Moving to Real World Benchmarks in SSD Reviews @ [H]

FrgMstr

Just Plain Mean
Staff member
Joined
May 18, 1997
Messages
55,634
Moving to Real World Benchmarks in SSD Reviews - Many of our readers embrace our "real world" approach with hardware reviews. We have not published an SSD review for almost 2 years while we have been looking to revamp our SSD evaluation program. Today we wanted to give you some insight as to How We Learned to Stop Worrying and Love the Real-World SSD Benchmark.
 
I just care about the access times, price and size. Everything else...okay whatever.

Will look forward to the results and hopefully we'll have some interesting distinguishable data. (fingers crossed)

I also just wish sites would quit posting up extensive RAM reviews. Waste of life.
 
I'm all for it. Synthetic benchmarks are close to being worthless IMO.
 
I just care about the access times, price and size. Everything else...okay whatever.

Will look forward to the results and hopefully we'll have some interesting distinguishable data. (fingers crossed)

I also just wish sites would quit posting up extensive RAM reviews. Waste of life.

By size you mean capacity and not physical size I presume?
 
I just care about the access times, price and size. Everything else...okay whatever.

Will look forward to the results and hopefully we'll have some interesting distinguishable data. (fingers crossed)

I also just wish sites would quit posting up extensive RAM reviews. Waste of life.

^This. I want simple, side by side comparisons showing time saving advantages. Including performance/endurance loss over the drives lifetime, since thats the biggest worry from my customers. A drive is rated @ 70GB/Day for 5 Years? Do your best to write 128TB worth of data on and off the drive (how long would that take? A few days? I dunno..) and see how it performs.

Personally I'd like a benchmark showing what drive will get me loaded into a game of Natural Selection 2 faster :O I have two Samsung 840 pros in raid 0 and it's still a bitch >_<
 
Last edited:
What I would like to see is getting away from industry 'standard' testing protocols and move to benchmarks that implement lifecycle, lifespan, access times, write times, and how error correction compares from tradition HD's to SSD'. Maybe a way to show degradation over time?

But I'd also like to see things like erase/write cycles of real use data; movies, games, web page access, etc.
 
I agree with somehow being able to show degradation over time. It would tell the real story of an SSDs ability to serve its purpose and how long it will do so - total cost of ownership. If SSD A priced at $150 only has a life expectancy of 3-4 years, then SSD B priced at $180 with a life expectancy of 6-8 years may clearly he the better option for most users...even if SSD B offers only 75% of the real-world performance.
 
If the SSD Endurance Experiment is any indicator, it would take a prohibitively long time to measure life expectancy and degradation. It took them 18 months to fully complete their tests. Every 250 gigabyte drive had more than 700 terabytes of data written to them. Two of them lasted to over 2 petabytes. Extrapolating that to 512 gb drives, that means most will last a minimum of 1.4 petabytes, with the high end ones going over 4 petabytes with 36 months (3 years) of testing required. And 1 tb SSDs would double those numbers.

No, degradation testing really cannot be done in a reasonable amount of time. Especially on larger drives.
 
If the SSD Endurance Experiment is any indicator, it would take a prohibitively long time to measure life expectancy and degradation. It took them 18 months to fully complete their tests. Every 250 gigabyte drive had more than 700 terabytes of data written to them. Two of them lasted to over 2 petabytes. Extrapolating that to 512 gb drives, that means most will last a minimum of 1.4 petabytes, with the high end ones going over 4 petabytes with 36 months (3 years) of testing required. And 1 tb SSDs would double those numbers.

No, degradation testing really cannot be done in a reasonable amount of time. Especially on larger drives.

Agreed, unfortunately. It's not really feasible.
 
All I truly care about is comparisons for how SSD performance will impact gaming.

Specifically, as someone who plays MMOs (SWTOR), I care about load times and any other type of performance boost that can potentially be gleaned through SSD.


Also, I am curious how, if at all, the new Intel Skylake Architecture will impact SSD Speed and load times.
 
I also just wish sites would quit posting up extensive RAM reviews. Waste of life.

Oh boy are you right there. All we really need in a RAM review is, did it clock at the advertised clocks and what the max overclock they got was. Everything else is just filler.

It's interesting to see [H] trying to use better criteria to test SSDs. I honestly believe that all but the absolute worst SSDs currently made are sufficient for all client workloads. It would be nice to know which ones are worth more than the others. I've been trying to stick to big names (currently I have a Samsung 840 PRO as my primary and it's been the best SSD I've ever bought) but it would be nice to know if there was a cheaper SSD that performed on par or better. Also, some drives like OCZ's Vector drives seem to display some rather fishy behaviour and it would be nice to know more about that.

The other problem is that pretty much all these SSDs are saturating SATA so if you're buying a SATA SSD it doesn't matter anymore.

[H] is currently the only site on the web that properly tests PSUs (a few others come close) so I'm confident you guys can come up with a better methodology.
 
You should look into FIO .... It will allow you to see all of the disk stats during a run.

Also... Trim is a way to cover up shitty nand and poorly coded GC routines.
You should always run benchmarks for a long enough time to observe a steady state.
Anandtech tends to do a good job with this.
 
You could look at using Virtual Machine software like hyperV or virtual box. The "clone VM" tool in virtualbox is incredibly taxing of storage where you can see obvious gains going from single drive to raid to NVMe.

Just a thought.
 
When you do video card reviews there is are two metrics...some minimum FPS which feels good (usually 30+) and some notion of "stuttering" which can only be felt in most cases by actually playing the game. I think SSD's, on [H] should be reviewed the same way. A set of cases where you feel this is "a good speed" but then also how it feels when being used in a specific application or a series of events. Dunno...
 
When you do video card reviews there is are two metrics...some minimum FPS which feels good (usually 30+) and some notion of "stuttering" which can only be felt in most cases by actually playing the game. I think SSD's, on [H] should be reviewed the same way. A set of cases where you feel this is "a good speed" but then also how it feels when being used in a specific application or a series of events. Dunno...

That would be very nice. I second this. Some actual "in-game" tests.
 
You could look at using Virtual Machine software like hyperV or virtual box. The "clone VM" tool in virtualbox is incredibly taxing of storage where you can see obvious gains going from single drive to raid to NVMe.

Just a thought.

That's not an everyday workload for the average gamer though.
 
You could look at using Virtual Machine software like hyperV or virtual box. The "clone VM" tool in virtualbox is incredibly taxing of storage where you can see obvious gains going from single drive to raid to NVMe.

Just a thought.

That's a good suggestion, and worth looking into. My concern would be that cloning a VM (rather than testing performance of one that's running in some way) becomes another large file copy test, which is meh. Anyway, in some fashion, virtual machines are a part of power user workloads- I spin up VMs for development purposes all the time.

When you do video card reviews there is are two metrics...some minimum FPS which feels good (usually 30+) and some notion of "stuttering" which can only be felt in most cases by actually playing the game. I think SSD's, on [H] should be reviewed the same way. A set of cases where you feel this is "a good speed" but then also how it feels when being used in a specific application or a series of events. Dunno...
It's a great methodology for video cards, but perhaps tougher to apply to SSDs in as intuitive a manner. With video card testing, you pretty much know where the bottlenecks are, and get immediate, visual feedback. I feel like the closest way to get to the same thing with SSDs is through performance stability testing.
 
You should look into FIO .... It will allow you to see all of the disk stats during a run.

Also... Trim is a way to cover up shitty nand and poorly coded GC routines.
You should always run benchmarks for a long enough time to observe a steady state.
Anandtech tends to do a good job with this.

Agreed all around. FIO looks like a real improvement over other synthetics.
 
All I truly care about is comparisons for how SSD performance will impact gaming.

Specifically, as someone who plays MMOs (SWTOR), I care about load times and any other type of performance boost that can potentially be gleaned through SSD.


Also, I am curious how, if at all, the new Intel Skylake Architecture will impact SSD Speed and load times.

Different MMO(s) same thought.

ESO, TSW mostly.

Pillars of Eternity has some long load times too.
 
Find out what the bottle neck is for Star Citizen. That takes damn near forever to load. I'm assuming it's actually a processor bottleneck but if it's somehow my SSD and there's a better one that'd be good to know too. :)

I bought a 5960x to see if it lowers it.... and for encoding. Who am I kidding, it's for Star Citizen. Bahahaha
 
Different MMO(s) same thought.

ESO, TSW mostly.

Pillars of Eternity has some long load times too.

Yep.

I think MMOs might be something perhaps Hard drive reviews could focus on, since load times seem to be longer in that genre vs your usual FPS or action game.
 
Agreed all around. FIO looks like a real improvement over other synthetics.

Its really a shame that intel doesn't distribute their iPEAK test platform anymore. Being able to use a single trace to replicate the test settings across multiple drives would probably be beneficial for this.

While I do agree that the workload for cloning a VM is a heavily sequential influenced workload, when cloning OS files there are usually thousands of tiny files that play a role here too, I'm not saying you are wrong but I think its more than a circlejerk to look at.
 
A few ideas from my own workloads both work and personal.
* Updating Windows images.
* Windows install.
* Imaging drives to and from.
 
Often, when doing time comparisons, reviewers like to conclude with "effectively" no difference. I am interested at every tiny choke-point and anomaly. Please don't average these out, or gloss over them, instead try to exploit these differences and give an analogous comparison. While Photoshop swap space makes for an easy example, recognize that there are many moments during general use that can add up to a subjective improvement/degradation.

I'm not looking for start times of computers, but I am looking for noteworthy incremental improvements that separate the new from the old - from a professional review (rather than every forum member with an e[H]ard-on).
 
Find out what the bottle neck is for Star Citizen. That takes damn near forever to load. I'm assuming it's actually a processor bottleneck but if it's somehow my SSD and there's a better one that'd be good to know too. :)

I bought a 5960x to see if it lowers it.... and for encoding. Who am I kidding, it's for Star Citizen. Bahahaha

My understanding is that Star Citizen is currently processor bottlenecked, especially for loading. It does not use multithreading. I believe it does most everything off of one core. Increasing your clock speed may help.

I just spend most nights endlessly loading and unloading into failed matches and back into the hangar. You really feel those load times.
 
The question for gamers has already been answered on the 1st page:
The fact is that for the vast majority of consumers, pretty much any current SSD will do what you need.

Once you've moved off of mechanical drives, the delays experienced in gaming rarely involve the local storage.
 
My understanding is that Star Citizen is currently processor bottlenecked, especially for loading. It does not use multithreading. I believe it does most everything off of one core. Increasing your clock speed may help.

I just spend most nights endlessly loading and unloading into failed matches and back into the hangar. You really feel those load times.

I don't have a copy of Star Citizen, but this would be a good test for the workload profiling stuff I describe in the article for anyone who does :cool:
 
I thought it might be nifty to dl a couple of the apps and check them out and see if I am bottlenecking with drive access, Q depth or any of the new info I picked up.

I have run into a wall of sorts: Help and suggestions would be very much appreciated.

I have :
c: 221 GB : Boot/Windows/ ESO /Word and a few other key games and apps.
d: 1.36 TB TSW / PoE/ Big installs MMO or otherwise / Pagefile
e: 1.81 TB : Low energy/green drive - bulk data - old games etc

Over all I have plenty of space for everything I do now and might do any year soon - bulk hard drive storage anyways.

All the tools I have looked at (and not all of them) want to measure a specific drive. If I am playing TSW - my current mail game along with Pillars of Eternity I really need to know how two different drives going though different controllers are behaving.

See sig for more details.

Any utilities look at the whole picture and not just one sub system? Any utilities that can test how well a given application is stressing the system?
 
I thought it might be nifty to dl a couple of the apps and check them out and see if I am bottlenecking with drive access, Q depth or any of the new info I picked up.

I have run into a wall of sorts: Help and suggestions would be very much appreciated.

I have :
c: 221 GB : Boot/Windows/ ESO /Word and a few other key games and apps.
d: 1.36 TB TSW / PoE/ Big installs MMO or otherwise / Pagefile
e: 1.81 TB : Low energy/green drive - bulk data - old games etc

Over all I have plenty of space for everything I do now and might do any year soon - bulk hard drive storage anyways.

All the tools I have looked at (and not all of them) want to measure a specific drive. If I am playing TSW - my current mail game along with Pillars of Eternity I really need to know how two different drives going though different controllers are behaving.

See sig for more details.

Any utilities look at the whole picture and not just one sub system? Any utilities that can test how well a given application is stressing the system?
What's TSW?

You can open multiple instances of HD Tune to profile your disks independently. Otherwise, you can use good ol' PerfMon in Windows and set it up with the counters you need.
 
This sounds like a plan (until proven to be a bad one ;) ). that said, Ken Rockwell? There are plenty of photo bloggers that talk about the spec obsessed, but few are less respected.

Hah, regardless of your thoughts on Rockwell's writing, you've got to admit that "measurbator" is a great term :D It's also bandied about quite a bit, or at least used to be, on photography forums. I think it makes an important point on seeing the forest through the trees.
 
Hah, regardless of your thoughts on Rockwell's writing, you've got to admit that "measurbator" is a great term :D It's also bandied about quite a bit, or at least used to be, on photography forums. I think it makes an important point on seeing the forest through the trees.

It does, but it's something that most photographers have said for several years. If you drop onto Canon/Nikon site, you'll see plenty of photographers constantly stress it's the photographer, not the equipment (unless you're a lot better than most are). There was a time when specs mattered, but in the last 3 or 4 years, there's really not a DSLR out there that's not more than enough for virtually all non-pro users (and really if you're not a low light user).

But enough of that, I agree, with the way [H] is going. I'm far more interested in real world results than a number that may or may not be meaningful.
 
One thing I think might be useful, even though as a comparison tool they're worthless, is to run one standard synthetic benchmark using incompressible data as a validation technique to show that the system is configured properly and the SSD is performing "as expected". The synthetic benchmark result could even be scrubbed down to a "pass" or "fail" grade with an explanation if it fails or if it only passes after the system has to be given non-standard settings.

For example, we can expect any SSD using a comparable series sandforce chipset to fall within an expected range of write, read, and iops performance. So after installation (and maybe manufacturer specific system optimization configuration tool has been run?) run the synthetic benchmark to find if the SSD is working within an expected range of values.

For example, I've found that my old intel msata SSD seems to need to be trim'd more often than expected in order to keep performance up anywhere near specified speeds. Using a light web browsing and MS office workload for a day or two can cut data throughput by up to 80%. I couldn't put my finger on why my system seemed to be fast one day and slow for days afterwards and then fast again for a day, until I started running some synthetic benchmarks to try to figure it out. I'd see miserably low numbers before running trim, and then I'd get numbers almost identical to the advertised specifications after trim, but for just a day or two before the performance crapped out again.

So running a synthetic bench to verify proper operation might be valuable. Hide the results with a pass/fail grade, if necessary. Maybe run the synthetic bench before real benchmarking begins, and again after it is all over, to see if there is any change in measured raw throughput. Again maybe hide the raw numbers and only post a pass/fail for the initial configuration, and percentage change between before and after runs?
 
One suggestion about workloads.
When I played Dragon Age I a few years back, I've noticed that it installed and used and instance of MSSQL 2005. I do not know how intensively that DB was used, but it might be a case of a client application with a DB workload.
Well I don't even know if there are modern games using a full fledged DB engine backend, but maybe there's something to investigate here.
 
One thing I think might be useful, even though as a comparison tool they're worthless, is to run one standard synthetic benchmark using incompressible data as a validation technique to show that the system is configured properly and the SSD is performing "as expected". The synthetic benchmark result could even be scrubbed down to a "pass" or "fail" grade with an explanation if it fails or if it only passes after the system has to be given non-standard settings.

For example, we can expect any SSD using a comparable series sandforce chipset to fall within an expected range of write, read, and iops performance. So after installation (and maybe manufacturer specific system optimization configuration tool has been run?) run the synthetic benchmark to find if the SSD is working within an expected range of values.

For example, I've found that my old intel msata SSD seems to need to be trim'd more often than expected in order to keep performance up anywhere near specified speeds. Using a light web browsing and MS office workload for a day or two can cut data throughput by up to 80%. I couldn't put my finger on why my system seemed to be fast one day and slow for days afterwards and then fast again for a day, until I started running some synthetic benchmarks to try to figure it out. I'd see miserably low numbers before running trim, and then I'd get numbers almost identical to the advertised specifications after trim, but for just a day or two before the performance crapped out again.

So running a synthetic bench to verify proper operation might be valuable. Hide the results with a pass/fail grade, if necessary. Maybe run the synthetic bench before real benchmarking begins, and again after it is all over, to see if there is any change in measured raw throughput. Again maybe hide the raw numbers and only post a pass/fail for the initial configuration, and percentage change between before and after runs?
This is definitely a frustrating scenario, and mirrors what I've seen with the SandForce drives- after preconditioning (as one must always do before testing), they're great when you first start out, and then seem to choke for a while... until TRIM catches up. As for determining whether a drive performs as advertised.. the problem with that is that there's a remarkable amount of fine print that goes into the 'advertised' numbers. Still, doing some degree of differential testing, whether it's a performance stability test series or a pre/post comparison, makes a lot of sense.

One suggestion about workloads.
When I played Dragon Age I a few years back, I've noticed that it installed and used and instance of MSSQL 2005. I do not know how intensively that DB was used, but it might be a case of a client application with a DB workload.
Well I don't even know if there are modern games using a full fledged DB engine backend, but maybe there's something to investigate here.
You'll see a lot of applications that use embedded databases (SQLite, Firebird, Interbase, or one of the embedded versions of MS SQL). Something very similar is also allowed by websites as part of the HTML5 spec. When you see this sort of thing, it's often very sporadic access that won't be causing a bottleneck, but I remember a trading application I used to work with, Multicharts.NET, that used an embedded Firebird database for pricing data. They've had a major release since I last used the software and I'm not sure if it's still the case, but WOW, that thing was a dog with a lot of time series data.
 
Great points all. You do, however, realize you just created a heap of work for yourselves (and, hopefully, the larger reviewer community) in making these points, right? :)
 
Synthetic benchmarks are still useful to some degree. Helped me confirm my perception that my AMD motherboard RAID support is inferior in almost every way to Storage Spaces built right into windows for a secondary game install array. For the most part, I still think the synthetic benchmarks should play a part in ssd reviews, just like they get their page in video card reviews.
 
Synthetic benchmarks are still useful to some degree. Helped me confirm my perception that my AMD motherboard RAID support is inferior in almost every way to Storage Spaces built right into windows for a secondary game install array. For the most part, I still think the synthetic benchmarks should play a part in ssd reviews, just like they get their page in video card reviews.

Well ya, they still work good to tell you if a drive is up to spec or not.

I've returned 1 Intel that was NIB because it ran at 50% performance vs all the others right out of the box... sometimes you just get a LEMON.
 
"measurbator"

Typo? didn't you mean masturb... no, wait... :D:D
 
As many other examples, like cpus, gpus, et al. SSDs are quickly becoming "fast enough".

Even a kingston value SSD is orders of magnitude faster than a mechanical drive, but you won't see the same difference between the kingston and an intel 750. Yes the intel will be much faster but only in a few instances will that make a difference.

Its somewhat akin to corei5 vs core i7. Is the i7 faster? yes, in some cases up to 80% or more. But is it faster in games? Not really, and when it is, its not by much.

But there are cases where for example an intel 750 will make a huge difference, like say video editing or photoshop and that won't show in synthetic benchmarks. That's where I see value in real world testing.


Can't wait to see you "first" SSD review.
 
Back
Top