Help me test systems with a stack of Enterprise SSDs

eduncan911

n00b
Joined
Mar 4, 2012
Messages
15
I am a recent father of 16x 800GB SAS2 6Gb/s SSDs drives and would like to use them to test various server configurations I could put together. Basic specs are 525MB/s read, 450MB/s write each.

I have a (short) opportunity to use a few chassis from my homelab/homelan I am rebuilding, as well as some chassis for a school I am building this summer to test everything.

I've never really done disk I/O testing before, not on this scale. Nothing beyond crystal DM a decade ago. So any help on what tools (Linux) to use, or how to tax all 16x SSDs would help.

CPUs vary from X9 and X10 LGA2011s, and a Threadripper 2950x. Everything has 64GB of DDR3 or DDR4 RDIMM ECC. I could combing a few sets and get 128GB of some mixed-match sets for a single machine testbench (ZFS?!?). But prefer not.

Generally, how would you stress/tax 16 x SSDs on a Linux system? Measuring mostly bandwidth, but also latency as some ideas are to use them in a Ceph cluster.

Verifying the drives as a baseline is one thing. But I'm really after testing various SAS2 controllers and backplanes I have across a range of 3 configurations (4 if you count a JBOD backplane I could swap in, but limited to only 8x drives).
 
If you still have access to those SSDs, check these articles out:
I used both dd and hdparm for stress testing some HDDs a few months ago, but I'm sure those would be fine with SSDs, too.

Thanks for the follow up!

Turns out all SSDs are at the max write (wear level). The SMART on these does not report wear level, but only number of sectors moved (STEC) and failed sector moves (max wear level / reserved blocks all used up). They have 200GB of reserve for write endurance. So if you figure 4kb blocks, multiplied by the number of sector moves, most of them come in either at or above the 200GB value !!

Not to mention, after several ZFS and Ceph tests, 7 of the 16 drives are now showing serious SMART errors and 2 of them are just dead. Ceph refuses to join 3 of the drives due to their errors. :(

But, as for the results ...

Turns out SAS2 @ 6 Gbps is limited to 24 Gb/s (we knew this), which I pegged easily at 2.9-3.1 GB/s sustained on both read and writes with only 8-10 disks working. Only my SC846 chassis has dual SF-8087 for 8 x 6 Gbps ports (or 48 Gb/s total bandwidth). Unfortunately, I didn't have enough 3.5" to 2.5" trays to mount them. I 3D printed several adapters; but in the end, decided to scrub the project due to the number of failed drives and end-of-life sector moves.

They were free, so no big deal. Just... What a waste of beautiful drives.

Turns out these were lemons all along because STEC was bought by WD right when these were released - and they failed horribly, with warranty replacements going to other WD Enterprise drives.

Maybe I can find something to do with the massively large (and removable !!) capacitors ...

1627597420830.png
 
Last edited:
Back
Top