Best Drives for ZFS

PANiCnz

n00b
Joined
Feb 29, 2012
Messages
18
Its time to upgrade the drives in my ZFS box (3 drives in RAID-Z1) but I'm really unsure which particular model/brand I should be considering?

From what I've read Hitachi use to be pretty popular but these seem pretty elusive these days.

If there isn't a particular brand or model that stands out from the rest is there a particular brand or model I should be avoiding? I keep hearing lots of horror stories about the WD Greens but its hard to find a single source of truth.

Not really that keep to shell out for enterprise grade drives so really looking at the various Seagate and WD consumer models around the 3TB capacity.
 
I would say these drive are really good but the info is dated.
I think the current seagate models, 2 years warranty back again, with 3TB (ST3000DM001) are pretty solid and don't crap around (TLER/WDIDLE) like current WDs do.

Furthermore, I think the hyped WD Red drives are better suited for hardware RAID and do not offer any real advantage in a ZFS setup, that is worth the price WD demands.
 
TLER does not matter in a software raid solution like ZFS.

I would buy whatever you are comfortable with.
 
I'm using several hundred Seagate 3Tb ST3000DM001 running firmware CC4H with ZFS and can report that I've had no issues with these drives. They are also normally cheaper than other 3Tb drives too.

TLER definitely will mess up ZFS... whether you see it or not will depend on how heavily used your pools are. Mine are under extremely heavy load for a good portion of the day, every day, and under heavier loads if a drive starts taking a while to respond ZFS is going to start performing horribly while the entire pool waits for the disk, or it will eventually fail the drive. Then you're stuck resilvering while you're under load with the chance of more drives getting messed up. If all of your drives exhibit this behavior you could easily end up with a failed pool or at minimum terrible and inconsistent performance.

If this is just sitting in your house getting occasional use it might not matter much, but why even bother dealing with TLER drives when better drives are available for less money?
 
I'm using several hundred Seagate 3Tb ST3000DM001 running firmware CC4H with ZFS and can report that I've had no issues with these drives. They are also normally cheaper than other 3Tb drives too.

Any further details on your setup? I'm looking to do something similar and have been weighing up either the WD Red's or the Seagate ST3000DM001's. No failures to date? Performance is good? I'm planning on running SC847 JBOD's, 4x 45 3tb - expanding to 8 JBOD's. Definitely don't want to pull the pin and buy the wrong drives.
 
Any further details on your setup? I'm looking to do something similar and have been weighing up either the WD Red's or the Seagate ST3000DM001's. No failures to date? Performance is good? I'm planning on running SC847 JBOD's, 4x 45 3tb - expanding to 8 JBOD's. Definitely don't want to pull the pin and buy the wrong drives.

I would strongly recommend against the use of SAS expanders like that SC847 has with SATA drives. It simply is not reliable and at some point will cause you a world of hurt.

If you're creative you can directly attach a lot of drives without SAS expanders. Otherwise, you will need SAS drives if you want to use expanders.

There's a good thread on here somewhere about the use of SATA drives with SAS expanders with a lot more details. My first hand experience with it lined up with all of the other experiences in that post.

What I found was that if a drive just completely fails all at once, as if you were to pull it out, everything will be okay and it can handle that. What that setup cannot handle is when a drive slowly starts failing-- not responding to commands, sending back bad data, doing other random shit. The SAS expander has to sort of multiplex and encapsulate all of the SATA drives commands into SAS commands to send back to the host. If one drive starts doing unexpected things the entire SATA bus gets fucked up and the SAS expander can't talk reliably to any of the drives. On the ZFS side you'll just see climbing error rates on every drive in the pool because it can't talk to any of them because the expander is confused due to the failing drive, IO will be frozen and you can't do anything on the host until you remove the failing drive. In some cases you won't be able to tell which drive is bad because the host can't talk to any of them. It's a complete ticking time bomb to run that kind of setup in production, I'll never do it again.

For my implementation it was more cost effective to have many hosts with directly attached drives instead of a couple very large ones with SAS drives.
 
I must admit I haven't read through the entirety of that thread - however I haven't heard any examples of this happening recently.

The scale out nature of my customer necessitates expanders - directly attached isn't really feasible at these sorts of drive counts. If SAS disks are an absolute requirement then that drastically changes the $/TB price point.

Do you mind if I ask what version of ZFS/OS you were running?
 
Thanks for the responses, I was leaning towards the ST3000DM001 so its great to hear people have been using them successfully with ZFS.
 
I would strongly recommend against the use of SAS expanders like that SC847 has with SATA drives. It simply is not reliable and at some point will cause you a world of hurt.

If you're creative you can directly attach a lot of drives without SAS expanders. Otherwise, you will need SAS drives if you want to use expanders.

There's a good thread on here somewhere about the use of SATA drives with SAS expanders with a lot more details. My first hand experience with it lined up with all of the other experiences in that post.

What I found was that if a drive just completely fails all at once, as if you were to pull it out, everything will be okay and it can handle that. What that setup cannot handle is when a drive slowly starts failing-- not responding to commands, sending back bad data, doing other random shit. The SAS expander has to sort of multiplex and encapsulate all of the SATA drives commands into SAS commands to send back to the host. If one drive starts doing unexpected things the entire SATA bus gets fucked up and the SAS expander can't talk reliably to any of the drives. On the ZFS side you'll just see climbing error rates on every drive in the pool because it can't talk to any of them because the expander is confused due to the failing drive, IO will be frozen and you can't do anything on the host until you remove the failing drive. In some cases you won't be able to tell which drive is bad because the host can't talk to any of them. It's a complete ticking time bomb to run that kind of setup in production, I'll never do it again.

For my implementation it was more cost effective to have many hosts with directly attached drives instead of a couple very large ones with SAS drives.

Are you sure about this? Not challenging your expertise however I have a customer on a SMicro Chassis with a backplane using 4 Crucial M4's in Raid as a SQL store and after nearly 1 year there has been ZERO issues with their array or SSD's and these certainly are NOT SAS SSD's.

I was seriously contemplating the use of a SuperMicro CSE833 case 3u in my ZFS NAS this coming december. I am probably going to use RED or RE4's in the NAS.
 
So we cant use WD RE Drives either? I am not understanding how TLER is volatile to ZFS if the command is a subset or part of the ATA standards? Isn't it the same thing as ERC on other drives?

Also to the person that recommended those 3TB Seagates, there are A LOT, of mixed reviews on the quality of these drives.
 
Are you sure about this? Not challenging your expertise however I have a customer on a SMicro Chassis with a backplane using 4 Crucial M4's in Raid as a SQL store and after nearly 1 year there has been ZERO issues with their array or SSD's and these certainly are NOT SAS SSD's.

I was seriously contemplating the use of a SuperMicro CSE833 case 3u in my ZFS NAS this coming december. I am probably going to use RED or RE4's in the NAS.

It's more of an if/when scenario, not like this does not ever work. IF you have a drive to fail in just the right shitty way, everything will go very wrong with SATA drives connected to a SAS expander. ZFS is pretty resilient and if you can isolate the bad drive your pool will probably be safe after you replace it. It's just that it will be hard to isolate the failing disk, no IO will get to your pool until that disk is removed, and from that point on it's very possible you could end up losing the entire pool. For some use cases that's too much risk to take.

4 SSDs are probably going to run just fine behind an expander. You can get away with a bunch of hard disks that work just fine for a while too, until something goes wrong.

My advice comes more from a standpoint of running machines where you can't report to your boss saying whatever box had a huge failure, all the data was lost, or it was down for 12 hours, etc. I've had the SATA disk and SAS expander problem happen, and that's why I no longer run those setups. I personally wouldn't run one at home either just because it's such a pain in the ass to deal with if something goes wrong.
 
It's more of an if/when scenario, not like this does not ever work. IF you have a drive to fail in just the right shitty way, everything will go very wrong with SATA drives connected to a SAS expander. ZFS is pretty resilient and if you can isolate the bad drive your pool will probably be safe after you replace it. It's just that it will be hard to isolate the failing disk, no IO will get to your pool until that disk is removed, and from that point on it's very possible you could end up losing the entire pool. For some use cases that's too much risk to take.

4 SSDs are probably going to run just fine behind an expander. You can get away with a bunch of hard disks that work just fine for a while too, until something goes wrong.

My advice comes more from a standpoint of running machines where you can't report to your boss saying whatever box had a huge failure, all the data was lost, or it was down for 12 hours, etc. I've had the SATA disk and SAS expander problem happen, and that's why I no longer run those setups. I personally wouldn't run one at home either just because it's such a pain in the ass to deal with if something goes wrong.

Thanks for the explanation thus far.

Now that I recall my customer that uses a Supermicro chassis actually does not use an expander, rather it does use a backplane. However the backplane has individual sata/sas ports for each disk and the disk are directly cabled to the raid card using the SMicro ports.

It appears to be just a big PCB with sata/sas plugs that each drive tray slides into and exactly behind the pcb there are indivdual ports for the SAS/SATA cables to attach. The only thing that appears to be community on the PCB is electricity as there are two molex ports for the PSU to plug into to distribute the power to the hotswap caddies.

Would this qualify as a SAS expander or just a common backplane with indivudial I/O ports?

Can you take a look at page 11 (picture) and tell me if this looks like a expander or just a backplane and if so would a backplane be safe to use for ZFS?

http://www.supermicro.com/manuals/chassis/3U/SC833.pdf
 
Last edited:
Also to the person that recommended those 3TB Seagates, there are A LOT, of mixed reviews on the quality of these drives.

There is simply no drive with only positive ratings :D
DOAs instantly become 1-point ramblings... then there are ratings refering to oudated/bogus firmwares, disk-setups where 1 out of 6 disks failed and killed some raid-pool, etc...

I really cannot make an educated guess on harddisk-quality based on newegg/whatever customer ratings. I would be concerned though if hardforum filled with reports of misbehaving hard-disks/general firmware problems...
 
There is simply no drive with only positive ratings :D
DOAs instantly become 1-point ramblings... then there are ratings refering to oudated/bogus firmwares, disk-setups where 1 out of 6 disks failed and killed some raid-pool, etc...

I really cannot make an educated guess on harddisk-quality based on newegg/whatever customer ratings. I would be concerned though if hardforum filled with reports of misbehaving hard-disks/general firmware problems...

Well there you also make some very good points indeed.
 
TLER definitely will mess up ZFS...

I was under the impression that having TLER would be better than not having it. A drive without TLER will take an arbitrary time to recover from an unreadable sector while TLER limits that that to a defineded value. I have seen drives (without TLER) that try more than 30 seconds to read a specific sector, long enough that the driver just reset said drive. I've yet to see a TLER enabled drive with an unreadable sector, though.
 
Back
Top