[zfs] Does sata drive iops matter?

cepa

n00b
Joined
Jul 18, 2012
Messages
12
Hello this is my first post on this forum, and need some advice on ZFS topic :)

I'm going to build a new storage server for my development lab, it will be directly connected to separate machine with XenServer via NFS on single 1gbit nic. There is +20 virtual machines running 24/7. Most of the time they're idle but from time to time I'm triggering a set of automatic tests and then there is a looot of concurrent IO going to the storage :)
All I care now is to get as much IOPS as possible without spending too much money.

Hardware:
- Intel server board (S3210SH)
- Single dual core xeon cpu on lga775
- 8GB of ram ddr2
- Separate raid controller card from LSI or 3Ware on PCI-E or PCI-X slot
- SSD SLC for ZIL (Intel x25-e 20GB/32GB)
- SSD MLC for L2ARC (OCZ Vertex3 64GB)
- SLC USB stick or SATADOM module for OS
- 8 sata drives in stripe-mirror configuration (RAID 10)

Software:
- FreeNAS 8 or generic FreeBSD with ZFS

I need more or less 1TB of capacity which will be a NFS export with thin provisioning.

Question is what SATA drives I should use for the vdev. I'm wondering if it makes sense to buy eight WD VelociRaptor 250GB drives (WD2500HHTZ) that have almost two times more IOPS than average 7200rpm sata drive or simply just take a bunch of ordinary consumer SATA drives.
Is it possible to predict the difference in IO performance with using faster SATA drives when SSDs are a layer between the ram and the real storage?

Thanks in advance for any feedback :)
 
Last edited:
Keep in mind for random read workloads, you can double write IOPs. With 4 vdevs, I'd say to just stick with decent (7200RPM) sata drives. I wouldn't drop the money yet for an L2ARC device - if the hit rate is too low, you will see no real benefit. Depends on your working set. I'd rather see you get a cpu+mobo that lets you use ddr3 ram, since that is much cheaper nowadays, and you'd be better off putting 16GB in and skipping L2ARC until you see what your ARC hit rate is.
 
How much are those drives? You could almost do a RAID 10 in 4 SSDs. Much faster and then not sure if you need ZIL/ L2 ARC since you are on RAID 10 SSD. Some of the larger drives are getting very affordable and you will save power over time with SSDs so you get a bit back on the TCO side. Also, not even sure you would need RAID 10 in this config. RAID 1 would likely be fine so if you could get by with 2x 480 or 512GB drives from a capacity standpoint, you will have way more IOPS, potentially not need an add-in RAID controller, and etc.

There is a healthy IOPS difference between 2.5" 10K and 15K drives and 3.5" 7.2K spindles. Also, SAS drives are going to be better even in the 3.5" size.

It it were me:
Go low cost.
USB Stick for storage OS.
RAID 1 2x 512 or 480GB drives.

My build would probably be: Supermicro X9SCM-iiF + Pentium G630 (or better) + USB drive + 2x modern 6.0gbps SSDs + lots of RAM.
 
thanks for the replies

Regarding DDR3 setup, I already have a spare server mobo so at the beggining Ill see how it goes and then if neccessary I will upgrade it.

Regarding SSD RAID 10. As I wrote, 1TB of usable capacity is a minimum. To have 1TB on SSD I would need at least five SSD drives with 256GB in RAID5, well doable but i dont trust cheap MLC especially in RAID5 setup and buing 5x 256GB SLC for a dev lab is way too pricy :)
Basically not all VM need such a performance, got some SSD directly in VM server for keeping those that really need it like testing databases, proxy cache, etc.
So In price of less than three 256GB MLC SSD i can get at least eight decent SATA drives, 8x 500GB => ~4TB of space => ~2TB of usable space in RAID 10
I have spent some time on experimenting with FreeNAS and SSD cache and results were very promising so this is the way I thing I will go.
The thing that i really dont know is if putting faster SATA spindles will improve total performance or the difference will be insignificant.

RAID 1 2x 512 or 480GB drives.
I already have tryied RAID 1 and was way to slow for my needs.

Thing is that this storage will keep virtual machines used in development/testing process so when there is a source code change it executes more than 36 parallel build processes that generates lots of random IO on at least 4 different virtual machines at a time.
 
If you need more IOPS, stay away from raid 5, it has the IOPS of a single drive. You would be better off with raid 1. You could mirror 3 7200RPM 1TB Hard Drives. Writes would be limited to the IOPS of 1 drive, but reads would have 3x the IOPS. You could even use 3 of the VR 1TB hard drives for more performance.
 
Hmmm, I dunno if I'd advocate for 3-way mirrors for performance, more for 'bet the business' redundancy. A 2x2 mirror gives you 4x the reads and 2x the writes (as well as 2x the storage) for only one drive more.
 
Hmmm, I dunno if I'd advocate for 3-way mirrors for performance, more for 'bet the business' redundancy. A 2x2 mirror gives you 4x the reads and 2x the writes (as well as 2x the storage) for only one drive more.

A 2x2 mirror would give you the same IOPS as 1 drive in writes and 2x the IOPS in reads.
 
Depends on the data access pattern. Either 1x or 2x, depending on how 'random' the references are. The hedge about randomness also applies to the reads :) Just so we're clear (I realize my prior post was ambiguous), I was comparing IOPS against a single drive, not the 3-way mirror.
 
The thing that i really dont know is if putting faster SATA spindles will improve total performance or the difference will be insignificant.

I don't think anyone can really tell you this for sure - it depends on your I/O profile.

It could range from an insignificant difference up to around 2x - but then once you add cache drives into the mix as well, it gets even harder to predict/guess.


All you can really say is that actual I/O to/from the data drives will be faster.
Given that writes must access the data drives, then you should see faster write performance overall. Reads however, may not access the data drives that much (other than the initial read to populate the ARC) - many may be serviced from the ARC/L2ARC, which means the data drives may have little effect on overall performance. However, if your I/O profile means few ARC hits, then the data drive performance becomes more important.

All that said, faster drives are never usually slower :)
 
All you can really say is that actual I/O to/from the data drives will be faster.
Given that writes must access the data drives, then you should see faster write performance overall. Reads however, may not access the data drives that much (other than the initial read to populate the ARC) - many may be serviced from the ARC/L2ARC, which means the data drives may have little effect on overall performance. However, if your I/O profile means few ARC hits, then the data drive performance becomes more important.

I assume that having L2ARC on SSD will cache most often read areas of the storage so this will rise overall performance significantly. However in regards to writes I thought that ZIL will be used as a "queue" for data before it gets to the spindle drive, and when it does all the writes will be optimized to minimize movements of the mechanical elements of a spindle disk. This is why I was wondering if having spindles with more IOPS will do any good or the most important will be performance in sequential block writes which can be achieved with common desktop drives.

Some time ago I have setup my first nas storage on ZFS with SSD cache and 3x 2TB sata drives, and did some tests:

Iometer running in vm on local ssd drive (ocz vertex 3):
http://cepa.ognisco.com/freenas/freenas_iometer1.png

Iometer running in the same vm with vmdk drive located on remote nfs storage via 1gbit link, freenas, zfs, raidz, ssd cache
http://cepa.ognisco.com/freenas/freenas_iometer3.png

Test result is that remote zfs with cache can do 1/3 of the local ssd performance.
In real world use i had from 30 up to 70 MB/s transfer over gigabit link regardless the numer of running VM, copying files over network was 90..110MB/s so total performance of the virtual lab was good enough. However total speed was going down when parallel virtual machines were doing operations like fsck that probably generated lots of random io.
So this time I would like to do much better :)
 
I assume that having L2ARC on SSD will cache most often read areas of the storage so this will rise overall performance significantly.

Essentially, yes (at least for random reads).
Be aware though that L2ARC is really a kind of second level overflow from the ARC - L2ARC is populated from the ARC, not the data drives (there is no data path directly between L2ARC and the pool).


However in regards to writes I thought that ZIL will be used as a "queue" for data before it gets to the spindle drive, and when it does all the writes will be optimized to minimize movements of the mechanical elements of a spindle disk.

That's really the transaction group/log, not the intent log (ZIL).
The ZIL is for integrity rather than performance - though you can speed up it's operation by using a log device faster than the main pool drives (usually an SSD)

All pools have a ZIL - but it's only used for synchronous writes. For asynchronous writes, it's not used.
The ZIL resides in the pool (on the HDDs) unless you configure one or more log devices, in which case ZFS moves the ZIL there.
Using a fast log device can certainly speed up synchronous writes, but they'll never be as quick as async writes to the same pool, due to the extra steps involved.


This is why I was wondering if having spindles with more IOPS will do any good or the most important will be performance in sequential block writes which can be achieved with common desktop drives.

Hard to answer really as it depends on the nature of the I/O and how effective the ARC/L2ARC is with it. Write performance will likely show some improvement though - at some point the data has to be written out to the pool drives - faster drives will make this part of the operation faster - simple as that! As I mentioned earlier though, reads are harder to predict.

Bottom line is that faster drives will always be faster - by how much is another question though, and whether the increase is worth the cost or not, is yet another!
 
Back
Top