Another new ESXi/ZFS build

edlin303

n00b
Joined
Apr 6, 2011
Messages
43
Updateed 4/26:

20 days later, and many hours of reading later, I have changed just about everything in the first post. For anyone interested in a similar build, here is where I stand. Hardware will be ordered this week, since I can't wait any longer and have not been able to get much feedback on my build plan.

Usage Requirements:
  • NFS share for VMs running on 2 other servers. Estimate of 20-30 Windows and Linux VMs running at any time. *Primary purpose
  • Samba/NFS share for desktops and VMs to map to for central shared storage
  • Possibly run ESXi with hardware passthrough to make best use of hardware resources. If not, possibly run MySQL on server too.
  • Possibly host remote backups for people. Will look into options such as ssh/rsync or windows equivalent.
  • Possibly run sabnzb for easy Usenet access.

Budget: I have blown my original budget, so all bets are off. I now am shooting for the best choices as much as possible.
  • Case: SuperMicro SC846TQ-R900B (I would much prefer the SC847A-R1400LPB 32-bay chassis, but people say to avoid using an expander with ZFS.)
  • Motherboard: Supermicro X8DTH-6F
  • CPU: 2 * E5645 Hex-core
  • RAM: 48GB of Crucial DDR3 PC3-10600 Unbuffered, ECC
  • SAS: 2 * AOC-USAS2-L8I + onboard LSI 2008, all passed through if using ESXi
  • HDD: 18+ * 5k3000. Will probably order more before I am done
  • HDD L2ARC/OS: ??? Should have ordered the 64GB V100 but now the rebate is expired.
  • Power Supply: Included in chassis
  • UPS: Existing 10KVA 110/208/240 UPS.


I will get deeper into the details once hardware arrives, but here is what I think I need based on my research:
Planned layout:
  • ESXi, controllers passed through to OpenIndiana VM for ZFS. May test other options when I get that far.
  • 6 drives in RAID10 for use by MySQL VM and other bandwidth-hungry apps
  • 12 drives in some sort of RAIDz. Last night I was worrying about what would happen with a controller failure, so I might do 4 3-drive RAIDzs, or I might do 2 5-disk RAIDz with a spare or two if I decide I am not worried about that.
  • Enable DeDupe on specific folder for hosting Windows vmdks. This will maximize dedupe effeciency without requiring massive amounts of ram.
  • Either a 15k or SSD for the OI VM boot disk and possibly L2ARC if it is an SSD. I don't think I need RAID for that right now, because I can easily back up the VMDK and copy it to a new drive in a hurry.




*Original outdated post*
Hey everyone, I just recently discovered how much good info is on this board when researching a NAS solution. I have pieced together lots of good info from many sources, and thought I would post here to solicit some peer reviews before I buy all the gear. Once I come up with my final config I plan to buy and build ASAP.

Background: (long, so feel free to skip this)
My home "Lab" currently has 2 servers.
  • Dell 2950 with 6 1TB disks. These were built as one 3-disk RAID 5, then I discovered it couldn't be expanded without Windows so I added a mirror and a hot spare for the last 3. ESXi running on internal USB.
  • Dell R710 with 2 15k 143GB disks. ESXi also on internal USB. 48GB ram, so this is my primary server.

Currently my only shared storage is an old PC with a single 1TB disk running NFS so I can move VMs between servers and because 143GB is way too small for my 710.

I know I need some sort of shared storage to make this work much better, so I started researching. Here was my progression so far:
  • Rackable systems 16-bay server. I was looking at buying one of these off eBay and just loading it with drives. It seemed a quick and dirty solution. However, I thought the supply had dried up so I looked elsewhere.
  • NAS appliances. Next I looked at things like the QNAP 7-bay NAS as an option. It seems pretty straightforward, but it doesn't get the best reviews and I was hoping for more drives. I also would prefer redundant power if I can manage.
  • UnRaid (Lime Technology) appliances. I found UnRaid and saw the pre-build chassis they make for it. 15 drives in a mid-tower looked very tempting and I thought I would order one or build a similar and put something with ZFS on it. This was when I found out about ZFS DeDupe which I think will help a lot on my VMs.
  • Custom build. I found threads on UnRaid forums about custom builds, and found the Norco servers. While researching them I found this forum, and all the threads about different build options.

Usage: My "Lab" is pretty well used by home standards. At any given time I have 20-30 VMs running, mostly XP or Ubuntu with a handful of others mixed in. Each could be idle, or very active depending on the day. I feel today I have some significant performance hits from the low disk count. Especially on the 710.

Requirements: I plan to host most if not all of my VMs on my new shared storage. I plan for something like 10 2TB disks right now, leaving capacity to add another 10 bigger disks when I need them and prices adjust. I have a higher-end Cisco gigabit switch, so I plan to do LACP or similar to try to maximize bandwidth.

Assumptions:
  • I think based on all my reading I am best off with ZFS with DeDupe. I am not sure yet which OS, but OpenIndiana sounds tempting.
  • I think NFS is my best bet for VMWare, though I am open to iSCSI if there is a good reason for it. I like the flexibility of NFS though because I can have one volume for everything, and hopefully share it out CIFS and other ways as needed.
  • Unless I hit a major snag, I hope to virtualize the ZFS host so I can have some low-use VMs on the same server. Maybe put the 2 15k SAS disks in and run VCenter as VM. If I understand the concept right, I can run ESXi, pass through my SAS controllers, and create a ZFS appliance to share out all my disks via NFS to other (and the same) ESXi servers. Did I get that right?

Build: Here is my build as of right now. I have stolen bits from other threads, and factored in feedback from some of the pros with most of this.

Budget: I would like to be cost-effective where possible, but I can spend $3,000-$4,000 on this if I need to before I start reconsidering options.
  • Case: Norco RPC-4224
  • Motherboard: Supermicro X8ST3-F
  • RAM: Minimum of 8GB, more likely 16GB or more
  • SAS: 3 * M1015?
  • HDD: 10 * 5k2000 or 5k3000. Price will help determine.
  • HDD L2ARC: As I dig deeper into ZFS configs, I think I will get some SSD for L2ARC. Too early to know what I need so that will get filled in later. Might use same for ZIL if it makes sense.
  • Power Supply: TBD. Haven't researched yet.
  • UPS: Existing 10KVA 110/208/240 UPS.

Questions:
  1. Is the M1015 the "best" choice if I end up doing ESXi passthrough? I have seen people saying it works, but are there any drawbacks to consider? I like the price, going as low as $75 on eBay.
  2. The X8ST3-F has lots of slots and IPMI, both big plusses. But it is not cheap. Is that the best bang for my buck if I don't forsee ever growing beyond 24 drives in this build?
  3. I'm a little confused by the specs on the motherboard. It lists a SAS and a SATA controller. Does that depend on what type of disk I add? Because the SAS is an LSI which would work for ESXi passthrough, but apparently the SATA wouldn't. I don't intend to spend the extra for SAS.
  4. Is the built in expander a reason to avoid the SuperMicro cases? I like that they have redundant PS, which would be harder to do in the 4224. But I have seen lots of comments that lead me to think I should avoid expanders.

Out in left field options:
  • I haven't completely ruled out the Rackable Server. For $450 shipped plus disks, it would be an easy solution even if it would be slower than building my own. I could also build in the future and migrate to a new system.
  • I haven't ruled out the QNAP either. But at $2000 I think I can do better on my own.

Closing thoughts: Any and all feedback and criticism is greatly appreciated, especially specific to a NAS to support lots of VMs. I will try to keep the updated as I firm up my plans, and plan to post pics as soon as I build it. I would love to get parts ordered within a week.
 
Last edited:
May want to repost to Virtualized Computing for more responses.

Thanks. I figured this would be more appropriate since my main concern is the storage aspect. I will look there too to see if there is talk there about building storage appliances for ESX.

Wow, I thought my home lab was overkill

It is a bit more than a lab, in that various VMs result in significant revenue for me. Nothing that is so critical I can't take it down, but good investments in hardware pay for themselves.
 
Thanks. I figured this would be more appropriate since my main concern is the storage aspect. I will look there [Virtualized Computing] too to see if there is talk there about building storage appliances for ESX.

All things virtual tend to gravitate toward that forum even if it is about hardware. That it's under Bits & Bytes is a bit offputting for new folks but it is a good place to post for the most qualified replies to anything that somehow relates to virtual.
 
I'd vote for the ESXi ZFS route since I just p2v my Solaris box into ESXi (on the same box :D)

It's working great for me. So great I haven't felt the need to reboot the host to see if two disks will get picked up by the controller during boot (they're just spares lol).

http://hardforum.com/showthread.php?t=1592163

Intel sasuc8i is another option (for the standard PCI bracket), not sure what price point you're picking the IBM card at. You have plenty of boxes, this can be flashed to 1068e IT
firmware rather easily (use option -o). I popped it in my HTPC and did it from Windows.

I'd definitely go for more RAM, or at least go the highest density you can afford without filling the other banks.

Solaris-derivative ZFS probably won't be as RAM hungry, but IIRC, L2ARC requires a bit more RAM (depends on size), and I think the same is for dedupe(?) Dedupe is also CPU hungry...

I only allocated 6GB for my OpenSolaris VM, but you seem to have your performance sights aimed higher than my NAS (iSCSI/CIFS).

All-in-all, if you're going to build with server-grade equipment, I'd definitely go ESXi since (at least OS) is so resources light. Otherwise, IMHO, you're wasting CPU cycles, RAM, and space (4U) on a NAS. Since you can afford VC, you have some cool options at your disposal.

PS I didn't see a write-cache RAID card for your host disks?
 
What do you do with your VM's in XP/Ubuntu?
The ubuntus do lots of stuff. One DB server with 24GB ram, about 60gb of databases and growing quickly. The rest run misc scripts that are not very invasive, low cpu, sometimes a little bursty on ram, and usually little or no disk IO. The XPs are the same, usually running unattended programs, often things like auto-it or imacros to do automated web tasks. Also on the server is a 2k8dc/exchange server, which will migrate to 2010 when I build out this disk array.


Intel sasuc8i is another option (for the standard PCI bracket), not sure what price point you're picking the IBM card at. You have plenty of boxes, this can be flashed to 1068e IT
firmware rather easily (use option -o). I popped it in my HTPC and did it from Windows.

I'd definitely go for more RAM, or at least go the highest density you can afford without filling the other banks.

Solaris-derivative ZFS probably won't be as RAM hungry, but IIRC, L2ARC requires a bit more RAM (depends on size), and I think the same is for dedupe(?) Dedupe is also CPU hungry...

I only allocated 6GB for my OpenSolaris VM, but you seem to have your performance sights aimed higher than my NAS (iSCSI/CIFS).

All-in-all, if you're going to build with server-grade equipment, I'd definitely go ESXi since (at least OS) is so resources light. Otherwise, IMHO, you're wasting CPU cycles, RAM, and space (4U) on a NAS. Since you can afford VC, you have some cool options at your disposal.

PS I didn't see a write-cache RAID card for your host disks?

I will read up on the Intel card. I liked the IBM because I can eBay it for close to $75 and it seemed to get relatively good reviews here. However, if the Intel is a better card, I don't want to be pound foolish on this build. If being cheap will cause me performance or stability issues, it will cost me more than if I just do it right up front.

The same goes for RAM. The price of everything else may steer me on how much ram I get, but if I end up going with a virtualized server to share out all the disks rather than a dedicated appliance, I will definitely want some extra horsepower on the box so I can shuffle stuff around when needed. I probably will leave it empty under normal circumstances, but if I need to have a downtime I could just move a critical VM or two onto that chassis and shut the other two down. With that said, I am going to cram as much RAM as is practical into it.

I'm also going to get as big of a CPU as I can manage since I do expect DeDupe to add a bit of load. Since the VMs don't reboot often, and don't have a lot of IO, I expect it will only be really bad when I need to do patches or snapshots.

For the host disks, do you mean the ones the NAS appliance will be installed to? I guess I hadn't planned that out yet. I was thinking I could reuse my 15k disks from my 710, but those won't be available until after this is built, if even then. Couldn't I use the on-board RAID for whatever disks I end up using?


All things virtual tend to gravitate toward that forum even if it is about hardware. That it's under Bits & Bytes is a bit offputting for new folks but it is a good place to post for the most qualified replies to anything that somehow relates to virtual.

Is there a way to request moving it? I don't want to cross-post (I assume that is as frowned upon as at other forums).
 
For the host disks, do you mean the ones the NAS appliance will be installed to? I guess I hadn't planned that out yet. I was thinking I could reuse my 15k disks from my 710, but those won't be available until after this is built, if even then. Couldn't I use the on-board RAID for whatever disks I end up using?
Yes, if you choose to run ESXi on the NAS box. Unless you plan on having your VM datastore not located physically on that box.

See my experience in my other box. I thought I could get away with onboard RAID since the VMs aren't really doing that much IO. Boy was I wrong.
 
Yes, if you choose to run ESXi on the NAS box. Unless you plan on having your VM datastore not located physically on that box.

See my experience in my other box. I thought I could get away with onboard RAID since the VMs aren't really doing that much IO. Boy was I wrong.

To clarify, my intent was: Load ESXi onto a usb flash drive like in my current system. Then put some sort of disk in on the onboard controller for a small datastore to house my NAS appliance. Then load OS or simillar, pass through SAS controllers, build a large NFS/ZFS share, and attach to that from this and my other two ESXi servers for their primary datastore.

It sounds like you were doing the same, but just putting the NAS on local disks caused everything to be slow. Am I reading that right? Do you think it was specific to RAID? If so, I could just have 2 independent disks with copies of a NAS VM and if one dies I would just have to light up the other VM. Or am I thinking about this the wrong way? If you think it will still have performance issues, I can just add the LSI to my list I guess. Just didn't expect to pay more for system disks for the NAS than the controllers for all of my other disks combined.
 
I think that would work. Worth a try anyway. Since I had both ESXi and VMs in the same datastore, I'm pretty sure I was being screwed since I had it in a RAID1 array with no write caching available.

The worse that would happen is you have to pick one up later. You could easily build your ZFS OS (any idea what are you going with?) now and move it over (hell vMotion if you want) when you build the NAS/ESXi box. If you have to rebuild the NAS/ESXi host then shut her down, move it off

I guess the only reason to build your ZFS OS first would be if you want to get the initial configuration out of the way (install/CIFS config/etc) while you wait for your hardware
 
Since posting this thread I have spent a few hours a day reading the forum, and constantly changing my mind about hardware. Now I have less of an idea than before. If it's allowed, I would be willing to offer a donation to one of you experts to help me come up with a build. I haven't built a PC in many years, and am out of my element. About my only requirement is I would like to get as much gear as possible from Amazon due to having a bunch of GCs to use up.

As mentioned in the OP, I am hoping to use ESXi passthrough to a VM for a ZFS NFS share (for VM disk usage). However, I would also like to be able to run some VMs on the server. This is causing my budget to go through the roof, so I just want to be sure I am not wasting money where I don't need to. Here is what I currently have in mind, subject to change in a few minutes when I read yet another build thread. :(

Mobo: X8DTH-6F - I like the number of PCIE slots for future expantion, I like the max of 192gb RAM, with an easy 48GB for $800. I like the dual procs so I can run some VMs on it.
Ram: Anywhere up to 48GB depending on remaining funds
CPU(s): I know nothing about CPUs. Reading http://ark.intel.com/ProductCollection.aspx?series=47915 looks like E5645 gives me great cores/threads per $, but I have no idea if that is a good choice.
Disks: I ordered 20*5k3000s, and could add more and/or get SSDs for the NAS VM and whatnot. Right now I think maybe 6 in RAID10 for disk-heavy VMs, and the rest in raidz2s with dedupe for less used windows XP and linux VMs.
Case: Norco RPC-4224
SATA: I will probably get a couple of 1068s plus one on board
 
Careful with the ZFS dedup - budget 1GB RAM per TB of deduped data.

Thanks for the warning. Since many of my VMs are nearly identical XP machines, I might opt for 2 or 4 drives mirrored, for max of 4TB. I expect to give 16GB +- to the ZFS VM.
 
I'm in a similar position to yourself with regards to wondering if I should go into a mixed mode san/esxi host. While it is very helpful to have the storage internal and accessible to VMs directly instead of over fibre/1gbe I'm still not comfortable with the idea of the actual storage server being hosted.. If I lose that vm to corruption or something of the like.. apply a patch that buggers something. I'm up sh!ts creek sort of deal.

I'm strongly considering going this route though, but hrmming and hawing about if I want to build my own server or get a IBM/HP server. My disk requirements are somewhat smaller than yours 6x2tb and 4x15k SAS. But it doesn't matter as it's ideal to front your disks with SSDs. Thing is everyone is saying wait until SSDs with a capacitor? are released for both ZIL and L2ARC. Not 100% if they're now released such as the OCZ Vertex 3 that just came out. But at 260$ /120gb it's not a cheap option.

Currently my SAN running Openfiler is in a separate server, trying to reduce the heat and noise in my office lab. That server has a db rating of 60db which is damned loud. Plus it adds one extra server to my equation. A small lab should have at least two esxi hosts for HA/DRS and vCenter. I know mine will continue need at least the two hosts because of vCloud Director. I've currently got three hosts and one SAN server

What makes me nervous is that it is repeatedly stated that Intel VT-d passthrough of storage controllers is not a supported VMware function. While it works.... I don't get warm and fuzzies. *Btw you should make sure that board and cpu support VT-d since thats what you need for passthrough. Also double check the boards capacity for unbuffered dimms. Usually the 192gb numbers are for registered dimms and something like 48gb for unregistered in that same server. (This is for HP/DELL/IBM gear, can't say for certain bout Supermicro)

I see people regularly stating that the M1015 can't be flashed into IT Mode which is what you'd want for ZFS .. I think because of timeouts.. Though I thought I saw somewhere that it was possible to flash the controller. I'll have to look. I just bought four of these cards.
 
I'm in a similar position to yourself with regards to wondering if I should go into a mixed mode san/esxi host. While it is very helpful to have the storage internal and accessible to VMs directly instead of over fibre/1gbe I'm still not comfortable with the idea of the actual storage server being hosted.. If I lose that vm to corruption or something of the like.. apply a patch that buggers something. I'm up sh!ts creek sort of deal.

What makes me nervous is that it is repeatedly stated that Intel VT-d passthrough of storage controllers is not a supported VMware function. While it works.... I don't get warm and fuzzies. *Btw you should make sure that board and cpu support VT-d since thats what you need for passthrough. Also double check the boards capacity for unbuffered dimms. Usually the 192gb numbers are for registered dimms and something like 48gb for unregistered in that same server. (This is for HP/DELL/IBM gear, can't say for certain bout Supermicro)

I see people regularly stating that the M1015 can't be flashed into IT Mode which is what you'd want for ZFS .. I think because of timeouts.. Though I thought I saw somewhere that it was possible to flash the controller. I'll have to look. I just bought four of these cards.

To me, what's nice about passthrough is if my ESXi or the host VM gets toasted, I should be able to load SE11 or w/e I go with onto the bare metal and import the physical disks. I don't worry so much about it being supported, although if it could go really wrong and start corrupting data I guess I could be in really bad shape.

The SuperMicro has VT-d based on my reading, but I would of course quadruple check before pulling the trigger. I am probably looking at 1068s instead of the IBMs now, because they seem more popular. I was thinking IBM to save $, but I am well beyond penny pinching at this point.

I am with you on the name brand server. My previous two Dells were outlet deals, and I am tempted to just do the same again and throw an external enclosure on it. I don't think I can get as good of a deal as last time ($3k for my 48gb dual proc r710), but with how high my price is getting already on this build I might be able to buy an outlet deal with similar specs for close to the same and save the work of building it. I was just hoping to get it all in 4U so I didn't waste rack space.
 
Yet another example of my cluelessness, I just figured out the 56xx procs do not do vt-d after all. BAck to the drawing board again. I kind of liked the 6 core idea.
 
That link I posted addresses that issue. Since VT-d is not a feature of the CPU it is not listed explicitly on the 5600 CPUs.

Information
VT-d is not a feature of the processor itself but rather a feature of the Chipset. As long as the chipset/BIOS combination support VT-d and the processor support VT, there will be VT-d support.

Viper GTS
 
To me, what's nice about passthrough is if my ESXi or the host VM gets toasted, I should be able to load SE11 or w/e I go with onto the bare metal and import the physical disks. I don't worry so much about it being supported, although if it could go really wrong and start corrupting data I guess I could be in really bad shape.

The SuperMicro has VT-d based on my reading, but I would of course quadruple check before pulling the trigger. I am probably looking at 1068s instead of the IBMs now, because they seem more popular. I was thinking IBM to save $, but I am well beyond penny pinching at this point.

I am with you on the name brand server. My previous two Dells were outlet deals, and I am tempted to just do the same again and throw an external enclosure on it. I don't think I can get as good of a deal as last time ($3k for my 48gb dual proc r710), but with how high my price is getting already on this build I might be able to buy an outlet deal with similar specs for close to the same and save the work of building it. I was just hoping to get it all in 4U so I didn't waste rack space.

Hrm, I wasn't aware you could import the disks that easily. So if the vm gets fubar'd I'd have a way to access my data?

Can a onboard sas controller, (*quasi onboard because its not part of the chipset but rather either a plug in component or similar) be passed through? hrm. Cuz if not, that complicates matters, you end up buying a hba and either connecting the backplane to it, or ripping out the backplane and cabling the drives to the hba..

Part of the reason I'm leery about self build is.. currently when something fails with my servers I pickup the phone and the next day I have a part or a service guy with parts. In the past the problem has yes not been identified or fixed immediately but it's not like a self build. Where a motherboard RMA is ... 2months.

Isn't there issues with growing ZFS ??

I noticed in another thread about Fibre/infiniband the guy mentioned he was writing to a ramdrive.. can you front your storage with a ramdrive then tier to a ssd or the sata disks after?? I'm asking since I know you've figured you'll be fronting your disks with SSD. Cheers!
 
Hrm, I wasn't aware you could import the disks that easily. So if the vm gets fubar'd I'd have a way to access my data?

Can a onboard sas controller, (*quasi onboard because its not part of the chipset but rather either a plug in component or similar) be passed through? hrm. Cuz if not, that complicates matters, you end up buying a hba and either connecting the backplane to it, or ripping out the backplane and cabling the drives to the hba..

Part of the reason I'm leery about self build is.. currently when something fails with my servers I pickup the phone and the next day I have a part or a service guy with parts. In the past the problem has yes not been identified or fixed immediately but it's not like a self build. Where a motherboard RMA is ... 2months.

Isn't there issues with growing ZFS ??

I noticed in another thread about Fibre/infiniband the guy mentioned he was writing to a ramdrive.. can you front your storage with a ramdrive then tier to a ssd or the sata disks after?? I'm asking since I know you've figured you'll be fronting your disks with SSD. Cheers!


For most of your questions I have to defer to others, as this will be my first foray into ZFS. I figured I would pick out the hardware, then start messing with what the best way to assemble it is, and the best OS for it. However, here is my understanding on your points:

RE: importing. It is my understanding that with ZFS, as long as I have all of the physical disks seen by the OS, if I need to move them all to a new system as long as the new OS sees all of the disks I can import the old volumes. The nice thing with that is it is not dependent on a specific controller or anything.

I am not sure what you are asking on the onboard controller. It would probably get a better answer in the virtualization forum though. I have seen but not read threads there about passing through HBSs that ESXi doesn't support, but if ESXi does support it and your cpu and mobo support VT-d, I think it will probably work.

For growing ZFS, again others can answer better. You might want to start a thread with your questions. I was of the impression that it was pretty easy to upgrade disks if you design your setup right (e.g. 3 5-disk RAIDz1s striped instead of 1 15 disk RAIDz3 allows you to upgrade one RAIDz at a time.)
 
For most of your questions I have to defer to others, as this will be my first foray into ZFS. I figured I would pick out the hardware, then start messing with what the best way to assemble it is, and the best OS for it. However, here is my understanding on your points:

RE: importing. It is my understanding that with ZFS, as long as I have all of the physical disks seen by the OS, if I need to move them all to a new system as long as the new OS sees all of the disks I can import the old volumes. The nice thing with that is it is not dependent on a specific controller or anything.

I am not sure what you are asking on the onboard controller. It would probably get a better answer in the virtualization forum though. I have seen but not read threads there about passing through HBSs that ESXi doesn't support, but if ESXi does support it and your cpu and mobo support VT-d, I think it will probably work.

For growing ZFS, again others can answer better. You might want to start a thread with your questions. I was of the impression that it was pretty easy to upgrade disks if you design your setup right (e.g. 3 5-disk RAIDz1s striped instead of 1 15 disk RAIDz3 allows you to upgrade one RAIDz at a time.)


Will do cheers. Best of luck with your implementation!
 
Updated OP. My build list is approaching $6k instead of my original $2-3k estimate, so I am going to order ASAP to avoid any further cost increases. Any last minute feedback?
 
Back
Top