Your home ESX server lab hardware specs?

Zarathustra[H];1039860015 said:
Time for another update for me:

The new specs are:

ESXi 5.1
AMD FX-8120
Gigabyte GA-990FXA-UD3
32GB DDR3 RAM
Matrox Millenium 2MB (Limited console use only)
32GB OCZ Octane SSD (ESXi Boot Drive + datastore for pfSense and FreeNAS installs)
250GB WD Blue SATA drive (extra datastore, only because I had it kicking around)
Intel EXPI9402PT Dual Port Gigabit Copper NIC, Direct I/O Forwarded to pfSense Guest
Intel EXPI9402PT Dual Port Gigabit Copper NIC, Direct I/O Forwarded to FreeNAS Guest, set up as LAGG interface.
Broadcom NetXtreme gigabit NIC Direct I/O Forwarded to Ubuntu Server 12.04 Guest
IBM M1015 Storage controller (Direct I/O forwarded to FreeNAS)
4 WD 3TB Green drives and 2 WD 2TB Green drives in RAIDz2 mode, for a total of 12TB (18 once I replace the two 2TB drives with 3TB+ drives)
On board Ethernet interface is used only for VmWare Client.

Guests: (thus far)
- pfSense (2 cores, 1GB RAM) My router and firewall
- Headless Ubuntu Server 12.04 (2 cores, 4GB Ram) General linux server for rtorrent/wget's, and running Ubiquiti Unifi controller/server software.
- FreeNAS (4 cores 25GB RAM) Running RAIDz array and sharing to LAN via dual lagg interface, and internally with Ubuntu via 10Gig VMXnet3 interface.

I am considering moving away from a Raid Card to a ZFS based system for more flexibility. What kind of read/write performance at the local disk level and at the CIFS level? I am experiencing the current speeds locally:
Code:
1000000+0 records in
1000000+0 records out
8192000000 bytes (8.2 GB) copied, 7.80287 s, 1.0 GB/s

real	0m10.838s
user	0m0.120s
sys	0m6.536s

I get about 110MB/s over the network via CIFS today. Thanks in advance for any insight you can provide!
 
I am considering moving away from a Raid Card to a ZFS based system for more flexibility. What kind of read/write performance at the local disk level and at the CIFS level? I am experiencing the current speeds locally:
Code:
1000000+0 records in
1000000+0 records out
8192000000 bytes (8.2 GB) copied, 7.80287 s, 1.0 GB/s

real	0m10.838s
user	0m0.120s
sys	0m6.536s

I get about 110MB/s over the network via CIFS today. Thanks in advance for any insight you can provide!


Wow that is pretty fast! What type and how many drives do you have? What kind of RAID configuration? I ahve to say, it almost seems a little suspicious that the performance is exactly 1GB/s, as if there is something else limiting it, not the array speed itself.

I have my 6 WD green drives (which are mismatched, 4x3 TB + 2x2TB, which slows the array down, and makes ZFS treat them all as 2TB drives) configured in RAIDz2, which is the ZFS equivalent of RAID6.

In local speed tests I am seeing ~480MB/s, and now that I have fixed my network configuration to go through a real Intel NIC (instead of the way it was accidentally configured before, through a vswitch to the onboard Realtek NIC) I am seeing SMB speeds over 100MB/s

The upside to ZFS is that it is controller independent, and more reliable than hardware RAID due to its checksumming design. If you had a problem, you could easily move all your drives to any machine with any controller and just import the array. The downside is that all of this uses lots of RAM and CPU, so you have to make sure you have a powerful enough server, or you'll see slowdowns.

The truth is the numbers vary greatly depending on how you issue your dd command.

If you give me the exact dd line you used for the above, I can run it on mine and give you the results if you'd like.

I can't remember the exact block size and count I used for my tests, as it's been a while, but the key for me was to make it HUGE, in order to make sure that the cached reads and writes are relatively small in comparison to the actual hard drive read and writes, and thus I am measuring actual drive performance, not RAM performance.

I would issue the commands something like this:

Code:
dd if=/dev/zero of=bench_file bs=2048k count=50k  (for writes)

and

dd if=bench_file of=/dev/null bs=2048k count=50k (for reads)

That way you are doing a write and a read of 100GB respectively, which should be large enough to minimize the impact of caches. The block size will also be large enough to reduce block size mismatches with the array.
 
Last edited:
Zarathustra[H];1039862576 said:
Wow that is pretty fast! What type and how many drives do you have? What kind of RAID configuration? I ahve to say, it almost seems a little suspicious that the performance is exactly 1GB/s, as if there is something else limiting it, not the array speed itself.

Code:
dd if=/dev/zero of=bench_file bs=2048k count=50k  (for writes)

and

dd if=bench_file of=/dev/null bs=2048k count=50k (for reads)

That way you are doing a write and a read of 100GB respectively, which should be large enough to minimize the impact of caches. The block size will also be large enough to reduce block size mismatches with the array.

My layout is 20xST32000641AS in a raid6 configuration with the drive presented to linux and LVM from there with a simple EXT4 filesystem for each breakout. Here are the commands below for reference:

Code:
(05-08-2013 02:08 PM)-> dd if=/dev/zero of=bench_file bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB) copied, 130.431 s, 823 MB/s

Code:
(05-08-2013 02:10 PM)-> dd if=bench_file of=/dev/null bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB) copied, 94.3281 s, 1.1 GB/s

You're numbers look good and I was considering the following potential configuration would be two raidz2 vdevs in a pool with each vdevs being 10+2(p) with the drives most likely being STBD4000400 with two LSI SAS 9201-16i giving me room to leverage all 24 ports in a RPC-4224 with room for 8 more ports in an external unit if growth is required.
 
My layout is 20xST32000641AS in a raid6 configuration with the drive presented to linux and LVM from there with a simple EXT4 filesystem for each breakout. Here are the commands below for reference:

Code:
(05-08-2013 02:08 PM)-> dd if=/dev/zero of=bench_file bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB) copied, 130.431 s, 823 MB/s

Code:
(05-08-2013 02:10 PM)-> dd if=bench_file of=/dev/null bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB) copied, 94.3281 s, 1.1 GB/s

You're numbers look good and I was considering the following potential configuration would be two raidz2 vdevs in a pool with each vdevs being 10+2(p) with the drives most likely being STBD4000400 with two LSI SAS 9201-16i giving me room to leverage all 24 ports in a RPC-4224 with room for 8 more ports in an external unit if growth is required.

Nice, it's actually sustained at those speeds, pretty impressive. The 20 drives probably explain it.

What you are trying to do makes sense to me. They recommend to not go much beyond 12 drives per array, but pooling two of them like that seems to e the standard recommendation for large systems, so that should be a good choice!

Unfortunately I have very little experience with large arrays such as yours, so I can't speak to performance concerns much beyond knowing that CPU use goes up significantly the more parity drives you have, and even more if you opt for encryption. (Having an EAS-NI based CPU seems to help a lot, but its not enough to negate some significant CPU load from encryption). RAM demands are going to be pretty high too. The recommendation for ZFS tends to be about 1GB RAM per TB of drive space and the more the better. ZFS LOVES RAM.

With ESXi things get tricky, especially if you intend to run FreeNAS as a VM AND use it to serve as a datastore, as my understanding is that ESXi typically expects its datastores to be available on boot.

Also, make sure your LSI controllers are supported in FreeBSD, and can be direct forwarded. ZFS ideally should be working on bare metal, not through VDMK's or RDM:ed drives.

I found this hardware wiki entry helpful when I first started with FreeNAS and ZFS.

Also keep in mind that the people over in the FreeNAS forums are a little prejudiced against virtualization, as evidenced in this post. The take seems to be something along th elines of "Someone doing something stupid with virtualization could break something resulting in dataloss, thus we are against virtualization in general". I would deem it safe, as long as you are using a passed through controller that is known to work well passed through. The other points he makes are probably better noted as cautions, rather than his over the top "don't virtualize" stance.

You may need to make a tweak to loader.conf in order to allow the system to boot with a passed through controller. See here.
 
I plan to run an internal datastore drive using mirrored SSD since I only plan to run a Linux VM (mounting ZFS storage), a Windows VM, and a few smaller VMs to play with as I need them. I was shooting for 16GB of ram out of the gate in the FreeNAS VM but plan to put in 32GB with room for 32GB more down the road if required.
 
I plan to run an internal datastore drive using mirrored SSD since I only plan to run a Linux VM (mounting ZFS storage), a Windows VM, and a few smaller VMs to play with as I need them. I was shooting for 16GB of ram out of the gate in the FreeNAS VM but plan to put in 32GB with room for 32GB more down the road if required.

That is pretty similar to what I do.

I use the FreeNAS VM both for SMB/CIFS LAN storage (with a passed through dual gigabit Intel PRO PT NIC, using link aggregation) and for storage for my Linux box (via internal vswitch and VMXnet3 10gig interface).

My array is - of course - much smaller :p

My only regrets are:

1.) That I didn't pick a motherboard capable of using more than 32GB of RAM
2.) That I didn't buy ECC Ram off the bat.
 
Zarathustra[H];1039863401 said:
My only regrets are:

1.) That I didn't pick a motherboard capable of using more than 32GB of RAM
2.) That I didn't buy ECC Ram off the bat.

I just read online that ESXi free version only supports up to 32GB of ram... so maybe you aren't so bad off in that scenario. The ECC part yes I am going that route off the bat.
 
I just read online that ESXi free version only supports up to 32GB of ram... so maybe you aren't so bad off in that scenario. The ECC part yes I am going that route off the bat.

Interesting. I did not realize that. That is unfortunate, but I guess they have to do something to prevent organizations from relying on the free version and not licensing it. Even as it is I suppose there is probably a lot of that going on...

I kind of wish they had a paid home/small office license, in addition to their super expensive major corporate licenses. I'd pay a consumer type software price to unlock all of the single server features ESXi provides. I don't need all the multi-server stuff. But I guess those of us who actually use ESXi for "production" home use are still relatively rare.
 
Last edited:
Finally had time to bring the private cloud lab online....working great so far. Still have a fair bit to go.

nexentace.png


vsphereweb.png


vcloud.png
 
Hardware:

Rosewill 4U RSV-L4500 rackmount server case (excellent BTW)
Supermicro X8DTE-F Xeon Motherboard (MP)
*onboard IPMI and 2 other 1GbE nics and 6 SATA2 ports.
2 Xeon L5520 Quad-Core HyperThreaded CPUs
6 4GB Samsung 1066mhz RDIMMs for 24GB total (6 slots left open)
Mellanox DDR 2 port PCI-E infiniband card
Brocade 1020 PCI-E 8x 10GbE CNA (2 SFP+ ports)
2 port Intel PCI-E 4x 1GbE (2x RJ45)
LSI 1068 (of some sort) PCI-E 8x 8 port SAS controller card (3Gb/s)
IBM M1015 PCI-E 8x 8 port SAS controller (6Gb/s)
8 10,000RPM Raptor 2.5" 300GB SATA2 drives attached to SAS card
2 5.25" to 4 bay 2.5" hotswap IcyDock bay (2 of them) (8 10k 2.5" drives total)
1 5.25" to 4 bay 2.5" hotswap dock (offbrand) for SSDs and ESXi install
2 3x5.25" bay to 4x3.5" bay drive cages holding 4 2TB 5400rpm drives each (8 2TB drives)
8 2TB 3.5" drives, 2 Samsung's, 2 Seagates, 4 WDs, all 5400rpm drives

Temporary stuff:
40GB 2.5" laptop drive holding the ESXi install and a small ISO repository
750GB 3.5" drive as the temporary VM datastore until they can be moved to the Raptor array.

Software:

ESXi 4.1 U3 (gave up on 5.X because of infiniband driver problems)
Open Indiana + Napp-it for SAN/NAS duty
8 10k drives are running a RAID10 setup and getting 650MB/s Read and 450MB/s Write
Waiting on M1015 controller to arrive before I install the 8 2TB drives (also in a RAID10)

-----------------------------------------------

Rack itself: Compaq 8000 42u? (its not the white model and not the newer 9000 series)

Other Hardware in rack:

Backup server:
Dell Poweredge 850 1u Pentium 4, 2GB RAM, windows 2008R2
*DDR infiniband card (dual port) so it gets access to the 10Gb network
*LVD SCSI card to control Dell 114T powervault

Dell 3u 114T Powervault with a single LTO-3 drive
Dell 1u Powerconnect 2716 16 port managed 1GbE switch (best switch ever IMHO)
Dlink DXS-3227 1u 24 port GbE managed switch w/10GbE fiber XFP port
APC 1500 2u battery backup unit with (mostly) new batteries and management card
2nd 4u Rosewill RSV-R4000 (2" shorter, older model) case: empty (for now)
1U 4 port KVM
1U keyboard/mouse pull-out tray
2U (i think) fold out compaq rackmount monitor (going bad sadly)

The rack itself, KVM, monitor and KB&M tray were freebees from a previous job. :)

------------------------------------------------

Details on ESXi 4u server:

So in the end I should have a 10,000RPM 1.2TB RAID10 pool and a 8TB RAID10 pool with 10GB connectivity between the VMs and my client machines (DDR infiniband cards all around). One of my machines *might* be too far away for a CX4 cable. I'll attempt a Point to Point 10GbE connection with the Brocade 1020 CNA(s) if I need to as well.

Only problem I have currently is I obviously can't put the OI VM on its on RAID array so it needs to go on another disk. Stupid me made it a 40GB VM and put it on my (current) datastore which is a 750GB 3.5" sata drive. Problem is there isn't room for the 8 2TB drives *and* that 750GB drive. I'll probably just put the 3.5" 750GB drive in a SATA enclosure outside the case and use a SATA-2-eSATA adapter plate until I can find/afford a 2.5" drive bigger then the 40GB one I have now to use to hold it (since I have that 4 bay 2.5" thing).

------------------------

I'd also like to move my HTPC/Steambox down into the rack and "pump" the video/USB up/down using cat6 and converters. Ive tested the converters and they work awesome. The current problem is heat dump. I'm not sure the room gets the airflow I need to be able to put another PC in there along with a GTX460 inside said server. I'll probably still move the HTPC to the basement, just not the "rack room" just yet.
 
Zarathustra[H];1039865139 said:
Interesting. I did not realize that. That is unfortunate, but I guess they have to do something to prevent organizations from relying on the free version and not licensing it. Even as it is I suppose there is probably a lot of that going on...

I kind of wish they had a paid home/small office license, in addition to their super expensive major corporate licenses. I'd pay a consumer type software price to unlock all of the single server features ESXi provides. I don't need all the multi-server stuff. But I guess those of us who actually use ESXi for "production" home use are still relatively rare.

I wish I could link 2 cheaply at home, so I could mess with hardware ETC without taking my VMs down.


almost has me switching to hyper-v
 
Just interested as to why people recommend ECC ram on here?

I know it has error correction but realistically how often is this a problem?
Can someone provide some solid facts / figures?
 
OK..so i'm adding some workloads regularly now. Working on some Nested vApps to setup SRM. I'm thinking of setting up a UBERVNX with the CORE_Inception vApp to present some storage for replication.

Right now I have the following workloads:
Horizon View
Horizon Workspace
Cisco UCS Platform Emulator
vCenter Operations Manager
vInception- To be the Secondary Cluster for SRM lab..etc.

Will add the following:
Wordpress Hosting: would like to migrate my blog to my cloud instead of hosting it at Wordpress..will give me more functionality.
IP PBX of some sort maybe look at 3CX again
Some sort of Monitoring Utility outside of vC OPS that can monitor all my devices (need some advice on this)
Probably some Liquidware Labs stuff like ProU, maybe StratUX.


Goin good so far though I think one of the frustrating things working with vCloud Director is that it's not .OVA friendly, only .OVF...which sux.

I need a lot more storage so i'm gonna beef up the Nexenta Appliance in the coming months. Right now i'm running a SDRS Storage Cluster between some slower storage on my Iomega PX4 and some SSD Cached Storage on my Nexenta..works fairly well. I may go all SSD with the Nexenta so I have a true Tier 1.....not sure yet..prices needs to come down on larger SSD's. I also need to add memory so i'm going to migrate off of 4GB dimms and go 8GB to double my capacity.

I also have 2 10Gbe cards in the vCloud hosts, just need to get some TwinAX cables..will be running those for vMotion until I can get a switch with 10Gbe uplinks..etc.

corecloud.png
 
Last edited:
Just interested as to why people recommend ECC ram on here?

I know it has error correction but realistically how often is this a problem?
Can someone provide some solid facts / figures?

Here is the layman's version, if you want more technical details, the white paper is here.

ECC is really not a big deal on video cards and on desktop systems. A flipped bit may only result in something silly like a pixel slightly off in color or something like that.

When it comes to a VMWare server computing hashes for your ZFS array's parity data, it may be of higher consequence.
 
Zarathustra[H];1039878192 said:
Here is the layman's version, if you want more technical details, the white paper is here.

ECC is really not a big deal on video cards and on desktop systems. A flipped bit may only result in something silly like a pixel slightly off in color or something like that.

When it comes to a VMWare server computing hashes for your ZFS array's parity data, it may be of higher consequence.

Thanks for that but to be honest I can only see data from 2009 and before.
Plus they were testing with only DDR2 ram and not DDR3.

On average, about one in three Google servers experienced a correctable memory error each year and one in a hundred an uncorrectable error, an event that typically causes a crash.
And looking at those stats right there I'd say I'm okay at the moment.
 
Interesting point in the article is how the age of the sticks also impacts reliability. Yes, memory sticks grow "old".
 
Just wondering, where do you people get more than your standard ESXi licence?

Clients at my work either upgrading to migrating to HyperV/Open Source that I've asked for, and received their licenses.

A little bit of a grey area, but still, at least I'm not pirating it.
 
Clients at my work either upgrading to migrating to HyperV/Open Source that I've asked for, and received their licenses.

A little bit of a grey area, but still, at least I'm not pirating it.

Pff, what's the difference? You might tell your conscience that you're not pirating, but you're not paying either way and potentially evading taxes by using a company-paid license.

Just pirate if you're not earning any cash with it and be done with it. I wouldn't pay hundreds just because the 32G limit is in the way if I'm just toying around with it at home. Not every pirate install is a lost sale and it's free advertising because you might later buy it in a new install at work instead of going with a competitor.
 
Hi, new user here. This looked like a pretty good thread to drop this into, but if not, feel free to publicly shame me as long as you steer me elsewhere, if you don't mind. :)

I just purchased and built the following h/w, intending to run a fully virtualized all-in-one home server environment: pfSense router, Untangle UTM, Amahi file server, etc. (for starters)

- AMD Opteron 6376 CPU (16 Cores @ 2.3 GHz)
- SuperMicro H8SGL-F-O Motherboard
- Samsung 64GB Registered ECC RAM
- Adaptec 7805 PCIe SAS RAID Controller
- 4x 3TB Seagate SAS 6.0Gb/s Drives (to run in RAID10, or ZFS in HBA mode)
- SuperMicro SC743 Server Case w/SAS backplane

Prior to purchasing all of this hardware, I had tested KVM and ESXi on different hardware, making sure everything I wanted to do worked (with the exception of the SAS controller/RAID array, since I didn't have any extra of those parts).

Everything worked great in both ESXi and KVM, but I ended up settling on ESXi, so I made all of my purchasing decisions based on hardware that is confirmed compatible in the ESXi 5.1 HCL.

Anyway, things generally "work" with one huge exception -- I CANNOT for the LIFE OF ME get PCI passthrough to function with the Adaptec card, and this is unfortunately a show-stopper. I decided that I would like to pass-through the SAS controller so performance doesn't suffer and so I can see the full ~6TB RAID volume on the file server, instead of dealing with the 2TB limitation within the datastores.

ESXi lets me select the card for passthrough, then the guest OSes definitely sees it (I've tried Ubuntu 12.04, Fedora 14, Windows 2008 R2, and even OpenIndiana). All of them choke in some manner when it comes time to load the AAC drivers.

Ubuntu 12.04 - when booting w/passthrough enabled, will refuse to load the GUI and the ESXi console eventually just completely "dies". The [drm] (vid card?) process causes udev to almost literally vomit and repeatedly time-out and the OS never fully boots, but it doesn't "hang" -- I can get in through SSH if I enable it. The aacraid driver (I've tried mutliple versions, DKMS and non) never seem to see the RAID volume. Also tried HBA mode/individual drives -- no dice.

Fedora 14 - OS hangs during boot-up as the white "Fedora" progress bar, near the end.

Windows 2008 R2 - As soon as you update the driver (yellow exclamation mark in Device Manager) with the Adaptec drivers, the VM completely stops (no blue screen or anything, just game over/crash with an error logged in ESXi that I no longer have handy at the moment)

OpenIndiana - Boot-up hangs as soon as the kernel attempts to load the aac driver... this was my last-ditch effort because I was going to virtualize a ZFS server to get around the 2TB limit... but alas, OpenIndiana craps out after loading the AACRAID driver as well.

That all being said...if I disable pass-through and just install the ESXi drivers, I am able to fully see and partition and build datastores/VMFS files on the RAID array just fine in ESXi... but I don't like that 2TB limit for my file server and would like the best performance possible.

I've tried posting to the actual VMWare ESXi community forums, but literally nobody has responded. I would be more than willing to pay for ESXi (to get support), but not if this doesn't work... hence I'm in a catch-22.

For kicks, I downloaded and installed ProxMox 3.0 and the PCI passthrough worked flawlessly. Well, I shouldn't say that. Performance was completely terrible to the RAID array -- I was only able to achieve 30-40Mbytes/sec in reads and writes, when in my current (non-virtualized) setup I am able to saturate GigE @ 100-110Mbytes/sec and that is on an Adaptec 5805 with 3.0Gb/s SAS drives in RAID5. :( I would love to tell you the performance I get using PCI passthrough on ESXi, but I can't make it that far.

Any ideas or suggestions? I would really rather not be forced to use ProxMox, I think KVM is a bit kludge-y and that ESXi is a lot more polished and elegant. Replacing the hardware is not really an option but if I can't make either solution work, I suppose I can turn my old (non-virt) file server into a ZFS/iSCSI host and point ESXi to that. I was really hoping for an all-in-one though, to conserve power and reduce noise.
 
So I believe I've "sort of" resolved my issue! I can at least get the VMs to boot and see the Adaptec 7805 in ESXi, but I am still doing some preliminary benchmarkings in terms of performance (more to come on that).

The trick? Editing the .vmx file manually and adding the following line for the PCIe card being passed-through:

pciPassthru0.msiEnabled = "FALSE"
 
Hi, new user here. This looked like a pretty good thread to drop this into, but if not, feel free to publicly shame me as long as you steer me elsewhere, if you don't mind. :)

I just purchased and built the following h/w, intending to run a fully virtualized all-in-one home server environment: pfSense router, Untangle UTM, Amahi file server, etc. (for starters)

- AMD Opteron 6376 CPU (16 Cores @ 2.3 GHz)
- SuperMicro H8SGL-F-O Motherboard
- Samsung 64GB Registered ECC RAM
- Adaptec 7805 PCIe SAS RAID Controller
- 4x 3TB Seagate SAS 6.0Gb/s Drives (to run in RAID10, or ZFS in HBA mode)
- SuperMicro SC743 Server Case w/SAS backplane

Prior to purchasing all of this hardware, I had tested KVM and ESXi on different hardware, making sure everything I wanted to do worked (with the exception of the SAS controller/RAID array, since I didn't have any extra of those parts).

Everything worked great in both ESXi and KVM, but I ended up settling on ESXi, so I made all of my purchasing decisions based on hardware that is confirmed compatible in the ESXi 5.1 HCL.

Anyway, things generally "work" with one huge exception -- I CANNOT for the LIFE OF ME get PCI passthrough to function with the Adaptec card, and this is unfortunately a show-stopper. I decided that I would like to pass-through the SAS controller so performance doesn't suffer and so I can see the full ~6TB RAID volume on the file server, instead of dealing with the 2TB limitation within the datastores.

ESXi lets me select the card for passthrough, then the guest OSes definitely sees it (I've tried Ubuntu 12.04, Fedora 14, Windows 2008 R2, and even OpenIndiana). All of them choke in some manner when it comes time to load the AAC drivers.

Ubuntu 12.04 - when booting w/passthrough enabled, will refuse to load the GUI and the ESXi console eventually just completely "dies". The [drm] (vid card?) process causes udev to almost literally vomit and repeatedly time-out and the OS never fully boots, but it doesn't "hang" -- I can get in through SSH if I enable it. The aacraid driver (I've tried mutliple versions, DKMS and non) never seem to see the RAID volume. Also tried HBA mode/individual drives -- no dice.

Fedora 14 - OS hangs during boot-up as the white "Fedora" progress bar, near the end.

Windows 2008 R2 - As soon as you update the driver (yellow exclamation mark in Device Manager) with the Adaptec drivers, the VM completely stops (no blue screen or anything, just game over/crash with an error logged in ESXi that I no longer have handy at the moment)

OpenIndiana - Boot-up hangs as soon as the kernel attempts to load the aac driver... this was my last-ditch effort because I was going to virtualize a ZFS server to get around the 2TB limit... but alas, OpenIndiana craps out after loading the AACRAID driver as well.

That all being said...if I disable pass-through and just install the ESXi drivers, I am able to fully see and partition and build datastores/VMFS files on the RAID array just fine in ESXi... but I don't like that 2TB limit for my file server and would like the best performance possible.

I've tried posting to the actual VMWare ESXi community forums, but literally nobody has responded. I would be more than willing to pay for ESXi (to get support), but not if this doesn't work... hence I'm in a catch-22.

For kicks, I downloaded and installed ProxMox 3.0 and the PCI passthrough worked flawlessly. Well, I shouldn't say that. Performance was completely terrible to the RAID array -- I was only able to achieve 30-40Mbytes/sec in reads and writes, when in my current (non-virtualized) setup I am able to saturate GigE @ 100-110Mbytes/sec and that is on an Adaptec 5805 with 3.0Gb/s SAS drives in RAID5. :( I would love to tell you the performance I get using PCI passthrough on ESXi, but I can't make it that far.

Any ideas or suggestions? I would really rather not be forced to use ProxMox, I think KVM is a bit kludge-y and that ESXi is a lot more polished and elegant. Replacing the hardware is not really an option but if I can't make either solution work, I suppose I can turn my old (non-virt) file server into a ZFS/iSCSI host and point ESXi to that. I was really hoping for an all-in-one though, to conserve power and reduce noise.

Why on earth are you using a GUI on a server? :p
 
So I believe I've "sort of" resolved my issue! I can at least get the VMs to boot and see the Adaptec 7805 in ESXi, but I am still doing some preliminary benchmarkings in terms of performance (more to come on that).

The trick? Editing the .vmx file manually and adding the following line for the PCIe card being passed-through:

pciPassthru0.msiEnabled = "FALSE"

Completely different adapter, but there may be some parallels.

When I was passing through my IBM M1015 (as I understand a rebranded LSI SAS controller) to FreeNAS, I had to manually edit boot scripts in order to make it work.

I eventually found this solution:
http://hardforum.com/showthread.php?t=1678655

I wonder if these issues are related.
 
Zarathustra[H];1039917777 said:
Why on earth are you using a GUI on a server? :p

That is a good question, end-game there won't be any GUIs running on the Linux/UNIX boxes. However, even without loading X at the start-up, the kernel/boot-up process was choking on the "drm" (some video chipset driver-related thing, not the DRM we all hate) was causing udev to infinitely hang. So not loading the GUI was more of a side-effect of the boot-up process not being able to complete, rather than a problem I actually cared about. :)
 
Zarathustra[H];1039917786 said:
Completely different adapter, but there may be some parallels.

When I was passing through my IBM M1015 (as I understand a rebranded LSI SAS controller) to FreeNAS, I had to manually edit boot scripts in order to make it work.

I eventually found this solution:
http://hardforum.com/showthread.php?t=1678655

I wonder if these issues are related.

Wow, yeah that is very interesting... I think it may be close to the same problem, but in my case, the fix was done from the ESXi side (rather than the Guest OS's side). I am not aware of a way to disable the MSI/MSI-X stuff within a Linux kernel; probably some obscure kernel option you pass into GRUB?

My only concern/worry is the potential performance hit by skirting around this MSI/MSI-X option... my numbers aren't lookin' so hot thus far, but I am going to try some more experimentation w/different RAID types and RDM passthrough (vs. the entire controller):

(Control group) 10GB bonnie++ test to my RAID5 xfs-fs array on my pre-existing, non-virtualized file server running SAS 3.0Gb/s drives:

Block Read: 312 MBytes/sec w/435ms latency
Block Write: 377 MBytes/sec w/113ms latency

(Reference point) 10GB bonnie++ test to my VMFS root partition on the new server (the SSD where the datastore resides -- this doesn't matter that much, data won't be stored here):

Block Read: 74 MBytes/sec w/1268ms latency
Block Write: 306MBytes/sec w/60403us latency (!!!)

(Test group) 10GB bonnie++ test to individual drive in HBA mode using PCI passthrough:

Block Read: 108 MBytes/sec w/35603us latency (meh!)
Block Write: 118 MBytes/sec w/313ms latency (meh!)

(Test group) 10GB bonnie++ test to RAID10 array using PCI passthrough:

Block Read: 280 MBytes/sec w/31115us latency (more like it...)
Block Write: 522 MBytes/sec w/91621us latency (yeah baby!)

(Test group) 10GB bonnie++ test to RAID5 array using PCI passthrough*:

* NOTE - RAID5 array is still Building/Verifying, will re-run test after that is done.

Block Read: 237 MBytes/sec w/26003us latency
Block Write: 143 MBytes/sec w/237ms latency

My goal was to hopefully run the disks in HBA mode (seems kind of dumb considering I bought a RAID controller, I guess) and just use ZFS. But if I am passing the entire controller through, that may not be ideal because I plan on installing additional drives to use for the datastores instead of filling my SSD w/VMs and VM data.

We'll see what happens; so far RAID 10 is king, but I plan on experimenting with the "local RDM passthrough hack" for ESXi that I found on here while looking for help with the original issue.

I will re-test RAID 5 after the build is complete, then switch over to the RDM hack and re-test everything.

I am a little bit dismayed that the numbers are quite a bit less than the original server (except for RAID10 writes), which is running older/inferior hardware, although it is not virtualized -- Fedora 14 server on bare metal... but, I guess that may be the computing cost of virtualization; we'll see.

I would deem the RAID 10 numbers acceptable so far though (as long as I can saturate a GigE link and then-some).
 
Wow, yeah that is very interesting... I think it may be close to the same problem, but in my case, the fix was done from the ESXi side (rather than the Guest OS's side). I am not aware of a way to disable the MSI/MSI-X stuff within a Linux kernel; probably some obscure kernel option you pass into GRUB?

My only concern/worry is the potential performance hit by skirting around this MSI/MSI-X option... my numbers aren't lookin' so hot thus far, but I am going to try some more experimentation w/different RAID types and RDM passthrough (vs. the entire controller):

(Control group) 10GB bonnie++ test to my RAID5 xfs-fs array on my pre-existing, non-virtualized file server running SAS 3.0Gb/s drives:

Block Read: 312 MBytes/sec w/435ms latency
Block Write: 377 MBytes/sec w/113ms latency

(Reference point) 10GB bonnie++ test to my VMFS root partition on the new server (the SSD where the datastore resides -- this doesn't matter that much, data won't be stored here):

Block Read: 74 MBytes/sec w/1268ms latency
Block Write: 306MBytes/sec w/60403us latency (!!!)

(Test group) 10GB bonnie++ test to individual drive in HBA mode using PCI passthrough:

Block Read: 108 MBytes/sec w/35603us latency (meh!)
Block Write: 118 MBytes/sec w/313ms latency (meh!)

(Test group) 10GB bonnie++ test to RAID10 array using PCI passthrough:

Block Read: 280 MBytes/sec w/31115us latency (more like it...)
Block Write: 522 MBytes/sec w/91621us latency (yeah baby!)

(Test group) 10GB bonnie++ test to RAID5 array using PCI passthrough*:

* NOTE - RAID5 array is still Building/Verifying, will re-run test after that is done.

Block Read: 237 MBytes/sec w/26003us latency
Block Write: 143 MBytes/sec w/237ms latency

My goal was to hopefully run the disks in HBA mode (seems kind of dumb considering I bought a RAID controller, I guess) and just use ZFS. But if I am passing the entire controller through, that may not be ideal because I plan on installing additional drives to use for the datastores instead of filling my SSD w/VMs and VM data.

We'll see what happens; so far RAID 10 is king, but I plan on experimenting with the "local RDM passthrough hack" for ESXi that I found on here while looking for help with the original issue.

I will re-test RAID 5 after the build is complete, then switch over to the RDM hack and re-test everything.

I am a little bit dismayed that the numbers are quite a bit less than the original server (except for RAID10 writes), which is running older/inferior hardware, although it is not virtualized -- Fedora 14 server on bare metal... but, I guess that may be the computing cost of virtualization; we'll see.

I would deem the RAID 10 numbers acceptable so far though (as long as I can saturate a GigE link and then-some).

Hmm, I don't use hardware RAID, preferring ZFS, but I don't think I am having performance issues after disabling MSI/MSI-X.


A quick and dirty speed test:
Code:
~> dd if=/dev/zero of=bench_file bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes transferred in 328.846266 secs (326517870 bytes/sec)

~> dd if=bench_file of=/dev/null bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes transferred in 229.066177 secs (468747432 bytes/sec)

So, 447MB/s reads, and 311MB/s writes on six WD Green SATA drives in RAIDz2 (ZFS RAID6 equivalent) doesn't seem too bad to me.
 
My goal was to hopefully run the disks in HBA mode (seems kind of dumb considering I bought a RAID controller, I guess) and just use ZFS. But if I am passing the entire controller through, that may not be ideal because I plan on installing additional drives to use for the datastores instead of filling my SSD w/VMs and VM data.

This is what I did.

I flashed my M1015 SAS RAID controller with the IT firmware, which turns it into a huge JBOD controller, and then used a FreeNAS VM for ZFS. It goes MUCH harder on the CPU and RAM, but I like the universal nature of it compared to proprietary hardware RAID solutions, and it's actually a little bit more secure.
 
So my RAID5 array finished building and the results are pretty impressive... perhaps too impressive? (I thought RAID10 was supposed to be faster than RAID5 especially for writes?)

405 MBytes/sec (reading)
397 MBytes/sec (writing)

PS: All my benchmarks referenced above in this thread had Block Read/Block Write completely reversed (I misread the Bonnie++ output).

Is there a reason you went with FreeNAS over OpenIndiana+napp-it? I am exploring the idea of having a separate ZFS VM/OS, but not sure I want the extra CPU overhead and software things that can break. But if the performance is better or I am able to better-utilize my space, I might do it.
 
So my RAID5 array finished building and the results are pretty impressive... perhaps too impressive? (I thought RAID10 was supposed to be faster than RAID5 especially for writes?)

405 MBytes/sec (reading)
397 MBytes/sec (writing)

PS: All my benchmarks referenced above in this thread had Block Read/Block Write completely reversed (I misread the Bonnie++ output).

Is there a reason you went with FreeNAS over OpenIndiana+napp-it? I am exploring the idea of having a separate ZFS VM/OS, but not sure I want the extra CPU overhead and software things that can break. But if the performance is better or I am able to better-utilize my space, I might do it.
 
Is there a reason you went with FreeNAS over OpenIndiana+napp-it? I am exploring the idea of having a separate ZFS VM/OS, but not sure I want the extra CPU overhead and software things that can break. But if the performance is better or I am able to better-utilize my space, I might do it.

Honestly?

FreeNAS was the first one I came across in my research, based on a lot of people running it like I am.

It has a large user community with an active forum, was easy to set up and works. BSD doesn't really hotswap, which is a downside, but otherwise I'm happy with it.

I don't really care for BSD CLI syntax when compared to modern Linux distributions, but Solaris and I have a long history of not getting along at all (stemming back to how I found out that killall actually means "kill all" in Solaris), so I never really even considered anything Solaris/OpenSolaris/OpenIndiana.

Napp-it looks pretty cool, but thus far I just haven't had a reason to switch things up. Maybe some day I will :p
 
Zarathustra[H];1039865139 said:
I just read online that ESXi free version only supports up to 32GB of ram... so maybe you aren't so bad off in that scenario. The ECC part yes I am going that route off the bat.
Interesting. I did not realize that. That is unfortunate, but I guess they have to do something to prevent organizations from relying on the free version and not licensing it. Even as it is I suppose there is probably a lot of that going on...

I kind of wish they had a paid home/small office license, in addition to their super expensive major corporate licenses. I'd pay a consumer type software price to unlock all of the single server features ESXi provides. I don't need all the multi-server stuff. But I guess those of us who actually use ESXi for "production" home use are still relatively rare.

It only applies to ESXi 5. Free ESXi 4 version does not have a memory limit and for a single server setup it should be enough.
 
Hello all,

Been a while, figured I show today's current (and freshly rebuilt) lab status!

ESX Host
---------------
VMware ESXi 5.1
Dell PowerEdge 1950 (with spare above in pic!)
2 x Xeon E5345 @ 2.33GHz
32GB RAM
272GB Local SCSI RAID 5
QLogic 405Xc iSCSI Host Bus Adapter

NAS/Storage
---------------
2 x 1000GB iSCSI LUNs via Buffalo Terastation TS-RIXL

Networking
---------------
Cisco WRVS4400N - 2 vlan config: Guest VM's | home network, wifi, NAS, VMmgmt
Linksys SE2800 (8 port GigE Unmanaged)


iom68h.jpg


j653rl.jpg
 
Upgrades to my lab. :D
Pair of Dell XS23-TY3 nodes with 2x L5520s and 48GB of ram.
vcenter.png


Funny server hw that it didn't quite care about fan swaps. It's quiet enough for be sitting 2 meters away from me in my living room and not bothering me :D
 
Back
Top