All-In-One (ESXi with virtualized Solarish based ZFS-SAN in a box)

_Gea

Supreme [H]ardness
Joined
Dec 5, 2010
Messages
4,219
All-In-One (ESXi Server with virtualized high-speed ZFS-SAN solution in a box)

How I have done it

Modern IT services are mostly based on Virtualization and SAN Storage servers.
Usually you have one or more ESXi servers for virtualization and one or more SAN storage or backup-servers
connected with a fast FC or IP based network and a lot of expensive switches and adapters based on expensive software.

I'm from the edu sector. In my case, we have a lot of different services, systems and applications,
mostly less money so we prefer OpenSource and low cost/ free solutions whenever possible. We have critical services
but we have defined, that a 30 min period from any kind of failure until regain service is always ok.
Beside this, we have a maintenance windows in the morning/ evening. So we do not have 24/7 service with
no interrupt allowed.

Until begin of last year we have had separate VMware and SAN Storage Servers. Due to the new upcoming feature of
real hardware virtualisation/ passthrough real hardware to guests (Intel vt-d and AMD IOMMU) instead of the former
virtualisation based only on emulated old standard hardware, sometimes improved a little with optimized ESXi drivers
i developed a concept of an All-In-One, based on free ESXi and a free virtualized ZFS-Storage SAN in the same box.

see my miniHowto about:
http://www.napp-it.org/doc/downloads/all-in-one.pdf


all-in-one-pdf.png



If you are thinking about a similar config, you may be interested in my thoughts about:

1. VMware ESXi
If you think about high-speed virtualization, you are very fast at a point, where you must say
that the best way is a type-1 virtualization technology (runs directly on barebone hardware) like XEN or ESXi.
I use ESXi because its the market leader and because of Xenserver currently lacks pass-through, needed for
performance and needed to have ZFS formatted disks and all ZFS failure features.

Main problem with ESXi. If you talk about storage features of a free ESXi, the best answer is,
that there are no features. Even with licensed versions there are no more than some HA and backup features,
nothing comparable to the features of a modern SAN, so in any serious ESXi use a SAN is involved.

I use free ESXi 4.1 for my solution


2. SAN Storage
In former time we have used Windows. We moved then to unix-systems mainly because of up-time.
Its not funny to have that lot of security patches with needed reboots...

Especially the immens ZFS feature list and the availability of free versions like OpenSolaris or
NexentaCore convinced us to use this as a base for Storage (ESXi datastore, SMB filer and backup-systems)

Currently i use mostly OpenIndiana 148 or NexentaCore (both are free and OpenSource)


3. Separate ESXi and SAN servers

This is what is mostly used. Main problem: You have a lot of machines, expensive switches, energy consumption,
cables and and parts that can fail. Especially if you want to have highspeed like 10 Gb between SAN and VMware
its very expensive.

Pro: If one fails, the other remains intact. You have no dependencies between them and you are free about your SAN.
If you need really 24/7 you should use this and care about a HA-SAN not only about High Availability of ESXi.


4. All-In-One
With modern hardware and especially comparable RAM like on a separate solution, you can virtualize
a SAN itself on ESXi, like you do with any other guest. You have the same logical configuration like
you have with separate machines. From outside view, there is no difference.

Best feature is software based high speed internal transfer (example 10 Gb with vnics based on ESXi vmxnet3 drivers)
between SAN storage and ESXi guests or the use of the ESXi virtual switches with vlans to your hardware switch.
I use for ex one physical 10 Gb vlan connection from ESXi to my HP switch for all of my VM's and lans like manage, san, lan1, lan2, lan3, internet, dmz..
The rest is internal 10Gb transfer based in virtual nics.

Cons:
You have to care about ESXi updates and SAN updates (-> complete out of service)
You have nearly the same SAN performance like on real hardware but you can have more on separate machines.
With my preferred ZFS-OS, you can only use max 2 virtual cpu's or its unstable on booting.
Reduced hardware and disk controller set (ESXi and Solaris are stable ond good only on a some hardware)

If you want to build your own system, i can use and can recomend the following sample config:
Use a Mainboard with Intel server chipsets 5520, 3420 or 202/204
I always prefer Supermicro mainboards from X8-.. or X9.-- series ending with -F for remote mamagement,
Use always Xeons and a minimum of 12 GB RAM, with SuperMicro X9 you may need an additional Intel Nic
If you use/add an SAS controller based on LSI 1068 chipset or LSI 2008 (ex LSI 9211-8i) you have it running without pain.


Conclusions
You should have two All-In-Ones. The second is the backup and failover system for ESXi and SAN service.
Its optimal if you have the possibility to physically move a disk-pool to the second machine, either
by enough free disk slots or an external SAS storage box (plug SAS cable on either machine).

You should also keep both systems up to date. I use ZFS replication between them to have the same data on the
second system with a few minutes delay. (Do always additional backups. This is only for availability)


My experience.
In my use case, its working very well and with less overall problems than our former separate solution. I now have three pairs of All-In-Ones plus
additional 4 dedicated SAN boxes used as SMB filers in our Domain and for backups in use for a year now. On hard or software problems we
could restart any vmware guest in that time usually in 30 min and in all cases without the need to use any backup but with the original pool data.
 
Last edited:
I am using vmxnet3 in OI 151... installing VMTools was very quick and it didn't seem like it installed everything, but vmxnet3 is working.
 
I am using vmxnet3 in OI 151... installing VMTools was very quick and it didn't seem like it installed everything, but vmxnet3 is working.

you are right. I have retried on another OI 151 installation and
installation of vmware-tools worked without problem.
 
I would beware of vmxnet3 driver for OI in virtualized environment. I and others have seen performance issues (burstiness, minihangs, etc...) If it's working, good, but just something to keep an eye on. I became convinced it wasn't worth it, since even with e1000 driver, I can get 3mb/sec sustained thruput.
 
Last edited:
I would beware of vmxnet3 driver for OI in virtualized environment. I and others have seen performance issues (burstiness, minihangs, etc...) If it's working, good, but just something to keep an eye on. I became convinced it wasn't worth it, since even with e1000 driver, I can get 3mb/sec sustained thruput.

Would this explain slow CIFS reads/ fast CIFS writes? I disabled the e1000 network card in OI in my all in one and am exclusively using the vmxnet3. (Installed VMware tools)
 
That I can't say - I honestly don't remember what specifically was having issues.
 
Yes. The openindiana VM has an e1000 nic, as do all of the other VMs (well, I think the XP one has pcnet, but I don't care about performance for that one...)
 
Yes. The openindiana VM has an e1000 nic, as do all of the other VMs (well, I think the XP one has pcnet, but I don't care about performance for that one...)

Okay. Thats good to know. I just tried switching it to the e1000 and got pretty much the same read speeds as the vmxnet3 nic.
 
Hmmm, this sounds very familiar. I think I remember someone else reporting slow windows <= OI but fast windows => OI speeds. Don't recall it was every figured out...
 
I have been searching all over for this issue and I see it popping up all over the place. I just can't seem to find a solution in any of the forums I have been reading regarding OI/Solaris.
 
VMware vmxnet-3 is the most sophisticated virtual network driver with best available performance
but of course e1000 is more than good enough if you have problems with it on your config - use it in case of problems

see http://www.vmware.com/pdf/vsp_4_vmxnet3_perf.pdf
compares to vmxnet2, already more powerful than the base e1000
 
I am using vmxnet3 in OI 151... installing VMTools was very quick and it didn't seem like it installed everything, but vmxnet3 is working.

about installation of ESXi 4.1 vmware tools in OpenIndiana 151

I just discovered that the installation is ok in OpenIndiana with english language setting
but broken when you set german language (and maybee others)
You get only a lot of perl errors
 
_Gea, I just ordred the parts for my second ESXi host and will be giving this a try. Hopefully my Perc 6i works ok otherwise I will just use the onboard controller from my X8SIL-F.
 
I set this up tonight and it is working really good. The guide needs some work though.

I have no idea what you meant by this step:

After First-time install, you must reenter root-pw to create a smb-password too
Enter: passwd root, then reboot
name the folder nfs, keep defaults like share smb (see 1.)
in menu folder share it also via nfs (see 2.)

I never did any of that, unless you mean when you first access the web gui you have to type in new passwords?

Also, if i set my IP to static i cannot get DNS to work. I found one post on google about there being a bug where the /etc/resolv.conf file gets overwritten with the DHCP settings but if there is no DHCP it changes the file to blank? If i manually enter a name server it does not work and the file returns to blank upon restart.
 
I am getting really low Write speeds. Both the DD and Bonnie benchmark report pretty much the same thing. Is this normal? I have a disk pool with 4 disks in the mirror mode. The disk are going through my Perc 6/i with each on in its own RAID-0 array because that is the onyl way i could get them to show up.

Also this is using the ESXi all-in-one with the Perc 6/i being passed through to a VM but i don't think that should make a difference with speeds unless it is a CPU problem?

5XBp8.jpg
 
those speeds look normal, in mirror setup the write speed is equal to write speed of single drive and the read speed to roughly the read speed of single drive x # of drives in mirror
if you don't need a 4 way mirror try setting up 2x 2way mirrors your write speed should about double
 
those speeds look normal, in mirror setup the write speed is equal to write speed of single drive and the read speed to roughly the read speed of single drive x # of drives in mirror
if you don't need a 4 way mirror try setting up 2x 2way mirrors your write speed should about double

Ok, I will need to learn how to use this ZFS stuff then. I followed the guide and it said to use the mirror option.

Create a pool
Menu pool - create
We wanto to use this pool to store VM,s, so create a mirrored-vdev
(opt. add more mirrors, add a hotspare)

Can you change the setup after it is made?
 
No, once created, a vdev is done.

Just great! good thing I did not make too many VM's for my new setup! :p

I am going to have to learn this ZFS stuff. Since I have 4X 500GB drives the 2x 2way mirror should be perfect and give me some good disk space. Are there any kind of raid 5 options with ZFS for more disk space?
 
Okay, so going with raid10? That would be fine. What do you mean by 'raid 5 options for more disk space'? Also, if you have a large enough spare disk, you can create a vanilla pool, and copy the VMs to it from the old pool?
 
Okay, so going with raid10? That would be fine. What do you mean by 'raid 5 options for more disk space'? Also, if you have a large enough spare disk, you can create a vanilla pool, and copy the VMs to it from the old pool?

Use striping across all 4 of my drives like RAID 5 would do.

I found this explanation of the different levels here http://www.zfsbuild.com/2010/05/26/zfs-raid-levels/
 
I think I'm not understanding this based on your previous comment. Even if raid5 stripes across all 4 drives, so there is no dedicated parity drive, you still only have 3 drives of usable space. The site you referenced does provide a good read on zfs levels.
 
That doesn't look right. raid10 should be faster on reads than 4-disk raidz1. Are you sure you created the pools right?
 
That doesn't look right. raid10 should be faster on reads than 4-disk raidz1. Are you sure you created the pools right?

People in data storage forum said it was good.

I created a mirror of two selected drives then did ADD Vdev and selected mirror with the renaming two drives.

9NzOw.jpg


C90in.jpg
 
Hmmm, I didn't see that post, but it flies against my experience and everything I've read. My pool is 6 600GB sata drives in a 3x2 raid10. Here are my 10GB results:

write: 10737418240 bytes (11 GB) copied, 48.6611 s, 221 MB/s
read: 10737418240 bytes (11 GB) copied, 25.5804 s, 420 MB/s

this is because there are 3 spindles participating in writes (mirrors are effectively one) but 6 in reads, and they are alternated. It is typical from what I've seen that raidz1 has faster writes than raid10 but slower reads. Something is off here...
 
just read the following news:
Integrating KVM into the Illumos Kernel (base of OpenIndiana and next Nexenta)

from http://smartos.org/
SmartOS: The Complete Modern Operating System:

SmartOS incorporates the four most revolutionary OS technologies of the past decade &#8212; Zones, ZFS, DTrace and KVM &#8212; into a single operating system, providing an arbitrarily observable, highly multi-tenant environment built on a reliable, enterprise-grade storage stack. With the introduction of KVM in SmartOS, you no longer have to give up the power of an enterprise-grade operating system in order to run legacy applications and stacks.

SmartOS turns any server into a highly efficient hosting platform for multi-tenant, machine-to-machine, or storage applications. SmartOS offers unique, innovative tools for application developers, service providers and data center operators &#8211; tuned and hardened for modern datacenter deployment....

Sounds very interesting.

more
http://www.joyent.com/products/smartos/smartos-faq/
http://dtrace.org/blogs/bmc/2011/08/15/kvm-on-illumos/
http://www.readwriteweb.com/enterprise/2011/08/joyent-brings-kvm-to-smartos-f.php
 
Last edited:
This is strange, getting some weird console errors(?) after installing vmware tools in Solaris Express 11, and installing napp-it with the vmxnet3s adapter.

Is this normal?

 
Dunno, I'd switch to e1000. I think the vmxnet3 driver for OS is crap, personally.
 
Is it possible to use VMware Fault Tolerance with a setup like this?
So, say, rather than having a totally all-in-one setup, you had the drives in a separate enclosure connected via miniSAS. Then you could use MPIO with dual-port HDDs/backplane and have the same drives hooked up to both servers. You would run the ZFS OS virtual machine on both server in lock-step with FT (and potentially do the same with whatever other VMs you have).
Thoughts on the feasibility and setup of this? :)
 
You have an ESXi server with a shared NFS datastore from a SAN server. The SAN has native access
to disks and SAS controllers. Whatever you can do with such a config build from two separate servers you can
do with an all-in-one.

Main difference: you need only one box and you have a multi-Gb connection between VM's and ESXi internally in software.
With two boxes, you must spend a lot for SAN network hardware to achive the same performance between VM's and SAN.

Connectivity from SAN OS to storage disks is independant from all-in-one or separate configs and the same with both.
 
Last edited:
You have an ESXi server with a shared NFS datastore from a SAN server. The SAN has native access
to disks and SAS controllers. Whatever you can do with such a config build from two separate servers you can
do with an all-in-one.

Main difference: you need only one box and you have a multi-Gb connection between VM's and ESXi internally in software.
With two boxes, you must spend a lot for SAN network hardware to achive the same performance between VM's and SAN.

Connectivity from SAN OS to storage disks is independant from all-in-one or separate configs and the same with both.

OK :)
Well, I'm essentially suggesting a variation of the dual all-in-ones with both servers having access to all the disks (MPIO) and then an active-active VMware FT setup of the Solaris/ZFS OS. So, for the server running the primary VM, it would be virtual NIC access, and for the server running the secondary VM, it would be physical NIC access for storage for its VM's--this would swap in the event of a failure.
Is that workable, i.e. that transparency of disk storage sharing etc.?
 
OK :)
Well, I'm essentially suggesting a variation of the dual all-in-ones with both servers having access to all the disks (MPIO) and then an active-active VMware FT setup of the Solaris/ZFS OS. So, for the server running the primary VM, it would be virtual NIC access, and for the server running the secondary VM, it would be physical NIC access for storage for its VM's--this would swap in the event of a failure.
Is that workable, i.e. that transparency of disk storage sharing etc.?

Hmm?
 
Is it possible to create this solution in a 1 unit rack server : I do have a HP Proliant DL360 G7. I would like to a have virtual machines on it and SAN storage at the same time. If I install ESXI on a flashcard and use the 6 disc slots for a san solution on the controller which is a HP P410i 512 MB RAM Raid Controller. Will that be fast enough for websites ?

The server has: 46 GB Ram
2 x Intel 2.4 Ghz quad core E5620 CPU
4 x 72 GB SAS 15K HP, Disc - will add 2 more for, not sure about the size, maybe cheap discs.
HP P410i 512 MB RAM Raid Controller (Should I buy an extra for the 2 extra discs I will buy or should i no matter what buy a controller card to them all and skip the HP controller ?

Need som inputs on this one. Would like to make this GEA all in a box.

Claus
 
All-In-One (ESXi Server with virtualized high-speed ZFS-SAN solution in a box)
[...]

Hi _Gea,

What kind of speeds are you seeing on your NFS? I'm currently using a separate NAS but speeds are not as good as I was hoping for so I'm debating whether I should move to a virtualized NAS/SAN. Are you using NFS in sync or async mode?

Thanks
 
Back
Top