OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

_Gea · Feb 17, 2011

kyoko said:
Heya Gea, loving Napp-it so far, any chance on a better configuration of the mail system?
Like a better configuration of the smtp server (authentication). Or is that already in there somewhere and I'm looking in the wrong place. I'd love to use my gmail

Thanks !

i have included the missing sasl libraries in current napp-it 0.415g nightly.
it now supports smtp with authentication - at least with my own mailer

Gea

axan · Feb 17, 2011

hmm updated to newest nightly to try the authentication for mail but i get

Code:

No SASL mechanism found
 at /var/web-gui/data/napp-it/CGI/Authen/SASL.pm line 77
 at /usr/perl5/5.8.4/lib/Net/SMTP.pm line 124

axan · Feb 17, 2011

hmm nm, changed the settings to my comcast email account and it works now. Originally i tried my work exchange server.

_Gea · Feb 17, 2011

axan said:
hmm nm, changed the settings to my comcast email account and it works now. Originally i tried my work exchange server.

thanks for the info
seems that the perl sasl modul does not support every auth config.
i will add this as a info at the auth-settings

Gea

dailo · Feb 18, 2011

Wow ZFS has matured quite a bit since I last looked into it. De-duplication support is awesome! I just built a new server to host iSCSI and SMB shares 6 months ago using Linux + Areca controller, but sounds like this might be the way to go for my next build. Anyone have any experiences to share with using this approach for hosting datastores on ESX? I'm currently using SCST with Linux to all of this, but not sure if the OpenSolaris equivalent is just as well performing and how would you setup multiple paths? In Linux if I have two NIC's on the same subnet the traffic will usually just route to one NIC so I have to create separate subnets, not sure if that's the case with Solaris also.

adi · Feb 18, 2011

From the limited open talk about dedup and virtualization, they go great together, if you have lots of virtual machines that are clones/templated. There are lots of enterprise storage solutions that have dedup, and the performance considerations that they talk about can be applied to ZFS, if you have the hardware to back it up.

One post I've used before to talk about dedup:
20 TB of unique data stored in 128K records or more than 1TB of unique data in 8K records would require about 32 GB of physical memory. If you need to store more unique data than what these ratios provide, strongly consider allocating some large read optimized SSD to hold the deduplication table (DDT). The DDT lookups are small random I/Os that are well handled by current generation SSDs.
(from http://hub.opensolaris.org/bin/view/Community+Group+zfs/dedup )

If you don't have the memory and/or SSD caching, dedup can severely hinder performance. For home use you can still try it out, but most likely performance will be horrible. I use Solaris 11 to serve up iSCSI targets to vmware, but have not played with dedup or compression (which can also increase performance) for iSCSI.

In solaris you can bond/aggregate network connections, either for failover or LACP/link aggregation. If you give more details of what services you are trying to spread over multiple network links I can give more details.

_Gea · Feb 18, 2011

dailo said:
Wow ZFS has matured quite a bit since I last looked into it. De-duplication support is awesome! I just built a new server to host iSCSI and SMB shares 6 months ago using Linux + Areca controller, but sounds like this might be the way to go for my next build. Anyone have any experiences to share with using this approach for hosting datastores on ESX? I'm currently using SCST with Linux to all of this, but not sure if the OpenSolaris equivalent is just as well performing and how would you setup multiple paths? In Linux if I have two NIC's on the same subnet the traffic will usually just route to one NIC so I have to create separate subnets, not sure if that's the case with Solaris also.

you should read/ google about the Sun/Oracle Solaris technologies like build-in crossbow virtual switches (comparable to the virtual network switch options in ESXi) and Comstar, the built-in Enterprise ready iSCSI stack.

example:
http://hub.opensolaris.org/bin/view/Project+crossbow/Docs
http://download.oracle.com/docs/cd/E19963-01/821-1459/fncoz/index.html

Gea

Red Falcon · Feb 18, 2011

_Gea said:
*Solaris* is Sparc and Intel only.
PPC was EOL year ago, Why would you do that?
ZFS is high end, power and ram hungry and not the right
thing to reuse old hardware.

PPC is still in use in the PS3.

I know it's old, I just had some systems I might be able to experiment with that were the PPC architecture, but if not, I might be able to get my hands on some SPARC systems.

I guess I like to take RISCs.

dailo · Feb 19, 2011

Thanks for all the articles. Right now I'm not sure exactly what I want because I'm building out new servers and am playing with a few options. I was planning on buying 3 new servers and making two an ESXi cluster and have all their storage come from the third server which will be a ZFS server with a few pools. I was thinking of using a mirrored or maybe a raidz pool (depending on performance) and then make another raidz pool to serve as a place to store backups. I was thinking of using dedup more for the backup pool because I don't really need the performance for storing backups and probably wouldn't use it on the ESXi side because I wouldn't want to lose performance. Most of my applications on the ESX side aren't very I/O demanding so I figured I could get away with hosting everything off iSCSI assuming I can get the performance to host ~50 VM's running at the same time.

Reading up more on the snapshot capability though, it seems like going NFS might make it easier to restore individual VM's. I guess I could just as easily snapshot the entire iSCSI LUN, clone it and then present it back to the ESX server since NFS doesn't perform as well. So many options! This is a project that probably won't happen until a few months from now so I have plenty of time to decide, but I just installed Solaris 11 and napp-it and it is very easy to use. Much easier than my current setup (Areca RAID, Centos and SCST) to serve out iSCSI/Samba shares. With all the savings from not buying the Areca cards I might be able to even buy a fourth server! Look forward to playing around more with all of this and will probably come back with more questions

_Gea · Feb 19, 2011

dailo said:
Thanks for all the articles. Right now I'm not sure exactly what I want because I'm building out new servers and am playing with a few options. I was planning on buying 3 new servers and making two an ESXi cluster and have all their storage come from the third server which will be a ZFS server with a few pools. I was thinking of using a mirrored or maybe a raidz pool (depending on performance) and then make another raidz pool to serve as a place to store backups. I was thinking of using dedup more for the backup pool because I don't really need the performance for storing backups and probably wouldn't use it on the ESXi side because I wouldn't want to lose performance. Most of my applications on the ESX side aren't very I/O demanding so I figured I could get away with hosting everything off iSCSI assuming I can get the performance to host ~50 VM's running at the same time.

Reading up more on the snapshot capability though, it seems like going NFS might make it easier to restore individual VM's. I guess I could just as easily snapshot the entire iSCSI LUN, clone it and then present it back to the ESX server since NFS doesn't perform as well. So many options! This is a project that probably won't happen until a few months from now so I have plenty of time to decide, but I just installed Solaris 11 and napp-it and it is very easy to use. Much easier than my current setup (Areca RAID, Centos and SCST) to serve out iSCSI/Samba shares. With all the savings from not buying the Areca cards I might be able to even buy a fourth server! Look forward to playing around more with all of this and will probably come back with more questions

hello dailo

some thoughts about your configs

about concept.
i would always build pairs of ESXi and storage servers
(redundancy) so 2 x ESXi + 2 x Storage (NFS/iSCSI

in my own config i added two tiered backup systems

All-In-One (ESXi with integrated, virtualized Storage Server
you can consider to use All-In-One systems
with modern hardware, pass-through and a lot of RAM
you can also virtualize the SAN-storage server within a ESXi box

Advantage: less systems, less power, less cablings, better reliability
and better performance due to internal 10 Gbe high speed links with
esxi vmxnet 3 net-drivers.

This config is best with 8 core 5520 systems and > 24 GB RAM
(i use 48 GB RAM to have enough RAM for VM's and storage server)

Option: 3 x All-In-One
box1, box2 are the production system, box 3 the common backup
backup can do all the tasks also

more: http://napp-it.org/napp-it/all-in-one/index_en.html

about iSCSI
iSCSI and NFS are about the same speed. My problem with iSCSI is that
it is a block based disk. You could only go back to a snap with the whole drive.

NFS is file-based. I share it in parallel with SMB. If i need a snap to copy/clone,
i can do it from Wndows with revious version.
-> always keep it easy

about cases
in case of problems (crash of a system), you can restore a backup or boot
up a back-upped system on your second server. It is possible to keep two ZFS
nearly in sync, so its a way.

personally, i do backups, but last time, had a crash of my All-in-one sytem
(boot-drive problem), i pulled the entire ZFS-pool and plugged t in into the second system,
imported the pool, reshared it, imported the datastore and bootet the VM's-

its uncomplicated, has few error options, none are fatal, and if you have done it a few time,
done in five minutes - and you have the original data - not time delayed backups.

-> use always cases with enough free slots to hold a pool from another system

about pools for ESXi
use always raid-10, (or more stripes of mirrors) add ssd read caches, optionally ssd write-cache mirrors
do not use dedup or compress
i use ssd only pools

use dedup, compress and raid-z for backup systems

Gea

_Gea · Feb 19, 2011

All-In-One

ESXi barebone Hypervisor + included ZFS NAS/ SAN Storage Server
+ other guests like Windows, Linux and others in one box

see draft copy
part 2 of my ESXi all-in-one mini HowTo
http://www.napp-it.org/doc/downloads/all-in-one.pdf
please report errors

from the content:

use case: multi user/ multiple os/ server use/ mostly always on

You can run typical server applications on your personal computer either on its base-os or virtualize some other operating systems on top of your base os with a type-2 hy- pervisor. But mostly, they need to run 24/7 or they have special demands in terms of data or application security, available cpu or ram resources or they need direct hardware access. So a lot of people need one or more extra „servers„ to do the following.

Storage:

You can use a computer as a storage system or as an archive or backup-system for your business data, your media files or any other sort of data with high demands to data security. Snapshots allows to go back to former versions of your data. The abilty to expand capacity without problems is mandatory. Indeed the storage part is the base and most important aspect of all server tasks. In case of a crash, you can reinstall software without problems but your data is lost if you do not have a most current and working backup.

Applications

For a working IT infrastructure, you need a lot of common server applications like web, mail, dns, dhcp, vpn, firewall, database or special applications like a render server. Very often, you want to isolate these applications from others because of security reasons or they have to run on different operating systems. You either can use a lot of compu- ters. Usually you virtualize all these systems. You need features like best resource usage, isolation, security and availability.
Developer use,Test installations and Education
In these use cases, you need virtualization. It should support all or nearly all available operating systems. A lot of bootable system snapshots to go back to former states are always needed. You need a good memory and resource sharing mechanism to have these systems running at adequate performance especially if you do not have dozens of GB Ram.

Home Use

Beside the above use cases, you may have a dlna media server to distribute your media library to your tv or other audio equipment .You may have a vdr digital sat or cable tuner and videorecorder to share live tv. You may have a torrent client running or a lokal mailserver to collect mails from different accounts. The problem with these use ca- ses. Your prefferred dlna media server is running with Windows, your preferred vdr server is linux. You can use a computer for each task or you can virtualize.

napp-it All-In-One (ESXi-Server with virtualized NAS/ SAN Storage Server )
how many computers do I need and why - an overwiew use case: single user/ single os/ personal use /powered on when needed

You can use a modern computer for a lot of tasks. You can use it with your office or business apps, you can layout documents and create media files, you can use it to play games, listen music, play movies or browse the internet. For these tasks you have a desktop computer, a laptop, a netbook, a tablet, a media client or an organizer. Each of these systems has its own operating system.

Virtualization with a type-1 hypervisor

A type-1 hypervisor like VMware ESXi (the alternative is XEN) is also often called a barebone hypervisor. Its a small mini-OS, under 100 MB in size, only used to do the virtualization task. Think of it more as an extended bios than an operating system. You can boot it from a 4 GB disk or usb stick. On top of such a hypervisor you can run your operating systems as virtual machines. Access to real hardware is possible by a set of common used emulated hardware drivers, delivered by the hypervisor or exclusively by a single guest via pass-through on vt-d capable hardware. Virtual Machines are stored in disk-image files either on a local harddisk or on a NAS/Sanserver via NFS or iSCSI to access them from different machines. You can boot these virtual machines without problems on any other ESXi -server, independant from real used hardware (beside vt-d options). Hardware resources like CPU, RAM and Disk capacity can be allocated individually between running guests. One Virtual Machines is running with nearly the same performance than it would be without hypervisor beside the video-adapter. (videoadapter-sharing is on development). If you have more than one Guest, resources are shared optimally. Because of low Graphic-performance, its currently not usefull for deskop use but best for servers. The only settings, you can enter or change locally on a ESXi box are IP adress, used network adapter for set your admin-password.
To manage a ESXi server, you need Windows. After installing ESXi, you can connect from a Windows Browser to download and install the Windows Managment Software vsphere. With vsphere, you can create and manage virtual network switches, virtual machines, allocate resources and mount NFS or iSCSI datastores or remote control this machine. If you start a copied virtual machine, you are asked if it was copied or moved. Thats all - no hardware dependencies- and these basic features are free.

How to build a ALL-In-One system (ESXi with embedded NAS/SAN Server + other virtual machines) based on ESXi and OpenIndiana
...

Gea

p3n · Feb 19, 2011

I think I love you <3

Got tired of esxi + freebsd zfs install since my PSU failed and the VMs got screwed up ... just imported and upgraded my zpool and all seems to be peachy! Going to try virtualbox at some point for messing around with VMs but this is definitely better for all my NAS needs - thankyou!

Now anyone else got sabnzbd+ running on Solaris Ex 11? Not tried yet but setting a static IP was fun so daemonising that probably requires a virgin sacrifice or something!

dailo · Feb 19, 2011

Thanks for all the advice! I was considering the all in one option, but the problem is what do I do when I want to upgrade my ESX servers? Only reason I wanted to have a separate iSCSI machine was so I could vmotion my VM's to another box while I service one and them move them back. The other reason why I wanted to have it on a seperate box was cost. Most of my VM's once powered on use very little I/O, so I was thinking I wouldn't really need to create that many pools. With the all in one route I would need to create a ZFS VM on each box, which is potentially safer since if my one ZFS box goes down I just took out my entire ESXi cluster.

I could see how going with the all in one solution would show negligible difference between NFS and iSCSI using the vmxnet driver, but from my experience NFS always showed higher latency compared to iSCSI in my lab when presenting data store from the same disk. Perhaps more tuning can be done on my side to prevent that, but I was playing with the windows previous version and the time slider in Solaris 11 so I think NFS is going to be the best option.

As for the pools I didn't know raid10 was a ZFS option (still more reading to do), but if it does support it then that's what I would definitely do since that's my current datastore configuration with my Areca. Since I can't afford an entire pool of SSD's, I would probably just have at least one SSD as a disk cache and probably have a pool of 6 Hitachi 7K3000 in raid10. Might have two raid10 pools depending on how many VM's I need to end up making. Then my last pool will be raidz with Hitachi 7K3000 that will hold all of my backups from NetBackup with dedup and compress enabled. Can you create ZFS snapshots onto a different pool and would it be recommended? Not really sure if I would need to since I should have plenty of space in my pools, but just curious if you would ever want to store the snapshot from your raid10 pool to your raidz pool.

Right now my typical build usually consists of:
Norco 20/24 bay case
Supermicro MB
2 x Xeon quad core
32 GB ECC memory
Areca 1880
1 x SSD for boot
6 x Hitachi 2TB disks

Gea - In all of your examples you mention that you are using Nexenta. Any reason why you chose to use Nexenta over Solaris or any other ZFS OS?

_Gea · Feb 19, 2011

@dailo

All-In-One is not the best choice in any case.
but there are a lot of ways to handle the esxi-updates

how i have done a pool-move + vm-move

i have always bought my boxes (16 x 3,5" or 24 x 2,5" backplaned) with enough free slots
to additionally plug in complete pools. Last week for ex. i had a crash of one of my All-In-Ones
(Boot Raid error). I then pulled the complete pool, plugged it in into another All-In-One,
imported the pool, shared it via NFS, imported the nfs datastore in ESXi, added them
to inventory and bootet them up without any problem (all machines have he same vlans)

time: 10 min, so if you have a 10 min support windows, it's not a problem
(you need enough free disk slots or a spare machine)

ps: i use free version of ESXi only

about NFS

i do not expect a lot of performance difference betwen NFS and iSCSI.
The first is much more easy and you have file access to clone/ copy a vm even a ZFS-snapped via SMB

about Raid-10

ZFS does not really talk about Raid-10, it is much more fexible
you could create a pool from a mirrored vdev -> mirror
if you add a second mirror -> Raid-10
if you add a third mirror -> Raid ??

each mirror adds capacity and performance
.. and so on

Emulsifide · Feb 20, 2011

_Gea,

Great interface you've written here. I've been using it for about a month now with my setup. Thank you for adding authenticated SMTP support. My only suggestion at this point is to add an update button under the system tab. Call me lazy, but I'd really love to have a button for that.

_Gea · Feb 21, 2011

Emulsifide said:
_Gea,

Great interface you've written here. I've been using it for about a month now with my setup. Thank you for adding authenticated SMTP support. My only suggestion at this point is to add an update button under the system tab. Call me lazy, but I'd really love to have a button for that.

Udate of napp-it menus will be included in one of the next versions as an option in system-menu

Gea

ChrisBenn · Feb 21, 2011

Just wanted to say thanks for the great interface _Gea - I just migrated over from a WHS v1 setup to a Solaris 11/ZFS setup, and your package made it a snap.

Everything seems to be up and working well, just wanted to run the Bonnie++ number by people here to see if they looks reasonable

Server is a Xeon 3440/16Gb (Dell T110)
pool "datas" is a raidz with 5 hitachi 7k3000 2Tb drives. (on the intel 3400 chipset controller)
L2Arc is a 60Gb partition on a X25-M (originally intended to install to USB drives (mirror) and use the entire drive as a cache, but boot was taking ~30 min on the USB (install was 12 hours), so now the other 20Gb partition is the boot drive/rpool).

Code:

NAME 	SIZE 	Bonnie 	Date(y.m.d) 	File 	Seq-Wr-Chr 	%CPU 	Seq-Wr-Blck 	%CPU 	Seq-Rewr 	%CPU
datas 	9.06T 	start 	2011.02.21 	32G 	58 MB/s 	99 	341 MB/s 	36 	206 MB/s 	31 

Seq-Rd-Chr 	%CPU 	Seq-Rd-Blck 	%CPU 	Rnd Seeks 	%CPU 	Files 	Seq-Create 	Rnd-Create
46 MB/s 	99 	482 MB/s 	25 	1058.8/s 	3 	16 	+++++/s 	29007/s

Thachief · Feb 21, 2011

How do you address the issue of fragmentation with ZFS?

Is it something that happens more so than a traditional file system or have any of you with experience encountered fragmentation issues?

If you have...how do you deal with them.

Assume a decent pool size of 20TB of which 10TB are in use with most of this being media storage and lets say 2TB of that the current 10TB faclitates extremly heavy VM usage. What kind of fragmentation would one assume to encounter.

Also have any of you considered passing through a PCI-Express SSD for ARC and L2ARC in virtulized OpenSolaris etc?

It should be clearly obvious to everyone that an ESX-i "all-in-one" solution using a EVGA SR-2 would be "one system to rule them all" for something like this.

You could pass the following:

http://www.newegg.com/Product/Product.aspx?Item=N82E16816118134

To a virtualized ZFS VM of your choosing and along with 10GbE VMXNET3 NIC's create a completly virtualized storage solution that could rival the throuput of a $500,000 EMC SAN solution.

I'm sold if somone can educate me on the fragementation pitfalls of ZFS.

Mindflux · Feb 21, 2011

http://www.mail-archive.com/[email protected]/msg28153.html for more on ZFS fragmentation.

Thachief · Feb 21, 2011

Good read thanks - couple of questions.

Recently zfs has added support for using a SSD as a synchronous write log, and this allows zfs to turn synchronous writes into more ordinary writes which can be written more intelligently while returning to the user with minimal latency.

Which builds are applicable here and support this?

It is said that a ZFS build should have 2GBs of RAM per 1TB of storage - is this just for "deduping" pourposes or does the file system require this to address and allocate pool pointers.

I looked but could not find anything on how big your L2ARC or cache vdevs (SSD 's in this case) should be per how many ever TB's of storage - Does anyone know.

Thanks for the spoon in advance this is interesting stuff.

_Gea · Feb 21, 2011

minihow-To part 02:
How to build a high speed All-In-One Server
(ESXi VM-Server + virtualized ZFS Storage Server in a box)

download:
http://www.napp-it.org/doc/downloads/all-in-one.pdf

@thachief

Thachief said:
How do you address the issue of fragmentation with ZFS?

Is it something that happens more so than a traditional file system or have any of you with experience encountered fragmentation issues?

If you have...how do you deal with them.

Assume a decent pool size of 20TB of which 10TB are in use with most of this being media storage and lets say 2TB of that the current 10TB faclitates extremly heavy VM usage. What kind of fragmentation would one assume to encounter.

Also have any of you considered passing through a PCI-Express SSD for ARC and L2ARC in virtulized OpenSolaris etc?

It should be clearly obvious to everyone that an ESX-i "all-in-one" solution using a EVGA SR-2 would be "one system to rule them all" for something like this.

You could pass the following:

http://www.newegg.com/Product/Product.aspx?Item=N82E16816118134

To a virtualized ZFS VM of your choosing and along with 10GbE VMXNET3 NIC's create a completly virtualized storage solution that could rival the throuput of a $500,000 EMC SAN solution.

I'm sold if somone can educate me on the fragementation pitfalls of ZFS.

from what i have heard until now, defragmentation seems not a real problem with ZFS- at least if your pool is lets say below 80% of its capacity.

Performance is more a matter of pool design like use striped mirrors only for performance or use as much RAM as possible.

Passing through any storage adapter is only a problem if ESXi has a problem with a special controller. (when I first tried, ESXi claimed only a few as supported, i was happy that the LSI 1068e based are amomg )

read more:
http://www.solarisinternals.com/wiki/index.php/Solaris_Internals_and_Performance_FAQ

Gea

Thachief · Feb 21, 2011

As a general rule a busy 2008 Server R2 will consume around 15-30 IOPS how many IOPS would you estimate the actual ZFS virtual appliance itself will demand under lets say a busy load.

Edit*

How much space did you allocate in general for the ZFS virtual appliance.

dailo · Feb 22, 2011

@Gea

Have you had any problems exposing the pools to systems outside of the ESXi environment? I'd imagine the bottleneck would just be the network, but not sure if there is anything I'm missing. I don't want to create a pool on each ESXi server because I don't need all the space so I was thinking I could just create one large ESXi box where the ZFS VM all other I/O intensive VMs would run go and then I can create two smaller ESXi boxes that will have datastores hosted from the ZFS VM.

_Gea · Feb 22, 2011

Thachief said:
As a general rule a busy 2008 Server R2 will consume around 15-30 IOPS how many IOPS would you estimate the actual ZFS virtual appliance itself will demand under lets say a busy load.

Edit*

How much space did you allocate in general for the ZFS virtual appliance.

about IOPS

Your ZFS-OS has physical access via passthrough to your disk controller and disks where you build your pool. You get 100% of the performance compared to a dedicated storage server. The storage OS-VM itself is located on the local ESXi datastore. Beside start-up, there is only minimal load on this datastore.

So if you count needed IOPs you must count only your other guests like your Windows or Database -Servers.

High IOPs, you will get usually with a lot of fast spindels on raid-10+ pools. With ZFS you could create a mirror and add as much mirrors as you need so you can go far ahead of o pure raid-10 in sequential read/ writes and IOPs.

For best IOPs, fare beyond pure disk-pools, you may use ZFS Hybrid pools. Just add a fast SSD (MLC, 50 GB+) as a read-cache and a mirrored SSD (2 X RAM+, best are SLC or DRAM based ones) as cache for syncronous writes and your IOPS are jumping from a few hundred to a few thousand.
read more about Hybrid-storage: ftp://download.intel.com/design/flash/NAND/SolarisZFS_SolutionBrief.pdf

For very best IOPs, use SSD-only pools for VM's (all my newer pools are based on Sandforce 120 GB MLC ssd's
- don't forget spare drives - and yes, i intend to through them out in three years)

about needed space for VM's

If you use NexentaCore or a OI/SE11 text-only-install, 5 GB are minimal size, 10 is suggested
If you use the Live version with TimeSlider, you should use at least 15 GB

Gea

_Gea · Feb 22, 2011

dailo said:
@Gea

Have you had any problems exposing the pools to systems outside of the ESXi environment? I'd imagine the bottleneck would just be the network, but not sure if there is anything I'm missing. I don't want to create a pool on each ESXi server because I don't need all the space so I was thinking I could just create one large ESXi box where the ZFS VM all other I/O intensive VMs would run go and then I can create two smaller ESXi boxes that will have datastores hosted from the ZFS VM.

about All-In-One and networks

Old style

With a conventional scenario you have ESXi Servers and separate storage servers. You often have a dedicated SAS-Storage Network to connect your Servers,
SAN- and Backup-Servers (FC or IP). Additionally you have a LAN for your Intranet and WAN/ Internet-Network. To connect these networks, you use fc and ip-switches.

All-In-One style

With all-in One you also use switches with separate networks, ESXi-Servers and Storage servers. You use virtual vlan-switches,
build from ESXi features to manage internal datatransfers on these networks (ex. SAN, LAN, Internet)

In this example, you could use either three virtual switches each connected to on of three physical Network Adapters to handle these networks internally and externally,
or you could use one virtual switch with vlans

Use vlans

I have done this via vlans. I have connected each of my All-In-One servers with one CX4 10 GBe tagged vlan
(with vlans SAN, LAN, WAN) to a HP Procurve Switch to switch all Lans over one high-speed uplink.

Within the All-In-One Server, the 10 GBe Network Adapter is connected to a ESXi virtual switch with the needed vlans defined.
On my Guests i have created up to three virtual Nics and connected them to the needed vlan. If you use the vmxnet3 network driver, you guests are connected via 10 GBe Links.

external view

From any other Computer, connected to one of these three networks, you can see and connect all other computers, connected to this network, wether they are within All-In-One or on dedicated servers.

Gea

Thachief · Feb 22, 2011

Gea,

Your a gentleman and a scholar - exactly what I needed to know.

I had purchased an SR-2 for a new ESX monster.

I now have something to do with the vast number of unused PCI-E slots.

ZFS say hello to OCZ PCI-E SSD's - I will post benchmarks when I am finished with the build.

moose517 · Feb 22, 2011

My current file server is a supermicro that has the intel 3210 chipset, and i have a pair of supermicro AOC-SAT2-MV8 SATA controllers, how well will these play with something like solaris express 11 in an ESXi enviroment?

dailo · Feb 22, 2011

Here is the build I am planning. I noticed in your builds that you didn't use any SAS expanders, anything to be concerned about using SAS expanders? I thought I could use the HP SAS expander to easily get support for 24 drives with dual linking support or should I just buy additional controllers?

Motherboard: X8DTi-F
CPU: 2 x Intel Xeon E5620
Memory: 48GB - 6 x 8GB - KVR1 333D3D4R9S
NIC: Intel Dual-port Gig E (iSCSI VLAN)
SAS Controller: Intel RS2WC080
SAS Expander : HP SAS Expander
Case: Norco 4244
Data Disks: 5 x Hitachi 2TB 7K3000 (Backup pool)
Data Disks: 4 x Hitachi 2TB 7K3000 (ESXi Pool)
Cache Disk: OCZ PCI SSD 120GB
Power Suppy: Corsair AX750

I am going to with the All-in-one approach and create a VM with 24GB memory and OpenIndiana. Then I will make a pool with two sets of mirrors for my VM guests and another pool in raidz for doing backups. The SSD disk will be allocated to the ESXi pool as a write cache disk. For the backup pool I'll create SMB shares so users can backup their data there. As the demand for capacity grows I plan on just adding more vdevs to either pool until all 24 disk slots are used.

Firebug24k · Feb 22, 2011

If you haven't read up on the issues with dedup, google "ZFS dedup performance" and you'll see what I'm talking about. I thought I knew about the problems so I just made a duplicate copy using ZFS SEND and ZFS RECEIVE of about 1TB of data, with DEDUP turned on for the second copy.... it eventually slowed down to about 5MB/sec (from my normal performance of around 100MB/sec), so I thought... no problem, I'll delete the new copy and move on. Except I've been deleting this copy for the last eight hours, and it's still deleting. In the meantime my entire server is unusable.

There is a story on the ZFS.DISCUSS mailing list of someone who lost their server for NINE DAYS doing a ZFS delete of dedup-ed data. Ouch.

Be very careful with dedup if you don't have a seriously hardcore machine (I'm talking 16GB ram MINIMUM, discussions on the mailing list often involve 32...64!!!). I'm using 6GB of ram + 32GB SSD L2ARC + dual core 3.6GHZ CPU, and it obviously couldn't handle even 1TB of data.

Lesson learned, I hope I get my server back soon.

EDIT: DAILO, you might be able to run dedup with good performance

_Gea · Feb 23, 2011

moose517 said:
My current file server is a supermicro that has the intel 3210 chipset, and i have a pair of supermicro AOC-SAT2-MV8 SATA controllers, how well will these play with something like solaris express 11 in an ESXi enviroment?

i supose, it may work with Solaris Express. (have not tried by myself)
http://www.sun.com/bigadmin/hcl/data/components/details/3009.html

If you want to virtualize it within ESXi and pass-trough you need vt-d
(may be included with your chipset) and you have to try if you get troubles
with your pci-x mv-8 cards

You have a chance, try it

Gea

_Gea · Feb 23, 2011

dailo said:
Here is the build I am planning. I noticed in your builds that you didn't use any SAS expanders, anything to be concerned about using SAS expanders? I thought I could use the HP SAS expander to easily get support for 24 drives with dual linking support or should I just buy additional controllers?

Motherboard: X8DTi-F
CPU: 2 x Intel Xeon E5620
Memory: 48GB - 6 x 8GB - KVR1 333D3D4R9S
NIC: Intel Dual-port Gig E (iSCSI VLAN)
SAS Controller: Intel RS2WC080
SAS Expander : HP SAS Expander
Case: Norco 4244
Data Disks: 5 x Hitachi 2TB 7K3000 (Backup pool)
Data Disks: 4 x Hitachi 2TB 7K3000 (ESXi Pool)
Cache Disk: OCZ PCI SSD 120GB
Power Suppy: Corsair AX750

I am going to with the All-in-one approach and create a VM with 24GB memory and OpenIndiana. Then I will make a pool with two sets of mirrors for my VM guests and another pool in raidz for doing backups. The SSD disk will be allocated to the ESXi pool as a write cache disk. For the backup pool I'll create SMB shares so users can backup their data there. As the demand for capacity grows I plan on just adding more vdevs to either pool until all 24 disk slots are used.

about Expanders:
Expanders are good if you need them and you use only Hardware that works well together.
But It's not always trouble free. The HP-Expander thread here is on of the largest ever - have a look at.

If you can avoid an expander - avoid it.
If you use 3 x 8port SAS Adapter, its not more expensive than a expander solution, but more trouble free
and faster - Eventually use 3 x 1068E Adapters instead. Your drives do not need SAS2 Adapters.
You may use SAS1, they are cheaper and more easy to use (show drive id instead of port independant disk WWN's -
hard to identify a drive). see http://www.nexenta.org/boards/1/topics/823

If you use 3 x SAS adapters, you have already one slot left for a further 10 GBe Card.
Or use 1 x LSI 2008 and 2 x LSI 1068 if you want to have a SAS2 option.

read also:
http://www.nexenta.org/issues/214

From Garrett D'Amore (illumos) about a problem with LSI SAS Expanders:
"So, our investigation shows that the most common source of these problems is due to a problem in the expander itself -- scrub and resilver operations particularly trigger the problem, which appears to be due to excessive amounts of small synchronous I/O operations (lots of cache syncs).
Best advice: get rid of SAS/SATA expanders and move to pure SAS or pure SATA."

Gea

_Gea · Feb 23, 2011

Firebug24k said:
If you haven't read up on the issues with dedup, google "ZFS dedup performance" and you'll see what I'm talking about. I thought I knew about the problems so I just made a duplicate copy using ZFS SEND and ZFS RECEIVE of about 1TB of data, with DEDUP turned on for the second copy.... it eventually slowed down to about 5MB/sec (from my normal performance of around 100MB/sec), so I thought... no problem, I'll delete the new copy and move on. Except I've been deleting this copy for the last eight hours, and it's still deleting. In the meantime my entire server is unusable.

There is a story on the ZFS.DISCUSS mailing list of someone who lost their server for NINE DAYS doing a ZFS delete of dedup-ed data. Ouch.

Be very careful with dedup if you don't have a seriously hardcore machine (I'm talking 16GB ram MINIMUM, discussions on the mailing list often involve 32...64!!!). I'm using 6GB of ram + 32GB SSD L2ARC + dual core 3.6GHZ CPU, and it obviously couldn't handle even 1TB of data.

Lesson learned, I hope I get my server back soon.

EDIT: DAILO, you might be able to run dedup with good performance

yes, dedup could become really slow, especially when deleting snaps or ZFS-folders.
But remember: block based deduplication is a high end feature not a feature for low end servers.

most of the absolute worse stories are from the first developer releases, supporting dedup or with low RAM.
With current versions like OpenIndiana or Solaris Express, i would say it is usable if:

- you really really need it (costs always performance)
- you have ADDITIONAL 2,5 GB RAM per TB data
- you must also add a SSD Read-Cache Drive

otherwise you have to sort out deduplication tables from disk
- reason because it could last days to delete a snap.

better way:
-> disks are cheap, buy more disks

Gea

dailo · Feb 23, 2011

Thanks I'll stick with just buying adapters. Don't want to deal with any problems so sounds like going with a few LSI 1068s is a better idea. Only reason why I was going to go with the 2008 is because the 7K 3000 drives support a 6 Gb interface, but to your point I don't know if the drives are fast enough to really get any benefit.

If you create a single write cache disk and that disk dies what happens? I did some google searching and found some articles that said if this happens I would lose the entire pool or I wouldn't be able to import the pool again. I'll probably end up doing a write cache mirror instead, but was just curious what would happen.

_Gea · Feb 23, 2011

dailo said:
If you create a single write cache disk and that disk dies what happens? I did some google searching and found some articles that said if this happens I would lose the entire pool or I wouldn't be able to import the pool again. I'll probably end up doing a write cache mirror instead, but was just curious what would happen.

about write cache failures

i tried it with NexentaCore (ZFS v.26).
The pool was accessable and working after a failure. But i can*t import such a pool.
I suppose its the same with OpenIndiana and Solaris Express 11.

So its potentially dangerous (i use ZFS due to its data-security).
I would only use mirrored write-cache drives.

for myself, i do not use SSD write-caches any more.

In those cases, (like NFS Storage for ESXi or databases) where a write-cache for
syncronous writes would be a good thing, I also want good sequental transfer rates
and short access time. In these cases, pools do not need to be very large.
I use SSD-only pools in these cases. (Far faster than disk-pools with cache drives,
not with sequential rates but with IOPs and access time)

On pools for backup- or mediaservers, a write-cache does not help but a single 50GB+ read-cache is a nice thing and it can fail without problem.

Gea

dailo · Feb 23, 2011

Thanks Gea! Unfortunately, my ESX pools need to be large and I doubt I'll have that much budget for a TB of SSD. My current ESX RAID10 datastore is already at 2TB with thin provisioning enabled on every VM. My current write times with a 4 disk RAID10 on my Areca 1680 is decent without any sort of caching (I guess a little by the controller) so I may try it without the SSD first and add it later if performance sucks. I forgot about the reads on the backup pool, that's a great idea

.

I'm pretty excited to build out a few of these machines, just hope I can get the budget for it!

_Gea · Feb 25, 2011

i have updated my mini-Howto's with a FAQ and suggestions about howto setup the ESXi virtual switch

http://www.napp-it.org/doc/downloads/napp-it.pdf
http://www.napp-it.org/doc/downloads/all-in-one.pdf

Gea

danswartz · Feb 26, 2011

Hi, Gea. I have tried installing napp-it on both openindiana and solaris express 11. In both cases, the install basically worked but some commands were not set up correctly. This seems to be due to the fact that vanilla installs of both OS's do not have any compilers loaded

I know my way around Linux and FreeBSD, but have pretty much zero solaris knowledge. Can you clue me in to how to fix this? Thanks!

_Gea · Feb 26, 2011

danswartz said:
Hi, Gea. I have tried installing napp-it on both openindiana and solaris express 11. In both cases, the install basically worked but some commands were not set up correctly. This seems to be due to the fact that vanilla installs of both OS's do not have any compilers loaded I know my way around Linux and FreeBSD, but have pretty much zero solaris knowledge. Can you clue me in to how to fix this? Thanks!

maybee you will find something at:

http://gcc.gnu.org/install/binaries.html
http://www.oracle.com/technetwork/server-storage/solarisstudio/overview/index.html
http://www.c0t0d0s0.org/pages/lksfbook.html

Gea

TheLastBoyscout · Feb 26, 2011

_Gea said:
maybe you will find something at:
http://www.c0t0d0s0.org/pages/lksfbook.html

Hm, does sound really interesting, but downloading the document does not work for me:

Code:

Forbidden
You don't have permission to access /lksfbook/lksfbook_21022010.pdf on this server.
Apache/2.2.9 (Debian) DAV/2 SVN/1.5.1 proxy_html/3.0.0 mod_ssl/2.2.9 OpenSSL/0.9.8g Server at www.c0t0d0s0.org Port 80

Anyone else having success?

TLB

dailo · Feb 26, 2011

Are you running the install as the root user? When I installed napp-it on Solaris 11 and Openindiana the installer automatically downloaded and installed the necessary compilers for me. Might help if you post some of the error messages.

OpenSolaris derived ZFS NAS/ SAN (OmniOS, OpenIndiana, Solaris and napp-it)

Supreme [H]ardness

[H]ard|Gawd

[H]ard|Gawd

Supreme [H]ardness

n00b

Limp Gawd

Supreme [H]ardness

[H]ard DCOTM December 2023

n00b

Supreme [H]ardness

Supreme [H]ardness

Gawd

n00b

Supreme [H]ardness

n00b

Supreme [H]ardness

Limp Gawd

n00b

Limp Gawd

n00b

Supreme [H]ardness

n00b

n00b

Supreme [H]ardness

Supreme [H]ardness

n00b

Gawd

n00b

Weaksauce

Supreme [H]ardness

Supreme [H]ardness

Supreme [H]ardness

n00b

Supreme [H]ardness

n00b

Supreme [H]ardness

2[H]4U

Supreme [H]ardness

Limp Gawd

n00b