My ZFS / Nexenta HA cluster build

femi · Apr 13, 2012

stevebaynet's the man, he can set u straight on the slotmap issue

Zeus DRAM, SLC or MLC SSD? wich one?

Who is your supplier?

stevebaynet · Apr 13, 2012

I'd like to think the lines between home and data center are blurred now a days with how cheap you can get certain parts + open source. I'd love to set something up like this for my home media server as i have seen others on here do, i'd just have to get some different parts obviously as all this stuff is LOUD, heh.

As for generating a slotmap. You can have Nexenta do it for you (sans the fancy image unless you are using a supported JBOD) by logging into the console and running:

Code:

setup lun slotmap

It will then blink each drive and ask you which slot its in. very handy when trying to locate a failed drive. This is in the manual, but it essentially just says to run the above command.

If you are running some sort of small norco box or something similiar, you could then just snap a pic of the front, add some #'s next to each bay and upload it to /var/lib/nza/slotmap.jpg and it will show above the slotmap grid/text.

femi · Apr 14, 2012

Noob ZFS question (I was playing around in NMV)

let's say you have 1 head + 2 JBODs
JBOD1: 8 x 1TB SAS 2.5"
JBOD2: 8 x 2TB SAS 3.5"
Since we have 2 different types of drives, I assume (best practice) they should not be in the same volume?
With my "noob" thinking, I convinced myself that you could "share" ZIL and L2ARC between both JBODs

You need one set of ZIL & L2ARC devices for JBOD1 and another set for JBOD2. Is that correct?

Stanza33 · Apr 14, 2012

ZIL and ARC are per a pool basis

So if both jbods are in the same pool... Then no you don't need one for each jbod.

.

femi · Apr 14, 2012

But you don't really want different drive types in the same pool?

Stanza33 · Apr 15, 2012

Different drive types in a vdev no, but a pool can be made up of several vdevs

Eg a pool made up of say

vdev raidz of 6 x 1tb drives
vdev raidz of 6 x 500gb drives
vdev raidz of 6 x 1.5tb drives
Making a pool of 14tb

ZFS will decide how to lay the stripes across the lot, with the ZIL and the ARC caching the whole pool.

Later you can swap out the 500's for larger drives when money allows and set the autoexpand option on to grow the pool.

.

femi · Apr 15, 2012

Yep, that makes sense. Tx for the info

Just realized my design has a slight flaw, 2 of my jbods are 6 gig drives the 3rd jbod has 3 gig drives

femi · Apr 16, 2012

stevebaynet,

any updates on your performance testing?

stevebaynet · Apr 16, 2012

nada as of rightnow, just the CrystalDiskMark tests i ran on a few VM's connected via NFS (a page or two ago in this thread). Results were good enough for me to continue without worrying too much about that. This week will be setting up the HA part. Then i will load this thing up with a ton of VM's (150 give or take) and bang away at it and see what happens. That part is probably a few weeks out.

femi · May 30, 2012

How is the "clustering" coming along?

Did you receive your ZIL device (OCZ Talos)?

goktugy · Jul 10, 2012

Hello,

While reading this topic with much interest I'd like to hear if any updates are available.

Thank you.

stevebaynet · Jul 12, 2012

Project got put on hold, but i anticipate starting it back up again in the next few weeks. I paused right before the clustering part so that is where things are headed next.

The Talos drives will be for L2arc, havent ordered them yet but will do so in the next few weeks (unless there is something new/better out while this stuff was paused)

femi said:
How is the "clustering" coming along?

Did you receive your ZIL device (OCZ Talos)?

stevebaynet · Jul 12, 2012

Got sucked in by other projects so nothing new to report, but expect to start moving forward again in a few weeks and will post updates then.

goktugy said:
Hello,

While reading this topic with much interest I'd like to hear if any updates are available.

Thank you.

pvaladez · Jul 13, 2012

Thanks Steve!

I'm using an SC847 and I just got that slotmap uploaded and working in about 2 minutes- I can't believe how easy that was! No restart of any kind needed!

apishdadi · Aug 2, 2012

to op:

What backplane configuration did you get with your JBOD chassis?

What type of cables are you using to connect to the head units?

Would you be interested in consulting for a fee to help me build our first unit?

stevebaynet · Sep 26, 2012

apishdadi said:
to op:

What backplane configuration did you get with your JBOD chassis?

What type of cables are you using to connect to the head units?

Would you be interested in consulting for a fee to help me build our first unit?

Sorry for the lag, i am just getting killed with projects lately, so this is still on the back burner for a few more weeks.

To answer your questions (better late than never i hope):

Backplanes: We have the E26 version (dual expander chips) so that means we're using the BPN-SAS2-846EL2 and BPN-SAS2-847EL2 backplanes

Cables: We are using standard SFF 8088 to SFF 8088 (SMC PN: CBL-0166) to connect to the head unit.

Consulting: assuming you still even need help, im just too swamped, but im happy to answer questions here, plus its free

) im subscribing to this thread finally (cant believe i didnt do that earlier) so when somone posts it wont take months for me to finally notice.

Will post more info in a few weeks when i start riving this (need to update nexenta to current ver, setup the cluster and add the new l2arc disks)

femi · Mar 25, 2013

stevebaynet, what's been happenin'? hope all is well (aside from being busy at work)

How is the Nexenta project coming along?

Any progress with clustering?

stevebaynet · Mar 25, 2013

femi said:
stevebaynet, what's been happenin'? hope all is well (aside from being busy at work)

How is the Nexenta project coming along?

Any progress with clustering?

Things have gone well, this project got shelved for several months do to other things in the pipeline getting bumped up.

Recently got the HA cluster up. I initially tried to do myself to no avail. Followed the HA cluster installation/setup guide and tried both the command line and GUI versions. Both error'd out during the process.

However, this was no biggie. When you buy the HA plugin, you also pretty much have to pay for the professional services to install it and get it working. So i scheduled Nexenta support for the help and they came through fine.

Before the scheduled appt, i cleaned out all the test pools i created to clean the slate. Also hooked up a serial cable between the two nodes for heartbeat.

This is my HA setup status screen post installation:

in the next week we are going to be doing alot of testing. things like forcing a failure to see how long it takes to move to the other node, does it interrupt operations on the ESX hosts, etc etc.

More to follow.

femi · Mar 25, 2013

Thx for the honest update.

I remember asking Nexenta if clustering was as easy to setup as it is with Microsoft Failover clustering, they said yes....most everyone else says otherwise

gillysuit · May 2, 2013

Hey Steve,

Great post on your experience with Nexenta. I would love to hear how the HA worked out and learn about your experience with performance and scalability during your VM testing.

stevebaynet · May 2, 2013

Thanks, its been a long road (nothing to do with this project, just other stuff coming up)

I have HA all setup and running, slowly moving over dev/test VM's to see how perf is, how good failover works, etc etc. I'm pleased so far but will post a more in depth update soon.

madrebel · May 3, 2013

steve,

i highly recommend you run nexenta 3.1.4.1 on your cluster. some very painful bugs/'features' were fixed in the latest version as it pertains to HA failovers. mostly applicable to high memory systems but i've seen volume failovers take over 10 minutes on 3.1.3x. in the most current version you have to work really hard creating thousands of data sets (or zvols) for failover to go beyond 90s. on my test boxes which have anywhere from 48-128GB ram failovers on the 3.1.4x code take ~20s.

note, i mean maintenance window type failovers here. kernel panic events have always failed over quickly.

My ZFS / Nexenta HA cluster build

Limp Gawd

Limp Gawd

Limp Gawd

Gawd

Limp Gawd

Gawd

Limp Gawd

Limp Gawd

Limp Gawd

Limp Gawd

n00b

Limp Gawd

Limp Gawd

n00b

n00b

Limp Gawd

Limp Gawd

Limp Gawd

Limp Gawd

n00b

Limp Gawd

Gawd