production vm storage?

Olga-SAN · Apr 28, 2015

what you guys use for production vm storage?

i mean explicitly not labs / homes but something you roll in the office you don't own yourself

would you put into production some free solution with allowed commercial use (no eula violation! ) but with community only / limited vendor support?

what would be a game changer for you? say going freenas (free) -> truenas (paid) upgrade?

tnx!!

Eulogy · Apr 28, 2015

EMC, Hitachi and NetApp.
I would absolutely not run production systems on solutions with limited support. When something dies in the middle of the night, or a firmware wipes the array... I want someone to call.

Thuleman · Apr 28, 2015

EMC VMAX (with VPLEX) and Isilon at the HQ, VNXe at branch locations.

Not all "production" is created equal. It all has to make sense for its intended use. At my old job we ran a lot of JetStor arrays because that worked just fine and we mitigated the risk of JetStor providing a support model that's different from EMC.

If it's a small business then you may get away with "free" if you control the risk by having appropriate backups and a Plan B when things go south. The question is always "how much money will be lost when the storage is down?"

Personally I would protect myself by either buying a solution that has commercial support, or use a managed services vendor, or contract with some local company that will be able to provide support on a "pay as you go" basis should things go south.

Commercial use without some level of support is most certainly going to end in a resume-generating event.

KapsZ28 · Apr 28, 2015

Tintri!

lopoetve · Apr 28, 2015

Tintri!

But in all honesty, if you asked me to build a "perfect" datacenter:

Tintri (VMs)
Nimble (anything physical / old fashioned clustering)
Tegile (ZFS for file shares).

There is no way on God's green earth, or any other planet, that I'm putting a solution of ANY kind into production that doesn't have real support behind it, and an enterprise solution at that. I KNOW what happens from the vendor side when something goes wrong fundamentally with those, and I also know precisely what happens when something fundamentally goes wrong with an "enterprise solution."

It has nothing to do with there not being a chance for a critical error wiping everything out on an enterprise solution - it's what those enterprise companies (be them EMC, Nimble, Tintri, or any other) can bring to bear, and force the OTHER vendors to bring to bear, in order to solve the problem, that makes a difference. Same reason that I will be hesitant to roll white box builds into critical production, even though they're built on great boards - just watch how fast and how many engineers Dell or HP can bring to bear on a problem with an Intel or AMD chip vs how many your individual manufacturer can, and what kind of resources (and ATTENTION) they can cause. It's a different world between the two; billions of dollars of difference, to be precise.

Disclaimer: I worked for VMware for 7 years as a Staff Engineer/Sr. Architect, and was the one involved in many issues between smaller vendors and bigger ones, as well as the big vs. big. I now work for Tintri.

Child of Wonder · Apr 28, 2015

I'm a consultant so nothing!

If I had to build a datacenter, I'd be looking at EMC, Hitachi, Tintri, Pure, and Nimble

I'd also echo what lopo said about free software solutions with paid support. Nope!

DeChache · Apr 29, 2015

Netapp(NFS) and VNX(Fiber)

samborarocks · Apr 29, 2015

EMC (VNX and VMax), and Netapp....

schizrade · Apr 29, 2015

Dell Equallogic on 10Gb Fiber. We are not a large shop by any means or we would be in a higher bracket storage model. The support is great which is what I like, and like many of the above mentioned we have a dedicated manager than can get us an engineer in under an hour.

Benzino · Apr 29, 2015

I have good experience working with NetApp in oh-snap moments.

KapsZ28 · Apr 29, 2015

The biggest part of this discussion is, what are you going to be running in production and what features do you need?

NetApp served us well for awhile, but we run many different types of VMs from large Oracle databases, websites, VDI'ish. Eventually the NetApp couldn't keep up with the workload. And with multiple RAID groups in a single aggregate and the fact that performance increasing as the aggregate exceeds 80-85% capacity just is not appealing to me. No inline deduplication, so as aggregates increased, and using Veeam for backups, it was getting increasing difficult for deduplication to complete during the night.

It is unfortunate that so many people are hung up on the big storage vendors. Although they are at the top of Gartner's list, that is for enterprise storage. They have been used for years and years long before virtualization was huge. The tides have changed and now storage needs to focus on working well with virtualization. EMC, NetApp, etc, they just bolt on new features to what already exists trying to make their product better. Tintri was designed for VMs. Use the right tool for the job. But again, what are you running in production, what features do you need, and how much capacity?

lopoetve · Apr 29, 2015

And that's NOT to say that the enterprise folks don't have their place. I love NetApp - if you wanted to swap NetApp for Tegile in my example, I'd be perfectly fine with that, or for physical systems. Same for EMC - the VNX does file shares decently well, and the fibre connections are good for physical as well. I don't like running VMs off of either much anymore, given that they're both pretty limited in how they scale and manage things there, but that's different.

that being said, if you asked me for a setup to run 5,000 servers of mixed kinds (ESX/physical/whatever), I'm going to point you at a VMAX and that's that - but you pay in terms of management, cost, and simplicity for a solution like that as well. In fact, I have several customers looking to migrate their VMs off of VMAX to something else and keep the VMAX for their large oracle deployments for that exact reason. VMs let you be REALLY dynamic, and a traditional array may not be able to keep up with that.

KapsZ28 · Apr 29, 2015

lopoetve said:
That being said, if you asked me for a setup to run 5,000 servers of mixed kinds (ESX/physical/whatever), I'm going to point you at a VMAX and that's that - but you pay in terms of management, cost, and simplicity for a solution like that as well. In fact, I have several customers looking to migrate their VMs off of VMAX to something else and keep the VMAX for their large oracle deployments for that exact reason. VMs let you be REALLY dynamic, and a traditional array may not be able to keep up with that.

If it wasn't for Oracle's pricing model, you would probably see a lot more of it being moved to virtualization. We have moved several servers with large Oracle databases to VMs and it makes managing and troubleshooting much easier. Now I only wish I could move those DBs off of NetApp and on to Tintri especially to take advantage of the QoS.

_Gea · Apr 29, 2015

Olga-SAN said:
what you guys use for production vm storage?

i mean explicitly not labs / homes but something you roll in the office you don't own yourself

would you put into production some free solution with allowed commercial use (no eula violation! ) but with community only / limited vendor support?

what would be a game changer for you? say going freenas (free) -> truenas (paid) upgrade?

tnx!!

All depends on your budget, your knowledge, the demanded high-endness and the allowed downtime.

With enough money you can take Netapp or similar and you are ok.
If something happens with your storage, there is always someone that you can call and move respondability.

But as you ask about ZFS opensource solutions based on BSD or on a free Solaris fork, it seems that either budget, vendor locking or flexibility is an item.

Yes, you can use opensource solutions in a production environment with some basic knowledge of what you should do and what not and a good hardware provider with hardware support. If you look around, you will find always the same suggested hardware for a general storage system, does not matter if one use ZFS with any BSD, NexentaStor or any other free Solaris fork or a Linux distribution. The common sense is ZFS as it offers production quality similar to high-end solutions in the opensource area. You can add software/os support when needed (ex NexentaStor, OmniOS, Oracle Solaris, Truenas)

What I would recommend: Do not use anything that is not widely used (like Dell or Supermicro hardware with LSI HBA and Intel nic) from a local provider that you can trust. Order at least one backup and one spare system where you can move your disks in case of problems.

An average high quality storage server for most use cases is nothing special, just standard hardware and standard software when combined with ZFS.

kdh · Apr 29, 2015

emc vmax and vnx.

niomosy · Apr 30, 2015

Currently using NetApp. So long as it's NetApp NAS, it's not too bad. NetApp FCP has not worked so well for us. Every OS has issues during filer failovers. Those "non-disruptive upgrades" are almost never non-disruptive on any FCP filer. NetApp's yet to come up with a combination of configuration, drivers, firmware, patching, and software that results in consistent success on NDU processes with FCP.

Makes me dream for the days of high-end EMC arrays where our problems were few and far between.

4saken · May 1, 2015

niomosy said:
Currently using NetApp. So long as it's NetApp NAS, it's not too bad. NetApp FCP has not worked so well for us. Every OS has issues during filer failovers. Those "non-disruptive upgrades" are almost never non-disruptive on any FCP filer. NetApp's yet to come up with a combination of configuration, drivers, firmware, patching, and software that results in consistent success on NDU processes with FCP.

Makes me dream for the days of high-end EMC arrays where our problems were few and far between.

Crazy. I run netapp and FCP on it has been the most stable during any of my NDU's or failovers. Usually the blip in NFS will cause a VM or two to bomb out, while my Fiber attached systems truck right along.

lopoetve · May 1, 2015

niomosy said:
Currently using NetApp. So long as it's NetApp NAS, it's not too bad. NetApp FCP has not worked so well for us. Every OS has issues during filer failovers. Those "non-disruptive upgrades" are almost never non-disruptive on any FCP filer. NetApp's yet to come up with a combination of configuration, drivers, firmware, patching, and software that results in consistent success on NDU processes with FCP.

Makes me dream for the days of high-end EMC arrays where our problems were few and far between.

4saken said:
Crazy. I run netapp and FCP on it has been the most stable during any of my NDU's or failovers. Usually the blip in NFS will cause a VM or two to bomb out, while my Fiber attached systems truck right along.

You're doing it wrong

Although NetApp's FCP implementation, to put it simply, blows goats.

geiger · May 1, 2015

Nimble.

DeChache · May 1, 2015

We run NFS on our Netapps to backend ESXi and don't have any issues.

Thuleman · May 1, 2015

lopoetve said:
Tintri!

But in all honesty, if you asked me to build a "perfect" datacenter:

Tintri (VMs)
Nimble (anything physical / old fashioned clustering)
Tegile (ZFS for file shares).

If only any of this would fit behind a VPLEX Metro.

Active/active datacenters and/or failover is where the above just can't compete with the EMCs of this world. That's neither good nor bad, it's just different.

SineDave · May 1, 2015

For my standard workload that deduplicates well - Nimble.
For large scale storage/mission critical apps - Hitachi VSP
For ultra high performance workloads (VDI, data analytics) - Pure

lopoetve · May 1, 2015

Thuleman said:
If only any of this would fit behind a VPLEX Metro.

Active/active datacenters and/or failover is where the above just can't compete with the EMCs of this world. That's neither good nor bad, it's just different.

The number of customers that need a true active/active setup is very small - most would be fine with 10-15 seconds of RPO, and that can be done in software.

KapsZ28 · May 1, 2015

lopoetve said:
The number of customers that need a true active/active setup is very small - most would be fine with 10-15 seconds of RPO, and that can be done in software.

Yeah, I have never seen a place that is active/active like that. I can only imagine what it costs. I guess that means you have to have the same amount of resources on both sides?

Usually I see BC setup for a small number of apps and a lot of DR. Although with all the new technologies, having a BC with a low RPO like you mentioned is becoming more common. Still though, usually the recovery site is a much reduced capacity when compared to all the production servers.

Thuleman · May 1, 2015

lopoetve said:
The number of customers that need a true active/active setup is very small - most would be fine with 10-15 seconds of RPO, and that can be done in software.

Hey, I am with you, but try telling a healthcare provider that.

KapsZ28 said:
Yeah, I have never seen a place that is active/active like that. I can only imagine what it costs. I guess that means you have to have the same amount of resources on both sides?

Yup, not just storage but compute as well.

2N is actually a lot less complex than N+something or <2N. The hardware costs more, but the lower complexity makes up for it.

lopoetve · May 1, 2015

Except the stretched networking, HA configuration, and all the SAN side (ugh). It's amazing when it works, but talk about a time sink.

Olga-SAN · May 1, 2015

thanks to everybody for your feedback i really appreciate your efforts

here's a thing

i work for a major telecom operator but IT budget is being cut

billing and processing run emc, corporate files belong to netapp, customers & their profiles are safe on oracle and oracle zfs storage so far so good

we have departments with individual budgets, branch offices, customer care centers, tiny datacenters doing calls logging for retransmission towers etc etc etc

company does outsourcing of non-critical IT to consulting company and these guys offered to put into production for branch offices and depts some free solution for storage and it would be consulting company who'll support it instead of a storage vendor

i'm a bit scared of this business model and i don't like the idea of putting something into production without support and ability to reach vendor

do you see this a show stopper?

QHalo · May 1, 2015

Thuleman said:
Hey, I am with you, but try telling a healthcare provider that.

Pretty much this. EMR downtime is life and death, literally.

DeChache · May 1, 2015

Olga-SAN said:
thanks to everybody for your feedback i really appreciate your efforts

here's a thing

i work for a major telecom operator but IT budget is being cut

billing and processing run emc, corporate files belong to netapp, customers & their profiles are safe on oracle and oracle zfs storage so far so good

we have departments with individual budgets, branch offices, customer care centers, tiny datacenters doing calls logging for retransmission towers etc etc etc

company does outsourcing of non-critical IT to consulting company and these guys offered to put into production for branch offices and depts some free solution for storage and it would be consulting company who'll support it instead of a storage vendor

i'm a bit scared of this business model and i don't like the idea of putting something into production without support and ability to reach vendor

do you see this a show stopper?

We have a tier of storage that is home brew OpenIndiana ZFS its cheap and fills a need. We also have a VNX and EQL and Netapps. In our storage tiers.

What I'm getting it if it fits the need and the business model and in the grand scheme of things you still have somebody to yell at when it fails. So its not all the different than having a storage vendor.

danswartz · May 2, 2015

QHalo said:
Pretty much this. EMR downtime is life and death, literally.

Also, some of the financial customers. Brokerages and market makers lose literally millions of dollars every minute they are down. In some countries, ANY outage requires a major dog and pony show for the Govt bureaucracy overseeing their sector.

lopoetve · May 2, 2015

Olga-SAN said:
thanks to everybody for your feedback i really appreciate your efforts

here's a thing

i work for a major telecom operator but IT budget is being cut

billing and processing run emc, corporate files belong to netapp, customers & their profiles are safe on oracle and oracle zfs storage so far so good

we have departments with individual budgets, branch offices, customer care centers, tiny datacenters doing calls logging for retransmission towers etc etc etc

company does outsourcing of non-critical IT to consulting company and these guys offered to put into production for branch offices and depts some free solution for storage and it would be consulting company who'll support it instead of a storage vendor

i'm a bit scared of this business model and i don't like the idea of putting something into production without support and ability to reach vendor

do you see this a show stopper?

Depends on the use case. Branch office for non-production or non-critical production? Eh, I wouldn't be happy, but it'd fly. Mass backups? Have at it. Object store? Sure thing. Production tier 3+ - not a chance in hades.

Olga-SAN · May 2, 2015

who does support for your zfs setup?

DeChache said:
We have a tier of storage that is home brew OpenIndiana ZFS its cheap and fills a need. We also have a VNX and EQL and Netapps. In our storage tiers.

What I'm getting it if it fits the need and the business model and in the grand scheme of things you still have somebody to yell at when it fails. So its not all the different than having a storage vendor.

Olga-SAN · May 2, 2015

branch office and main office

both production

non mission-critical

is it ok if contractor installs free solution without support from vendor and does support for it?

lopoetve said:
Depends on the use case. Branch office for non-production or non-critical production? Eh, I wouldn't be happy, but it'd fly. Mass backups? Have at it. Object store? Sure thing. Production tier 3+ - not a chance in hades.

DeChache · May 2, 2015

Olga-SAN said:
who does support for your zfs setup?

So for background I work for largish research University so we have a group that maintains the HPC clusters. They also maintain the ZFS stuff. Though they are part of a Storage "functional group" that maintains and sets the direction of all storage on campus.

_Gea · May 2, 2015

What I can say about professional ZFS napp-it/OmniOS/ Solaris users (free or with OS support)

- many are from the university/ edu area
- many are web providers or media producers
- many are medium size enterprises with an IT division
- some are system houses/ resellers
- some are "large data" even .gov

beside SoHo users

lopoetve · May 2, 2015

Olga-SAN said:
branch office and main office

both production

non mission-critical

is it ok if contractor installs free solution without support from vendor and does support for it?

How bad will it be if it goes down and they just shrug and say "well, that sucks?"

Decide based on that. If the result is a RGE of some kind, then don't do it. If it's a "well, that's mildly inconvenient" - then maybe. It all depends on if your job (or someone else's) relies on those systems or not.

DeChache · May 2, 2015

lopoetve said:
How bad will it be if it goes down and they just shrug and say "well, that sucks?"

Decide based on that. If the result is a RGE of some kind, then don't do it. If it's a "well, that's mildly inconvenient" - then maybe. It all depends on if your job (or someone else's) relies on those systems or not.

I should also mention our ZFS tier is "best effort" storage. Its cheap ZFS protects the data integrity.

But if you absolutely can't loose it we are going to point to the netapps or something else. Most of it is the working set of research data that if its down or gone its just a well that sucks copy it back.

Olga-SAN · May 3, 2015

nobody will die

cell phones would re-register within another nearest tower so customers will see no denial of service and we;re not going to lose money with cancelled billing

but multiple outages may trigger higher load on whole network

customer care depts will survive as well we'll just grow some new users for our competitors

and so on

ok let me re-phrase my question

when it worth going with a self-supported storage solution assuming it's still not home / lab use?

lopoetve said:
How bad will it be if it goes down and they just shrug and say "well, that sucks?"

Decide based on that. If the result is a RGE of some kind, then don't do it. If it's a "well, that's mildly inconvenient" - then maybe. It all depends on if your job (or someone else's) relies on those systems or not.

_Gea · May 3, 2015

Olga-SAN said:
ok let me re-phrase my question
when it worth going with a self-supported storage solution assuming it's still not home / lab use?

Asuming you use professional grade standard storage hardware ex Dell, HP or the ZFS champion
Supermicro in a usually suggested default configurations with usual hardware support, you should
(in case of ZFS)

- prefer a solution similar to configurations with full support (ex a certified NexentaStor)
- avoid "the newest", use what others are using for some time
- avoid complex configurations (ex HA setups)
- use expander + SAS or discrete LSI HBA + Sata
- use it for basic NAS/SAN setups like NFS or CIFS or FC/iSCSI

as you cannot expect hardware replacements within a short time when one fails
- have spare parts like disks
- have a second spare system with empty bays where you can move disks and import the pool
- have a backup system with replication

AND
- have some knowledge of your OS of choice (does not matter if Unix or Windows)
- read the mailing list for your solution
(ex OmniOS discuss + OmniOS irc when using OmniOS) to get informed about known problems
and where you can ask
- think about OS support if available

- know the actions to regain service when anything fails (from disk to whole system)

If you can live with these restrictions, you get enterprise grade storage hardware with an
enterprise grade free storage OS like the Solaris fork OmniOS or a BSD variant at a fraction
of the costs of similar systems with full support. For example 1-2 Terabyte of high speed
high iops NFS storage with 10GbE and Intel Enterprise SSDs for 3-4k Euro/$ up with
dual PSU and a 26bay 2,5" backplane that is expandable up to say 30-40TB raw pure SSD.

But as said - no support beside your own
( beside hardware replacements on warranty ). The problem ist not the server. That is not space
technology but just a 0/8/15 server with some disks. Its the knowledge and capability to
regain service in case of problems.

If its worth - can be yes or no.

Thuleman · May 3, 2015

_Gea said:
That is not space technology but just a 0/8/15 server with some disks.

Nullachtfufzn', da krieg ich ja richtig Heimweh!

I, naturally, agree with Gea, just protect yourself from the risk by having spares on site and have some in-house knowledge on how all of this works so you aren't out in the cold if/when your consultant decides that they don't want to work on your outage.

production vm storage?

Limp Gawd

2[H]4U

Supreme [H]ardness

2[H]4U

Extremely [H]

2[H]4U

Supreme [H]ardness

Gawd

Supreme [H]ardness

[H]ard|Gawd

2[H]4U

Extremely [H]

2[H]4U

Supreme [H]ardness

Gawd

Limp Gawd

[H]F Junkie

Extremely [H]

Limp Gawd

Supreme [H]ardness

Supreme [H]ardness

Limp Gawd

Extremely [H]

2[H]4U

Supreme [H]ardness

Extremely [H]

Limp Gawd

2[H]4U

Supreme [H]ardness

2[H]4U

Extremely [H]

Limp Gawd

Limp Gawd

Supreme [H]ardness

Supreme [H]ardness

Extremely [H]

Supreme [H]ardness

Limp Gawd

Supreme [H]ardness

Supreme [H]ardness