Thoughts on VMware Virtual SAN

I'm not attacking the service level in which your employer provides. I'm simply trying to help guide where I can in correcting the issue immediately and long term. I completely understand the situation and it's a difficult one to be in. On one side you don't have the funds to correct the issue by adding hardware. On the other side, if you offer slower than promised service you have people leave that are paying something. Something I just thought of that may be beneficial is to sell the lower end hosts (44GB RAM ones, etc.) and purchase a couple of used C1100s w/ 128GB RAM from Ebay. While not ideal or elegant if you can sell the old hosts for 50-75$ of a C1100 and get 3x the RAM and performance then that could be a win-win situation.

Nothing you said peanuthead. But memory isn't exactly cheap with our setup. The servers were maxed out with 8 GB x 18 sticks for 144 GB. Ideally we would buy 18 new 16 GB sticks for about $3k.

But again, this is all stuff being worked on. Just need to figure out the best areas to address first, and people were helpful by pointing out the swapped memory usage.
 
Above all you most likely will see the most improvements if you add more RAM to existing servers rather than buy new servers, though you ought to check the CPU Ready values to see whether there is CPU contention as well.

But personally I think some of the comments here are uncalled for. Not all businesses have the money to pay for constant upgrades. Until the past year this company has been running in the red. The two co-founders don't even get paychecks and are very dedicated to their customers.

I have a tough time with companies that don't deliver what the customer pays for because I personally have been on the customer side of that more than once. In my opinion it's not OK to provide inadequate or less than paid for service (no matter how "dedicated" one is to the customer) because one doesn't have the money to upgrade to a level that provides the client with exactly what the client paid for.

The following is a value-free statement that applies to all companies; The first objective of a business is to be profitable. If a business is not profitable then there's no reason to run it. Existing hardware has to be able to support existing customers, new customers should not be added if existing hardware doesn't even support the current customers, likewise existing customers shouldn't be able to expand if the current hardware doesn't even support the current needs.
 
Above all you most likely will see the most improvements if you add more RAM to existing servers rather than buy new servers, though you ought to check the CPU Ready values to see whether there is CPU contention as well.

CPU is not as bad, but we could definitely use more. We are going to do memory upgrades too, but being that we are so far stretched with memory as is, it is hard to take servers down to do memory upgrades while offloading on the rest of the cluster. That is why I would like two new servers and then work on the rest of the cluster.

Just to be clear, what kind of values are considered bad for CPU Ready Time say in milliseconds?
 
I attempted to start a discussion on Twitter...should know better :rolleyes:, not enough characters...lol. Anyway..the suggested capacity of SSD for VSAN is 10% of the total storage for each node.

I was looking more into the Policies as I was taking the updated VSAN HoL @ PEX and was playing with the flash read cache reservation. This policy is to alleviate cache read hit misses on specific workloads (VM's) that you determine by policy.

Since this is a reservation, it's dedicated to the VM's that are assigned that Storage Policy.

Now, I understand that for specific use cases where you know the workload you'll be running on VSAN like VDI, Hadoop, DR, read cache reservation may not be used..etc but let's look at the standard Data Center workload where there are all types of Virtual Machine workloads.

In this case we may not know what workload will have read cache hit misses until they are placed on VSAN, so we may have to adjust.

If I add a reservation to a VM where I want to try to make sure that 100% of the reads are handled in flash, AND I have followed the guideline for SSD sizing (10%), I now have taken SSD resources away from the total usage for the Pool and now I may have skewed the percentage.

My point is while I understand the guidelines for SSD Capacity, I do think that I would size SSD a bit more to accommodate for unknowns, like the possibility of needing additional capacity for a read cache reserve.

Even if it's not used for that specific purpose, that's not to say it wouldn't be used as all, especially for things like write buffer..etc...so to me...it's still a win.

I have a VSAN best practice sizing class tomorrow..i'll pose the question to our proctors and see what comes out of it.

Until then, chime in...interested to hear your thoughts.
 
If you want 100% of reads to be guaranteed out of flash you put it on an all flash array. Think of VSAN like a Tintri hybrid array. Scale the spinning disks to handle cache misses. Don't over allocate dedicated cache.

You'll be amazed at how small the working sets of VMs are.
 
Yeah that makes sense however when would i use that setting never?
 
Ok...humor me...I have a very as-needed basis VM ..now I have a reservation...now i've skewed the 10% right?
 
Ok...humor me...I have a very as-needed basis VM ..now I have a reservation...now i've skewed the 10% right?

Most likely. Have you played with a Nutanix or Simplivity? Really good performance letting the system do what it does. Run it without adjusting anything first. Trust it. :)
 
What I am curious about is what happens when you fill your flash cache? Meaning what happens from an IO latency perspective? Conceivably you are in latency hell at that point?

Re: Nutanix
Looking forward to when they divorce their software from their hardware (which they are apparently working on). As a hardware appliance their solution is a lot less attractive imho.
 
Yeah that makes sense however when would i use that setting never?

VMware does not recommend changing that setting unless you have a demonstrable performance issue you are trying to resolve.
 
Yeah I get that. My point is that if you do change that setting and you follow the SSD sizing guidelines that skews the recommended 10% which could impact performance across the board with other VM's running on that host.
 
Yeah I get that. My point is that if you do change that setting and you follow the SSD sizing guidelines that skews the recommended 10% which could impact performance across the board with other VM's running on that host.

Yep, which is precisely why we don't recommend changing it. :)
 
Re: Nutanix
Looking forward to when they divorce their software from their hardware (which they are apparently working on). As a hardware appliance their solution is a lot less attractive imho.

They could de-couple their software right now if they wanted. Nothing special. They just don't want to...yet.
 
Yeah I mean there are reasons to offer a hardware appliance and there are reasons to offer a software-only appliance.

If you offer hardware there is more of a value perception is you are actually selling a tangible good (at a very significant markup). Once it's software only some (many?) customers may balk at the price (even if the price per node is lowered). On the other hand the hardware appliance makes support a lot easier since the hardware is well known and well tested. Likewise some customers may prefer to purchase an integrated solution so that the customer too knows that the hardware is 100% compatible and tested and worry free.

Still, there are plenty of small (i.e. fewer than 10 hosts) deployments that would likely buy the software solution to run on their own hardware. Hopefully they will get all of that sorted out soon.
 
Excited for tomorrow's announcements. I've been holding off deploying VSAN on my physical hosts until GA.
 
Hopefully get our VSAN lab built tomorrow. Had to order cables from LSI because the ones we had weren't the right ones for our C200s....
 
We recently had to tear down our VSAN lab because Cisco wanted the blades back, but we did testing with Fusion IO as the flash. A whitepaper should come out soon with the results. I'll post it here when it does.
 
Just found out today we have Dell providing 4 R720's w/SSD's and 15K Drives for our Dell Solution Center for VSAN. Looking forward to setting that up before our March Madness Event....
 
I have 6 myself burning a hole in my Dev environment's pocket. Can't wait.
 
Just found out today we have Dell providing 4 R720's w/SSD's and 15K Drives for our Dell Solution Center for VSAN. Looking forward to setting that up before our March Madness Event....

Copying our event? :)
 
Just found out today we have Dell providing 4 R720's w/SSD's and 15K Drives for our Dell Solution Center for VSAN. Looking forward to setting that up before our March Madness Event....

Be interesting to see which SSDs these OEMs ship. I think a lot of these guys are going to be in for a surprise. We've hit a number of issues with many OEM controllers and most OEM SSDs are not up to the task of VSAN or server-side caching.
 
Copying our event?

Nahh...March Madness is huge here...Syracuse Orange..right down the road.

Even the last VAR I worked did a March Madness Event.

Nothing as big as yours, however.

On a side note..listened to the Podcast for Geek Whisperer..great stuff. Passed it on to my boss.

Be interesting to see which SSDs these OEMs ship. I think a lot of these guys are going to be in for a surprise. We've hit a number of issues with many OEM controllers and most OEM SSDs are not up to the task of VSAN or server-side caching.

This was actually sponsored by Intel..and I think they will be S3700's. The controllers are are the PERCH310's which i believe are rebranded LSI's.
 
Ready nodes seem more for orgs that still want to keep that OEM name in their DC. My guess is the better VSANs are going to come from orgs that don't mind building their own version with best of breed 3rd party parts. To me the only thing an OEM brings is (debatable) quicker part replacement and a OEM centralized hardware monitoring system. All completely circumvented by 3rd party monitoring you probably already have and keeping a stash of parts laying around. A supermicro fat twin with 12Gbps SAS cards and Intel S3700 or Kingston's E100s is going to decimate a ready node. And without the OEM name brand mark up.
 
I don't know..I think there are going to be several options..and i'm willing to bet some of those options will have PCI-E Based Flash like Fusion IO which will put up some stiff competition to any SSD offering.

After the VSAN "Sizing" PEX session, ehem..no comment Jason..lol...I had a chance to talk with Wade Holmes a bit. He alluded to a special Dell offering that's "purpose built" for VSAN...haven't really heard anything on that through the Partner channel...exciting to see what some of the Ready Nodes will look like.
 
Ready nodes seem more for orgs that still want to keep that OEM name in their DC. My guess is the better VSANs are going to come from orgs that don't mind building their own version with best of breed 3rd party parts. To me the only thing an OEM brings is (debatable) quicker part replacement and a OEM centralized hardware monitoring system. All completely circumvented by 3rd party monitoring you probably already have and keeping a stash of parts laying around. A supermicro fat twin with 12Gbps SAS cards and Intel S3700 or Kingston's E100s is going to decimate a ready node. And without the OEM name brand mark up.

Sure. But I can beat every major server manufacturer on price/performance right now and build my own...but very few do. I think we'll see the same here, especially since it's also going to be your storage. Most people also don't understand the "all parts aren't equal" thing. I spend a good bit of time when talking VSAN or PernixData FVP on why you rarely buy OEM SSDs and only buy certain others. Same goes for controllers. It all applies here too.

A lot of my engineers want us to create a VSAN package...but I'm not really up for being a systems integrator. Let someone else do that.
 
Yeah but any option, even PCIe options, are available 3rd party style and most likely with less markup. I mean let's be serious, even on Cisco's build-to-order site (which already contains a 40% discount I might add) they list the 785GB FusionIO cards for a C240 @ $14,000. I know for a fact that I can get those cards for about 60% of that cost from a 3rd party vendor.

I'm interested to see as well. As a VAR I don't see you guys sitting around building 3rd party VSANs. You're most likely gonna be selling OEMs because you're partners with these OEMs, so I can imagine that's where your focus may be. As a customer though, all bets are off.
 
Yeah but any option, even PCIe options, are available 3rd party style and most likely with less markup. I mean let's be serious, even on Cisco's build-to-order site (which already contains a 40% discount I might add) they list the 785GB FusionIO cards for a C240 @ $14,000. I know for a fact that I can get those cards for about 60% of that cost from a 3rd party vendor.

I'm interested to see as well. As a VAR I don't see you guys sitting around building 3rd party VSANs. You're most likely gonna be selling OEMs because you're partners with these OEMs, so I can imagine that's where your focus may be. As a customer though, all bets are off.

VSAN licensing + FusionIO cards in each + HDs = Not really that cheap. We're starting to see some cheaper options than the FIO cards for sure...but are they really that much better than a couple Intel 3700s? Eh. We'll see.

You're in the minority if you're a customer that is going to build their own. That's a lot of risk to assume.
 
I think we'll see the same here, especially since it's also going to be your storage.

While I'm all drooling over the thought, this is still the biggest thing that comes back in my mind. Is your org comfortable sticking their critical data on an 'untested' platform. And I think the hard answer will be no and that's where OEMs will flourish.

A lot of my engineers want us to create a VSAN package...but I'm not really up for being a systems integrator. Let someone else do that.

Sounds like a conflict of interest to me, and I get that. Would seem to put you in direct competition with your partner relationships.
 
Sounds like a conflict of interest to me, and I get that. Would seem to put you in direct competition with your partner relationships.

Not really. We'd do it using Cisco C-Series. We already use 3rd party SSDs for FVP and sometimes RAM. I have zero interest in being a systems integrator because it's a lot of work and overhead for VERY little profit.
 
Well you alluded to it already, most customers I would think are not going to go the "build your own" route for the reasons you listed. I think there will only be a small subset of customers that will do that, and these are the same customers who go Supermicro..and have spare parts..etc...just don't think that will be the norm for the same reasons why most go with OEM.

We aren't planning on rolling out our own solution neither, too much headache support..etc. It's all really commodity anyway..as you stated no $ for the small player there.
 
You're in the minority if you're a customer that is going to build their own. That's a lot of risk to assume.

Anyone rushing to put this in as their production storage environment is assuming alot of risk regardless of Ready-nodes or build your own. This absolutely would be our dev environment. I still have maintenance agreements and a significant investment with my incumbent storage vendor. They're not going anywhere for a while.
 
Anyone rushing to put this in as their production storage environment is assuming alot of risk regardless of Ready-nodes or build your own. This absolutely would be our dev environment. I still have maintenance agreements and a significant investment with my incumbent storage vendor. They're not going anywhere for a while.

Agreed, it will take time, but I think you will see adoption rates quicker than most 1.0 releases..the amount of interest is fairly significant. We have customers that we helped with Beta...and they were testing prior to GA....that's pretty significant.
 
I haven't done the research myself yet, but I heard that you can only have up to 8 servers for storage in VSAN. Is that true? You can have more ESXi servers in the VSAN, but they can't provide storage to the cluster. That would be kind of limited especially if you want to have dual parity.
 
Back
Top