Are you Eager or Lazy?

shnelson · May 16, 2012

Trying to standardize on our vmdk formatting here and finding online documentation to be a bit lacking as far as recent vsphere & storage array abilities.

Assuming you have an array that performs thin provisioning (LUNs via FC, no NFS here), what is your preference for a thick vmdk?

I've thrown out the idea of a thin provisioned vmdk, or 'thin on thin', as I don't like the idea of having to monitor over-committed storage against two seperate peices of infrastructure.

To be able to take immediate advantage of thin provisioning on the array, you'd need to present it with a lazy zeroed vmdk, as an eager zeroed vmdk will consume the full capacity up front.

Seems like an obvious choice, but I am interested in the performance benefit of an eager disk. Is it a noticeable impact to have a lazy disk wait for a datastore lock before it can zero & write to newly required blocks on demand? In a busy environment, I think you would see some impact to latency - but what about less than 100 vms?, and only 10-15% of those heavy (ok, maybe moderate) on IO.

From my testing, it appears I can gain back all of the empty eager zeroed blocks after a deduplication scan against the volume - so if the performance impact is marginal, it really comes down to how I want to logically manage the over-committed storage.

Interested in thoughts & experiences from the storage experts out there. Ultimately, it is looking like I will go the lazy route to take advantage of immediate storage savings & let deduplication do it's magic on top of it.

Vader · May 16, 2012

You certainly SHOULD do thin on thin.There are cases you shouldn't, however, like in a VDI desktop environment using Linked Clones or only thin on the array side for full view desktops as to not put on too much overhead on the virtualization layer.

shnelson · May 16, 2012

Vader said:
You certainly SHOULD do thin on thin..

Could you expand on your reasoning a bit? Not trying to challenge it or anything, simply looking gain more perspective

In terms of environment, this is strictly server based - no VDI on this array.

Vader · May 16, 2012

If you thin provision on the storage array, you are only decreasing wasted space at the array level, so you can still 'over-provision" at the vm level and still waste space.

Running thin on both the array and virtualization layers, alleviates wasted space across the board.

If you're like me, and mange both the storage and virtualization layers, it can get complicated, but with the latest reporting tools, etc, you should be able to manage it comfortably.

Thuleman · May 16, 2012

I find thin provisioning to be problematic and am no longer using it. Thin provisioning is in theory a good idea to save disk space, so the ONE good thing about it is that you can save on storage cost. How much you save depends on fairly large number of variables none least of which is how much you pay for storage in the first place.

Then there is from my perspective a LONG list of downsides. What is the risk to your organization if thin provisioning goes wrong? How do you express this risk in dollars, and how does that amount compare to the money you saved on storage by thin provisioning?

I find thin on thin to be especially problematic because the storage has no way to know what's actually unused space as it isn't talking to the host and guest OS about what's free and used space. This becomes very apparent as you decrease actual usage as seen by the OS and that "free" space will not be reclaimed by the VMDK or the storage array.

Thin, and especially thin on thin, introduces a lot of complexity, a lot of potential problems, a lot of effort to constantly monitor usage, and the requirement that if you are starting to use the oversubscribed space you need to be able to add physical storage quickly enough to not bring the whole cluster down. All that to save the above calculated amount on disk space.

Yes, if you are Amazon or Bank of America then thin provisioning maybe makes sense (probably not), but in the SMB market I see nothing but problems and risk for insignificant financial gain.

I am down to approximately $800/TB of usable iSCSI storage. At that price I have no legitimate reason to thin provision. YMMV.

NetJunkie · May 16, 2012

Vader said:
If you thin provision on the storage array, you are only decreasing wasted space at the array level, so you can still 'over-provision" at the vm level and still waste space.

Running thin on both the array and virtualization layers, alleviates wasted space across the board.

If you're like me, and mange both the storage and virtualization layers, it can get complicated, but with the latest reporting tools, etc, you should be able to manage it comfortably.

Sorry, but I think this is an AWFUL idea and you won't find much support from any vendor for doing thin on thin. It's a great way to REALLY get yourself in a bind. If storage is close enough that thin on thin is needed then you're going to hit a wall and have a real hard time getting out of it. Only provision thin in one place..and usually that place is where you can get the best reporting and trending info.

NetJunkie · May 16, 2012

I don't do eager zero unless something requires it. The loss of performance is VERY minor and as you said, if you thin provision on the array and then eager zero you are defeating the purpose.

Vader · May 16, 2012

Sorry, but I think this is an AWFUL idea and you won't find much support from any vendor for doing thin on thin. It's a great way to REALLY get yourself in a bind. If storage is close enough that thin on thin is needed then you're going to hit a wall and have a real hard time getting out of it. Only provision thin in one place..and usually that place is where you can get the best reporting and trending info.

To each is own. EMC supports thin on thin and recommends it as do VMware storage best practice guidelines. I use it on my server pools and it works fine and I save space on the array and virtualization side. There is no reason not to other than the fact that you may have a problem managing it and I manage it with no issues.

There is no technological barrier here. What most people are alluding too is it's hard to manage. I'm not denying that, but technically it works fine and saves on both sides of the spectrum. As you know, storage is not cheap...and the facts are businesses need to squeeze every bit out of what they have.

lopoetve · May 16, 2012

Vader said:
You certainly SHOULD do thin on thin.There are cases you shouldn't, however, like in a VDI desktop environment using Linked Clones or only thin on the array side for full view desktops as to not put on too much overhead on the virtualization layer.

Every person I've heard say that, has ended up talking to me about recovering their environment within the next 6 months, with VERY rare exceptions.

I get to laugh. And they get to write a resume. No one spends that much time monitoring their infrastructure, it turns out.

Vader · May 16, 2012

Every person I've heard say that, has ended up talking to me about recovering their environment within the next 6 months, with VERY rare exceptions.

I get to laugh. And they get to write a resume. No one spends that much time monitoring their infrastructure, it turns out.

Well, I guess i'm fortunate then since it's been over six months, and I don't have any problems managing it.

NetJunkie · May 16, 2012

Vader said:
To each is own. EMC supports thin on thin and recommends it as do VMware storage best practice guidelines. I use it on my server pools and it works fine and I save space on the array and virtualization side. There is no reason not to other than the fact that you may have a problem managing it and I manage it with no issues.

Please point me to the written best practice guides that recommend thin on thin.

NetJunkie · May 16, 2012

Vader said:
There is no technological barrier here. What most people are alluding too is it's hard to manage. I'm not denying that, but technically it works fine and saves on both sides of the spectrum. As you know, storage is not cheap...and the facts are businesses need to squeeze every bit out of what they have.

Storage isn't cheap, but downtime is even less cheap. To actually squeeze every bit out you're going to have to walk a VERY thin line and what happens if it goes unmonitored for a while? You're right, there is no technology reason but I still state it's a terrible idea.

lopoetve · May 16, 2012

Vader said:
To each is own. EMC supports thin on thin and recommends it as do VMware storage best practice guidelines. I use it on my server pools and it works fine and I save space on the array and virtualization side. There is no reason not to other than the fact that you may have a problem managing it and I manage it with no issues.

There is no technological barrier here. What most people are alluding too is it's hard to manage. I'm not denying that, but technically it works fine and saves on both sides of the spectrum. As you know, storage is not cheap...and the facts are businesses need to squeeze every bit out of what they have.

Where is that documented? I do support presentations on this constantly, and I'll never recommend it.

Pick one place to thin provision. You're not going to save anything with thin luns and thin disks of significance, especially since no one has UNMAP working yet in the VAAI layer that I know of, and as a result, if you DO run out of space on your luns, you get a nice thermonuclear explosion that takes out everything on the LUN.

lopoetve · May 16, 2012

Vader said:
Well, I guess i'm fortunate then since it's been over six months, and I don't have any problems managing it.

You do realize that on all current arrays that do not support UNMAP, a thin lun filling results in a block write failing which will result in corruption of all disks on that lun, right?

Massive, unrecoverable corruption.

Vader · May 16, 2012

Please point me to the written best practice guides that recommend thin on thin.

https://www.vmware.com/files/pdf/VMware-DynamicStorageProv-WP-EN.pdf

Got to page 10.

NetJunkie · May 16, 2012

Vader said:
https://www.vmware.com/files/pdf/VMware-DynamicStorageProv-WP-EN.pdf

Got to page 10.

Says you can...doesn't say you should. Terrible idea.

Vader · May 16, 2012

You asked for a document, I provided it. I hear so much conflicting info it's unreal in this community (VMware in general, not Hardforum). On one side I have the EMC guys touting yes yes, do thin on thin, then I read white papers that specifically state it's a good idea, then I get hammered for doing it..it's utterly unreal.

From the VMware doc:

If all of the storage in a single array is used for virtualization environments then one might argue that the use
of both thin provisioning at the array and the virtual disk levels might not yield as much space savings beyond
what would result from using just the array-based thin provisioning. However, one should consider that wasted
space in the guest OS use of storage is what is saved by over committing a datastore. And that datastore
will grow over time on the underlying storage allocation. So use of both array-based and virtual disk thin
provisioning is complimentary.

I would say that's a glaring recommendation.

NetJunkie · May 16, 2012

I'll be honest. EMC guys rarely understand how to run proper workloads on their arrays.

Do as you wish. You obviously know the downsides. I don't think there is THAT much conflicting info here. Ask anyone that really does vSphere designs and every single one is going to tell you this is a terrible idea. You are literally the first person I've ever seen defending it.

Just because you can doesn't mean you should...but as long as you constantly monitor your environment I agree you SHOULD be fine...assuming you don't get hit by something like a runaway process or virus that causes rapid data growth as you have nothing to backstop you before you hit 100% and corrupt data.

Vader · May 16, 2012

'll be honest. EMC guys rarely understand how to run proper workloads on their arrays.

Do as you wish. You obviously know the downsides. I don't think there is THAT much conflicting info here. Ask anyone that really does vSphere designs and every single one is going to tell you this is a terrible idea. You are literally the first person I've ever seen defending it.

Just because you can doesn't mean you should...but as long as you constantly monitor your environment I agree you SHOULD be fine...assuming you don't get hit by something like a runaway process or virus that causes rapid data growth as you have nothing to backstop you before you hit 100% and corrupt data

I wouldn't say i'm "defending" it. What I AM doing is going by the information that is out there for all to see. Unfortunately, I don't have the insight as a consultant or design engineer. I am a end user, who supports a couple of environments that don't change all that often, and gets their info from product websites, and some blogs and books I read.and, of course building/supporting my environments. Of course I can't read all the blogs, I don't have time.

And yes, of course, if you guys tell me that it's a bad idea, and there are legitimate reasons why, then I absolutely will take your advice. Like any other customer, I don't want to deal with corruption, data loss, downtime, etc, however, like any other customer, I expect the vendors provide accurate information as well.

lopoetve · May 16, 2012

Vader said:
https://www.vmware.com/files/pdf/VMware-DynamicStorageProv-WP-EN.pdf

Got to page 10.

From page 10:

he answers are that the ESX server does know if the underlying LUN is thin provisioned or not.

This is only true if UNMAP is present. As I stated prior, no one has this implemented yet that I am aware of. In fact, it is disabled in ESX5 as-per U1, for the foreseeable future (due to issues).

As a result, ESX is completely unaware, till suddenly everything goes "kaboom".

Vader · May 16, 2012

I will be resolving this in the morning. It's my understanding that my VNX R31 supports UNMAP. Currently i'm running 4.1 U2 ..and UNMAP wasn't introduced until vSphere 5 so i'm still running at risk?

I'm sure I'll look back at this and think i'm glad that this post came up...lol

lopoetve · May 16, 2012

Vader said:
You asked for a document, I provided it. I hear so much conflicting info it's unreal in this community (VMware in general, not Hardforum). On one side I have the EMC guys touting yes yes, do thin on thin, then I read white papers that specifically state it's a good idea, then I get hammered for doing it..it's utterly unreal.

From the VMware doc:

If all of the storage in a single array is used for virtualization environments then one might argue that the use
of both thin provisioning at the array and the virtual disk levels might not yield as much space savings beyond
what would result from using just the array-based thin provisioning. However, one should consider that wasted
space in the guest OS use of storage is what is saved by over committing a datastore. And that datastore
will grow over time on the underlying storage allocation. So use of both array-based and virtual disk thin
provisioning is complimentary.

I would say that's a glaring recommendation.

Wow, that's glaringly wrong. I'll see about getting that corrected immediately. If the guest says it's used, it shows as used to us, and to the array. Then, if the guest deletes it, it's not freed from us (in current ESX implementations), and will not be freed from the array either. This also relies on the UNMAP feature, as well as a guest tools feature that has not been released yet, as well as PUNCH_ZEROS (which is present, but not utilized yet).

With the appropriate array, you can use vmkfstools -Y to free space from deleted vmdk files, but inside the guest we cannot currently do anything, and vmkfstools -Y is currently not a documented feature.

lopoetve · May 16, 2012

Vader said:
I will be resolving this in the morning. It's my understanding that my VNX R31 supports UNMAP. Currently i'm running 4.1 U2 so it seems i'm at risk.

I'm sure I'll look back at this and think i'm glad that this post came up...lol

4.X doesn't have UNMAP at all

It's only in 5.0, and it's currently disabled by default.

EMC had it slated for some time this year, but I haven't heard if they got there or not. It ~WAS~ on the VMAX at one point, but had problems, so it was disabled there as well (corruption). EMC had a powerlink article on it, but I don't have the link handy, and my powerlink account is b0rked.

NetJunkie · May 16, 2012

Vader said:
I will be resolving this in the morning. It's my understanding that my VNX R31 supports UNMAP. Currently i'm running 4.1 U2 so it seems i'm at risk.

I'm sure I'll look back at this and think i'm glad that this post came up...lol

UNMAP doesn't work on VNX (or any other array as lopoetve just said) right now. It was pulled back in vSphere due to bad problems.

Vader · May 16, 2012

Yeah..I was editing my post when you guys responded, I remembered that UNMAP wasn't introduced until 5.

Either way, i'll be looking at making some changes and i'll report back.

Thuleman · May 16, 2012

Vader said:
Either way, i'll be looking at making some changes and i'll report back.

I think much of this discussion centered around the wrong topic really. I find that the technology is entirely secondary to the thin provisioning topic. The question is; Why do you thin provision? What do you gain by doing so in your environment? Not theoretically but what real life problem do you solve? How much would it be to solve that real life problem by adding storage, perhaps even lower tier (less expensive) storage?

Storage is like memory, it's supposed to be used. I know of a couple people who did thin provisioning not because they had to save storage space but because they liked to see lots of available space on their hardware. You didn't buy it to be available, you bought it to be used, so use it while maintaining a reasonable about of free space for growth.

Once I talked to my peers about this and got to the bottom of why they used thin provisioning (to look at lots of free space available) it became clear that they had no business requirement for thin provisioning and they abandoned it.

aggiec05 · May 17, 2012

Thuleman said:
I think much of this discussion centered around the wrong topic really. I find that the technology is entirely secondary to the thin provisioning topic. The question is; Why do you thin provision? What do you gain by doing so in your environment? Not theoretically but what real life problem do you solve? How much would it be to solve that real life problem by adding storage, perhaps even lower tier (less expensive) storage?

Storage is like memory, it's supposed to be used. I know of a couple people who did thin provisioning not because they had to save storage space but because they liked to see lots of available space on their hardware. You didn't buy it to be available, you bought it to be used, so use it while maintaining a reasonable about of free space for growth.

Once I talked to my peers about this and got to the bottom of why they used thin provisioning (to look at lots of free space available) it became clear that they had no business requirement for thin provisioning and they abandoned it.

No good business reason, but often at a number of shops I have been at you have a developer, admin, or someone in power request 400GB of space when in all reality they only use 40GB. I prefer to do thin provision on one side to help me save the space. From a hosting side we used array based thin provisioning knowing we would buy additional space as needed, and had the funds to do so, but once again, we would sell 200GB to the customer who often used 20GB or under. If I did thick, I would have needed to buy additional arrays long before I would actually use the space. It is a balance, much like everything in our jobs.

AceXsmurF · May 17, 2012

Well for my group we do use thin provisioning on the VMs as almost all of the VMs we utilize are for QA testing. So the VMs are constantly being loaded up, tests are done, then the VM is reverted to a 'clean' state for the next round of testing. Lots of data gets loaded and unloaded from VMs constantly in our shop.

Thuleman · May 17, 2012

aggiec05 said:
No good business reason, but often at a number of shops I have been at you have a developer, admin, or someone in power request 400GB of space when in all reality they only use 40GB.

I am well familiar with that scenario. I was able to change the culture though I will admit that it took a couple of years to do so. Our standard Windows servers were created with a 100GB disk "just because". We now provision them with 40GB disks and the average use is 32GB.

What I did was to personally get in touch with those who requested large disks, listened to what they had to say about why they thought they needed the space and then made a deal that we would provision 40GB and I would add additional space within 5 minutes of being notified that it is needed, day or night. We only once ran out of space because someone screwed up and SQL logs grew till the disk was full. Even then the system chugged along albeit slowly.

My suggestion for those who have to deal with "those in power" requesting large disks is to just personally go to their office, listen to their concerns, and then offer them a deal that works for them and you.

shnelson · May 17, 2012

I can't say I didn't expect this thread to blow up about thin provisioning - but somewhere in there NetJunkie gave me a firm response on (thick) eager vs lazy, so thank you for that

.

It's great to see such an in-depth discussion around the logistical challenge of thin on thin. What I think some of us fail to plan for is what happens when you're not personally overseeing the storage operations (for whatever reason), will the other admins be aware of the risk & diligent about managing unexpected increase in use?

Vader · May 17, 2012

I think much of this discussion centered around the wrong topic really. I find that the technology is entirely secondary to the thin provisioning topic. The question is; Why do you thin provision? What do you gain by doing so in your environment? Not theoretically but what real life problem do you solve? How much would it be to solve that real life problem by adding storage, perhaps even lower tier (less expensive) storage?

Storage is like memory, it's supposed to be used. I know of a couple people who did thin provisioning not because they had to save storage space but because they liked to see lots of available space on their hardware. You didn't buy it to be available, you bought it to be used, so use it while maintaining a reasonable about of free space for growth.

You must deal with people that have IT budgets. While my shop does, they get cut and often. It's business leaders that have no concept of IT budgeting, etc even when the risk is presented that make those decisions, very backwards here.

We use what we have, believe me. On top of that, there are always needs that come up last minute. I'm not saying that Thin provisioning has saved me, yet, but it's getting there. In my environment, there will NEVER be IT/Business alignment, sad, but true. You would cringe if you knew what I dealt with pertaining alignment/strategy, and the almighty dollar. One of the main reasons i'm getting out.

The thin on thin discussion, well that's another story. Netjunkie and lopoetve provided high risk reasons not to do that, and therefore i'm going to make a change to Thin on one side, however Thin provisioning certainly has it's place in my environment and will be used.

Vader · May 17, 2012

. What I think some of us fail to plan for is what happens when you're not personally overseeing the storage operations (for whatever reason), will the other admins be aware of the risk & diligent about managing unexpected increase in use?

That's a great point. This is why I pointed this out earlier and in my case, I oversee SAN, Virutalization, Compute, and Network (related to Virtualization) so I have visibility into all the technology supporting my vSphere environments.

itrsteve · May 17, 2012

shnelson said:
Seems like an obvious choice, but I am interested in the performance benefit of an eager disk. Is it a noticeable impact to have a lazy disk wait for a datastore lock before it can zero & write to newly required blocks on demand? In a busy environment, I think you would see some impact to latency - but what about less than 100 vms?, and only 10-15% of those heavy (ok, maybe moderate) on IO.

Unless there's an absolute requirement for eager zero (i.e. FT) then you're fine with the flat lazy zero disks. The performance hit is negligible at best.

Not to pull away from the great thin on thin topic, but I don't know if I saw that question hit.

Personally, I prefer thin provisioning on the arrays (when available). With array dedupication and reporting it just makes sense to only have to monitor one possible area that can be over-subscribed rather than two. But yes, when not available on the array there are definite reasons to do it as long as you know the risks and how to monitor.

I do clearly remember a time years ago when I had a NetApp engineer on site say it was okay to do thin on thin, at that point in my career I trusted that advice as gold.

lopoetve · May 17, 2012

Vader said:
That's a great point. This is why I pointed this out earlier and in my case, I oversee SAN, Virutalization, Compute, and Network (related to Virtualization) so I have visibility into all the technology supporting my vSphere environments.

But what if you're gone for a week, and one of your users goes nuts? (It's happened)

That's the trick - even if you DO oversee it, there's always the rogue user (or idiot user) that mucks things up for everyone else

Vader said:
You must deal with people that have IT budgets. While my shop does, they get cut and often. It's business leaders that have no concept of IT budgeting, etc even when the risk is presented that make those decisions, very backwards here.

We use what we have, believe me. On top of that, there are always needs that come up last minute. I'm not saying that Thin provisioning has saved me, yet, but it's getting there. In my environment, there will NEVER be IT/Business alignment, sad, but true. You would cringe if you knew what I dealt with pertaining alignment/strategy, and the almighty dollar. One of the main reasons i'm getting out.

The thin on thin discussion, well that's another story. Netjunkie and lopoetve provided high risk reasons not to do that, and therefore i'm going to make a change to Thin on one side, however Thin provisioning certainly has it's place in my environment and will be used.

You'll never see me argue against using thin, especially at the VMDK level - if we fill up a datastore, we'll save your shit (or at worst, the VM will crash), but we can't control the lun portion. I use thin provisioning all the time, both for software that WANTS to validate it has X space (when it needs only a fraction of that), and for things that I don't know growth rate for (I'm installing it, I know it'll need space, but I don't know how much, so I hand it 30, and wait to see afterwards). Also for things that will need very little space now, but in the future just might need more - rather than add, I can make them thin and go from there.

I'm also massively constrained on storage space, so when you've got 20-30G luns, you do what you can by thin provisioning to stuff bits on there

I just don't like seeing people destroyed because their array gleefully ate their environment

I hate those calls.

Vader · May 17, 2012

But what if you're gone for a week, and one of your users goes nuts? (It's happened) That's the trick - even if you DO oversee it, there's always the rogue user (or idiot user) that mucks things up for everyone else

Well, that's not a 'technology' problem that's a staffing problem..

Another reason why i'm outtie. Probably going to take the position with the VMware/Netapp/Cisco VAR, just working out the Compensation package now.

WoodiE · May 17, 2012

In our VNX environment we do thick on the array side and thin on the VM side.

Child of Wonder · May 17, 2012

Vader said:
Well, that's not a 'technology' problem that's a staffing problem.. Another reason why i'm outtie. Probably going to take the position with the VMware/Netapp/Cisco VAR, just working out the Compensation package now.

If you begin working as a consultant you will very quickly find that even though you trust yourself to implement thin on thin, there are few of your customers that should be trusted to use it.

Seriously, you will be AMAZED at how badly some companies are run.

kdh · Sep 26, 2012

digging up an old thread..

i do thin/thin all day long in my current shop, only because i know i will be able to buy more storage on a regular basis..

at my old company I refused to do it because they didn't understand the concept.

Vader · Sep 26, 2012

I'll defer back to Child Of Wonder's previous post, i've been working as a consultant now for 3 months at a Cisco/Dell/VMware VAR and his post could not be more true. I'm afraid to implement thin period at some of the companies i've been too thus far.

lopoetve · Sep 27, 2012

Vader said:
I'll defer back to Child Of Wonder's previous post, i've been working as a consultant now for 3 months at a Cisco/Dell/VMware VAR and his post could not be more true. I'm afraid to implement thin period at some of the companies i've been too thus far.

SEE! TOLD YA!

Are you Eager or Lazy?

Limp Gawd

Supreme [H]ardness

Limp Gawd

Supreme [H]ardness

Supreme [H]ardness

[H]F Junkie

[H]F Junkie

Supreme [H]ardness

Extremely [H]

Supreme [H]ardness

[H]F Junkie

[H]F Junkie

Extremely [H]

Extremely [H]

Supreme [H]ardness

[H]F Junkie

Supreme [H]ardness

[H]F Junkie

Supreme [H]ardness

Extremely [H]

Supreme [H]ardness

Extremely [H]

Extremely [H]

[H]F Junkie

Supreme [H]ardness

Supreme [H]ardness

Limp Gawd

[H]ard|Gawd

Supreme [H]ardness

Limp Gawd

Supreme [H]ardness

Supreme [H]ardness

Limp Gawd

Extremely [H]

Supreme [H]ardness

Limp Gawd

2[H]4U

Gawd

Supreme [H]ardness

Extremely [H]