VCB + BE 12.5 + FC[4Gb] = ?Throughput?

SpaceHonkey · Dec 3, 2008

Just curious on VCB backup throughput in the wild. I'm setting up a new environment and I'm not all that impressed with the speed. Average is 1.25GB/m, but fluctuates between .6 - 2.9GB/m. I'd be happy overall with 3+, but I know the system is capable of 5+ from the local file system.

Hardware:
IBM HS21 blades - 2 x 5440 Xeon w/16GB RAM + 4Gb QLogic w/dual HBAs
Brocade 4Gb switches
IBM DS3400 SAN with 12 15K 300GB SAS disks, 2 luns now, 1TB each. VM datastore is RAID 5, Backup lun is RAID 0. Also dual SPs, one responsible for each lun.
IBM TS3100 with 1 drive sled, again - 4Gb FC
IBM x3350 Backup/Virtual Center Server - yes VC and BE on one box. Small deployment so far, only two blades (hosts)

For software, VI3 including latest patched up ESX 3.5.x and VC 2.5.x. Backup Exec 12.5 with the VMware agent.

What kind of throughput are you guys seeing using VCB?

Thuleman · Dec 3, 2008

No idea on the VCB issue, but I am quite curious about your FC setup. It seemed to me that most everyone was trying to get away from FC at this point and go iSCSI instead. I would be curious to hear whether you actually bought a new FC setup or are just reusing existing equipment.

Orinthical · Dec 3, 2008

That all depends on how you have VCB configured -- also, Symantec ruined VCB support in 12.5 so that you have no choice but to take full image-level backups all the time. You'd be better off using a production like vRanger (formerly ESX Ranger) by Vizioncore.

If VCB is configured to use "SAN" mode and can pull over fibre, the throughput is generally quite respectable. If it's configured to use NBD mode and has to rely on the network, throughput will be markedly reduced.

So there are several problems you may be facing, not the least of which is the fact that 12.5 copies the whole VM to local disk every time it wants to perform a nightly backup. 11d allowed you to utilize a VCB integration module (collection of scripts put out by VMware) and backup using file-level full/differential or image-mode for DR. You could then capture an image-mode backup monthly for quick DR and use your standard full/differential method during the month but not anymore.

lopoetve · Dec 3, 2008

Thuleman said:
No idea on the VCB issue, but I am quite curious about your FC setup. It seemed to me that most everyone was trying to get away from FC at this point and go iSCSI instead. I would be curious to hear whether you actually bought a new FC setup or are just reusing existing equipment.

FC is massively faster no matter how you spin it on the right hardware. All the top end SANs are FC, and whoever told you different was trying to sell you something.

FC is for the big guys, NFS and iSCSI (either of which can be faster depending on the filer and app) are for the medium and small.

There's NO iSCSI san that can pace a DMX.

edit: Yet. 10gbE may change this.

lopoetve · Dec 3, 2008

21mb/s is a bit slow, even with the VCB compression going on. Can you give me the actual vcbmounter command they're running? Symantec should be able to help you get that from their scripts.

12.5 Is still using the integration module, it's just built in to the program now instead of scripts you have to modify by hand. There's still a way to do file level, it's just buried in there a bit farther, or so they've told me.

And do NOT use vDanger. That program is nothing but trouble. I wish your data the best in their hands. Especially when you lose all of it to an errant snapshot.

SpaceHonkey · Dec 3, 2008

We are already invested in BE, so we're not going to vRanger. It is configured for SAN mode, which is why I'm so disappointed.

In my own searching, I'm having trouble finding the VCB mounter command being used.

Also, in my testing - watching disk reads/writes to the VCB temp dir for the job, both reads and writes average around 100MB/s. This is on a 4 disk RAID 0 lun with it's own storage processor (for now, as only 2 luns exist).

Disk benchmarks are hard to test on that lun also, as most have a hard time getting past the cache, so the speeds are inaccurate.

Any other ideas?

lopoetve · Dec 3, 2008

Yeah, BE hides the command pretty well. Ask them what they're calling in the pre-backup and post-backup scripts, they'll tell you. Or file an SR with VMware - you'll get my group and we'll figure it out

What's your fibre setup to the VCB proxy? Any multipathing agents installed on the proxy? Is the VCB temp dir local, or san?

SpaceHonkey · Dec 3, 2008

VC + VCB proxy + BE are all on one server. Yes is does multipath, but only using the microsoft multipathing driver. Temp dir is on the SAN, on the raid 0 lun. Backup to disk is on the same lun - so now I'm trying creating a third lun just for the temp, w/2 disks in raid 0.

So that's 3 luns, one for the VM datastore, one for B2D, one for temp.

Fibre setup is pretty simple. Two switches, and every fibre device has 2 hbas (minus tape library) with one to each switch. The switches are not connected to each other. Tape is zoned exclusively to the VC/VCB/BE server, as are the B2D and Temp luns. VM Datastore lun is zoned to both blades and the VC/VCB/BE server.

For what it's worth, this setup is not in production - yet. We have an outside vendor coming to help us set it up (our first), but I'm trying to learn as much as possible about it before it goes into production.

LittleMe · Dec 3, 2008

lopoetve said:
edit: Yet. 10gbE may change this.

I've got a SAN I built myself for home use with my ESX server. It's got 2 10GbE ports and wipes the floor with our FC SAN at work. I think when 10GbE port pricing comes down, it'll slowly eat out more and more at FC's market share.

lopoetve · Dec 3, 2008

LittleMe said:
I've got a SAN I built myself for home use with my ESX server. It's got 2 10GbE ports and wipes the floor with our FC SAN at work. I think when 10GbE port pricing comes down, it'll slowly eat out more and more at FC's market share.

How many IOPS / what's your throughput?

SpaceHonkey said:
VC + VCB proxy + BE are all on one server. Yes is does multipath, but only using the microsoft multipathing driver. Temp dir is on the SAN, on the raid 0 lun. Backup to disk is on the same lun - so now I'm trying creating a third lun just for the temp, w/2 disks in raid 0.

So that's 3 luns, one for the VM datastore, one for B2D, one for temp.

Fibre setup is pretty simple. Two switches, and every fibre device has 2 hbas (minus tape library) with one to each switch. The switches are not connected to each other. Tape is zoned exclusively to the VC/VCB/BE server, as are the B2D and Temp luns. VM Datastore lun is zoned to both blades and the VC/VCB/BE server.

For what it's worth, this setup is not in production - yet. We have an outside vendor coming to help us set it up (our first), but I'm trying to learn as much as possible about it before it goes into production.

Try it to local disks.

Your san fabric should be criss-crossed. Either 4 or 8 paths to each lun. Is that how you have it set up? What SAN are you using?

will get back with more in a sec

SpaceHonkey · Dec 3, 2008

Right, I suspect that we will criss-cross it, but I don't have the hardware for it now (SFPs and fiber).

Here is an example:

Backing up directory from C: (local 4 disk raid 5) -

Set type : Backup
Set status : Completed
Set description : Backup Speed TEST
Resource name : \\XX.XX.XX.XX\C:
Logon account : System Logon Account
Encryption used : None
Agent used : Yes
Advanced Open File Option used : No

Byte count : 16,176,992,707 bytes
Rate : 5,644.00 MB/Min
Files : 3,773
Directories : 233
Skipped files : 0
Corrupt files : 0
Files in use : 0
Start time : Wednesday, December 03, 2008 12:50:06 PM
End time : Wednesday, December 03, 2008 12:52:51 PM
Media used : B2D000016

Backup up VM -

Backup Set Information
Family Name: "Media created 12/3/2008 10:24:09 AM"
Backup of "VMVCB::\\myservername\VCGuestVm\(DC)BC1BladeCenter(DC)\vm\Test VM" as ""
Backup set #7 on storage media #1
Backup set description: "Backup Speed TEST"
Backup Method: Full - Back up virtual machines

Backup started on 12/3/2008 at 12:54:02 PM.

Backup Set Detail Information
VMware vcbMounter job started to export virtual machine 'Test VM'. Wednesday, December 03, 2008 at 12:54:14 PM
VMware vcbMounter job to export virtual machine 'Test VM' ended successfully. Wednesday, December 03, 2008 at 12:55:06 PM

Backup completed on 12/3/2008 at 12:55:41 PM.

Backup Set Summary
Backed up 12 files in 1 directory.
Processed 2,128,248,003 bytes in 1 minute and 39 seconds.
Throughput rate: 1230 MB/min

Thuleman · Dec 3, 2008

Sorry to go a little OT on this and introduce the "FC or not to FC" discussion in here as well, it's really more of a storage discussion, but since we are here already, perhaps you don't mind to keep a parallel discussion going.

I am not suggesting that iSCSI is as capable as FC when it comes to performance. I was just wondering whether the extra performance provided by FC is needed. In my limited experience most people overestimate by orders of magnitude what performance they actually need.

Considering that iSCSI is almost infinitely easier to administer, and has a lesser price tag, I was just curious as to what prompted FC deployment on a new system.

SpaceHonkey · Dec 3, 2008

Politics really. Don't forget that in some cases, you have to take what you can get, when you can. In my case, we don't get much often (read "red tape" and $$) so when we do, we get the best we can knowing that it'll be quite a while until it gets replaced. The first server that will be migrated in this case is approaching 9 years old if that gives you an idea.

lopoetve · Dec 3, 2008

Thuleman said:
Sorry to go a little OT on this and introduce the "FC or not to FC" discussion in here as well, it's really more of a storage discussion, but since we are here already, perhaps you don't mind to keep a parallel discussion going.

I am not suggesting that iSCSI is as capable as FC when it comes to performance. I was just wondering whether the extra performance provided by FC is needed. In my limited experience most people overestimate by orders of magnitude what performance they actually need.

Considering that iSCSI is almost infinitely easier to administer, and has a lesser price tag, I was just curious as to what prompted FC deployment on a new system.

Well, one of my more recent customers had 2.2 PB of storage across 4 DMX-4's, running something close to 5000 virtual machines... Don't remember the host count but it was a LOT.

So yes, they needed the performance

They made those Symmetrix' scream. For many people no, the FC performance isn't needed, but there are a few. I deal with lots of very large customers that have extreme performance requirements, so I see lots of times that FC is the only way to get the performance needed. I also run into lots of people with an MSA1000 that could get by with a good NFS server or iSCSI SAN that would blow the cheap FC out of the water too

Heck, the job I had before the current one I was working on the Supers at NCAR - we could saturate 8gbit infiniband when we tried

lopoetve · Dec 3, 2008

frak. My post got lost, so I'll do a fast one.

run vcbmounter from the command line and time it - I want to see if the delay is in VCB or in BE.

command is vcbmounter -h vchost -u vcadminuser -p passwordforuser -a name:nameofvm -r c:\location\toputvm-fullvm -t fullvm..

SpaceHonkey · Dec 3, 2008

lopoetve said:
frak. My post got lost, so I'll do a fast one.

run vcbmounter from the command line and time it - I want to see if the delay is in VCB or in BE.

command is vcbmounter -h vchost -u vcadminuser -p passwordforuser -a name:nameofvm -r c:\location\toputvm-fullvm -t fullvm..

Approx 5 minutes. 20 GB of data.

lopoetve · Dec 3, 2008

SpaceHonkey said:
Approx 5 minutes. 20 GB of data.

filesystem of VM? NTFS windows of some kind, I'm guessing?

'what san are you using?

SpaceHonkey · Dec 3, 2008

For that test it was NTFS - XP SP3.

For SAN hardware, info is posted above.

I've also tested with Netware - using an agent (not VCB) at 4GB/m. Linux VM (ext3) has similar speeds as XP, if not a bit slower (VCB). And Win2K3 R2 again is slow (VCB).

SpaceHonkey · Dec 4, 2008

Ok, I'm starting to narrow in on the problem - I think. This damn DS3400 writes really slow - compared to reading at least. Anyone know of an obvious reason that write speeds, no matter the raid level, are on average 1/4 read speeds?

Very frustrating!

sabregen · Dec 4, 2008

Well this one is right up my alley. In the last 6 months, I have become both System X technical and VCP certified. In looking at your configuration, I am surprised that you have come to the determination that the bottleneck is the DS3400. I can tell you that the max theoretical throughput on a DS3100 LTO Gen 4 FC tape drive is 122MB/sec, but most of the time your realistic throughput (write to tape) is going to be 40Mb/sec for the most part (given your configuration).

We are also using DS3k/4ks and BCH with HS21's for our ESX environment (I set it up) and using VCB to dump to tape. However, we're using Commvault, not BE, so there is a slight difference. You are using VMFS on the VM storage LUN, right (which, if I read correctly is RAID-5). How many disks comprise your production LUN for the VMs? Yes, RAID-5 writes are slower than reads (this is universal), and if you think that's bad, try RAID-6 (which is now available on all DS3k and the DS4700 products through a firmware upgrade).

What is your host OS type set to in SM? You should have it set to ESX OS type, or at least Linux, and make sure that you've set it to the multipath capable Linux versions, if you go that route. There's a specific and preferred OS type when you create the LUNs for ESX usage, but I forget exactly what it's called in SM.

lopoetve · Dec 4, 2008

SpaceHonkey said:
Ok, I'm starting to narrow in on the problem - I think. This damn DS3400 writes really slow - compared to reading at least. Anyone know of an obvious reason that write speeds, no matter the raid level, are on average 1/4 read speeds?

Very frustrating!

other than it's an old IBM Shark, iirc?

lopoetve · Dec 4, 2008

Given the reads he's getting, and the fact he's not having massive reservation conflicts, it's not the host type. Those really don't like being set wrong

He is using VMFS. He's writing to a storage lun before he writes to the tape drive, so it's not the tape.

Open an SR - I know who you'll get for these, and once we really dig in we'll be able to tell what's up.

FibreChanMan · Dec 4, 2008

I came from the EMC Clariion world, but one thing I would check on for write speeds would be to make sure the Write Cache on the storage processors are enabled and active. Since I don't have any IBM equipment available to play with, I would imagine somewhere in the interface of the array would be options to enable/disable/modify read/write cache parameters.

sabregen · Dec 4, 2008

Are you getting any "Volume not on preferred path" conflicts in SM? If you don't have your SAN cabled or zoned properly, or the host type is wrong, you could be seeing the result of path failover between ESX hosts, created like this:

* ESX hosts (two) both mapped to their own boot LUNs, and they share VMFS LUN for vm storage
* ESX host #1 requests data from shared VMFS LUN through controller A (we'll assume controller A is the preferred path for this LUN, for arguments sake).
* ESX host #2 requests access to shared VMFS LUN through controller B (non-preferred path), causing a path failover (due to either improper cabling of drive channels to the attached trays, host type, or ESX managed paths settings)

Performance is signifcantly degraded when this happens, and causes LUN thrashing. We experienced this in our own environment when I started our ESX build, and ended up having to upgrade to a higher firmware revision on the DS3400 (and thus flashing NVSRAM, etc). and setting the host type on the DS3400 to one that natively supports multi-pathing on the connected host side, and then setting our ESX servers to use their last path, instead of doing it auto, as that would cause the failovers to occur at the rate of about 1 failover per 5 minutes.

Just an idea, but there's a lot to consider when setting this stuff up, and we haven't even discussed your potential zoning config issues yet (which will be difficult without screenshots). I'd say call IBM, they're going to want the logs from the DS3400, and it'll take some time to figure out, for sure.

SpaceHonkey · Dec 4, 2008

Let me ask this - what speeds should I be expecting on writes, especially to the Raid 0 luns?

Write caching is enabled, though I don't see a way to disable it in SM. Oh, and I did upgrade to the latest firmware/nvsram, and even hba drivers, to no avail.

Zoning is quite simple especially on the 2 raid 0 luns, only this server is allowed. Host type is set correctly - though now I'm going go test with different types to see what I get.

More to come...

VCB + BE 12.5 + FC[4Gb] = ?Throughput?

Gawd

Supreme [H]ardness

[H]ard|Gawd

Extremely [H]

Extremely [H]

Gawd

Extremely [H]

Gawd

2[H]4U

Extremely [H]

Gawd

Supreme [H]ardness

Gawd

Extremely [H]

Extremely [H]

Gawd

Extremely [H]

Gawd

Gawd

Fully [H]

Extremely [H]

Extremely [H]

[H]ard|Gawd

Fully [H]

Gawd