Trying to find the bottleneck on a 100Mb/s Lan

Shadowhaxor

Limp Gawd
Joined
Jul 15, 2005
Messages
311
I'm trying to indentify and issue with my 10/100MB Lan. I noticed for a while that my transfer speeds from my ESXi server and my FreeNAS server were crawling at 3-5MB/s, however I always regarded it due to the crappy FreeNas box I had built. I recently upgraded to a Synology DS212 and popped in 2 1TB drives until I could get my 3TB drives from newegg.

However after making a LUN for the ESXi server and a volume for CiFS I noticed that I see g the same pitiful file transfer speeds. My initial thought was my network however everything on the LAN is wired except for the Android tablets which are wireless but they also are not a factor in this issue.

Now I've done several tests to eliminate the cause here;

1. I transferred a 8GB ISO from the Blurays I ripped from my 1 PC to another. They are connected via 2 switches as they are on the other side house from each other. Now when the file transfers I get 11MB/S which comes out to 88Mb/s, which is close to the max throughput of the 100mb/s. That transfer took 6-7 minutes or little over 1GB per minute.

2. Transfer from PC to DS212. I took that same file and transferred it over to the DS212 and i'm getting the same speed, which is 11MB/s. So I'm not seeing a network related issue thus far.

3. Transfers from PC to ESXi 5 hosted guest. This is where I see the issue. Transferring from either the PC or the DS212 to a ESXi 5 guest, the transfers are halved or worse. At one point I saw speeds from 9MB/S but they dropped to 3-5MB/S. Every other transfer since then has also been as slow.

The guest is hosted on a Supermicro server with 8GB of memory and (2) Quad core xeons. So I'm at the point where I believe that the speeds issues are either a combo of the (2) NIC's (Onboard Intel 80003ES2LAN Gigabit) on the server or a configuration issue. However I've verified that both the main and standby adaptor on the server are running at 100mb ( changed it from auto-negotiate for testing) and the speeds still don't change.

I've made several changes to the Win2k8 server network such as disabling Chimney, QoS and dropping the Anti-virus for testing in case it was inspecting my network traffic, still no change. I also checked the ESXi network performance and it's not even close to peaking, max usage is 9800Kbps which is just about 10Mb/s.

Going crazy here trying to figure out what may be happening and it's likely in that I've probably missed something silly.

Any ideas?
 
If you're trying to store VMs on the NAS, you really should have a GigE network.

Otherwise, use iperf to test actual network throughput, so you take the disk speed out of the picture.

Look at the stats on your switches to make sure they're not logging errors.

If your devices are set to 100Mb/Full, make sure the switch ports are as well.
 
+1 on gigabit. If you can't afford a GS108E, you can't afford to have an ESXi box.

My guess

ESXI driver, or The ESXi box is running half-duplex
 
If you're trying to store VMs on the NAS, you really should have a GigE network.

Otherwise, use iperf to test actual network throughput, so you take the disk speed out of the picture.

Look at the stats on your switches to make sure they're not logging errors.

If your devices are set to 100Mb/Full, make sure the switch ports are as well.

I was in the process of testing with Iperf and it confirmed what I initially thought, that the network isn't the issue;

Client:

C:\>iperf -c 192.168.1.27
------------------------------------------------------------
Client connecting to 192.168.1.27, TCP port 5001
TCP window size: 8.00 KByte (default)
------------------------------------------------------------
[156] local 192.168.1.90 port 49306 connected with 192.168.1.27 port 5001
[ ID] Interval Transfer Bandwidth
[156] 0.0-10.0 sec 95.2 MBytes 79.9 Mbits/sec

C:\>iperf -c 192.168.1.27
------------------------------------------------------------
Client connecting to 192.168.1.27, TCP port 5001
TCP window size: 8.00 KByte (default)
------------------------------------------------------------
[156] local 192.168.1.90 port 49341 connected with 192.168.1.27 port 5001
[ ID] Interval Transfer Bandwidth
[156] 0.0-10.0 sec 95.0 MBytes 79.7 Mbits/sec

Server:

------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 8.00 KByte (default)
------------------------------------------------------------
OpenSCManager failed - Access is denied. (0x5)
[248] local 192.168.1.27 port 5001 connected with 192.168.1.90 port 49306
[ ID] Interval Transfer Bandwidth
[248] 0.0-10.0 sec 95.2 MBytes 79.9 Mbits/sec
[252] local 192.168.1.27 port 5001 connected with 192.168.1.90 port 49341
[ ID] Interval Transfer Bandwidth
[252] 0.0-10.0 sec 95.0 MBytes 79.7 Mbits/sec

The ESXi is using 2 drives currently, local storage for nested VM's on a 250GB 7200 rpm drive with 16MB cache and the NAS with is using 7200 rpm drives as well.

I verified that the switch are set to 100mbs and I even removed the additional port groups I had made originally.

And I agree, I really should get off my butt and upgrade this network to gigabit and then worry about storing LUNs for the VM's on the NAS.I think i'll just stick with adding the VM's on local storage and use CiFS / NFS since the speeds are faster until then.
 
The last ESXi white box I built I used a HP P400 controller off of Ebay with (4) RE4 drives in RAID 10.

I did experience the issues you are seeing. From Inside the VM my transfers were showing approx 30-60MB per second.

Which is lower than native as normally a RAID 10 P400 will show 180MB during benchmarks and can saturate a gigabit connection with no issues.
 
You have one active NIC on the ESXi box and using it for both iSCSI and vm communication? Your data is hairpinning.
 
You have one active NIC on the ESXi box and using it for both iSCSI and vm communication? Your data is hairpinning.

That was my next thought and after talking to some of the virt engineers at my job they stated the same thing. So i'll run another drop from the server to the switch and configure it up and see if magic happens.

*And of course I'm all out of RJ-45 plugs... ARGH*
 
I talked to our local VMware genius. So this is what is happening. You are trying to transfer data through a virtual file system and through a virtual nic.


To fix this... You need to mount the data store on a SAN.


Mount the Lun and the data store using iSCSI in the VMware machine setup interface and map the drive to the VM you want to use.

So in your VM you should have C: boot drive and your D: drive which was mounted via the VMware Console machine configuration page. Do not map the storage drive using the iSCSI ability inside the vm itself. If you do the performance will be horrible.

If you follow my recommendation above you should see near native (nas level) speed from the D:\ drive.
 
Created a new Vmkernel and moved the iSCSI traffic to it, same result. Transfer is high for about 20-30 seconds then drops back down. Using iperf when transferring and it shows the transfer is definitely cut in half.

Both nics are connected to the same switch that is pulled directly into the router. WIll check and see if the Vmkernel is imposing default restrictions on transfer speed next.
 
Created a new Vmkernel and moved the iSCSI traffic to it, same result. Transfer is high for about 20-30 seconds then drops back down. Using iperf when transferring and it shows the transfer is definitely cut in half.

Both nics are connected to the same switch that is pulled directly into the router. WIll check and see if the Vmkernel is imposing default restrictions on transfer speed next.

what kinda router ? Vlans ?
 
Now when the file transfers I get 11MB/S which comes out to 88Mb/s
11MB/s is ~92.2Mb/s. Transfer speeds in bits are always multiple of 1000.

Edit: and by 11MB/s I mean 11534336B/s. Data signal rates were 10^x before that SI nonsense came into place.
 
what kinda router ? Vlans ?

Crappy Verizon Fios Actiontec, haven't been able to get my Netgear router to work in conjunction with it.

Only using 1 VM, will eventually move up to 2 to segment my Windows / Linux networks

11MB/s is ~95Mb/s. Transfer speeds in bits are always multiple of 1000.

Right, I was estimating since 8Mb/s = 1MB/s, so 11MB was good. But that was only without the VM in the equation.

Well after long end of the story is after getting transfers at 9MB/s they dropped back to 4MB/s. As for the datastore, it's mounted on the NAS, only the primary VMDK is housed on the local storage, so it's not stored with the VM.
 
Crappy Verizon Fios Actiontec, haven't been able to get my Netgear router to work in conjunction with it.

Only using 1 VM, will eventually move up to 2 to segment my Windows / Linux networks



Right, I was estimating since 8Mb/s = 1MB/s, so 11MB was good. But that was only without the VM in the equation.

Well after long end of the story is after getting transfers at 9MB/s they dropped back to 4MB/s. As for the datastore, it's mounted on the NAS, only the primary VMDK is housed on the local storage, so it's not stored with the VM.

I wonder if your firewall/ router is causing the issue for speed..
 
I wonder if your firewall/ router is causing the issue for speed..

Except that it's only affecting the ESXi server. Everything else works fine. At least for now I've resorted to using CiFS for shares. I guess my next move is wiring the place for GigE.
 
Except that it's only affecting the ESXi server. Everything else works fine. At least for now I've resorted to using CiFS for shares. I guess my next move is wiring the place for GigE.

wiring probably good for your gig switch.
 
Just to confirm your datastore is being mounted by ESXi and NOT the virtualized OS....

What you are saying my be true as block level storage hates 100mb ethernet. It will also beat the tar out of your network.
 
Just to confirm your datastore is being mounted by ESXi and NOT the virtualized OS....

What you are saying my be true as block level storage hates 100mb ethernet. It will also beat the tar out of your network.

Right. I never mount the VMDK with the VM, even though it's the default option.
 
Back
Top