Help Identifying Network Transfer problem

nitrobass24

[H]ard|DCer of the Month - December 2009
Joined
Apr 7, 2006
Messages
10,466
I have my main fileserver and it has been atrociously slow over the last few months.

Its a Dual Xeon E5530 12GB ECC, 14TB RAID6, 1200mb RAID10, and various single disks.
It has Dual Intel GB NICS teamed.
Using Dell 2716 Gbe Switch
All of my desktops have Intel Gbe nics

Any sequential file transfers to/from either array or single disks are dog slow....25MB/s or less.

Is there a tool that i can use to simulate a transfer between two machines? I am trying to figure out if my network is somehow to blame or if its my server.
Im not tempted to blow away 14tb and reinstall windows and restore from Backups if I am not sure it will help.
 
Use iperf. This looks at just network throughput to another server. This takes out disk, windows, and all the other pieces that slow up network transfers. If your speed is low here then you should look at your NICs, your switch, or windows. Driver updates, firmware updates, and jumbo frames are your friends here.

If your network all looks to be fine (should look something like this)

------------------------------------------------------------
Client connecting to 10.1.2.3, TCP port 5001
TCP window size: 8.00 KByte (default)
------------------------------------------------------------
[244] local 10.1.3.1 port 53916 connected with 10.1.2.3 port 5001
[ ID] Interval Transfer Bandwidth
[244] 0.0-10.0 sec 236 MBytes 198 Mbits/sec

but you are still getting crappy file transfers grab HD Tune Pro and test the read speeds of your arrays. If they are low start looking at drivers / firware updates for you RAID card.

Let us know if it helps any.
 
this is exactly what i was looking for.

Will report back with results
 
Ok so i performed this test from my main workstation to the filserver and from my HTPC to the fileserver. I also performed this test from the my main workstation to the HTPC and got the following result. This tells me that i have a network issue. The only problem is that there is only a Dell 2716 between me and the fileserver and HTPC.

What could i possibly configure on that thing to make it better? I have the latest LAN drivers installed on each machine and i get similar results.

Code:
bin/iperf.exe -c 192.168.1.189 -P 1 -i 1 -p 5001 -f M -t 30
------------------------------------------------------------
Client connecting to 192.168.1.189, TCP port 5001
TCP window size: 0.01 MByte (default)
------------------------------------------------------------
[128] local 192.168.1.197 port 50675 connected with 192.168.1.189 port 5001
[ ID] Interval       Transfer     Bandwidth
[128]  0.0- 1.0 sec  50.6 MBytes  50.6 MBytes/sec
[128]  1.0- 2.0 sec  49.6 MBytes  49.6 MBytes/sec
[128]  2.0- 3.0 sec  49.9 MBytes  49.9 MBytes/sec
[128]  3.0- 4.0 sec  49.6 MBytes  49.6 MBytes/sec
[128]  4.0- 5.0 sec  50.0 MBytes  50.0 MBytes/sec
[128]  5.0- 6.0 sec  50.3 MBytes  50.3 MBytes/sec
[128]  6.0- 7.0 sec  50.1 MBytes  50.1 MBytes/sec
[128]  7.0- 8.0 sec  49.8 MBytes  49.8 MBytes/sec
[128]  8.0- 9.0 sec  49.9 MBytes  49.9 MBytes/sec
[128]  9.0-10.0 sec  50.6 MBytes  50.6 MBytes/sec
[128] 10.0-11.0 sec  50.4 MBytes  50.4 MBytes/sec
[128] 11.0-12.0 sec  49.9 MBytes  49.9 MBytes/sec
[128] 12.0-13.0 sec  50.1 MBytes  50.1 MBytes/sec
[128] 13.0-14.0 sec  49.4 MBytes  49.4 MBytes/sec
[128] 14.0-15.0 sec  49.8 MBytes  49.8 MBytes/sec
[128] 15.0-16.0 sec  49.8 MBytes  49.8 MBytes/sec
[128] 16.0-17.0 sec  50.1 MBytes  50.1 MBytes/sec
[128] 17.0-18.0 sec  49.8 MBytes  49.8 MBytes/sec
[128] 18.0-19.0 sec  50.2 MBytes  50.2 MBytes/sec
[128] 19.0-20.0 sec  50.7 MBytes  50.7 MBytes/sec
[ ID] Interval       Transfer     Bandwidth
[128] 20.0-21.0 sec  50.4 MBytes  50.4 MBytes/sec
[128] 21.0-22.0 sec  50.2 MBytes  50.2 MBytes/sec
[128] 22.0-23.0 sec  49.8 MBytes  49.8 MBytes/sec
[128] 23.0-24.0 sec  50.9 MBytes  50.9 MBytes/sec
[128] 24.0-25.0 sec  50.5 MBytes  50.5 MBytes/sec
[128] 25.0-26.0 sec  50.4 MBytes  50.4 MBytes/sec
[128] 26.0-27.0 sec  50.3 MBytes  50.3 MBytes/sec
[128] 27.0-28.0 sec  50.4 MBytes  50.4 MBytes/sec
[128] 28.0-29.0 sec  50.0 MBytes  50.0 MBytes/sec
[128] 29.0-30.0 sec  50.9 MBytes  50.9 MBytes/sec
[128]  0.0-30.0 sec  1504 MBytes  50.1 MBytes/sec
Done.
 
Ok so i performed this test from my main workstation to the filserver and from my HTPC to the fileserver. I also performed this test from the my main workstation to the HTPC and got the following result. This tells me that i have a network issue. The only problem is that there is only a Dell 2716 between me and the fileserver and HTPC.

What could i possibly configure on that thing to make it better? I have the latest LAN drivers installed on each machine and i get similar results.

Lose the switch, retest, report back

EDIT: actually try adding -w64k on the end of that first
 
I would use jperf. It's a gui implementation of iperf. Much, much easier to use and to quickly configure different options to narrow down the issue.

http://code.google.com/p/xjperf/

I'd also lose the teaming, you're very, very likely not getting anything by it except possible negative consequences.

Your Dell 2716 does support Jumbo Frames (9000 MTU max) see here: http://pages.uoregon.edu/joe/jumbo-clean-gear.html

However, I've had hit or miss luck with jumbo frames enabling higher throughput.

I've been working on tweaking throughput the last few days, here are some of my tests with relevant info:

CLIENT -
Code:
netsh interface tcp>show global
Querying active state...

TCP Global Parameters
----------------------------------------------
Receive-Side Scaling State          : enabled
Chimney Offload State               : automatic
NetDMA State                        : enabled
Direct Cache Acess (DCA)            : disabled
Receive Window Auto-Tuning Level    : normal
Add-On Congestion Control Provider  : none
ECN Capability                      : disabled
RFC 1323 Timestamps                 : disabled

netsh interface ipv4>show subinterface

   MTU  MediaSenseState   Bytes In  Bytes Out  Interface
------  ---------------  ---------  ---------  -------------
4294967295                1          0      85053  Loopback Pseudo-Interface 1
  1500                1  7807903614  5113705283  Local Area Connection 2

SERVER (Server 2008, note it has ctcp turned on...2008 has it on by default and win 7 it's off by default, so this is normal)

Code:
netsh interface tcp>show global
Querying active state...

TCP Global Parameters
----------------------------------------------
Receive-Side Scaling State          : enabled
Chimney Offload State               : automatic
NetDMA State                        : enabled
Direct Cache Acess (DCA)            : disabled
Receive Window Auto-Tuning Level    : normal
Add-On Congestion Control Provider  : ctcp
ECN Capability                      : disabled
RFC 1323 Timestamps                 : disabled

netsh interface ipv4>show subinterfaces

   MTU  MediaSenseState   Bytes In  Bytes Out  Interface
------  ---------------  ---------  ---------  -------------
4294967295                1          0          0  Loopback Pseudo-Interface 1
  9000                1  152361140683  9849688779  Local Area Connection


Code:
bin/iperf.exe -c 10.10.1.51 -P 1 -i 1 -p 5001 -f m -t 10
------------------------------------------------------------
Client connecting to 10.10.1.51, TCP port 5001
TCP window size: 0.01 MByte (default)
------------------------------------------------------------
[148] local 10.10.1.50 port 65219 connected with 10.10.1.51 port 5001
[ ID] Interval       Transfer     Bandwidth
[148]  0.0- 1.0 sec  38.8 MBytes   325 Mbits/sec
[148]  1.0- 2.0 sec  38.3 MBytes   321 Mbits/sec
[148]  2.0- 3.0 sec  38.5 MBytes   323 Mbits/sec
[148]  3.0- 4.0 sec  38.7 MBytes   325 Mbits/sec
[148]  4.0- 5.0 sec  37.9 MBytes   318 Mbits/sec
[148]  5.0- 6.0 sec  36.5 MBytes   306 Mbits/sec
[148]  6.0- 7.0 sec  37.7 MBytes   316 Mbits/sec
[148]  7.0- 8.0 sec  37.3 MBytes   313 Mbits/sec
[148]  8.0- 9.0 sec  38.0 MBytes   319 Mbits/sec
[148]  9.0-10.0 sec  34.9 MBytes   293 Mbits/sec
[148]  0.0-10.0 sec   377 MBytes   316 Mbits/sec
Done.

Notice how much faster this is by changing 1 single parameter: It's saturating the 1Gbit with only 1 connection because the buffer length was set to 1MB.
Code:
bin/iperf.exe -c 10.10.1.51 -P 1 -i 1 -p 5001 -l 1.0M -f m -t 10
------------------------------------------------------------
Client connecting to 10.10.1.51, TCP port 5001
TCP window size: 0.01 MByte (default)
------------------------------------------------------------
[148] local 10.10.1.50 port 65223 connected with 10.10.1.51 port 5001
[ ID] Interval       Transfer     Bandwidth
[148]  0.0- 1.0 sec   109 MBytes   914 Mbits/sec
[148]  1.0- 2.0 sec   110 MBytes   923 Mbits/sec
[148]  2.0- 3.0 sec   112 MBytes   940 Mbits/sec
[148]  3.0- 4.0 sec   112 MBytes   940 Mbits/sec
[148]  4.0- 5.0 sec   112 MBytes   940 Mbits/sec
[148]  5.0- 6.0 sec   110 MBytes   923 Mbits/sec
[148]  6.0- 7.0 sec   112 MBytes   940 Mbits/sec
[148]  7.0- 8.0 sec   112 MBytes   940 Mbits/sec
[148]  8.0- 9.0 sec   112 MBytes   940 Mbits/sec
[148]  9.0-10.0 sec   110 MBytes   923 Mbits/sec
[148]  0.0-10.0 sec  1112 MBytes   931 Mbits/sec
Done.

If I use 4 parallel streams at the default settings I can get 800, using 5 gets me to around 850. Here's 4:

Code:
bin/iperf.exe -c 10.10.1.51 -P 4 -i 1 -p 5001 -f m -t 10
------------------------------------------------------------
Client connecting to 10.10.1.51, TCP port 5001
TCP window size: 0.01 MByte (default)
------------------------------------------------------------
[172] local 10.10.1.50 port 49310 connected with 10.10.1.51 port 5001
[164] local 10.10.1.50 port 49309 connected with 10.10.1.51 port 5001
[156] local 10.10.1.50 port 49308 connected with 10.10.1.51 port 5001
[148] local 10.10.1.50 port 49307 connected with 10.10.1.51 port 5001
[ ID] Interval       Transfer     Bandwidth
[172]  0.0- 1.0 sec  22.9 MBytes   192 Mbits/sec
[164]  0.0- 1.0 sec  24.3 MBytes   203 Mbits/sec
[156]  0.0- 1.0 sec  24.2 MBytes   203 Mbits/sec
[148]  0.0- 1.0 sec  23.6 MBytes   198 Mbits/sec
[SUM]  0.0- 1.0 sec  95.0 MBytes   797 Mbits/sec
[156]  1.0- 2.0 sec  23.8 MBytes   199 Mbits/sec
[172]  1.0- 2.0 sec  21.2 MBytes   178 Mbits/sec
[164]  1.0- 2.0 sec  25.0 MBytes   210 Mbits/sec
[148]  1.0- 2.0 sec  21.6 MBytes   181 Mbits/sec
[SUM]  1.0- 2.0 sec  91.6 MBytes   768 Mbits/sec
[148]  2.0- 3.0 sec  23.3 MBytes   196 Mbits/sec
[156]  2.0- 3.0 sec  24.2 MBytes   203 Mbits/sec
[172]  2.0- 3.0 sec  23.1 MBytes   193 Mbits/sec
[164]  2.0- 3.0 sec  24.1 MBytes   202 Mbits/sec
[SUM]  2.0- 3.0 sec  94.7 MBytes   794 Mbits/sec
[156]  3.0- 4.0 sec  25.7 MBytes   216 Mbits/sec
[164]  3.0- 4.0 sec  25.6 MBytes   215 Mbits/sec
[172]  3.0- 4.0 sec  23.4 MBytes   196 Mbits/sec
[148]  3.0- 4.0 sec  23.7 MBytes   199 Mbits/sec
[SUM]  3.0- 4.0 sec  98.4 MBytes   826 Mbits/sec
[ ID] Interval       Transfer     Bandwidth
[172]  4.0- 5.0 sec  22.3 MBytes   187 Mbits/sec
[148]  4.0- 5.0 sec  22.6 MBytes   190 Mbits/sec
[164]  4.0- 5.0 sec  25.9 MBytes   217 Mbits/sec
[156]  4.0- 5.0 sec  25.6 MBytes   215 Mbits/sec
[SUM]  4.0- 5.0 sec  96.5 MBytes   810 Mbits/sec
[164]  5.0- 6.0 sec  24.7 MBytes   207 Mbits/sec
[156]  5.0- 6.0 sec  25.0 MBytes   210 Mbits/sec
[148]  5.0- 6.0 sec  23.9 MBytes   200 Mbits/sec
[172]  5.0- 6.0 sec  23.7 MBytes   199 Mbits/sec
[SUM]  5.0- 6.0 sec  97.2 MBytes   816 Mbits/sec
[148]  6.0- 7.0 sec  24.1 MBytes   202 Mbits/sec
[172]  6.0- 7.0 sec  23.9 MBytes   200 Mbits/sec
[164]  6.0- 7.0 sec  25.2 MBytes   212 Mbits/sec
[156]  6.0- 7.0 sec  25.1 MBytes   211 Mbits/sec
[SUM]  6.0- 7.0 sec  98.3 MBytes   824 Mbits/sec
[164]  7.0- 8.0 sec  25.0 MBytes   210 Mbits/sec
[156]  7.0- 8.0 sec  25.2 MBytes   211 Mbits/sec
[172]  7.0- 8.0 sec  23.8 MBytes   200 Mbits/sec
[148]  7.0- 8.0 sec  24.0 MBytes   202 Mbits/sec
[SUM]  7.0- 8.0 sec  98.0 MBytes   822 Mbits/sec
[ ID] Interval       Transfer     Bandwidth
[164]  8.0- 9.0 sec  24.5 MBytes   206 Mbits/sec
[148]  8.0- 9.0 sec  22.8 MBytes   191 Mbits/sec
[156]  8.0- 9.0 sec  24.6 MBytes   206 Mbits/sec
[172]  8.0- 9.0 sec  22.6 MBytes   189 Mbits/sec
[SUM]  8.0- 9.0 sec  94.4 MBytes   792 Mbits/sec
[164]  9.0-10.0 sec  23.8 MBytes   199 Mbits/sec
[172]  9.0-10.0 sec  22.9 MBytes   192 Mbits/sec
[156]  9.0-10.0 sec  23.9 MBytes   201 Mbits/sec
[148]  9.0-10.0 sec  23.2 MBytes   194 Mbits/sec
[SUM]  9.0-10.0 sec  93.8 MBytes   787 Mbits/sec
[164]  0.0-10.0 sec   248 MBytes   208 Mbits/sec
[172]  0.0-10.0 sec   230 MBytes   193 Mbits/sec
[156]  0.0-10.0 sec   247 MBytes   207 Mbits/sec
[148]  0.0-10.0 sec   233 MBytes   195 Mbits/sec
[SUM]  0.0-10.0 sec   958 MBytes   803 Mbits/sec
Done.

Windows 7/2008 also has this automatic tcp window size adjustment capability, you'll notice below how it says auto tuning level is overriding any user defined settings . Turn it off while you mess with different window sizes and see what works for you. To do that issue "netsh interface tcp>set heuristics disabled". Just set it to enabled to turn it back on.
Code:
netsh interface tcp>show global
Querying active state...

TCP Global Parameters
----------------------------------------------
Receive-Side Scaling State          : enabled
Chimney Offload State               : automatic
NetDMA State                        : enabled
Direct Cache Acess (DCA)            : disabled
Receive Window Auto-Tuning Level    : normal
Add-On Congestion Control Provider  : none
ECN Capability                      : disabled
RFC 1323 Timestamps                 : disabled
** The above autotuninglevel setting is the result of Windows Scaling heuristics
overriding any local/policy configuration on at least one profile.

netsh interface tcp>set heuristics disabled
Ok.

netsh interface tcp>show global
Querying active state...

TCP Global Parameters
----------------------------------------------
Receive-Side Scaling State          : enabled
Chimney Offload State               : automatic
NetDMA State                        : enabled
Direct Cache Acess (DCA)            : disabled
Receive Window Auto-Tuning Level    : normal
Add-On Congestion Control Provider  : none
ECN Capability                      : disabled
RFC 1323 Timestamps                 : disabled

After setting that to off, you can also try what greengolftee said, set the TCPWindowSize to 64K. Here are my results with heuristics off, all default including the buffer size, and only changing the windowsize to 64K. As you can see, pretty dramatic improvement!

Code:
bin/iperf.exe -c 10.10.1.51 -P 1 -i 1 -p 5001 -w 64.0K -f m -t 10
------------------------------------------------------------
Client connecting to 10.10.1.51, TCP port 5001
TCP window size: 0.06 MByte
------------------------------------------------------------
[148] local 10.10.1.50 port 50174 connected with 10.10.1.51 port 5001
[ ID] Interval       Transfer     Bandwidth
[148]  0.0- 1.0 sec   107 MBytes   894 Mbits/sec
[148]  1.0- 2.0 sec   103 MBytes   861 Mbits/sec
[148]  2.0- 3.0 sec   103 MBytes   866 Mbits/sec
[148]  3.0- 4.0 sec   105 MBytes   878 Mbits/sec
[148]  4.0- 5.0 sec   109 MBytes   911 Mbits/sec
[148]  5.0- 6.0 sec   107 MBytes   900 Mbits/sec
[148]  6.0- 7.0 sec   106 MBytes   893 Mbits/sec
[148]  7.0- 8.0 sec   105 MBytes   881 Mbits/sec
[148]  8.0- 9.0 sec   108 MBytes   902 Mbits/sec
[148]  9.0-10.0 sec   104 MBytes   873 Mbits/sec
[148]  0.0-10.0 sec  1056 MBytes   886 Mbits/sec
Done.
 
Last edited:
Yea i like the teaming because with Hyper-v i can only assign an 1 external network to 1 phyical NIC so this provides me some fail-over capability.

I dont think its hurting anything because i have the same network issue without the server in the mix. Im trying to figure out how i can test this without my 2716. Might go borrow a different switch from a buddy.
 
Oh I didn't realize you were using virtual servers and the nics for failover, that's different, and should be fine. I wouldn't bother removing the switch before you run the tests with jperf. It'll take all of 15 minutes.

But by lose the switch I think he meant crossover cable to eliminate variables.
 
Back
Top