Speeding up Windows file sharing w/ gigabit ethernet?

Stereodude

2[H]4U
Joined
Oct 20, 2000
Messages
3,285
I have a home network connecting my various PCs with gigabit ethernet through a HP ProCurve 1800-24. I have a server running Windows XP x64 SP2 in the basement. It has a RAID-5 array in it capable of >300MB/sec reads and writes. The other computers run Windows XP x86 SP2 or SP3.

I get ~45MB/sec max through my network to any of my computers with Windows file sharing. My buddy tells me this is typical for gigabit and Windows file sharing and that I need to use FTP or another protocol to get closer to the mythical 1000Mbit/sec number.

I did some testing with Iperf and found that without running parallel tests that the BW maxed out about 350-400Mbit/sec which is ~45MB/sec. Running parallel tests of 5 or greater I can get >900Mbit/sec each way.

What can I do to improve my file transfer performance? I see there are articles on TCP tuning XP, but there's doesn't seem to be any suggestions on what parameters to change and to what.
 
The server uses an on-board Realtek 8111C (PCIe) NiC. The other machines use a variety of NiC's [Realtek 8111B (PCIe), Intel 82547GI (CSA), BroadCom 5702 (PCI)]. I have an Intel Pro/1000 PT Server NiC and an Intel Gigabit CT Desktop NiC on the way to test in the server (mostly to see if they lower CPU usage), but I don't anticipate either speeding anything up.
 
Hard drives? they are likely your limitation.

FTP is a little more pure, I would also advise it, but it really shouldn't matter.

Your RAID5 aside, many hard drives can only pull sequential average read rates of 40-110 MB/s depending on bit density and rpm. Older drives and laptop drives down there on the low end, brand new 4 platter 2TB drives on the high end and everything else roughly in between.

Iperf is likely a simulated benchmark, generating data in RAM, not using hard disks.
 
Hard drives? they are likely your limitation.

Your RAID5 aside, many hard drives can only pull sequential average read rates of 40-110 MB/s depending on bit density and rpm. Older drives and laptop drives down there on the low end, brand new 4 platter 2TB drives on the high end and everything else roughly in between.
I have another system with a RAID-5 array capable of >180MB/sec. I have two Seagate 1.5TB (7200.11) drives in another system (>110MB/sec at the front). I can write a lot faster than ~45MB/sec in several of the system. The network is definitely the limiting factor, not the HD's.

Besides, the Iperf benchmarks basically match up with what I'm seeing (excluding the parallel testing).
 
Iperf is a throughput tester that uses data straight from ram to isolate the test to pure network. In this case it's clearly showing that the hard drives are not the bottleneck.

I know this might be a long shot, but do you have ICS enabled?
Network performance and data throughput may be significantly slower after installing Windows XP Service Pack 2: http://support.microsoft.com/?kbid=842264

Some TCP window modding info available here:
http://www.enterprisenetworkingplanet.com/nethub/article.php/3485486

Windows XP/2000 Server/Server 2003
The magical location for TCP settings in the registry editor is HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

We need to add a registry DWORD named TcpWindowSize, and enter a sufficiently large size. 131400 (make sure you click on 'decimal') should be enough.Tcp1323Opts should be set to 3. This enables both rfc1323 scaling and timestamps.

And, similarly to Unix, we want to increase the TCP buffer sizes:

ForwardBufferMemory 80000
NumForwardPackets 60000

Good luck with the testing and adjusting, let us know how it goes!
 
Windows XP has relatively terrible network performance. I've never been able to have much luck saturating GbE with it, but maybe with some tuning it can be achieved.

Vista and 7, as well as Server 2003+ seem to work well out of the box.
 
Iperf is a throughput tester that uses data straight from ram to isolate the test to pure network. In this case it's clearly showing that the hard drives are not the bottleneck.

I know this might be a long shot, but do you have ICS enabled?
Nope, ICS is not enabled.
Some TCP window modding info available here:
http://www.enterprisenetworkingplanet.com/nethub/article.php/3485486

Good luck with the testing and adjusting, let us know how it goes!
Ok, thanks. I'll take a look at them and see if I can make any improvements.

Without any tweaks, here are the numbers back from iperf run this way:

iperf.exe -c 192.168.1.151 -r -t 30 -w xxxxx

4k window:
c: 352 gbit/sec
s: 493 gbit/sec

8k window:
c: 352 gbit/sec
s: 493 gbit/sec

16k window:
c: 401 gbit/sec
s: 670 gbit/sec

32k window:
c: 884 gbit/sec
s: 926 gbit/sec

64k window:
c: 939 gbit/sec
s: 930 gbit/sec

128k window:
c: 922 gbit/sec
s: 934 gbit/sec
 
Ok, so I did some benchmarking and have come to the conclusion you can't speed up Windows File Sharing via SMB with TCP Window tuning.

I used the follow registry tweaks in an attempt to speed things up:

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters]
"TcpWindowSize"=dword:00080000
"Tcp1323Opts"=dword:00000003

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\AFD\Parameters]
"DefaultSendWindow"=dword:00010000

Both systems had the tweaks used, or not used. I pushed to both systems and pulled from both systems. I used a 5.5GB file to test with. I timed how long it took to copy it from one system to another. I did 3 successive runs and averaged the time (though the times were quite consistent between runs). Server1 runs XP x64 SP2, Server2 and Desktop run XP x86 SP3 and SP2 respectively. Server1 has a Realtek 8111C (PCIe x1), Server2 has an Intel 82547GI (CSA), and Desktop has a Realtek 8111B (PCIe x1).

First test case:

Baseline:
Server2 push to Server1 - 59.14MB/sec
Server1 pull from Server2 - 52.85MB/sec

Server2 pull from Server1 - 47.51MB/sec
Server1 push to Server2 - 14.87MB/sec

Tweaked:
Server2 push to Server1 - 59.34MB/sec
Server1 pull from Server2 - 52.04MB/sec

Server2 pull from Server1 - 35.91MB/sec
Server1 push to Server2 - 17.56MB/sec


Second test case:

Baseline:
Desktop push to Server1 - 66.07MB/sec
Server1 pull from Desktop - 51.72MB/sec

Desktop pull from Server1 - 59.34MB/sec
Server1 push to Desktop - 22.73MB/sec

Tweaked:
Desktop push to Server1 - 65.30MB/sec
Server1 pull from Desktop - 51.72MB/sec

Desktop pull from Server1 - 49.02MB/sec
Server1 push to Desktop - 22.79MB/sec


In both cases the tweaks slowed down copying data from the server and no noteworthy improvement. I'll give this a big thumbs down.
 
Try Teracopy (pretty seamless user interface) or Fastcopy (UI is a toilet, but app is fast).

With Teracopy, a pair of onboard gigabit nics, and good RAID arrays on both ends I was hitting around 90-100MB/s throughput. Both machines run Vista x64.

(Areca 1231ML 12x Hitachi 1TB R5 1GB cache, Areca 1261ML 14x Seagate 7200.10 500GB R5 1GB cache)
 
Yea, I will never do any SMB work without TeraCopy again. I love how it replaces the default copying, cut and paste in Windows.
 
The network is definitely the limiting factor, not the HD's.

Without any tweaks, here are the numbers back from iperf run this way:

iperf.exe -c 192.168.1.151 -r -t 30 -w xxxxx

<snip>

32k window:
c: 884 Mbit/sec
s: 926 Mbit/sec

64k window:
c: 939 Mbit/sec
s: 930 Mbit/sec

128k window:
c: 922 Mbit/sec
s: 934 Mbit/sec

OK. This tells me your network (everything between each NIC's transceivers) is OK. The limit of Gb ethernet is going to be around 900Mb/s.

Your HDD subsystems certainly seem fast enough. So what's left?

"How To Build a Really Fast NAS - Part 6: The Vista (SP1) Difference"
 
With Teracopy, a pair of onboard gigabit nics, and good RAID arrays on both ends I was hitting around 90-100MB/s throughput. Both machines run Vista x64.
Vista uses SMB 2.0, which is reportedly faster than SMB (which XP has), so I'm not real surprised that you get better numbers.

I'll have to take a look at Teracopy.
 
Ok, well Teracopy gets a big F for failure!!!

Windows Explorer Copy: (from before)
Server2 push to Server1 - 59.14MB/sec
Server2 pull from Server1 - 47.51MB/sec

Teracopy 2.06beta:
Server2 push to Server1 - 42.93MB/sec
Server2 pull from Server1 - 32.78MB/sec
 
Ironically I have had the same experience with Teracopy. It seems to copy slightly slower than the stock interface but the convenience makes for better ease of use. If you want to maximize transfer speeds try Fastcopy. Interface is not nearly as polished nor as convenient but it's gotten the fastest SMB transfers I've seen on my network.

http://www.ipmsg.org/tools/fastcopy.html.en

When are your intel nic's supposed to arrive? If you have time, can you do some testing when they arrive with and without jumbo frames? From my memory that gave me a decent speed up with the intel's, but my memory is fuzzy.
 
I'm not sure when my Intel NiC's will arrive. :(

Jumbo frames are a bit of a problem for me. My switch and all gigabit devices on the network support them, but I have three 100Mbit devices and one 10Mbit device that obviously don't support them. One of those is my Linksys WRT-54G. As I understand it you can't mix jumbo frames and non jumbo frames on a network without creating a lot of problems.

I messed around with creating a jumbo frame gigabit only VLAN for filesharing between the machines, and a 2nd VLAN for the 10/100Mbit stuff and internet a while back, but I found out most of the on-board gigabit solutions couldn't handle that setup which forced me to put an Intel 32-bit PCI Pro/1000 MT card in most of the systems. By the time I did all that I didn't see any real performance boost (not sure if the idea is flawed, or if the 32-bit PCI bus held all the systems back). I guess I should enable jumbo frames on two of my systems and test again.
 
I think you'll find that jumbo frames might reduce CPU load slightly but probably won't speed throughput much. Try FTP, I bet you'll find you get much closer to the performance you're expecting. As far as I can tell (and I see the same behaviour on every WinXP box I've used), the problem is in SMB, not the TCP/IP stack itself. There might be some tuning you can do to that somewhere, but I have never found any.

With good NICs I see very little difference in either performance or CPU usage with jumbo frames on and off, so I just leave it disabled as I too have some 100mbit devices still on my network, and it's just not worth the hassle.
 
I enabled jumbo frames and did a quick test and found that transfers were much slower with 9k jumbo frames turned on than the default 1.5k variety. My 59MB/sec transfers dropped to 37MB/sec. :eek:
 
Someone else I know is going to trunk / team two 1bgit/sec links in an attempt to get better throughput. I'm not sure how much that will improve things since SMB 1.0 seems very inefficient.
 
Someone else I know is going to trunk / team two 1bgit/sec links in an attempt to get better throughput. I'm not sure how much that will improve things since SMB 1.0 seems very inefficient.

In most cases aggregation is useless for host->host traffic. It becomes more useful as the number of clients increases. There are ways around this, but I don't think I've ever seen any NIC drivers that actually support anything other than basic L2 or L3 round robin - only high end switches can do better.
 
there's usually a hash function XOR / MOD involved (src-dst IP or mac)...so host-to-host, you'd be using the same link in the bundle..
 
When are your intel nic's supposed to arrive?
I got the Intel Pro/1000 PT Server card today but I haven't installed it yet.
If you have time, can you do some testing when they arrive with and without jumbo frames?
I can try it once I get done redoing my RAID array. I'm moving from 6 x 1.5TB in RAID-5 to 8 x 1.5TB in RAID-6 shortly. :cool:
 
Realtek 8111C in Server1:

Case1:
Desktop push to Server1 - 66.07MB/sec
Server1 pull from Desktop - 51.72MB/sec

Desktop pull from Server1 - 59.34MB/sec
Server1 push to Desktop - 22.73MB/sec

Case2:
Server2 push to Server1 - 59.14MB/sec
Server1 pull from Server2 - 52.85MB/sec

Server2 pull from Server1 - 47.51MB/sec
Server1 push to Server2 - 14.87MB/sec


Intel Pro/1000 PT Server in Server1:

Case1:
Desktop push to Server1 - 63.11MB/sec
Server1 pull from Desktop - 56.19MB/sec

Desktop pull from Server1 - 56.56MB/sec
Server1 push to Desktop - 22.76MB/sec

Case2:
Server2 push to Server1 - 59.34MB/sec
Server1 pull from Server2 - 53.18MB/sec

Server2 pull from Server1 - 48.32MB/sec
Server1 push to Server2 - 14.74MB/sec

So, uh... Not a lot changed by replacing the Realtek 8111C in Server1 with an Intel Pro/1000 PT Server card. In Case1 two of the four tests slowed down by 3MB/sec, one test speed up by 3MB/sec, and one test stayed the same. There was basically no change in Case2 with Server2 (which already has an Intel in it).
 
Have you tried multiple copies at the same time? I found at times I couldn't get full saturation without doing 2+ copies at the same time.
 
Here are the results from Case1 in a graph. The fastest NiC in one scenario is also the slowest in the other three scenarios. Classic...



The results are the average of 3 copies of a 5.5GB file using Windows Explorer. Clearly the NiC has a far greater impact on file transfer speeds than any TCP "optimizations" (which made things worse in my testing).
 
Have you tried multiple copies at the same time? I found at times I couldn't get full saturation without doing 2+ copies at the same time.
That's an interesting idea, but as I understand it that leads to file fragmentation on the drive writing multiple files simultaneously.
 
Back
Top