I read so many posts where people bad-mouth NFS because they can't seem to get expected throughput with it. I've been doing hardcore protocol analysis for almost 20 years now. That being said, I've done very little NFS analysis and heck I don't even feel comfortable with all the ins and outs of 1000BaseT. My suspicion has been though, that folks having NFS performance problems have not taken the time to understand all the variables at play and ensure they have things set optimally.
So today I setup to see exactly what NFS file copy performance I could get between a new Ubuntu box I was setting up and one of my ZFS servers....here's the network layout:
Ubuntu (10.10)
| -Cat 6 cable
Linksys SRW2024 Switch
| - Cat 6 cable
ZFS (Solaris 10 u9)
I started this exercise because when I did a simple dd file test I wasn't getting anywhere near the 100MB/s (1Gbps I was expecting):
So then I started checking things....Initial configuration of Ubuntu box:
The glaring issue was that TX and RX flow control were enabled yet I had the switch port it was connected to *disabled*. If you research this setting you find folks making a case that it's really better just to let packets drop and then let TCP retransmit as necessary...typically recovering from congesting will occur faster with TCP than with Ethernet level flow control. I'd like to understand this better but for now I'd opted to disable it.
So here I am disabling TX and RX flow control on the Ubuntu eth1:
And here's another test...I'm switching to a different file name to prevent file caching.
So how's that for a 30% improvement with one simple (but important) adapter setting.
So next notice how jumbo frames ARE enabled on the ZFS server but are NOT enabled on the Ubuntu server:
So now let's enable jumbos on the Ubuntu server:
And the result:
Nice!
So today I setup to see exactly what NFS file copy performance I could get between a new Ubuntu box I was setting up and one of my ZFS servers....here's the network layout:
Ubuntu (10.10)
| -Cat 6 cable
Linksys SRW2024 Switch
| - Cat 6 cable
ZFS (Solaris 10 u9)
I started this exercise because when I did a simple dd file test I wasn't getting anywhere near the 100MB/s (1Gbps I was expecting):
Code:
root@kvm330:~# dd if=/mnt/cytel/temporary/llb_sda.dd of=/dev/null
^C22057633+0 records in
22057632+0 records out
11293507584 bytes (11 GB) copied, 169.848 s, 66.5 MB/s
So then I started checking things....Initial configuration of Ubuntu box:
Code:
root@kvm330:~# ifconfig
eth1 Link encap:Ethernet HWaddr 00:e0:81:4b:b8:ea
inet addr:192.168.2.110 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: fe80::2e0:81ff:fe4b:b8ea/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:5651497 errors:0 dropped:0 overruns:0 frame:0
TX packets:1708607 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8530745609 (8.5 GB) TX bytes:123711656 (123.7 MB)
Interrupt:26
root@kvm330:~# ethtool -k eth1
Offload parameters for eth1:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
ntuple-filters: off
receive-hashing: off
root@kvm330:~# ethtool -a eth1
Pause parameters for eth1:
Autonegotiate: on
RX: on
TX: on
root@kvm330:~# ethtool -g eth1
Ring parameters for eth1:
Pre-set maximums:
RX: 511
RX Mini: 0
RX Jumbo: 0
TX: 511
Current hardware settings:
RX: 200
RX Mini: 0
RX Jumbo: 0
TX: 511
The glaring issue was that TX and RX flow control were enabled yet I had the switch port it was connected to *disabled*. If you research this setting you find folks making a case that it's really better just to let packets drop and then let TCP retransmit as necessary...typically recovering from congesting will occur faster with TCP than with Ethernet level flow control. I'd like to understand this better but for now I'd opted to disable it.
So here I am disabling TX and RX flow control on the Ubuntu eth1:
Code:
root@kvm330:~# ethtool -A eth1 autoneg off rx off tx off
root@kvm330:~#
root@kvm330:~# ethtool -a eth1
Pause parameters for eth1:
Autonegotiate: off
RX: off
TX: off
And here's another test...I'm switching to a different file name to prevent file caching.
Code:
root@kvm330:~# dd if=/mnt/cytel/temporary/ymco_freebsd.img of=/dev/null
^C14314145+0 records in
14314144+0 records out
7328841728 bytes (7.3 GB) copied, 83.4662 s, 87.8 MB/s
So how's that for a 30% improvement with one simple (but important) adapter setting.
So next notice how jumbo frames ARE enabled on the ZFS server but are NOT enabled on the Ubuntu server:
Code:
# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
e1000g0: flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 9000 index 2
inet 192.168.2.3 netmask ffffff00 broadcast 192.168.2.255
ether 0:30:48:dc:e0:6a
see above ifconfig for Ubuntu MTU.
So now let's enable jumbos on the Ubuntu server:
Code:
root@kvm330:~# ifconfig eth1 mtu 9000
root@kvm330:~# ifconfig
eth1 Link encap:Ethernet HWaddr 00:e0:81:4b:b8:ea
inet addr:192.168.2.110 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: fe80::2e0:81ff:fe4b:b8ea/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
RX packets:2187784 errors:0 dropped:0 overruns:0 frame:0
TX packets:1109459 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:19320371115 (19.3 GB) TX bytes:82271651 (82.2 MB)
Interrupt:26
Hey..let's run snoop on the ZFS server during the test to actually make sure we are using jumbos...yup...note length 8948:
# snoop -r -d e1000g0 -c 10 ether host 00:E0:81:4B:B8:EA
Using device e1000g0 (promiscuous mode)
192.168.2.3 -> 192.168.2.110 RPC R XID=3047194340 Success
192.168.2.110 -> 192.168.2.3 TCP D=2049 S=978 Ack=990785284 Seq=1341721328 Len=0 Win=24576 Options=<nop,nop,tstamp 236608 106894330>
192.168.2.3 -> 192.168.2.110 TCP D=978 S=2049 Ack=1341721328 Seq=990776336 Len=8948 Win=53688 Options=<nop,nop,tstamp 106894330 236587>
192.168.2.3 -> 192.168.2.110 TCP D=978 S=2049 Ack=1341721328 Seq=990785284 Len=8948 Win=53688 Options=<nop,nop,tstamp 106894330 236587>
192.168.2.110 -> 192.168.2.3 TCP D=2049 S=978 Ack=990803180 Seq=1341721328 Len=0 Win=24576 Options=<nop,nop,tstamp 236608 106894330>
192.168.2.3 -> 192.168.2.110 TCP D=978 S=2049 Ack=1341721328 Seq=990794232 Len=8948 Win=53688 Options=<nop,nop,tstamp 106894330 236587>
192.168.2.3 -> 192.168.2.110 TCP D=978 S=2049 Ack=1341721328 Seq=990803180 Len=8948 Win=53688 Options=<nop,nop,tstamp 106894330 236587>
192.168.2.110 -> 192.168.2.3 TCP D=2049 S=978 Ack=990821076 Seq=1341721328 Len=0 Win=24576 Options=<nop,nop,tstamp 236609 106894330>
192.168.2.3 -> 192.168.2.110 TCP D=978 S=2049 Ack=1341721328 Seq=990812128 Len=8948 Win=53688 Options=<nop,nop,tstamp 106894330 236587>
192.168.2.3 -> 192.168.2.110 TCP D=978 S=2049 Ack=1341721328 Seq=990821076 Len=8948 Win=53688 Options=<nop,nop,tstamp 106894330 236587>
And the result:
Code:
root@kvm330:~# dd if=/mnt/cytel/temporary/ymco_freebsd.img of=/dev/null
^C4029089+0 records in
4029088+0 records out
2062893056 bytes (2.1 GB) copied, 19.2819 s, 107 MB/s
Nice!