4xGigE - max throughput, two hosts?

dune

n00b
Joined
Jun 7, 2004
Messages
29
My Google-Fu on this topic seems to be failing to produce a workable solution (if one exists). I have about 25TB of data I need to move as quickly as possible between two HP servers (Windows 2008 R2 and 2012 R2). 10G isn't a viable option right now.

Server 1 - HP DL360 G6, NC364T Quad-port GigE
Server 2 - HP DL360 G7, NC382i (onboard) Quad-port GigE
Switch - Cisco Catalyst 4948

I started out trying LACP but quickly realized the limitations and moved onto other methods. I tried to emulate the setup from this article (didn't see how it would help) -- but couldn't get traffic through more than one interface at a time.

Moved onto experimenting with round-robin which did distribute but the packet out-of-order nuked performance. I also tried src-dst-port hashing with 4 concurrent FTP streams but for some reason only two interfaces were utilized at 50%. I also think I'm working against the convention differences between the HP configuration utility on 2008 R2 and the integrated methods on 2012.

SMB 3.0 would be nice except only one server is 2012. Also been trying with multiple iPerf instances to try and get more distribution but that showed a similar result as FTP.

I think I'm looking for the right combination of teaming/switch configuration and a multi-stream transfer protocol I can use within Windows. Any suggestions are welcome, thanks.
 
I know its not what you asked for but can you approach this from a different angle. Can you make 4 completely different networks and then just copy the data between those specific networks and manually divide the workload across the networks. So 192.168.1.1 and .2 then 192.168.2.1 and .2 and so on. Then ftp the a-g on 1.0 network and h-o data on 2.0 and so forth.

I know it's not the most elegant solution but should accomplish what you want.
 
The most you will ever get on a single TCP connection will be 1Gbit. Have you thought about transferring the SAS controller and drives over to the new system?

As stated above, you could put each connection into its own subnet and then do 4 simultaneous transfers.
 
Last edited:
Thanks for the feedback. There is an immediate need to transfer data but a longer term need to keep a subset of the data in sync between the servers. Thus temporary options like moving the controller/disks are not as attractive.

Completely understood that I'll need a transfer protocol that'll do simultaneous TCP streams, likely looking to GridFTP for this in the hopes of saturating all interfaces.

I'm back at an LACP config on both ends with TCP connection distribution and testing with 4-6 simultaneous iperf instances. Problem is results are inconsistent. Sometimes 3 of the 4 links will get to ~90% utilization while the 4th sits idle. A subsequent run then will show all 4 being utilized but only around 50-60%.

Going to try putting each interface pair (ie. nic1 on server1 and server2) in it's own subnet (and possibly VLAN) and retesting iperf.
 
Have you watched your disk IO? Is it possible you're saturating your array on either end causing the bottleneck?
 
Have you watched your disk IO? Is it possible you're saturating your array on either end causing the bottleneck?

Both servers have decent arrays but I've been testing exclusively with iperf to avoid any disk bottlenecks.

I may get a couple 10G nics for each server but haven't worked with it much. Can anyone recommend a good NIC/chipset? I presume I'll need the cards, SFP's and cables to go directly between servers.
 
I'm using two of these cards to go directly from my FreeNAS server to my ESXi hypervisor. Performance is fantastic. But that's for my home network, they don't let me play with such toys at work.
 
As an eBay Associate, HardForum may earn from qualifying purchases.
I haven't run any serious benchmarks on my cards, and I'm not in a position where I'll be able to post benchmarks for a few weeks :( If I remember, I'll post figures for ya at the end of the month xD
 
Should've read that post further, looks like they got more throughput after adjusting MTU and jumbo frames. Picked up a couple Intel E15729, will see how they do for my use case.
 
Update... picked up a couple of those Intel XF 10G cards but various iperf testing is only getting around 1.9gbps. Not sure if it's a PCIe or OS issue but going to keep at it. Any suggestions?
 
Put them both to 2012r2 and use nic teaming to load balance the windows side. Then create simple lags on the 4948 (perfect switch for this btw) via channel groups and you will see it open right up, or at least as much as it will. I have a few servers that have 6 port lags that I have pushed up to 4gb/s.

Here is the command on the switch:

config t
interface range Gi
switchport mode access
switchport access vlan 1
channel-group X mode auto
 
Actually the 10G cards are directly linked between the hosts, no switch in the middle (..yet).
 
Another update... seems the Intel 10Gbe XF SR's in BOTH servers are only negotiating PCIe v1.1, 1x which is ~2.5gbps (per hwinfo64). I'm not sure why since they should be v2.0 cards. Going to see if there are firmware updates for the cards.
 
Another update... seems the Intel 10Gbe XF SR's in BOTH servers are only negotiating PCIe v1.1, 1x which is ~2.5gbps (per hwinfo64). I'm not sure why since they should be v2.0 cards. Going to see if there are firmware updates for the cards.

Do the servers have the latest BIOS?

Could also be something as simple as a setting in the BIOS that is set to 1.1 instead of 2.0
 
Working on updates across the board, focusing on the newer server for now. The onboard array controller is showing up as 2.0 and another add-on array card is showing 3.0 so I don't think it's BIOS related.

Can't find any firmware updates for the Intel card. Will try moving slots but pretty sure that's not the issue either.
 
If you can get them both on server 2012 you can use the built in dynamic teaming without getting into lapc.

You'll see close to 2Gbps with that setup over SMB.

Option 2 setup two nics on each. Each in a different subnet

Use FTP to transfer across on the one network and SMB for the other.

throughput should be higher than a single SMB connection.
 
Back
Top