Looking for >gigabit between 2 PCs, I'm just at the 10m range of sfp+ DAC

Ok so I decided to try something a little different. I decided to try copying a file from one array to the other over the network. Locally on the server copying a 32gb file from O: to R: took 1:29 at 368MB. Transferring it on my pc the same way took 1:08 at 480MB/sec. I did see the network activity past 4gbit up and down simultaneously. To verify this result, I tried a few more 32gb files transferred with my pc over the network, all within the margin of error (1:14 and 1:10.) So then I decide to verify it locally, and transfer a few more 32gb there too (burst plot files.) Here we get 1:27 and 1:29, again very similar.

So average transfer rate locally was 368.67MB/sec, and doing the same thing with my pc over the network was 462.33MB/sec

During this time my pc was seeing about 14% cpu use from Teracopy, the server would see up to 8%. It looks like I may be cpu limited, because my 8700k's single thread performance is greater than my ryzen 1700x? I did not expect to see this limit reached so easily.

This also leads me to believe that my internal ssd is actually my limiting factor right now, even though Crystal Disk Mark showed otherwise. Suppose this gives me an excuse to look at an nvme drive again, very peculiar.
 
Haha, would you look at nvme's only to be able to saturate the 10G connection?! :)
I think you push yourself too much into needless purchases. If you weren't able to reach 9+Gbps with iperf, there is something else that impedes your results. As I told you I'm able to reach 9.5 with much older CPU on my server, x4 Phenom II 965 at 3.4GHz.
Telling the truth I'm not entirely following your real-world copy achievements because you ware not saying from what drive/array to what drive/array you are actually transferring. Those disks/arrays have their rates that you can measure independently of any network connection.
 
Haha, would you look at nvme's only to be able to saturate the 10G connection?! :)
I think you push yourself too much into needless purchases. If you weren't able to reach 9+Gbps with iperf, there is something else that impedes your results. As I told you I'm able to reach 9.5 with much older CPU on my server, x4 Phenom II 965 at 3.4GHz.
Telling the truth I'm not entirely following your real-world copy achievements because you ware not saying from what drive/array to what drive/array you are actually transferring. Those disks/arrays have their rates that you can measure independently of any network connection.
I apologize for not being clear. In every case it was from the O: drive to the R: drive. So that makes sense, the O: is made up of the 8x 4tb hgst disks, and the R: drive is made up of the 8x 8tb wd red disks. I thought it was interesting because it shows that it is possible to transfer a single file at higher speeds than I was seeing to my ssd over the fiber.

There may very well be other factors at play here, especially considering that doing this locally on the server is actually slower. I agree there's something else odd going on here with iperf being slow, but I feel like I tried pretty much everything there. Just wanted to share my results with a different test. The only other drive I have on this pc is an old 300gb 10k rpm velociraptor drive, and that only writes at 60MB/sec so it's not useful to test with.
 
How do you attach those 8+4 disks to the system? The Areca controller?
What comes to mind is PCIe lanes that may be shared between hungry devices and likely a bottleneck on the chipset. But not entirely sure about that.
For example my mobo has x4 3.0 lanes that connect cpu to the chipset. Then the chipset has six 2.0 lanes used together by USB controller, SATA devices (those that are attached to the chipset of course), PCIe slots (apart from the two primary slots that are attached directly to CPU with 3.0). All these devices and controllers "share" the bandwidth of the four 3.0 lanes (32Gbps) allocated to the chipset.
If the Areca (or some USB devices) card share some bandwidth (not sure if only when under load), it might be just this. I don't know.

Edit: I have to correct myself. The six 2.0 lanes from the chipset are allocated only to the PICe slots that extend beyond the CPU-bound two x16 slots. E.g. 2 or 3 PCIe x1 slots and one x16 (wired for only x4 lanes). So the four 3.0 lanes are shared between all chipset-bound devices (pcie slots, USB, SATA etc.).
 
Last edited:
How do you attach those 8+4 disks to the system? The Areca controller?
What comes to mind is PCIe lanes that may be shared between hungry devices and likely a bottleneck on the chipset. But not entirely sure about that.
For example my mobo has x4 3.0 lanes that connect cpu to the chipset. Then the chipset has six 2.0 lanes used together by USB controller, SATA devices (those that are attached to the chipset of course), PCIe slots (apart from the two primary slots that are attached directly to CPU with 3.0). All these devices and controllers "share" the bandwidth of the four 3.0 lanes (32Gbps) allocated to the chipset.
If the Areca (or some USB devices) card share some bandwidth (not sure if only when under load), it might be just this. I don't know.
I appreciate you trying to figure this out with me! I'll lay out what I have going on in the system.

The Areca card is in the second PCI Express slot on the motherboard, which is version 3.0 operating at 8x (the gpu is taking up the first slot, operating at the same 8x.) Then we have the Mellanox card in the third slot, which is PCI Express 2.0 x4.

All 16 drives are connected to the Areca card, which is what handles the Raid 6 arrays.

I have a boot up 128gb sata ssd, a 60gb sata ssd for Plex, and a 3TB sata drive connected internally for my hdhomerun to dump its dvr recordings to.

I have two usb devices. One is an 8tb wd easystore external drive, the other is a hub that currently just has the ups's communication cable connected to it.


So after all that, I don't think I have anything going on that would be hungry for bandwidth. It IS worth noting that I just decided to do an iperf again. Started with my pc as the server. 5.41gbit. then I threw in the -P 2 flag, 8.5gbit. -P 3 gets me 9.7gbit! Ok let's have the server be the host...3.88gbit, -P 2 gets 5.46gbit, -P 3 gets 4.46gbit, and -P 4 gets 5.32gbit. So there's certainly something fishy going on with the server not being able to keep up. Using -P 4 with my pc as the server gets me constant 9.6gbit.

I think the next step is to swap the cards, to rule out the card in the server being (partially) defective.
 
That's something already :) .
You can try with no -P flag but with -w flag (window size): -w 2M or something like that.
Again, it shouldn't be a problem with no -P flag as I don't need it, I just increase the window size a little to get to 9.5.
I thought you have already tested before -P with more threads but maybe only one way around (your PC as the iperf client)..
 
That's something already :) .
You can try with no -P flag but with -w flag (window size): -w 2M or something like that.
Again, it shouldn't be a problem with no -P flag as I don't need it, I just increase the window size a little to get to 9.5.
I thought you have already tested before -P with more threads but maybe only one way around (your PC as the iperf client)..
I had tested that, but that was before I have the bright idea to increase my jumbo packet size (mtu?) to 9614.

So with my pc as server, -w 2M gets me 7.9gbit, 4M gets me 8.43gbit, and 8M gets me 9.90gbit. Flipping it around with my server as the server, it's 8.8gbit, 8.75gbit, and 7.72gbit. Window size sure has made a big difference, although oddly the server likes 2M the best. So what does that tell us now?

Also attached is a screenshot, doing the test over and over gets inconsistent results. I wonder what's going on really?
iperf -w 2M server as server 09-26-18.png
 
I also have some inconsistencies when I test successively.
But during my tests, anything beyond -w 2M doesn't make any difference for me. My MTU is normal (1514 I think), no Jumbo frames, as they did not make any difference either.
What does all this tell us?! I don't know :) . I think it's all a matter of combination of firmware + drivers, obviously.
To me it was enough that I saw the venerable 9.5 bandwidth so I'm not messing with it anymore. Now it's enough that the link is not a bottleneck in everything I throw at it "real-world".
 
I also have some inconsistencies when I test successively.
But during my tests, anything beyond -w 2M doesn't make any difference for me. My MTU is normal (1514 I think), no Jumbo frames, as they did not make any difference either.
What does all this tell us?! I don't know :) . I think it's all a matter of combination of firmware + drivers, obviously.
To me it was enough that I saw the venerable 9.5 bandwidth so I'm not messing with it anymore. Now it's enough that the link is not a bottleneck in everything I throw at it "real-world".
Even 0.5M on the window got me up to 9+ on iperf, my overall setup is obviously capable of the thoroughput. So I did one more test, copied a file from the ssd to itself. I know not the best test, but as an ssd it should still be pretty fast. Came in at 225MB/sec, which is pretty close to the 240ish I was seeing when trying things over the network. Really starting to think my ssd is the culprit here, I've been chasing the wrong problem! I do have another in another machine I can pull to do some testing with, later. But I suspect this is the problem.
 
Ok I have confirmed that my ssd is the limiting factor. I remembered that Asus has a ramdisk utlity, so I fired it up and made a 12gb disk, then copied over an 8gb file. Copied at 510MB/sec. So the network is capable of fast file transfers (as is my array.) My ssd is not. I think that wraps things up? Thanks a lot for all the help!
 
510MB/s is what a SATA SSD should be doing... and some NVMe SSDs stop at around 500MB/s writes too, like my OG WD Black NVMe.

Of course it reads at >1500MB/s.
 
510MB/s is what a SATA SSD should be doing... and some NVMe SSDs stop at around 500MB/s writes too, like my OG WD Black NVMe.

Of course it reads at >1500MB/s.
Pretty sure at this point the 510 is the limit of my array's read capabilities (it was 3x that when it was empty lol)
 
Pretty sure at this point the 510 is the limit of my array's read capabilities (it was 3x that when it was empty lol)

Inner tracks of spinners?

Quite likely, and that's best-case. I have a pair of SATA SSDs that will be caching for my duty array. Whenever I can figure out FreeNAS Samba permissions.

[which I'm using because Server 2016 won't let me set up Storage Spaces the way I'd like, for whatever reason]
 
Copying an 11gb file from my ssd to ramdisk went at 461MB/sec.
Copying that same file back from ramdisk to ssd went at 399MB/sec.
Copying that file from the ssd to the array via network went at 462MB/sec (same as copying to ramdisk, so probably the limit of my ssd's reads.)
Copying that file back across the network from array to ssd went at 277MB/sec.

Yet above I showed that the network can handle transferring to my machine at 510MB/sec, going from array to ramdisk.

So the ssd can be written to faster, from ramdisk to ssd. For some reason writing to the ssd over the network is limited at the mid to high 200s. I can't imagine why this is.
 
Inner tracks of spinners?

Quite likely, and that's best-case. I have a pair of SATA SSDs that will be caching for my duty array. Whenever I can figure out FreeNAS Samba permissions.

[which I'm using because Server 2016 won't let me set up Storage Spaces the way I'd like, for whatever reason]
I dunno how the raid card is allocation the data, but 510 is quite fine to me for the array. It's my pc's ssd's slow performance that I'm trying to figure out.
 
Ok I have confirmed that my ssd is the limiting factor. I remembered that Asus has a ramdisk utlity, so I fired it up and made a 12gb disk, then copied over an 8gb file. Copied at 510MB/sec. So the network is capable of fast file transfers (as is my array.) My ssd is not. I think that wraps things up? Thanks a lot for all the help!
Not sure I understand this.
When copying from and to the same drive it is perfectly obvious that the total bandwidth of about 500-550MB/s is split in half by the read and simultaneous write operations on the same drive. When you copy from a SSD to a SSD across network, you should get 500+MB/s.

P.S. Abother thing - your array is capable of more than 500MB/s write? Because when you copy from ramdisk to the server array (across the network) you got the same 460MB/s, to this tells us your array is the limiting factor here (if network can do 9Gbps).

Also have in mind the limitations of the shell (explorer) file copy which is flawed when dealing with huge files over SMB. How do you copy files, using Windows Explorer and shared folders/UNC paths?
But if you use Teracopy I guess it uses unbuffered copy which is the better way.
 
Last edited:
Not sure I understand this.
When copying from and to the same drive it is perfectly obvious that the total bandwidth of about 500-550MB/s is split in half by the read and simultaneous write operations on the same drive. When you copy from a SSD to a SSD across network, you should get 500+MB/s.
Sorry you're right, but see my better analysis I did above which uses ssd & ramdisk along with the network. It's specifically going from array to ssd that it's slow, even writing from ssd to array is fast.
 
I should mention, for desktop users, I don't think you would see an appreciable difference in speed between the connectx-2 and 3. If you have lots of connections or have lots of VMs hitting the network at the same time then sure.. but for desktop use I challenge anyone to see the difference.
 
I know it's ridiculous, but I got a Samsung 500gb 970 Evo NVMe drive during the recent ebay sale for $127, and just got the old ssd cloned over to it and rebooted. Copying that same 26.7GB of data that I was originally testing with (that usually went at about 256MB/sec and around 1:50 in transfer time) just completed on the nvme drive at 426MB/sec and 1:06. So it's not a driver issue, or a configuration issue, the ssd really was the limitation, even though crystal disk mark tests faster. Just did a 32gb test file from my 'slower' array and it copied at 400MB/sec.

It looks like one core of my pc's cpu was pegged pretty heavily during the transfer, so that's the limitation. Which in itself confuses the hell out of me, where do I find a cpu with 3x the single core performance of a 5.1ghz 8700k to fully saturate just 10 gigabit for a file transfer? I am using the windows drivers again. Realistically though I'm pretty close to the speed limits of the arrays I think so I think I'm wrapping things up. Just wanted to give final closure.
 
If you can get 9+ with iperf just live with it :) .
Install an FTP server at one end and test with transfer through FTP (plain FTP). Windows SMB shares are not very efficient. Maybe Win10 is the limiting factor. See above video - there they used WinServer 2016 to squeeze 4-5+ GBytes/sec (or maybe more, I watched it few days ago) :) .
I also use Win2016 at both ends but I have dual boot Win10 at one end though. Maybe there are TCP/IP stack parameters that can be tuned but I don't know.
You may just get 100Gbps cards and cable, maybe they would be more hardware-assisted :) . At the end of that video I expected the guy to mention his next challenge - 1 terabit/s LAN card.
 
If you can get 9+ with iperf just live with it :) .
Install an FTP server at one end and test with transfer through FTP (plain FTP). Windows SMB shares are not very efficient. Maybe Win10 is the limiting factor. See above video - there they used WinServer 2016 to squeeze 4-5+ GBytes/sec (or maybe more, I watched it few days ago) :) .
I also use Win2016 at both ends but I have dual boot Win10 at one end though. Maybe there are TCP/IP stack parameters that can be tuned but I don't know.
You may just get 100Gbps cards and cable, maybe they would be more hardware-assisted :) . At the end of that video I expected the guy to mention his next challenge - 1 terabit/s LAN card.
Heh no this went beyond silly. I already had an ftp server on my server (had an expired certificate too, stupid 1 year things) and I did see faster speeds when transferring several files at once vs teracopy. When transferring a lone file however the rate was no better, around 460MB/sec. Problem is of course my use case is generally working with single large files at a time. Still much much faster than gigabit!
 
Yeah, it's about f** time something better than 1Gbps to come to masses. Even older HDDs are considerably faster on sequential than 1Gbps.
Telling the truth I rarely use the 10G link (backups, some movies...) and it sits idle maybe 99.5% of the time but when I move something big it's just different with 180-200MB/s than 105MB. Yeah, I don't use RAIDs, I just wanted the network to be able to keep up with at least the HDDs speeds (up to 200MB/s). And at the same time when a transfer is ongoing I used at one time some VMs stored on the server on another physical HDD.
 
I'm guessing your array isn't fast enough, and wondering if adding additional write cache would solve your immediate problem

Also, Samsung drivers are typically better than windows drivers in my experience (for nvme) so you may want to run those
 
arnemetis your thread here inspired me to try the same thing myself, although my setup seems to have gone a bit easier than yours.

I was looking to network three PCs with a 10Gb connection; two are in the same room, but one is in the basement, so I needed at least a 10 meter run for that one. eBay and Amazon to the rescue!

10Gbps_n826y6dnkt.jpg


That's one dual port and two single port Mellanox ConnectX-3s, some pretty cheap transceivers, and the smaller 3m run of OM3 fiber. Unfortunately one of the single port cards was DOA, so I can only connect two PCs for now, but the eBay seller is giving me an exchange soon.

Getting the network set up and running at full speed was relatively easy. The only mistake I made was putting the dual port card in a PCIe x16 slot that was actually running at only x1. After fixing that, the iperf test looked pretty good:

iPerf10Gb_5bqz0zy3pb.jpg


9.27 Gbits/sec? That'll do, especially at around $170 for everything.

So thanks again arnemetis and everyone who participated in this thread.
 
arnemetis your thread here inspired me to try the same thing myself, although my setup seems to have gone a bit easier than yours.

I was looking to network three PCs with a 10Gb connection; two are in the same room, but one is in the basement, so I needed at least a 10 meter run for that one. eBay and Amazon to the rescue!

View attachment 111460

That's one dual port and two single port Mellanox ConnectX-3s, some pretty cheap transceivers, and the smaller 3m run of OM3 fiber. Unfortunately one of the single port cards was DOA, so I can only connect two PCs for now, but the eBay seller is giving me an exchange soon.

Getting the network set up and running at full speed was relatively easy. The only mistake I made was putting the dual port card in a PCIe x16 slot that was actually running at only x1. After fixing that, the iperf test looked pretty good:

View attachment 111461

9.27 Gbits/sec? That'll do, especially at around $170 for everything.

So thanks again arnemetis and everyone who participated in this thread.
Glad it worked out easier for you than it did for me !
 
Back
Top