ochadd said: Once you get them all setup I'd like to see what your transfer speeds are. Maybe some iperf tests if you're willing. Never used 40gbps stuff. I've read things that it's really 4x10 gbps bonded in some way, other things said it should work like a single 40 gbps link.

The 25 gbps stuff I've worked on gets to a little over 21 gbps w/ some tweaking in iperf3. 43 gbps w/ a tream of 2x25gbps.

perf3.exe -P 8

iperf3.exe -P 8 -w 512M Click to expand...

Code: $ iperf3 -c 10.0.2.10 -P 8 Connecting to host 10.0.2.10, port 5201 [ 5] local 10.0.2.116 port 55074 connected to 10.0.2.10 port 5201 [ 7] local 10.0.2.116 port 55080 connected to 10.0.2.10 port 5201 [ 9] local 10.0.2.116 port 55084 connected to 10.0.2.10 port 5201 [ 11] local 10.0.2.116 port 55100 connected to 10.0.2.10 port 5201 [ 13] local 10.0.2.116 port 55110 connected to 10.0.2.10 port 5201 [ 15] local 10.0.2.116 port 55118 connected to 10.0.2.10 port 5201 [ 17] local 10.0.2.116 port 55126 connected to 10.0.2.10 port 5201 [ 19] local 10.0.2.116 port 55136 connected to 10.0.2.10 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.01 sec 266 MBytes 2.22 Gbits/sec 383 239 KBytes [ 7] 0.00-1.01 sec 260 MBytes 2.17 Gbits/sec 291 168 KBytes [ 9] 0.00-1.01 sec 258 MBytes 2.16 Gbits/sec 388 502 KBytes [ 11] 0.00-1.01 sec 135 MBytes 1.13 Gbits/sec 403 211 KBytes [ 13] 0.00-1.01 sec 262 MBytes 2.19 Gbits/sec 307 645 KBytes [ 15] 0.00-1.01 sec 258 MBytes 2.15 Gbits/sec 154 215 KBytes [ 17] 0.00-1.01 sec 259 MBytes 2.16 Gbits/sec 289 184 KBytes [ 19] 0.00-1.01 sec 263 MBytes 2.19 Gbits/sec 494 127 KBytes [SUM] 0.00-1.01 sec 1.92 GBytes 16.4 Gbits/sec 2709 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 1.01-2.00 sec 235 MBytes 1.98 Gbits/sec 220 137 KBytes [ 7] 1.01-2.00 sec 239 MBytes 2.01 Gbits/sec 184 161 KBytes [ 9] 1.01-2.00 sec 252 MBytes 2.12 Gbits/sec 170 478 KBytes [ 11] 1.01-2.00 sec 237 MBytes 2.00 Gbits/sec 437 156 KBytes [ 13] 1.01-2.00 sec 230 MBytes 1.94 Gbits/sec 467 235 KBytes [ 15] 1.01-2.00 sec 227 MBytes 1.91 Gbits/sec 360 105 KBytes [ 17] 1.01-2.00 sec 253 MBytes 2.13 Gbits/sec 567 158 KBytes [ 19] 1.01-2.00 sec 234 MBytes 1.97 Gbits/sec 324 232 KBytes [SUM] 1.01-2.00 sec 1.86 GBytes 16.1 Gbits/sec 2729 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 2.00-3.00 sec 237 MBytes 1.99 Gbits/sec 272 170 KBytes [ 7] 2.00-3.00 sec 227 MBytes 1.91 Gbits/sec 406 247 KBytes [ 9] 2.00-3.00 sec 244 MBytes 2.05 Gbits/sec 436 167 KBytes [ 11] 2.00-3.00 sec 237 MBytes 1.99 Gbits/sec 269 311 KBytes [ 13] 2.00-3.00 sec 240 MBytes 2.01 Gbits/sec 430 163 KBytes [ 15] 2.00-3.00 sec 246 MBytes 2.06 Gbits/sec 281 228 KBytes [ 17] 2.00-3.00 sec 238 MBytes 2.00 Gbits/sec 443 334 KBytes [ 19] 2.00-3.00 sec 238 MBytes 2.00 Gbits/sec 295 173 KBytes [SUM] 2.00-3.00 sec 1.86 GBytes 16.0 Gbits/sec 2832 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 3.00-4.00 sec 232 MBytes 1.94 Gbits/sec 381 322 KBytes [ 7] 3.00-4.00 sec 234 MBytes 1.95 Gbits/sec 282 91.9 KBytes [ 9] 3.00-4.00 sec 232 MBytes 1.94 Gbits/sec 347 146 KBytes [ 11] 3.00-4.00 sec 242 MBytes 2.02 Gbits/sec 243 479 KBytes [ 13] 3.00-4.00 sec 231 MBytes 1.93 Gbits/sec 283 191 KBytes [ 15] 3.00-4.00 sec 240 MBytes 2.01 Gbits/sec 284 233 KBytes [ 17] 3.00-4.00 sec 235 MBytes 1.96 Gbits/sec 242 130 KBytes [ 19] 3.00-4.00 sec 233 MBytes 1.95 Gbits/sec 206 119 KBytes [SUM] 3.00-4.00 sec 1.84 GBytes 15.7 Gbits/sec 2268 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 4.00-5.00 sec 268 MBytes 2.25 Gbits/sec 627 96.2 KBytes [ 7] 4.00-5.00 sec 270 MBytes 2.27 Gbits/sec 623 136 KBytes [ 9] 4.00-5.00 sec 263 MBytes 2.21 Gbits/sec 760 69.3 KBytes [ 11] 4.00-5.00 sec 264 MBytes 2.22 Gbits/sec 560 76.4 KBytes [ 13] 4.00-5.00 sec 261 MBytes 2.19 Gbits/sec 340 129 KBytes [ 15] 4.00-5.00 sec 265 MBytes 2.22 Gbits/sec 1113 243 KBytes [ 17] 4.00-5.00 sec 254 MBytes 2.14 Gbits/sec 363 36.8 KBytes [ 19] 4.00-5.00 sec 256 MBytes 2.16 Gbits/sec 454 84.8 KBytes [SUM] 4.00-5.00 sec 2.05 GBytes 17.7 Gbits/sec 4840 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 5.00-6.00 sec 269 MBytes 2.26 Gbits/sec 707 191 KBytes [ 7] 5.00-6.00 sec 215 MBytes 1.81 Gbits/sec 353 39.6 KBytes [ 9] 5.00-6.00 sec 269 MBytes 2.26 Gbits/sec 524 235 KBytes [ 11] 5.00-6.00 sec 269 MBytes 2.26 Gbits/sec 495 252 KBytes [ 13] 5.00-6.00 sec 252 MBytes 2.12 Gbits/sec 636 252 KBytes [ 15] 5.00-6.00 sec 269 MBytes 2.26 Gbits/sec 500 206 KBytes [ 17] 5.00-6.00 sec 264 MBytes 2.22 Gbits/sec 641 146 KBytes [ 19] 5.00-6.00 sec 268 MBytes 2.25 Gbits/sec 535 359 KBytes [SUM] 5.00-6.00 sec 2.03 GBytes 17.4 Gbits/sec 4391 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 6.00-7.00 sec 251 MBytes 2.10 Gbits/sec 119 270 KBytes [ 7] 6.00-7.00 sec 242 MBytes 2.03 Gbits/sec 207 223 KBytes [ 9] 6.00-7.00 sec 251 MBytes 2.10 Gbits/sec 208 173 KBytes [ 11] 6.00-7.00 sec 251 MBytes 2.10 Gbits/sec 205 174 KBytes [ 13] 6.00-7.00 sec 251 MBytes 2.10 Gbits/sec 50 181 KBytes [ 15] 6.00-7.00 sec 251 MBytes 2.10 Gbits/sec 141 174 KBytes [ 17] 6.00-7.00 sec 251 MBytes 2.10 Gbits/sec 76 403 KBytes [ 19] 6.00-7.00 sec 250 MBytes 2.09 Gbits/sec 111 209 KBytes [SUM] 6.00-7.00 sec 1.95 GBytes 16.7 Gbits/sec 1117 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 7.00-8.00 sec 254 MBytes 2.13 Gbits/sec 805 368 KBytes [ 7] 7.00-8.00 sec 245 MBytes 2.06 Gbits/sec 601 182 KBytes [ 9] 7.00-8.00 sec 266 MBytes 2.23 Gbits/sec 564 150 KBytes [ 11] 7.00-8.00 sec 236 MBytes 1.98 Gbits/sec 515 205 KBytes [ 13] 7.00-8.00 sec 230 MBytes 1.93 Gbits/sec 191 284 KBytes [ 15] 7.00-8.00 sec 229 MBytes 1.92 Gbits/sec 323 215 KBytes [ 17] 7.00-8.00 sec 240 MBytes 2.01 Gbits/sec 799 156 KBytes [ 19] 7.00-8.00 sec 229 MBytes 1.93 Gbits/sec 399 256 KBytes [SUM] 7.00-8.00 sec 1.88 GBytes 16.2 Gbits/sec 4197 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 8.00-9.00 sec 228 MBytes 1.91 Gbits/sec 123 225 KBytes [ 7] 8.00-9.00 sec 221 MBytes 1.86 Gbits/sec 152 191 KBytes [ 9] 8.00-9.00 sec 229 MBytes 1.92 Gbits/sec 275 485 KBytes [ 11] 8.00-9.00 sec 226 MBytes 1.89 Gbits/sec 171 366 KBytes [ 13] 8.00-9.00 sec 220 MBytes 1.84 Gbits/sec 56 238 KBytes [ 15] 8.00-9.00 sec 226 MBytes 1.90 Gbits/sec 166 76.4 KBytes [ 17] 8.00-9.00 sec 222 MBytes 1.86 Gbits/sec 140 216 KBytes [ 19] 8.00-9.00 sec 220 MBytes 1.85 Gbits/sec 112 123 KBytes [SUM] 8.00-9.00 sec 1.75 GBytes 15.0 Gbits/sec 1195 - - - - - - - - - - - - - - - - - - - - - - - - - [ 5] 9.00-10.00 sec 251 MBytes 2.11 Gbits/sec 147 230 KBytes [ 7] 9.00-10.00 sec 238 MBytes 2.00 Gbits/sec 133 14.1 KBytes [ 9] 9.00-10.00 sec 251 MBytes 2.11 Gbits/sec 153 216 KBytes [ 11] 9.00-10.00 sec 250 MBytes 2.10 Gbits/sec 209 199 KBytes [ 13] 9.00-10.00 sec 251 MBytes 2.11 Gbits/sec 105 260 KBytes [ 15] 9.00-10.00 sec 246 MBytes 2.07 Gbits/sec 213 122 KBytes [ 17] 9.00-10.00 sec 251 MBytes 2.11 Gbits/sec 50 240 KBytes [ 19] 9.00-10.00 sec 248 MBytes 2.08 Gbits/sec 105 191 KBytes [SUM] 9.00-10.00 sec 1.94 GBytes 16.7 Gbits/sec 1115 - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.43 GBytes 2.09 Gbits/sec 3784 sender [ 5] 0.00-10.00 sec 2.43 GBytes 2.09 Gbits/sec receiver [ 7] 0.00-10.00 sec 2.33 GBytes 2.01 Gbits/sec 3232 sender [ 7] 0.00-10.00 sec 2.33 GBytes 2.00 Gbits/sec receiver [ 9] 0.00-10.00 sec 2.46 GBytes 2.11 Gbits/sec 3825 sender [ 9] 0.00-10.00 sec 2.46 GBytes 2.11 Gbits/sec receiver [ 11] 0.00-10.00 sec 2.29 GBytes 1.97 Gbits/sec 3507 sender [ 11] 0.00-10.00 sec 2.29 GBytes 1.97 Gbits/sec receiver [ 13] 0.00-10.00 sec 2.37 GBytes 2.04 Gbits/sec 2865 sender [ 13] 0.00-10.00 sec 2.37 GBytes 2.04 Gbits/sec receiver [ 15] 0.00-10.00 sec 2.40 GBytes 2.06 Gbits/sec 3535 sender [ 15] 0.00-10.00 sec 2.40 GBytes 2.06 Gbits/sec receiver [ 17] 0.00-10.00 sec 2.41 GBytes 2.07 Gbits/sec 3610 sender [ 17] 0.00-10.00 sec 2.41 GBytes 2.07 Gbits/sec receiver [ 19] 0.00-10.00 sec 2.38 GBytes 2.05 Gbits/sec 3035 sender [ 19] 0.00-10.00 sec 2.38 GBytes 2.05 Gbits/sec receiver [SUM] 0.00-10.00 sec 19.1 GBytes 16.4 Gbits/sec 27393 sender [SUM] 0.00-10.00 sec 19.1 GBytes 16.4 Gbits/sec receiver iperf Done.

Code: $ iperf -c 10.0.2.10 -d -P 8 ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 128 KByte (default) ------------------------------------------------------------ [ 6] local 10.0.2.116 port 34760 connected with 10.0.2.10 port 5001 ------------------------------------------------------------ Client connecting to 10.0.2.10, TCP port 5001 TCP window size: 85.0 KByte (default) ------------------------------------------------------------ [ 8] local 10.0.2.116 port 34800 connected with 10.0.2.10 port 5001 [ 3] local 10.0.2.116 port 34796 connected with 10.0.2.10 port 5001 [ 7] local 10.0.2.116 port 34798 connected with 10.0.2.10 port 5001 [ 4] local 10.0.2.116 port 34746 connected with 10.0.2.10 port 5001 [ 2] local 10.0.2.116 port 34768 connected with 10.0.2.10 port 5001 [ 1] local 10.0.2.116 port 34730 connected with 10.0.2.10 port 5001 [ 9] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37944 [ 10] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37950 [ 11] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37960 [ 12] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37966 [ 13] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37980 [ 14] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37982 [ 15] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 37994 [ 16] local 10.0.2.116 port 5001 connected with 10.0.2.10 port 38008 [ 5] local 10.0.2.116 port 34782 connected with 10.0.2.10 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.0000-10.0058 sec 1.66 GBytes 1.43 Gbits/sec [ 6] 0.0000-10.0058 sec 1.68 GBytes 1.44 Gbits/sec [ 2] 0.0000-10.0058 sec 1.78 GBytes 1.53 Gbits/sec [ 4] 0.0000-10.0058 sec 1.57 GBytes 1.35 Gbits/sec [ 5] 0.0000-10.0058 sec 1.74 GBytes 1.50 Gbits/sec [ 10] 0.0000-10.0051 sec 5.61 GBytes 4.81 Gbits/sec [ 8] 0.0000-10.0058 sec 1.69 GBytes 1.46 Gbits/sec [ 7] 0.0000-10.0058 sec 1.53 GBytes 1.31 Gbits/sec [ 3] 0.0000-10.0065 sec 1.75 GBytes 1.50 Gbits/sec [ 9] 0.0000-10.0102 sec 5.98 GBytes 5.13 Gbits/sec [ 12] 0.0000-10.0042 sec 5.91 GBytes 5.08 Gbits/sec [ 14] 0.0000-10.0049 sec 5.35 GBytes 4.59 Gbits/sec [ 16] 0.0000-10.0031 sec 4.60 GBytes 3.95 Gbits/sec [ 11] 0.0000-10.0051 sec 5.42 GBytes 4.65 Gbits/sec [ 15] 0.0000-10.0037 sec 4.53 GBytes 3.89 Gbits/sec [ 13] 0.0000-10.0038 sec 5.46 GBytes 4.69 Gbits/sec [SUM] 0.0000-10.0098 sec 56.3 GBytes 48.3 Gbits/sec [ CT] final connect times (min/avg/max/stdev) = 0.123/0.167/0.243/0.044 ms (tot/err) = 8/0

Code: $ iperf -c 10.0.2.10 ------------------------------------------------------------ Client connecting to 10.0.2.10, TCP port 5001 TCP window size: 85.0 KByte (default) ------------------------------------------------------------ [ 1] local 10.0.2.116 port 41822 connected with 10.0.2.10 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.0000-10.0134 sec 20.5 GBytes 17.6 Gbits/sec $ iperf -Rc 10.0.2.10 ------------------------------------------------------------ Client connecting to 10.0.2.10, TCP port 5001 TCP window size: 85.0 KByte (default) ------------------------------------------------------------ [ 1] local 10.0.2.116 port 60754 connected with 10.0.2.10 port 5001 (reverse) [ ID] Interval Transfer Bandwidth [ *1] 0.0000-10.0026 sec 22.4 GBytes 19.3 Gbits/sec

I can only give you iperf numbers between two direct connected linux boxes right now. My delayed MTP cable has not yet arrived, so I don't have a second 40gig box to iperf to.I've also never used iperf3 before, just the regular iperf that comes with pretty much every distribution. I just installed iperf3 for testing.All of that said, with the direct connected box it looks like this:iperf3 -c 10.0.2.10 -P 8:Yikes, that's a lot more output than I am used to.The -w 512M gives me an error message that the socket buffer size is not set correctly.I've used regular iperf for most of my testing, which has looked something like this:If I just do a single thread we can see that to properly use this bandwidth you really need a lot of parallelism, as a single thread seems to peak at about 17.6 or 19.3 Gbit/s (depending on direction, probably due to different CPU's on each side):This is between my workstation and my main server as follows:- Wortkstation: Threadripper 3960x, 64GB non-ECC udimm's, quad channel ddr4-3600, Running Linux Mint 21.3server- Main Server: Epyc 7543 with 512GB octa-channel registered ECC DDR4-3200, Running Proxmox VE based on Debian BookwormIn both of these machines, the NIC's are maxing out the NIC's 8x Gen3 lanes going directly to the CPU (so no chipset shenanigans going on here)The Intel XL710-QDA2 NIC's are dual port 40gig NIC's though, and an astute observer would notice that 8x Gen3 is not enough PCIe bandwidth to saturate both at the same time. Gen3 is 985MB/s per lane, and we have 8 lanes, so 7880MB/s which is 61.5 Gbit/s. So even before you factor in protocol overhead across the PCIe bus, we are limited to only about 1.5 ports worth of bandwidth. While doing these tests the other ports were at near idle though, so I don't think that was a factor.I had hoped that a 40gig NIC would be a way for me to get near native NVMe performance off of remote storage across the network, but that doesn't look like it will happen. Raw iPerf - as noted - seems to have a worst case performance of 16-17 Gbit/s, so ~2.2GB/s. Once you use a network file sharing protocol like SMB or NFS this drops further to about 1.6GB/s. Complaining about this is definitely a first world problem, but it is evident I will not be seeing near native remote m.2 performance this way.I'm not quite sure what the limiting factor is, but something is kicking in and limiting things before we saturate 40gbit Ethernet bandwidth. QSFP is essentially hardware link aggregation in a way that is supposed to eliminate the issues with traditional link aggregation. I'd blame that, but my single threaded performance is higher than what one of the four 10gbit links could muster on its own, so that is obviously not it. I tried messing with jumbo frames but this does not appear to have accomplished anything at all.I mean, I bought the switch, so I am going to be using these as is one way or another for some time, but I'd be curious how 40Gbit QSFP+ performance compares to single link SFP28 (25Gbit) performance in this regard. I wonder if the latter is actually a little faster, or if they have the same limitations which suggests the bottleneck is somewhere else (PCIe subsystem? Software not optimized for these transfer rates? Kernel/OS? Something else?If you are curious how having the Mikrotik switch in the middle of things impacts performance, I'd be happy to post that once the second cable arrives.But yeah, for the direct host to host connection it very much seems like anything above 15-20 Gbit networking really requires some serious parallelism.