infiniband performance?

Discussion in 'Networking & Security' started by sweloop64, Nov 14, 2010.

  1. sweloop64

    sweloop64 Limp Gawd

    Messages:
    202
    Joined:
    Nov 28, 2009
    IPoIB is as I have understood far from optimal due to its intensive CPU-usage...
    Is this still an issue or is there efficient hardware offload or are CPUs fast enough these days?

    As a little sidetrack, anyone running nfs-rdma on fairly up to date hardware and has any performance figures to share?
     
  2. mattjw916

    mattjw916 [H]ard|Gawd

    Messages:
    1,290
    Joined:
    Mar 10, 2005
    I haven't kept up with IB but why would you want to run IP over it? TCP/IP wasn't really designed for those speeds. Native Infiniband (or Myrinet which is what I set up a few times) would be so much more efficient...

    Just curious what you're doing.
     
  3. cymon

    cymon Limp Gawd

    Messages:
    453
    Joined:
    Apr 16, 2009
    What kind of application is this? Looking at setting up a high-speed network with used IB gear (I know a few people around here have looked at that, I know xphil3 has an Infiniband switch, although I haven't heard if it was running.
     
  4. pissboy

    pissboy Gawd

    Messages:
    515
    Joined:
    Feb 2, 2003
    I have a pair of machines that had PCI-E IPoIB 10GB host card adapters, and was hitting ~170-180MB/s with file copies in Windows. The 20Gb HCAs hit the same numbers. At that time I thought I was CPU bound with an E2180 in one machine. I have since updated the CPU, and newer drivers have been released, but haven't had time to reinstall the IB HCAs. Linux does give better performance due to RDMA, but you're probably limited by your disks write performance in most cases. With a pair of 10GBase-CX4 Ethernet NICs and the same machines, I was hitting ~250MB/s transfers for some time, but regularly saw ~190MB/s. I never bothered with ramdisks or any other nonsense, only needed it to backup one array to another, and my numbers were real world performance.

    As far as trying out IB cheaply: All you need is a IB CX4 cable, a pair of HCAs, drivers, and a subnet manager. Normally one of the switches will act as a subnet manager, but you can just do that in software. Make sure you get real Infiniband cables, not cables meant for 4x sata or SAS, they have different ends. You should be able to get a pair of IB HCAs and a cable for around $100 on ebay if you can wait a week or so for a deal. When looking for a cable, keep in mind the best deals will be on the ~4-7M cables, which noone really needs. You'll pay a premium for short (0.5-1M) cables. Don't buy a used one that looks like its been tangled up, just move on to a good used one or a new one.

    You can find deals on 10GBe adapters and cables on ebay, but they're few and far between. I was persistent and over a few weeks I was able to get 2 NICs and a pack of 5 cables cheap.
     
    Last edited: Nov 14, 2010
  5. mattjw916

    mattjw916 [H]ard|Gawd

    Messages:
    1,290
    Joined:
    Mar 10, 2005
  6. sweloop64

    sweloop64 Limp Gawd

    Messages:
    202
    Joined:
    Nov 28, 2009
    AFAIK there is a limited selection of programs that work at native IB level...
    Myrinet might be better in some setups, but price is many times higher :(
    What I'm doing? trying to find an acceptable replacement for GbE...

    nfs-rdma? it's nfs taking advantage of rdma, boosting performance many times in common setups...

    170MB/s... SMB1 or SMB2? windows to windows?
    Performance with new CPU?
    InfiniHost III Lx/Ex?
    The two-adapters-no-switch setup is what I had in mind as switches cost a bit too much just for testing stuff...
    Yes, 10GbE adapters at bargain prices are not that common, IB adapters on the other hand...

    hmm... why?
     
  7. mattjw916

    mattjw916 [H]ard|Gawd

    Messages:
    1,290
    Joined:
    Mar 10, 2005
    Someone mentioned disk write speed as a possible bottleneck and that was my silly solution for it... :p
     
  8. pissboy

    pissboy Gawd

    Messages:
    515
    Joined:
    Feb 2, 2003
    SMB2, setup was a pair of 64bit windows Vista pro machines, haven't had any need/reason to upgrade those to Win7. Machines had large SATA Areca arrays (12x 1TB, 10x 1TB) and plenty of cache (2GB). Both machines were S775 based with an E2180 in the lower end one and an E6750. The E2180 was replaced, I'd have to check what is in it now. Both the HCAs and RAID cards had x8 slots. Both machines were using Intel S3210SHLC boards, and had ~4GB memory.

    MHEA28-XTC. I'm almost positive that's what I used as I have 40-50 of them last check.
    MHGA28-XTC cards for the 20gb DDR setup.

    As far as the 10GBbase-CX4 cards, I had a NetXen card, as well as a Chelsio(maybe?). I'm not at home for a while, but I can check and get back in a week or so.

    I did not get to try out any RDMA with linux, but RDMA is where the performance of IB is.
     
  9. sweloop64

    sweloop64 Limp Gawd

    Messages:
    202
    Joined:
    Nov 28, 2009
    performance a bit lower than I would have expected :(
    Connected Mode I presume? MTU 65k?

    MHEA28-XTC is what one usually finds at around $50 on ebay...

    Have any experience (or seen any real world stats) with the ConnectX or ConnectX-2 series that features TCP/UDP/IP stateless offload?
     
  10. pissboy

    pissboy Gawd

    Messages:
    515
    Joined:
    Feb 2, 2003
    I didn't have anything that could use RDMA, so that is a big part of it I'm sure. Looking at my past posts, I'm almost positive I was running the 2.0.2 or the 2.1 OpenFabrics software, which was from Feb or Sept 2009. Newer versions have been released : 2.2, 2.2.1, 2.3rc5, 2.3rc6, which I haven't had experience with. The OpenFabrics roadmap shows some IPoIB fixes and features due out (DHCP, which I wasn't using in the next release, and IPoIB Connected Mode to be added in 2011).

    I haven't owned any ConnectX cards.

    I may have time to play around with this again next week, but I'll be busy most likely with holiday stuff until the week after. If you're looking to buy on ebay, find one of the sellers selling them for ~$40 or best offer and submit an offer for a pair of them. You should be able to get a good deal and combine shipping.
     
  11. Jon98064

    Jon98064 n00bie

    Messages:
    11
    Joined:
    Sep 15, 2010
    I have two PC's, each with RAID 0 capable of 600 MB/s sustained read rate with large (e.g. 10GB) files.
    In effect, these two PCs form a RAID 10 system (mirrored RAID 0).

    File transfers between the two systems easily saturate Gbit Ethernet.
    Even with sustained (and verified) and saturated 120 MB/s, it takes about 25 minutes to copy 150GB between the two PC's.

    However, I have been hoping that baseline Infiniband of 10 Gbit/s would cope with 600 MB/s RAID 0 transfers and drop the time to copy 150GB down to 4 minutes for example.
    Sort of like the benefit when we all went from 100 Mbit to Gbit Ethernet.

    I see reasonable price ($99) Infiniband cards available eBay:
    Questions:
    (1) Can a DIRECT cable connection be made between two Infiniband HCA's?
    Sort of like using an Ethernet cross-over cable (since Infiniband switches are very expensive)

    (2) How is infiniband set up in Win 7 or Linux?
    Same as or similar to a network connection?
    Does drag-and-drop of files work, or must command line be used?
     
  12. sweloop64

    sweloop64 Limp Gawd

    Messages:
    202
    Joined:
    Nov 28, 2009
    Got a pair of MHEA28-XTC for less then $80, will do some testing within the next couple of days...
     
  13. pissboy

    pissboy Gawd

    Messages:
    515
    Joined:
    Feb 2, 2003
    Yes, so long as one of the clients is running a subnet manager application.

    On Windows getting the machines setup to drag and drop files would take this:

    1) Install the hardware, go to http://www.openfabrics.org/ for the latest drivers and install them.
    2) Run the subnet manager application one one of the clients to bring the fabric up.
    3) Setup the IPoIB network connection as a static IP on both sides and share out a folder.
    3) Hit the Start menu and type in \\10.10.10.1 or whatever your IP is and click on the folder.

    Everything else will be transparent.

    You can have the subnet manager run on startup on both machine and if one is already running, the second will go into a standby mode.
     
  14. meelick

    meelick n00bie

    Messages:
    4
    Joined:
    Feb 8, 2011
    Hi all,
    Thanks for all the good info in this thread so far.
    I've got myself a couple of Mellanox MHEA28-XTC cards, and a 3m cable. I'm running two machines with ipoib directly connected (no switch), Ubuntu 10.10 (64-bit) with opensm on one end, Windows 7 (64-bit) on the other with OFED. Both machines pingable, configured IP's statically. Ubuntu machine is a 3GHz Athlon X2, Win7 machine is i7 950. So the setup is working.

    Using 'iperf' with the stock settings (I havent tweaked anything yet), I was getting 0.8 gbps.
    Using iperf with "-w 65535", that increased to 1.2gbps.

    Finally, my question. Does anyone have some links or HOWTO's for optimising this setup for maximum throughput? I've about 10 drives in the linux box which I'm going to stripe/raid5 for performance.

    Rgds,
    Dave.
     
  15. meelick

    meelick n00bie

    Messages:
    4
    Joined:
    Feb 8, 2011
    OK, Initially I wasn't able to run netperf on the linux box, but then I updated the firmware on both cards (5.1 up to 5.3), and then netperf was able to run ok.

    root@raid:~# netperf -H 10.4.12.1
    TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.12.1 (10.4.12.1) port 0 AF_INET : demo
    Recv Send Send
    Socket Socket Message Elapsed
    Size Size Size Time Throughput
    bytes bytes bytes secs. 10^6bits/sec
    87380 16384 16384 10.00 7239.95

    That's about 7 gbps, if I'm reading it right.
    As compared to the data through my gigabit ethernet:
    bytes bytes bytes secs. 10^6bits/sec
    8192 16384 16384 10.00 580.92

    iperf still shows about 1.3gbps, though. I guess I'll have to configure the raid drives and see if I can get decent real-life throughputs with that.

    Still, it's nice to have an Infiniband Fabric at home. Interesting (geeky) conversation piece. :)

    Cheers,
    Dave.

    --EDIT--
    Fyi, I found the easiest way to upgrade the firmware was to get the MFT from mellanox onto the Windows 7 box, Run a cmd prompt as administrator, then follow the Mellanox instructions from there. I has to provide the -skip_is parameter to the flint tool, which then worked fine, rebooted the machine and card came up with firmware 5.3.
    That way I didn't have to go through the pain on compiling the firmware burner on Ubuntu.
     
    Last edited: Feb 9, 2011
  16. pissboy

    pissboy Gawd

    Messages:
    515
    Joined:
    Feb 2, 2003
    When I used some of the bandwidth testing tools I didn't see anything as nice as the real world performance.

    I have a couple IB switches and dozens of cards and cables I need to get rid of, I've just been lazy lol.
     
  17. meelick

    meelick n00bie

    Messages:
    4
    Joined:
    Feb 8, 2011
    For those of you attempting infiniband drivers on Ubuntu 10.10, there's a very quick way to enable the infiniband fabric as it does not come up without a little help in the form of some config file changes and package installations. The blog article is here

    Also, if anyone can give me some pointers as to why my speed is gone down to 25mbps, when my previous linux install was giving me 7200mbps. It's the same machine, Ubuntu box completely cleaned and ib0 brough up using the procedure in the blog article. remote machine has not changed.
    I don't recall having to do any major tweaks to the previous install to get the speed up to 7gbos, apart from putting the link into connected mode, and increasing the mtu to 65520.

    Thanks,
    Dave.
     
  18. Alexdi

    Alexdi n00bie

    Messages:
    37
    Joined:
    Jul 21, 2010
  19. MotionBlur

    MotionBlur [H]ard|Gawd

    Messages:
    1,640
    Joined:
    Mar 27, 2001
    Figured I'd post my findings today (using the latest firmware and drives available for the Mellanox MHEA28-XTC cards):

    -Storage Server 2008 R2 SP1 x64 and Windows 7 SP1 x64 (server and client respectively).
    -Server's RAID array gets ~800mb/sec reads and ~800mb/sec writes
    -Client RAID array gets ~200mb/sec reads and ~200mb/sec writes
    -iperf tests only got around 2gbps no where near 10gbps even after making sure the MTU was increased
    -Was not able to stream (ie play) a 1920x1080 uncompressed 30 second clip of a video project I am working on smoothly from the server to the client. The bit rate for that is about 180mb/sec.
    -1280x720 uncompressed 30 second clip however was able to playback smoothly, albeit that is only around 67mb/sec. Most of my clients are looking for 1080p not 720p, so this is unacceptable for my work.
    -CPU usage on the server was between 60-70% during file transfers and 30-40% on the client. The server has a Athlon II X2 260 and the client has a Phenom II X6 1090T, both with 16gb of ram.

    So overall I am upset with it, to eBay the cards and cable go....
     
  20. Red Falcon

    Red Falcon [H]ardForum Junkie

    Messages:
    9,759
    Joined:
    May 7, 2007
    It really sounds like the server's processor is a huge bottleneck. Infiniband should be leagues ahead of 2Gbps, that's what I get on FC.

    Seriously, I would upgrade the CPU, then try it again imo.
     
  21. MotionBlur

    MotionBlur [H]ard|Gawd

    Messages:
    1,640
    Joined:
    Mar 27, 2001
    hmm, maybe get a Phenom II X4 955 (3.2ghz)?

    I only have Sempron 140s in my HTPC and wife's computer and Opterons in my ESXi server so nothing readily available to test.
     
  22. Red Falcon

    Red Falcon [H]ardForum Junkie

    Messages:
    9,759
    Joined:
    May 7, 2007
    Yeah, a dual-core may hold a 10Gbps infiniband connection back a bit. If you can afford it, definitely go for the processor you listed.
     
  23. Blue Fox

    Blue Fox [H]ardForum Junkie

    Messages:
    11,628
    Joined:
    Jun 9, 2004
    You're not going to get much better than that when encapsulating ethernet traffic over InfiniBand. If you want 10gbit speeds, get a 10gbit ethernet card.
     
  24. MotionBlur

    MotionBlur [H]ard|Gawd

    Messages:
    1,640
    Joined:
    Mar 27, 2001
    Blue Fox -

    Yeah I understand that would be the preferred method, I was just trying to explore other avenues (ie FC, Infiniband) before saving up for true 10gbit ethernet. I don't like spending a lot of money if I don't have to :)
     
  25. Red Falcon

    Red Falcon [H]ardForum Junkie

    Messages:
    9,759
    Joined:
    May 7, 2007
    Oh, I didn't realize you were trying to get ethernet traffic to go through the inifiniband. Yes, Blue Fox is correct, you're going to need a 10Gb ethernet card(s) if you want that speed over infiniband.
     
  26. MotionBlur

    MotionBlur [H]ard|Gawd

    Messages:
    1,640
    Joined:
    Mar 27, 2001
    Cool thanks for the heads up, I'll continue doing what I've been doing then:

    -Copy 1080p footage from my Compact Flash cards to my Workstation
    -Edit, apply post production, burn to blu-ray/dvd
    -Copy all of the work to the SAN for storage

    For most projects I have plenty of space on my workstation, but the longer hour long projects take up TBs of space :)
     
  27. thefreeaccount

    thefreeaccount Gawd

    Messages:
    832
    Joined:
    Aug 8, 2010
    Have you considered running remote desktop services (aka terminal server) on your SAN box and RDPing into your virtual "workstation"? You would need a much beefier CPU, but it would solve the network connectivity issue.
     
  28. sj3101

    sj3101 n00bie

    Messages:
    1
    Joined:
    Mar 23, 2012
    We have designed a 72 Bay SAN with 2 x LSI 9265 RAID controllers and a Mellanox QDR. There is a Supermicro IB Switch connected to Supermicro 8 Nos Blade servers each with 64GB and 120GB SSD to boot. Server blades are with QDR IB module and on Win 2008 Ent. and two of them are clustered for SQL DB. I would like to know ideal OS for the SAN and protocol to be used for optimal performance of the storage network throughput and throughput between the blades. IBoSRP, iSCSI extns for RDMA(iSER), NFSoRDMA, RoCE or SMB2. What will be the storage throughput we can expect out of the entire solution, 20GB++...Pls help
     
  29. archenroot

    archenroot n00bie

    Messages:
    2
    Joined:
    Dec 25, 2012
    Thanks all for this discussion. As soon as QDR switches are still little bit expensive, I will start with direct-connect using just cards. This is not the only place where I see that the CPU performance plays crucial role in the overall speed result when using TCP over Infiniband.