Infiniband Help

402blownstroker

[H]ard|DCer of the Month - Nov. 2012
Joined
Jan 5, 2006
Messages
3,242
I picked up a pair of Mellanox 4x dual porrt HCA cards that are rebranded HP 483513-B21 cards. For the life of me I can not find a Server 2008r2 driver for them. I found the 'generic' windows driver package for them on the Mellanox site, but after install the driver the card is still not recognized. The HP site only has firmware :( Does anyone know where a driver is or has been able to get one of these cards to work in Server 2008?
 
Last edited:
Thanks for the links. Form the Mellanox site, version 4.7 did not find the adapter for either driver. The openfabrics driver it did find the adapter, but generated an error when initializing it. Do cables have to be connected to the adapter for it to work correctly? Still new to the whole Infiniband technology.
 
OK, I figured out why the driver is not installing. A "PCI Simple Communication Controller" No Driver Found error is happening. What the hell does that mean?

Is it an issue with a MB driver, the MB, or the card? I tried swapping the card out for another one and the same thing. Both cards are new..... from crapBay. Maybe that might be my issue.
 
Last edited:
I found the install guide off the Mellanox site, but no trouble shooting section :(

Guide

I have also tried updating the firmware and that errors out too: Firmware updating

> mst status
MST devices:
------------
mt25218_pci_cr0
mt25218_pciconf0

> flint -d mt25218_pci_cr0 -i fw-25218-5_3_000-483513-B21_A1.bin burn
-E- Cannot open Device: mt25218_pci_cr0. Invalid Image signature.

> vstat
ib_open_al failed status = IB_ERROR

I am thinking now I should have just gone the 10Gbe route. The old 'Save a few bucks and rebuy everything' game :(
 
I have no idea. I do not know what the hell it is referring to??!?!?

This is what I was referring to:

pci.jpg
 
I am not seeing that entry in the device manager. The only thing that is showing up in the device manager with a question mark is the Infiniband card. Everything else is assigned a working driver.
 
Which Mellanox driver did you try installing?

MLNX WinOF Ethernet v4.70 or MLNX WinOF v4.70.50000?

Which version of the OpenFabric driver (OFED_3-2_win7_x64.zip?)
 
I will get those answers here in a bit guys as I will have to boot Windows again. Funny thing the card just works in Linux pretty much right of the box.

Code:
> yum install infiniband-diags perftest qperf opensm
> systemctl start rdma
> systemctl start opensm
> ifconfig ib0 10.0.0.50/24
> ping 10.0.0.53
PING 10.0.0.53 (10.0.0.53) 56(84) bytes of data.
64 bytes from 10.0.0.53: icmp_seq=1 ttl=64 time=0.041 ms
64 bytes from 10.0.0.53: icmp_seq=2 ttl=64 time=0.022 ms
64 bytes from 10.0.0.53: icmp_seq=3 ttl=64 time=0.020 ms
64 bytes from 10.0.0.53: icmp_seq=4 ttl=64 time=0.035 ms
^C
 
The windows drivers have been bit hit or miss in my experience. I recall having to modify INF files and using various combinations of drivers from different packages to make my Mellanox HBA's work on Windows 7 before there was Windows 7 support. These older cards are a bit of a mess, but I did find that support from Mellanox was excellent. At one point I ended up e-mailing directly with one of their engineers.
 
I guess when I have to boot Windows I will just have to happy with 1Gb connectivity.
 
OK, I have been able to play around with the setup under linux. Basic setup is each machine has 483513-B21 card and the cards are connected directly with a 10Gb/s CX4 SFF-8470 cable. I have set MTU to 65520 which is the highest value supported by the cards. I am seeing rather low latency, but the bandwidth is around 120MB/s using ftp to transfer a large 8GB file. Both machines are using a ramdisk for the source and target directories of the transfer. With a 1Gb connection using cat6 cable and MTU of 9000, I see about 112MB/s transfer rating. I was kind of expecting the infiniband to have a lot more bandwidth. Are there any other settings that need to be set? Is the card the issue?
 
The open fabric software fixed that for me, however I had stability issues with the software, so in the end I gave up on it and went back to 2 x 1GB links (it was only lab stuff.. but it was really good when it worked..)

that was on 2008 (not R2)
 
OK, I followed a fair number of things from this post: Monster ZFS Build. Using iperf, it is reporting:

Code:
------------------------------------------------------------
Server listening on TCP port 5555
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.100.50 port 5555 connected with 192.168.100.50 port 50300
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  14.3 GBytes  12.3 Gbits/sec

Definitely an improvement. But still using sftp or across nfs I am still seeing about 120MB/s transfer rate. Shouldn't it be closed to 1.5GB/s? Or is there limitation of sftp and/or nfs? Would enabling/using rdma with nfs improve things?
 
Another follow up question. Currently I have two machines connected together using a single cable, no switch. The issue is, when one of them reboots, the other one looses its ip settings: no ip address, netmask, etc. Is this because of the use of a cable to connect them? If I add in a switch, would the settings persist when the other one reboots?
 
Yes the settings will persist with reboot with a switch.

oh and an FYI on the HP HBA card you are using, I have two of the same cards, when I could get them to work in Windows 7 or server 2008, they were really slow and then would cause blue screening, I even loaded a Linux VM on hyper-v and got better performance over using windows 2008. How sad is that.
 
Back
Top