IBM ServeRAID M1015 Solaris JBOD?

Ok I finally got my Xeon e3-1220 and Supermicro X9SCM-F motherboard, with 3 IBM M1015 cards and 1 Dell XR997 10 gbe ethernet card, 4x4GB Kingston ECC RAM system put together.

ESXi 5.0 is installed, with the 3 M1015s in passthrough to OI. I disabled the Option ROM for all cards including the XR997. Note: The XR997 does not work at all if I put it in the x8 slot closest to the CPU - the lights never light up after boot. ESXi reports the card is there, but it is always Down. Moving it to the x8 slot next to it and it works again.

OI is given 8 GB by ESXi. It is configured as 1 CPU with 4 cores.

Some thoughts:

1) I don't know how guys who didn't flash to IT mode for the M1015s keep your systems stable. When I reboot OpenIndiana 151, 50% of the time I get the same kernel panics that other people have reported with the M1015s with default IBM firmware. I am using the imr_sas 3.03 drivers from LSI's website.

So I don't know what you guys (piglover, odditory) are doing to keep your systems stable with the 3.03 imr_sas drivers. Maybe you guys never reboot your servers often and don't notice it? Or do you guys have special access to secret drivers LSI has given you? Or maybe you guys are using older motherboard hardware that is more stable?

EDIT: Hmmm, it could be that I need the latest M1015 firmware - I haven't checked yet but the cards were pulled from systems more than a year ago. Could this be it? Has IBM made reference to firmware fixes for similar problems?

2) hdd spindown on the M1015s do NOT work when using power.conf and device-thresholds line. Even disabling the fmd service as recommended in the all-in-one PDF from napp-it's site does not help.

According to this post, it IS possible to get hdd spindown, but he flashed to IT-mode. power.conf can be used:

http://forums.overclockers.com.au/showpost.php?p=13320952&postcount=132

3) VMXNET3 does not work. On OI bootup, I see those same "getcap" messages on the console, and then a final "detach" message.

Again, what magic is done to get VMXNET3 to at least stay attached?

This reminds me of those overclockers who can always get 3x the GHz on any CPU they get, while everyone else (mortals) can only get 1.5x overclocks! :p
 
2) hdd spindown on the M1015s do NOT work when using power.conf and device-thresholds line. Even disabling the fmd service as recommended in the all-in-one PDF from napp-it's site does not help.

According to this post, it IS possible to get hdd spindown, but he flashed to IT-mode. power.conf can be used:

http://forums.overclockers.com.au/showpost.php?p=13320952&postcount=132

That's what I read the most and keeps me from flashing to IT... Looking forward to replies to this thread.

I just read in the german hardwareluxx-forum that the disks have to be addressed using the /scsi_vhci/ path in power.conf to get it working [1]
There is also a comment regarding the FMD-service to edit the scan-interval in /usr/lib/fm/fmd/plugins/disk-transport.conf

[1] http://www.hardwareluxx.de/community/f101/zfs-stammtisch-570052-55.html#post17764771
 
Last edited:
Hey hotzen - that's for that link!

I flashed my M1015s with the IT firmware and now spindown works. Oddly, there seems to be a default spindown setting (30 minutes? I am not sure) after flashing. I could not find a spindown setting in the LSI BIOS configuration utility.

I can confirm that I can change the spindown times through edit power.conf as described here:

http://www.nexenta.org/boards/1/topics/105

It also uses the /scsi_vhci path as shown in the format command, but it is also very important that the autopm is set to enable. If you use Napp-it, it will set autopm to default so watch out for that!

At least there are no more random kernel panics when booting OI 151 anymore, since it is not using the 3.03 imr_sas drivers anymore!
 
Good to know, thanks for that.

Perhaps you could disable the M1015's BIOS to kill the default 30 minutes? But I am not sure about the impact of disabling the BIOS...
 
hotzen, disabling the option ROM for the card in the motherboard's BIOS causes a problem with my 10 gbe card not appearing properly to the system (Dell XR997). For some reason, it has a higher chance of not being enabled unless I enable at least one Option ROM for one of the M1015s! It's very strange. Once the 10 gbe card refuses to enable itself, I have to take out the card, power cycle the system once, turn off the system, and then put the 10 gbe card back in!

I'll need to do more tests now to see if using a timeout greater than 30 minutes in power.conf works.
 
Hi guys, I've ran into some problems with a M1015 with IT-firmware. Lets start with some background info:

-Core 2 Quad Q9550
-Asus P5Q-E
-8GB ram
-rpool: 320G hdd
-pool 1: 4x 640G raidz on ICH10R
-pool 2: 8x 2TB raidz in 2 vdev on M1015

The system is running native(not VMed) Solaris 11 express with napp-it. I started out running the card with the default IR firmware, and everything was running without issue. Then I read about the ability to flash the card to IT firmware, so I thought I'd give it a whack.

I gathered all the tools, and ran through the .bats and steps, and everything looked absolutely fabulous. Then I noticed a problem. One of the 2TB drives would report hardware errors for the device (the S:X H:X T:X where the H:X would be in the hundreds). If I put a heavy load on the pool, read or write, or if I execute a scrub on the pool, the errors would rapidly increase. The value would reset on system restart, then climb back up again once the pool is accessed.

I thought I might have a bad SFF8087 -> sata cable, so I swapped it with another, but the problem did not resolve. I thought it might be that specific HD that's causing the problem, so I swapped which sata connector is connected to which device, didn't help. Swapped out the HD for a cold spare, also didn't help. I've notice that is always the highest numbered device that's reporting the error (Let's say, c9t0d1 thru c9t7d1, it's always c9t7d1).

Now, I don't have another M1015 on hand to check if it's only mine which doesn't like to be flashed to IT, so I'm wondering if any of you guys have ran into this before. I thought that maybe the errors would've scrambled the test data I put into the pool, but when I compared it against the control data, it was bit-accurate. So, perhaps it is a false alarm?

In the end, I flashed the card back to the latest IR firmware, and it has been running rock solid since then. But I would still love to get to the bottom of the issue why the card didn't take the IT firmware, just in case some new ZFS feature comes along and won't play with IR firmware at all.

Look forward to hearing what you guys think.
 
have you inspected the connector on the card? Maybe it is not bent properly or it is dirty? Yeah, it's a long shot, but it could just be a flaky connection on that part of the connector.

Also, it isn't clear to me, but early in your post it sounds like you successfully flashed the card to IT-mode. But later you said it didn't flash? I am confused.
 
I have checked the two sff connectors on the card, I didn't see any problems with the pins, and I've also cleaned them using a contact cleaner. I've done the same for the cables. But I think the best evidence that it isn't a physical connection problem, is that it runs perfectly fine in IR-mode.

The card was successfully flashed to IT-mode, but once I have completed the testing, and ran into the errors, I flashed it back with 9240-8i firmware to check if the same errors occur, and have left it like that ever since.

Thanks for your suggestions!
 
Wow that is concerning -- I hope that doesn't happen on my M1015s....(flashed to IT as per a post on lime-technology.com)
 
I am pretty new to UNIX..

Can someone give me a step by step guide to install the LSI 3.03 driver for the 9240-i8? I currently am stuck with a live USB stick and I'm not sure if there is a way to "inject" the driver during full install or what...

Thanks in advance!
 
Back
Top