MegaRaid SAS-9280 not compatible with SuperMicro SAS2 expander backplanes?

pclausen

Gawd
Joined
Jan 30, 2008
Messages
697
I got a SAS-9280-4i4e raid controller that is sitting in a Supermicro 826 chassis and internally connected to a BPN-SAS2-826EL1 backplane. The 9280 is running the latest firmware from Avago's site (dated August 2015). The backplane is running the 55.14.18.0 firmware from June 2013, which is the latest I could find.

The controller is seeing the 12 disks on the backplane just fine, but I keep getting the following error message every 10 seconds in the MegaRAID Storage Manager console:

Controller ID: 0 Unexpected sense: PD = 70-Invalid command operation code, CDB = 0x1c 0x01 0x01 0x00 0x20 0x00, Sense = 0x70 0x00 0x05 0x00 0x00 0x00 0x00 0x0a 0x00 0x00 0x00 0x00 0x20 0x00 0x00 0x02 0x00 0x00

avago-02.PNG


It does not appear too serious having an Information error level. However, in the Dashboard, it is showing I have no Enclosures and no Backplanes. Also, on reboot, I get a critical alert stating that I have an Unstable Enclosure.

avago-05.PNG


I proceeded to connect my 2 external Supermicro 846 chassis (with BPN-SAS2-846EL1 backplanes also running firmware 55.14.18.0) that are daisy-chained together since the 9280 only has a single external SFF-8088 connector.

Anyway, after adding those, I'm not getting entries for each of those backplanes in the log as well:

Controller ID: 0 Unexpected sense: PD = 83-Invalid command operation code, CDB = 0x1c 0x01 0x01 0x00 0x20 0x00, Sense = 0x70 0x00 0x05 0x00 0x00 0x00 0x00 0x0a 0x00 0x00 0x00 0x00 0x20 0x00 0x00 0x02 0x00 0x00

And:

Controller ID: 0 Unexpected sense: PD = 108-Invalid command operation code, CDB = 0x1c 0x01 0x01 0x00 0x20 0x00, Sense = 0x70 0x00 0x05 0x00 0x00 0x00 0x00 0x0a 0x00 0x00 0x00 0x00 0x20 0x00 0x00 0x02 0x00 0x00

I went ahead and build my 60 disk RAID60 array (5 spans with 12 disks each):

avago-07.PNG


The init is chugging along and is close to 50% done this morning after starting it last night.

So should I be worried about the log entries that are coming in every 10 seconds or the critical error about "Unstable Enclosures" (I got 2 of them now since adding the external ones).

Is there a newer firmware available for the Supermicro Expander backplanes that will correct this?

Thanks
 
You are getting an 0x1C SCSI sense error from each of the enclosures. Here is a link to the Intel decode document, which utilizes an LSI codebase.
The specific error you are seeing (and you are seeing the same CDB and sense data from each of the enclosures) 0x1C RECEIVE DIAGNOSTIC RESULTS Optional is most likely that your controller is receiving a SES code from the enclosures which the card does not have in its table.
 
The backplane is running the 55.14.18.0 firmware from June 2013, which is the latest I could find.

Did you upgrade the expander to that firmware? If so, did you get the firmware from an FTP site and NOT Supermicro? Did you also upgrade the MFG?

What I've found is that that FTP site is offering some Supermicro backplane firmwares that were sent to a specific 826EL1 user and not meant for public distribution. The firmware could be for a specific revision of the backplane only, and you're lucky that you actually have an 826 -- another user on this forum flashed those MFGs on his 836 and now can only see 12 ports instead of 16!

Short of disabling SES2, I would email Supermicro with the problem and request the latest firmware/MFG for your backplane revision. If they can't solve the problem, talk to LSI -- it's their SAS2x36 expander chip and should be fully compatible with their own controllers.
 
Yes, I did upgrade the backplanes on both of my 846 expanders and the one in the 826 using the FW_55.14.18.0.zip file from the ftp site. It included a Release note file containing the following:

--- cut ---
Super Micro Computer Inc.

Firmware
FW 55.14.18.0
Release Note

June 2013

Expander Backplanes being affected:
1. BPN-SAS2-213EL1
2. BPN-SAS2-216EL1 and BPN-SAS2-216EL2
3. BPN-SAS2-826EL1 and BPN-SAS2-826EL2
4. BPN-SAS2-836EL1 and BPN-SAS2-836EL2
5. BPN-SAS2-846EL1 and BPN-SAS2-846EL2
6. BPN-SAS2-847EL1 and BPN-SAS2-847EL2
7. BPN-SAS2-936EL1 and BPN-SAS2-936EL2
8. BPN-ADP-E16-L (alias SC217)

Bug fix / Improvements:
1. FW 55.14.11.0 set number of power supplies to zero in Server configuration, and set to two in JBOD configuration.
The SES-2 diagnostic page length will therefore be variable. It is found that some application software cannot
support this, so FW 55.14.18.0 set number of power supplies to two in all circumstances while power supplies status
will be shown as "not applicable" in Server configuration.

2. FW 55.14.11.0 allow primary and secondary expanders to monitor all SES elements at same time for HA application.
However, it has a FW bug that only backplanes with certain SAS address could support this. FW 55.14.18.0 get this
fixed.

3. Fix incorrect cross-reference link in few SES-2 diagnostic pages.

4. Fix compatibility issue with Adaptec HBA / RAID controllers.
--- cut ---

So it appears to be applicable to the expanders in both my 826 and 846 chassis.

EDIT: I just realized that the zip file within main zip file is called sc826.zip and contains the following:

Code:
02/05/2013  12:34 AM           269,172 sas2xfw_55.14.18.0.fw
02/14/2013  04:30 PM            11,680 sc826mfg_primary_fw55_14_18_0_mfg_2_38.bin
02/14/2013  04:30 PM            11,616 sc826mfg_primary_fw55_14_18_0_mfg_2_38_nofan.bin
02/14/2013  04:30 PM            11,684 sc826mfg_secondary_fw55_14_18_0_mfg_2_38.bin
02/14/2013  04:30 PM            11,620 sc826mfg_secondary_fw55_14_18_0_mfg_2_38_nofan.bin

So yeah, let me contact SuperMicro about firmware files.

A friend if mine has a LSI-9266-8i in a 826 chassis that already had the 55.14.11.0 firmware on the expander. He's not seeing the issue I am.

I did contact Avago about this issue and they had me run this "LSIget" batch file which created a ,7z file containing tons of information about my expanders, HBA and MegaRAID hardware.

Among the info it captured was the following:

Code:
            Expander: SAS2X36 (SAS2x36) B3
         SAS Address: 50030480:00EC717F
Enclosure Logical Id: 50030480:0000007F
          IP Address: 0.0.0.0
Component Identifier: 0x0223
  Component Revision: 0x05

	Product Id			:	02
	Platform Id			:	55
	FW Version			:	55.04.18.00
	MFG Version			:	01.01
	Platform Version		:	02.27
	Product Name			:	LSI SAS2X36         
	Platform Name			:	SC846EL2

Code:
            Expander: SAS2X36 (SAS2x36) B3
         SAS Address: 50030480:01A2137F
Enclosure Logical Id: 50030480:01A2137F
          IP Address: 0.0.0.0
Component Identifier: 0x0223
  Component Revision: 0x05

	Product Id			:	02
	Platform Id			:	55
	FW Version			:	55.07.23.00
	MFG Version			:	02.27
	Platform Version		:	01.01
	Product Name			:	LSI SAS2X36         
	Platform Name			:	SC846EL2

Code:
            Expander: SAS2X28 (SAS2x28) B3
         SAS Address: 50030480:015852FF
Enclosure Logical Id: 50030480:015852FF
          IP Address: 0.0.0.0
Component Identifier: 0x0221
  Component Revision: 0x05

	Product Id			:	02
	Platform Id			:	55
	FW Version			:	55.07.23.00
	MFG Version			:	02.29
	Platform Version		:	01.01
	Product Name			:	LSI SAS2X28         
	Platform Name			:	SC826EL2

Hmm, looking at the above, I'm realizing that not all 3 expanders are running 55.04.18.00 which is very odd since I validated the version after the upgrade. Will need to look into that. I'll also have my friend with the 9266 run this on his to see what he gets.

Supermicro xflash command is showing the 55.04.18.00 on all the expanders:

Code:
c:\xflash>xflash -i get avail

********************************************************************************
    Xflash
    LSI SAS Expander Flash Utility
    Version: 7.0.0.0
    Copyright (c) 2010 LSI Corporation.  All rights reserved.
********************************************************************************

Initializing Interface.
Expander: SAS2x28

1) SAS2x28 (50030480:015852FF)  (0.0.0.0)
2) SAS2x36 (50030480:01A2137F)  (0.0.0.0)
3) SAS2x36 (50030480:00EC717F)  (0.0.0.0)

C:\xflash>xflash -i 50030480015852FF get ver

Initializing Interface.
Expander: SAS2x28

Firmware Region Version: 55.14.18.00

C:\xflash>xflash -i 5003048001A2137F get ver

Initializing Interface.
Expander: SAS2x36

Firmware Region Version: 55.14.18.00

C:\xflash>xflash -i 5003048000EC717F get ver

Initializing Interface.
Expander: SAS2x36

Firmware Region Version: 55.14.18.00

C:\xflash>

Go figure...
 
Last edited:
So it appears to be applicable to the expanders in both my 826 and 846 chassis.

You'd think that, right? And the firmware most probably is, although I know for a fact (in spite of all the cross-flashing) that there is no reference binary firmware. LSI provides an SDK to the OEM who builds the firmware. They can use the default configuration (which most appear to do), but they can also customize the firmware per backplane/enclosure/card/whatever. The MFG on the other hand is a configure-at-boot region of flash which is most definitely model-specific.

Did you know that the SAS2x chips feature a TCP/IP stack and a 10/100 Ethernet MAC, giving OEMs the option of adding a PHY to allow telnet/SSH access to the expander for diagnostics/management? That appears to be one of the many "mix-and-match" firmware options.

I realized later that xflash (at least on Linux) lets you "upload" (backup) existing firmware and MFG to a file. Those files are padded, meaning you may need to cut off a few bytes at the end to match the size of OEM-provided firmware, but I verified that they can then be successfully be flashed to restore the original version.

Go figure...

Did you flash both "regions" on all expanders? SAS2x has one region (1) for "active" firmware and another (2) for "updated" firmware. It may be that the LSIget utils are reading one region and xflash.exe is reading the other.

Try xflash -i SAS_ADDRESS get ver 1 and xflash -i SAS_ADDRESS get ver 2
 
Also, to add to your original post, what's happening is that the controller is sending the expander the RECEIVE DIAGNOSTICS RESULT SCSI command, requesting it to return diagnostic page 1 (the configuration status page). The expander is returning an error instead, causing the warning.

Here's an overview of the configuration diagnostics page:
The Configuration diagnostic page returns information about the enclosure, including the list of elements in the enclosure. The element list shall include all elements with defined element status or controls and may list any other elements in the enclosure. The Configuration diagnostic page provides enclosure descriptor information and parameters. The Configuration diagnostic page may provide descriptive text identifying element types in more detail.

The Configuration diagnostic page is read by the RECEIVE DIAGNOSTIC RESULTS command with a PCV bit set to one and a PAGE CODE field set to 01h.

Can you run Linux (even in Live Demo mode) on the system? You'd then have access to the sg_ses utilities allowing direct querying of the expander, although if the 9280 doesn't expose the expander to the system, this may require an HBA instead.
 
Did you flash both "regions" on all expanders? SAS2x has one region (1) for "active" firmware and another (2) for "updated" firmware. It may be that the LSIget utils are reading one region and xflash.exe is reading the other.

Try xflash -i SAS_ADDRESS get ver 1 and xflash -i SAS_ADDRESS get ver 2

So doing a get ver 0 and get ver 1 returns 55.14.18.00 for all 3 expanders.

Get ver 2 returns a "Flash region is blank"

Get ver 3 returns the following values:

Code:
SAS2x28 (50030480:015852FF)

Reading the MFG Version Info page.......

        Product Id                      :       02
        Platform Id                     :       37
        FW Version                      :       37.07.17.00
        MFG Version                     :       02.1d
        Platform Version                :       01.01
        Product Name                    :       LSI SAS2X28
        Platform Name                   :       SC826EL2

Code:
SAS2x36 (50030480:01A2137F)

Reading the MFG Version Info page.......

        Product Id                      :       02
        Platform Id                     :       37
        FW Version                      :       37.07.17.00
        MFG Version                     :       02.1b
        Platform Version                :       01.01
        Product Name                    :       LSI SAS2X36
        Platform Name                   :       SC846EL2

Code:
SAS2x36 (50030480:00EC717F)

Reading the MFG Version Info page.......

        Product Id                      :       02
        Platform Id                     :       37
        FW Version                      :       37.04.12.00
        MFG Version                     :       01.01
        Platform Version                :       02.1b
        Product Name                    :       LSI SAS2X36
        Platform Name                   :       SC846EL2

So not sure what to make of that.

Did you know that the SAS2x chips feature a TCP/IP stack and a 10/100 Ethernet MAC, giving OEMs the option of adding a PHY to allow telnet/SSH access to the expander for diagnostics/management? That appears to be one of the many "mix-and-match" firmware options.
I figured as much since the xflash utility has an option to access via ip address. Pretty cool.

Can you run Linux (even in Live Demo mode) on the system? You'd then have access to the sg_ses utilities allowing direct querying of the expander, although if the 9280 doesn't expose the expander to the system, this may require an HBA instead
Sure, I can boot Linux from a USB drive if needed. And I do have an LSI SAS9200-8e HBA I can use to access the external enclosures if needed. As for the 826 enclosure, the mobo (Supermicro X8DT6-F) has an embedded LSO SAS9211-8i), so I'm good there as well.
 
I found this in the Supermicro faq:

Code:
Region 0 : boot region 

1 : active region 

2 : update region 

3 : MFG region 

Please use the active (1) or update (2) region to find the firmware version.

So it would appear that both the boot region and active region contains the most current firmware version after all. I suspect that the MFG region is "hardcoded" and contains the original firmware at the time of manufacture that can't be changed.
 
So I received feedback from both Supermicro and Avago.

Supermico confirmed that 55.14.18.00 is the latest firmware. I have asked them to send it to me again so that I can reflash, just in case the one I got from the ftp site is somehow not the correct version.

Avago also confirmed that I'm running the latest driver and firmware is updated to the latest. They also state that they see no single drive problems occurring.

They do say that there appear to be a backplane or cable issue happening.

Code:
------
Physical Devices  : 64
  Disks           : 60
------
seqNum: 0x0000565e ; Time: Mon Feb 08 19:21:13 2016 ; Code: 0x000000ba ; Class: 2 ; Locale: 0x04
Event Description: Enclosure PD 6c(c Port 4 - 7/p2) is unstable

Not sure if that makes sense to me. I have asked them to tell me which log file they found the above error in so that I can see about tracking it to a specific expander. But if the issue is with a single expander, when that doesn't align with the log messages that I'm seeing with 3 distinct messages, one for each backplane.
 
So despite the errors/issues, the RAID 60 array did complete the init with no issues.

I copied about 40TB of data to it, and while it wasn't able to saturate my 10Gig link, it was fairly decent.

copytoraid60.PNG


Going the other way had higher peaks, but fluctuated widely:

copyfromraid60.PNG


So I suspect that is related to whatever the issue I'm fighting at the moment.

Copying in both directions at once I got this:

copytoandfromraid60.PNG


My goal is to get to the point where I can max out the 10Gbit link in either direction.

Argon is the server with the LSI 9280 controller and 60 disks in RAID 60.

Brama has an Areca 1882 with 2 RAID 6 arrays with 12 disk members each. (there is also a RAID0 array with 4 x 840 PROs drives on this server but I have not done any testing to/from it yet)
 
Last edited:
I installed CentOS and the sg3_util package and then ran this command:

Code:
[peter@00-25-90-64-a2-66 ~]$ sudo sg_ses /dev/sda --verbose
    inquiry cdb: 12 00 00 00 24 00
  ATA       Hitachi HDS72202  A3MA
    disk device (not an enclosure)
    Receive diagnostic results cmd: 1c 01 00 ff ff 00
receive diagnostic results:  Fixed format, current;  Sense key: Illegal Request
 Additional sense: Invalid command operation code
Attempt to fetch Supported Diagnostic Pages diagnostic page failed
    Receive diagnostic results command not supported

This is with the built in LSI2008 talking to the 826 expander and a LSI 9200-8e talking to the external 846 expanders.

So the LSI 9280 is completely out of the picture at this point.

lsblk gives me:

Code:
[peter@00-25-90-64-a2-66 ~]$ lsblk
NAME            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda               8:0    0   1.8T  0 disk
sdb               8:16   0   1.8T  0 disk
sdc               8:32   0   1.8T  0 disk
sdd               8:48   0 111.8G  0 disk
├─sdd1            8:49   0   500M  0 part /boot
└─sdd2            8:50   0 111.3G  0 part
  ├─centos-root 253:0    0    50G  0 lvm  /
  ├─centos-swap 253:1    0  11.2G  0 lvm  [SWAP]
  └─centos-home 253:2    0  50.1G  0 lvm  /home
sde               8:64   0   1.8T  0 disk
sdf               8:80   0   1.8T  0 disk
sdg               8:96   0   1.8T  0 disk
sdh               8:112  0   1.8T  0 disk
sdi               8:128  0   1.8T  0 disk
sdj               8:144  0   1.8T  0 disk
sdk               8:160  0   1.8T  0 disk
sdl               8:176  0   1.8T  0 disk
sdm               8:192  0   1.8T  0 disk
sdn               8:208  0   1.8T  0 disk
sdo               8:224  0   1.8T  0 disk
sdp               8:240  0   1.8T  0 disk
sdq              65:0    0   1.8T  0 disk
sdr              65:16   0   1.8T  0 disk
sds              65:32   0   1.8T  0 disk
sdt              65:48   0   1.8T  0 disk
sdu              65:64   0   1.8T  0 disk
sdv              65:80   0   1.8T  0 disk
sdw              65:96   0   1.8T  0 disk
sdx              65:112  0   1.8T  0 disk
sdy              65:128  0   1.8T  0 disk
sdz              65:144  0   1.8T  0 disk
sdaa             65:160  0   1.8T  0 disk
sdab             65:176  0   1.8T  0 disk
sdac             65:192  0   1.8T  0 disk
sdad             65:208  0   1.8T  0 disk
sdae             65:224  0   1.8T  0 disk
sdaf             65:240  0   1.8T  0 disk
sdag             66:0    0   1.8T  0 disk
sdah             66:16   0   1.8T  0 disk
sdai             66:32   0   1.8T  0 disk
sdaj             66:48   0   1.8T  0 disk
sdak             66:64   0   1.8T  0 disk
sdal             66:80   0   1.8T  0 disk
sdba             67:64   0   1.8T  0 disk
sdam             66:96   0   1.8T  0 disk
sdbb             67:80   0   1.8T  0 disk
sdan             66:112  0   1.8T  0 disk
sdbc             67:96   0   1.8T  0 disk
sdao             66:128  0   1.8T  0 disk
sdbd             67:112  0   1.8T  0 disk
sdbe             67:128  0   1.8T  0 disk
sdap             66:144  0   1.8T  0 disk
sdbf             67:144  0   1.8T  0 disk
sdaq             66:160  0   1.8T  0 disk
sdbg             67:160  0   1.8T  0 disk
sdar             66:176  0   1.8T  0 disk
sdbh             67:176  0   1.8T  0 disk
sdas             66:192  0   1.8T  0 disk
sdbi             67:192  0   1.8T  0 disk
sdat             66:208  0   1.8T  0 disk
sdau             66:224  0   1.8T  0 disk
sdav             66:240  0   1.8T  0 disk
sdaw             67:0    0   1.8T  0 disk
sdax             67:16   0   1.8T  0 disk
sday             67:32   0   1.8T  0 disk
sdaz             67:48   0   1.8T  0 disk

Not sure what to try next.
 
Code:
[peter@00-25-90-64-a2-66 ~]$ sudo sg_ses /dev/sda --verbose
    inquiry cdb: 12 00 00 00 24 00
  ATA       Hitachi HDS72202  A3MA
    disk device (not an enclosure)
    Receive diagnostic results cmd: 1c 01 00 ff ff 00
receive diagnostic results:  Fixed format, current;  Sense key: Illegal Request
 Additional sense: Invalid command operation code
Attempt to fetch Supported Diagnostic Pages diagnostic page failed
    Receive diagnostic results command not supported
This is with the built in LSI2008 talking to the 826 expander and a LSI 9200-8e talking to the external 846 expanders.

That's the expected result when you throw SCSI commands at an ATA device ;)

The expander won't show as an sdX device. modprobe sg if necessary, and see if you the expander shows up as an sgNN device. Pass that to sg_ses and see if it gives you a list of supported pages. If so, sg_ses --page=1 /dev/sgNN and see if responds politely or simply throws a SENSE error back at you.

Try lsscsi and lsscsi -t to see if the expander "enclosure" shows up. If so, it will also be under /dev/bsg/ . Get smp_utils and run smp_discover against the /dev/bsg/ expander device to see if you get a full list of all drives attached (SMP is a different protocol from SES-2).
 
"modprobe sg" returns nothing unfortunately.

However, lsscsi returns the following:

Code:
[peter@00-25-90-64-a2-66 ~]$ lsscsi
[0:0:0:0]    disk    ATA      Hitachi HDS72202 A3MA  /dev/sda
[0:0:1:0]    disk    ATA      Hitachi HDS72202 A3MA  /dev/sdc
[0:0:2:0]    disk    ATA      Hitachi HDS72202 A3MA  /dev/sde
[0:0:3:0]    disk    ATA      Hitachi HDS72202 A3MA  /dev/sdf
[0:0:4:0]    disk    ATA      Hitachi HDS72202 A3MA  /dev/sdh
[0:0:5:0]    disk    ATA      Hitachi HDS72202 A3MA  /dev/sdi
[0:0:6:0]    disk    ATA      Hitachi HDS72202 A3MA  /dev/sdj
[0:0:7:0]    disk    ATA      Hitachi HDS72202 A3MA  /dev/sdk
[0:0:8:0]    disk    ATA      Hitachi HDS72202 A3MA  /dev/sdl
[0:0:9:0]    disk    ATA      Hitachi HDS72202 A3MA  /dev/sdm
[0:0:10:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sdo
[0:0:11:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sdp
[0:0:12:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sdq
[0:0:13:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sdr
[0:0:14:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sds
[0:0:15:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sdt
[0:0:16:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sdu
[0:0:17:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sdv
[0:0:18:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sdw
[0:0:19:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sdx
[0:0:20:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sdy
[0:0:21:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sdaa
[0:0:22:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sdab
[0:0:23:0]   disk    ATA      Hitachi HDS72202 A3MA  /dev/sdac
[0:0:24:0]   enclosu LSI CORP SAS2X36          0e12  -
[0:0:25:0]   disk    ATA      ST2000NM0033-9ZM SN04  /dev/sdad
[0:0:26:0]   disk    ATA      ST2000NM0033-9ZM SN04  /dev/sdae
[0:0:27:0]   disk    ATA      ST2000NM0033-9ZM SN04  /dev/sdaf
[0:0:28:0]   disk    ATA      ST2000NM0033-9ZM SN04  /dev/sdah
[0:0:29:0]   disk    ATA      ST2000NM0033-9ZM SN04  /dev/sdai
[0:0:30:0]   disk    ATA      ST2000NM0033-9ZM SN04  /dev/sdaj
[0:0:31:0]   disk    ATA      ST2000NM0033-9ZM SN04  /dev/sdak
[0:0:32:0]   disk    ATA      ST2000NM0033-9ZM SN04  /dev/sdal
[0:0:33:0]   disk    ATA      ST2000NM0033-9ZM SN04  /dev/sdan
[0:0:34:0]   disk    ATA      ST2000NM0033-9ZM SN04  /dev/sdao
[0:0:35:0]   disk    ATA      ST2000NM0033-9ZM SN04  /dev/sdap
[0:0:36:0]   disk    ATA      ST2000NM0033-9ZM SN04  /dev/sdaq
[0:0:37:0]   disk    ATA      Hitachi HUA72302 A840  /dev/sdat
[0:0:38:0]   disk    ATA      Hitachi HUA72302 A840  /dev/sdav
[0:0:39:0]   disk    ATA      Hitachi HUA72302 A840  /dev/sdaw
[0:0:40:0]   disk    ATA      Hitachi HUA72302 A840  /dev/sday
[0:0:41:0]   disk    ATA      Hitachi HDS72302 A5C0  /dev/sdba
[0:0:42:0]   disk    ATA      Hitachi HDS72302 A5C0  /dev/sdbb
[0:0:43:0]   disk    ATA      Hitachi HDS72302 A5C0  /dev/sdbc
[0:0:44:0]   disk    ATA      Hitachi HDS72302 A5C0  /dev/sdbd
[0:0:45:0]   disk    ATA      Hitachi HDS72302 A5C0  /dev/sdbe
[0:0:46:0]   disk    ATA      Hitachi HDS72302 A5C0  /dev/sdbf
[0:0:47:0]   disk    ATA      Hitachi HDS72302 A5C0  /dev/sdbg
[0:0:48:0]   disk    ATA      Hitachi HDS72302 A5C0  /dev/sdbh
[0:0:49:0]   enclosu LSI CORP SAS2X36          0e12  -
[1:0:0:0]    disk    ATA      SanDisk SDSSDA12 10RL  /dev/sdd
[7:0:0:0]    disk    ATA      Hitachi HDS72202 A3MA  /dev/sdbi
[7:0:1:0]    disk    ATA      Hitachi HUA72302 A840  /dev/sdb
[7:0:2:0]    disk    ATA      Hitachi HUA72302 A840  /dev/sdg
[7:0:3:0]    disk    ATA      ST2000VN000-1H31 SC42  /dev/sdn
[7:0:4:0]    disk    ATA      ST2000VN000-1H31 SC42  /dev/sdz
[7:0:5:0]    disk    ATA      ST2000VN000-1H31 SC42  /dev/sdag
[7:0:6:0]    disk    ATA      ST2000VN000-1H31 SC42  /dev/sdam
[7:0:7:0]    disk    ATA      ST2000VN000-1H31 SC42  /dev/sdar
[7:0:8:0]    disk    ATA      WDC WD200MFYYZ-0 1K01  /dev/sdas
[7:0:9:0]    disk    ATA      ST32000644NS     SN12  /dev/sdau
[7:0:10:0]   disk    ATA      ST32000644NS     SN12  /dev/sdax
[7:0:11:0]   disk    ATA      ST2000NM0011     SN02  /dev/sdaz
[7:0:12:0]   enclosu LSI CORP SAS2X28          0e12  -

And lsscsi -t give me:

Code:
[peter@00-25-90-64-a2-66 ~]$ lsscsi -t
[0:0:0:0]    disk    sas:0x5003048001a2134c          /dev/sda
[0:0:1:0]    disk    sas:0x5003048001a2134d          /dev/sdc
[0:0:2:0]    disk    sas:0x5003048001a2134e          /dev/sde
[0:0:3:0]    disk    sas:0x5003048001a2134f          /dev/sdf
[0:0:4:0]    disk    sas:0x5003048001a21350          /dev/sdh
[0:0:5:0]    disk    sas:0x5003048001a21351          /dev/sdi
[0:0:6:0]    disk    sas:0x5003048001a21352          /dev/sdj
[0:0:7:0]    disk    sas:0x5003048001a21353          /dev/sdk
[0:0:8:0]    disk    sas:0x5003048001a21354          /dev/sdl
[0:0:9:0]    disk    sas:0x5003048001a21355          /dev/sdm
[0:0:10:0]   disk    sas:0x5003048001a21356          /dev/sdo
[0:0:11:0]   disk    sas:0x5003048001a21357          /dev/sdp
[0:0:12:0]   disk    sas:0x5003048001a21358          /dev/sdq
[0:0:13:0]   disk    sas:0x5003048001a21359          /dev/sdr
[0:0:14:0]   disk    sas:0x5003048001a2135a          /dev/sds
[0:0:15:0]   disk    sas:0x5003048001a2135b          /dev/sdt
[0:0:16:0]   disk    sas:0x5003048001a2135c          /dev/sdu
[0:0:17:0]   disk    sas:0x5003048001a2135d          /dev/sdv
[0:0:18:0]   disk    sas:0x5003048001a2135e          /dev/sdw
[0:0:19:0]   disk    sas:0x5003048001a2135f          /dev/sdx
[0:0:20:0]   disk    sas:0x5003048001a21360          /dev/sdy
[0:0:21:0]   disk    sas:0x5003048001a21361          /dev/sdaa
[0:0:22:0]   disk    sas:0x5003048001a21362          /dev/sdab
[0:0:23:0]   disk    sas:0x5003048001a21363          /dev/sdac
[0:0:24:0]   enclosu sas:0x5003048001a2137d          -
[0:0:25:0]   disk    sas:0x5003048000ec714c          /dev/sdad
[0:0:26:0]   disk    sas:0x5003048000ec714d          /dev/sdae
[0:0:27:0]   disk    sas:0x5003048000ec714e          /dev/sdaf
[0:0:28:0]   disk    sas:0x5003048000ec714f          /dev/sdah
[0:0:29:0]   disk    sas:0x5003048000ec7150          /dev/sdai
[0:0:30:0]   disk    sas:0x5003048000ec7151          /dev/sdaj
[0:0:31:0]   disk    sas:0x5003048000ec7152          /dev/sdak
[0:0:32:0]   disk    sas:0x5003048000ec7153          /dev/sdal
[0:0:33:0]   disk    sas:0x5003048000ec7154          /dev/sdan
[0:0:34:0]   disk    sas:0x5003048000ec7155          /dev/sdao
[0:0:35:0]   disk    sas:0x5003048000ec7156          /dev/sdap
[0:0:36:0]   disk    sas:0x5003048000ec7157          /dev/sdaq
[0:0:37:0]   disk    sas:0x5003048000ec7158          /dev/sdat
[0:0:38:0]   disk    sas:0x5003048000ec7159          /dev/sdav
[0:0:39:0]   disk    sas:0x5003048000ec715a          /dev/sdaw
[0:0:40:0]   disk    sas:0x5003048000ec715b          /dev/sday
[0:0:41:0]   disk    sas:0x5003048000ec715c          /dev/sdba
[0:0:42:0]   disk    sas:0x5003048000ec715d          /dev/sdbb
[0:0:43:0]   disk    sas:0x5003048000ec715e          /dev/sdbc
[0:0:44:0]   disk    sas:0x5003048000ec715f          /dev/sdbd
[0:0:45:0]   disk    sas:0x5003048000ec7160          /dev/sdbe
[0:0:46:0]   disk    sas:0x5003048000ec7161          /dev/sdbf
[0:0:47:0]   disk    sas:0x5003048000ec7162          /dev/sdbg
[0:0:48:0]   disk    sas:0x5003048000ec7163          /dev/sdbh
[0:0:49:0]   enclosu sas:0x5003048000ec717d          -
[1:0:0:0]    disk    sata:                           /dev/sdd
[7:0:0:0]    disk    sas:0x50030480015852ec          /dev/sdbi
[7:0:1:0]    disk    sas:0x50030480015852ed          /dev/sdb
[7:0:2:0]    disk    sas:0x50030480015852ee          /dev/sdg
[7:0:3:0]    disk    sas:0x50030480015852ef          /dev/sdn
[7:0:4:0]    disk    sas:0x50030480015852f0          /dev/sdz
[7:0:5:0]    disk    sas:0x50030480015852f1          /dev/sdag
[7:0:6:0]    disk    sas:0x50030480015852f2          /dev/sdam
[7:0:7:0]    disk    sas:0x50030480015852f3          /dev/sdar
[7:0:8:0]    disk    sas:0x50030480015852f4          /dev/sdas
[7:0:9:0]    disk    sas:0x50030480015852f5          /dev/sdau
[7:0:10:0]   disk    sas:0x50030480015852f6          /dev/sdax
[7:0:11:0]   disk    sas:0x50030480015852f7          /dev/sdaz
[7:0:12:0]   enclosu sas:0x50030480015852fd          -

So the "enclosures" are seen by lsscsi, but not by modprobe?
 
So the "enclosures" are seen by lsscsi, but not by modprobe?

No, modprobe sg simply loads the sg (generic SCSI) driver (module), if it isn't already. Doesn't do anything else, so no output is good output :)

Please try
Code:
lsscsi -g | grep encl
and that should show you the sg devices for the enclosures.

Then, sg_ses /dev/sgNN (whatever is reported for the enclosures above) and see if it gives you a list of supported pages. If so, sg_ses --page=1 /dev/sgNN and see if responds politely or simply throws a SENSE error back at you.
 
Ah ok. So I get this with the first command:

Code:
[peter@00-25-90-64-a2-66 ~]$ lsscsi -g | grep encl
[0:0:24:0]   enclosu LSI CORP SAS2X36          0e12  -          /dev/sg24
[0:0:49:0]   enclosu LSI CORP SAS2X36          0e12  -          /dev/sg49
[7:0:12:0]   enclosu LSI CORP SAS2X28          0e12  -          /dev/sg63

And then this with the second command:

Code:
[peter@00-25-90-64-a2-66 ~]$ sudo sg_ses /dev/sg24
  LSI CORP  SAS2X36           0e12
Supported diagnostic pages:
  Supported Diagnostic Pages [sdp] [0x0]

And then when I look at each page I get the following:

Code:
[peter@00-25-90-64-a2-66 ~]$ sudo sg_ses --page=sdp /dev/sg24 --verbose
    inquiry cdb: 12 00 00 00 24 00
  LSI CORP  SAS2X36           0e12
    enclosure services device
    Receive diagnostic results cmd: 1c 01 00 ff ff 00
    receive diagnostic results: pass-through requested 65535 bytes but got 5 bytes
Supported diagnostic pages:
  Supported Diagnostic Pages [sdp] [0x0]

And:

Code:
[peter@00-25-90-64-a2-66 ~]$ sudo sg_ses --page=0x0 /dev/sg24 --verbose
    inquiry cdb: 12 00 00 00 24 00
  LSI CORP  SAS2X36           0e12
    enclosure services device
    Receive diagnostic results cmd: 1c 01 00 ff ff 00
    receive diagnostic results: pass-through requested 65535 bytes but got 5 bytes
Supported diagnostic pages:
  Supported Diagnostic Pages [sdp] [0x0]

I get the same results as above with sg49 and sg63.

So this would seem to indicate that all 3 of my enclosures are behaving as they should when spoken to through an LSI 2008 based HBA, no?

I'm going to put my LSI 9280 RAID controller back in (LSI 2108 based), install the LSI CLI tools, and poke around with those some. I wonder if the issue is that I have an embedded LSI 2008 controller on my X8 motherboard that is somehow confusing the MegaRaid Management Console? Even if I disable the LSI 2008 in the BIOS (by disabling PCI slot 4), the MegaRaid Console still picks up its presence.
 
Last edited:
Code:
[peter@00-25-90-64-a2-66 ~]$ sudo sg_ses --page=0x0 /dev/sg24 --verbose
    inquiry cdb: 12 00 00 00 24 00
  LSI CORP  SAS2X36           0e12
    enclosure services device
    Receive diagnostic results cmd: 1c 01 00 ff ff 00
    receive diagnostic results: pass-through requested 65535 bytes but got 5 bytes
Supported diagnostic pages:
  Supported Diagnostic Pages [sdp] [0x0]
I get the same results as above with sg49 and sg63.

So this would seem to indicate that all 3 of my enclosures are behaving as they should when spoken to through an LSI 2008 based HBA, no?

Argh! :eek:

No, you can see the "got 5 bytes" message hinting there's a problem. And there is.
A "normal" SAS2xXX expander should respond to a request for a list of its configuration pages like this:

Code:
  LSI CORP  SAS2X36           0417
    enclosure services device
Supported diagnostic pages:
  Supported diagnostic pages [0x0]
  Configuration (SES) [0x1]
  Enclosure status/control (SES) [0x2]
  Element descriptor (SES) [0x7]
  Additional element status (SES-2) [0xa]
  Download microcode (SES-2) [0xe]
I put my 846 back together (since I'd also updated to the "FTP" firmware), and found that I was getting exactly the same erroneous result as yours (SAS2008 HBA). I know the enclosure supports diagnostic page 1, but that gives me:

Code:
$ sudo sg_ses -p 1 /dev/sg19 --verbose

    inquiry cdb: 12 00 00 00 24 00
  LSI CORP  SAS2X36           0e12
    enclosure services device
    Receive diagnostic results cmd: 1c 01 01 ff ff 00
receive diagnostic results:  Fixed format, current;  Sense key: Illegal Request
 Additional sense: Invalid command operation code
Attempt to fetch Configuration (SES) diagnostic page failed
    Receive diagnostic results command not supported
Also controller-initiated SES-2 functions such as identify slot (flashing light) wouldn't work:

Code:
$ sudo sas2ircu 0 locate 2:2 on

LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.

SAS2IRCU: IocStatus = 4 IocLogInfo = 824180928
SAS2IRCU: SEP write request failed. Cannot perform LOCATE.
SAS2IRCU: Error executing command LOCATE.
So I flashed it back to the 55.07.23 firmware from the same site, and it seems to have worked:

Code:
$ sudo sg_ses -p 0 /dev/sg9

  LSI CORP  SAS2X36           0717
Supported diagnostic pages:
  Supported Diagnostic Pages [sdp] [0x0]
  Configuration (SES) [cf] [0x1]
  Enclosure Status/Control (SES) [ec,es] [0x2]
  Element Descriptor (SES) [ed] [0x7]
  Additional Element Status (SES-2) [aes] [0xa]
  Download Microcode (SES-2) [dm] [0xe]


$ sudo sas2ircu 0 locate 2:2 on

LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.

SAS2IRCU: LOCATE command completed successfully.
SAS2IRCU: Command LOCATE Completed Successfully.
SAS2IRCU: Utility Completed Successfully.
For the greater common good, I also tested out the 55.14.11.00 firmware from said site. Turns out it works just fine too. I guess Supermicro messed up the "fixed" .18 firmware OR it's specific to the 826 or the owners of the FTP site:

Code:
$ sudo ./xflash -i 5003048000f934ff get ver

<snip banner>
Initializing Interface.
Expander: SAS2x36
Firmware Region Version: 55.14.11.00


$ sudo sg_ses /dev/sg19

  LSI CORP  SAS2X36           0e0b
Supported diagnostic pages:
  Supported Diagnostic Pages [sdp] [0x0]
  Configuration (SES) [cf] [0x1]
  Enclosure Status/Control (SES) [ec,es] [0x2]
  Element Descriptor (SES) [ed] [0x7]
  Additional Element Status (SES-2) [aes] [0xa]
  Download Microcode (SES-2) [dm] [0xe]
So I'd recommend you update the firmware in both regions to 55.14.11.00 via the GUI utility (not xflash directly), reboot, check that the expander responds fully to sg_ses as above, and then see if you're still getting those warnings in the MegaRaid management log.

Of course, since we didn't back up the original firmware, the ideal solution is to ask Supermicro support for the appropriate firmware AND MFGs for our specific backplanes (and provide the revision number on the PCB for good measure). Check via xflash -i <SAS_address> get exp that it shows your SAS2x36 chip as hardware revision B3 -- I believe that's the latest.

 
Thank you, great stuff!

Yep, all 3 of my backplanes are B3 hardware rev.

Code:
Expander: SAS2X28 (SAS2x28) B3
SAS Address: 50030480:015852FF

Expander: SAS2X36 (SAS2x36) B3
SAS Address: 50030480:01A2137F

Expander: SAS2X36 (SAS2x36) B3
SAS Address: 50030480:00EC717F

Btw, when I originally used xflash, it was version 7.0 from the infamous ftp site. I since switched to version 12. But it didn't seem to make a difference.

I did flash to .11 at one point in both regions, but for me that didn't work either.

When you say to flash using the GUI, are you referring to using MegaRAID Storage Manager with the LSI 2008 HBAs connected to the backplane?

I'll hit Supermicro up for the .18 firmware specific to the B3 hardware rev. for both the 826 and 846 backplanes.

The thing I find so odd about this whole ordeal is that I have another 846 Chassis flashed to .18 using xflash that appears to work just fine, at least with the Areca 1882 controller that I have.

areca1882enclosures.PNG
 
I went ahead and downgraded all 3 backplanes to 55.07.23.0 from the link you provided. I did use xflash since I could do that without tearing the server down and swap out for the HBAs again.

I flashed regions 0, 1 and 2 to the older version. Rebooted, and still the same errors and no enclosures seen by the MegaRAID Storage Manager.

If I get a chance later tonight, I'll yank the 9280 and go back to the HBAs and see if that will allow the MegaRAID Storage Manager to see the enclosures.
 
I meant the SAC GUI wrapper around xflash. No inherent harm in using x flash or g3xflash. The SM manual cautions not to do that because it may zero out the WWN but as long as you have them noted that's not a problem.

Is your MegaRAID somehow thinking it's talking to SAS drives (which will respond the receive diagnostic results command) instead of SATA drives (which will say HUH WTF)?

Can you pastebin the Mega log?
 
Here's a link to the Mega log:

http://www.cstone.net/~dk/Argon_2_15_2016.log

Partial pastebin log:

http://pastebin.com/S2m4HdrR

Note that I have disabled logging the sense error messages.

I'm not familiar with the SAC GUI wrapper. Is that a Supermicro utility? Google seems sparse on any info about it.

I wouldn't think MegaRAID thinks I have SAS drives, since it sees all the drives fine and I was able to create a RAID array and copy data to it.
 
Just for grins, I swapped out the LSI 9280 for an Areca 1882LP I had laying around, and low and behold, it sees the 3 enclosures just fine.

1882LP.PNG


Btw, I also ran the smc.exe (as admin) and it did not pick up the enclosures either.

Friend of mine with the LSI 9266 controller and 826 running the .18 firmware does not have this issue. He extracted his .18 firmware and compared it to my ftp derived .18 firmware, and they are bit for bit identical.

Really not getting what the heck is going on.

I'm going to switch to the LSI HBAs now and see what smc.exe and MegaRAID tell me with that config.
 
I swapped the Areca 1882 for the LSI SAS9200 and the onboard LSI2008.

MegaRaid is seeing all 3 enclosures now:

megadash1.PNG


megadash2.PNG


megadash3.PNG


So the issue with not seeing the enclosures appears to be isolated to the LSI 9280 controller under windows and the LSI2008 HBA chip under CentOS. How strange...

Furthermore, the sense errors are gone now using the LSI2008 controllers.

My friend with the LSI 9266 controller and the .18 firmware has no issues with the sg_ses command:

Code:
[root@tiny ~]# lsscsi -g | grep encl
[0:0:29:0]   enclosu LSI      SAS2X28          0e12  -          /dev/sg0
[root@tiny ~]# sudo sg_ses --page=0x0 /dev/sg0 --verbose
    inquiry cdb: 12 00 00 00 24 00
  LSI       SAS2X28           0e12
    enclosure services device
    Receive diagnostic results cmd: 1c 01 00 ff ff 00
Supported diagnostic pages:
  Supported Diagnostic Pages [sdp] [0x0]
  Configuration (SES) [cf] [0x1]
  Enclosure Status/Control (SES) [ec,es] [0x2]
  Element Descriptor (SES) [ed] [0x7]
  Additional Element Status (SES-2) [aes] [0xa]
  Download Microcode (SES-2) [dm] [0xe]

I'm very tempted to put the blame on the 9280 controller at this point since that is the only variable between my setup and my friends.
 
Last edited:
I'm not familiar with the SAC GUI wrapper. Is that a Supermicro utility? Google seems sparse on any info about it.

So sorry about that -- autocorrect. I meant SMC.exe or smc, of course.

Friend of mine with the LSI 9266 controller and 826 running the .18 firmware does not have this issue. He extracted his .18 firmware and compared it to my ftp derived .18 firmware, and they are bit for bit identical.

That's good to know, at least it'll work for all 826s.

I swapped the Areca 1882 for the LSI SAS9200 and the onboard LSI2008.

MegaRaid is seeing all 3 enclosures now

<image snip>

So the issue with not seeing the enclosures appears to be isolated to the LSI 9280 controller under windows and the LSI2008 HBA chip under CentOS. How strange...

I'm very tempted to put the blame on the 9280 controller at this point since that is the only variable between my setup and my friends.

Very odd. Good to know the SAS2008 is working. After updating the fw via xflash in CentOS, did you reset the expander(s) via xflash? That appears to be necessary to get the expander to load the new FW without actually rebooting the host.

It's unlikely, but maybe built-in drivers with your CentOS version are problematic? Could you try to boot Ubuntu 15.10 off of a USB and see if that works? With the SAS2008 and the 9280 if possible?

Sorry about all the swapping around. Wish you had a server mobo with like six PCIe x8 slots... :)
 
Also, slightly off-topic, but if you're driving 60 drives in RAID 60, I think you'd be much better off with a SAS2208/SAS3108-based controller. They have dual processors compared to the single processor of the SAS2108 on the 9280.

The ARC1882-LP is SAS2208-based. For an array this large, I think the larger caches onboard the Arecas (compared to similar LSIs) may be helpful too.

The Arecas will hide the expanders from the system though -- no sg_ses, etc. They show up fine in the interface of course.
 
So sorry about that -- autocorrect. I meant SMC.exe or smc, of course.
I figured that out. :) However, running SMC.exe (as admin under Win10) still does not see the expanders.

Very odd. Good to know the SAS2008 is working. After updating the fw via xflash in CentOS, did you reset the expander(s) via xflash? That appears to be necessary to get the expander to load the new FW without actually rebooting the host.
I stuck my Win10 SSD in after the running the sg_ses commands. I used xflash 12 under windows to switch between firmware versions. I did reboot after doing all 3 expanders to get them to load up the new firmware versions.

It's unlikely, but maybe built-in drivers with your CentOS version are problematic? Could you try to boot Ubuntu 15.10 off of a USB and see if that works? With the SAS2008 and the 9280 if possible?

Sorry about all the swapping around. Wish you had a server mobo with like six PCIe x8 slots... :)
The 9280 is getting returned. I really think mine just has an issue. I got it used off ebay.

My friend put his SAS2008 in his 826 and got the following (CentOS):

Code:
[root@tiny ~]# lsscsi -g | grep encl
[6:0:12:0]   enclosu LSI      SAS2X28          0e12  -          /dev/sg13
[root@tiny ~]# sudo sg_ses --page=0x0 /dev/sg13 --verbose
    inquiry cdb: 12 00 00 00 24 00
  LSI       SAS2X28           0e12
    enclosure services device
    Receive diagnostic results cmd: 1c 01 00 ff ff 00
    receive diagnostic results: pass-through requested 65535 bytes but got 10 bytes
Supported diagnostic pages:
  Supported Diagnostic Pages [sdp] [0x0]
  Configuration (SES) [cf] [0x1]
  Enclosure Status/Control (SES) [ec,es] [0x2]
  Element Descriptor (SES) [ed] [0x7]
  Additional Element Status (SES-2) [aes] [0xa]
  Download Microcode (SES-2) [dm] [0xe]

So he did a little better than me, but still didn't get all the bytes returned.
 
Also, slightly off-topic, but if you're driving 60 drives in RAID 60, I think you'd be much better off with a SAS2208/SAS3108-based controller. They have dual processors compared to the single processor of the SAS2108 on the 9280.
Yeah I realized that after getting it. I thought the 928x generation was newer than the 926x one, but found that not to be the case. For some reason LSI doesn't make a 2208 based controller with both internal and external ports.

The ARC1882-LP is SAS2208-based. For an array this large, I think the larger caches onboard the Arecas (compared to similar LSIs) may be helpful too.

The Arecas will hide the expanders from the system though -- no sg_ses, etc. They show up fine in the interface of course.

I got a 9266-8i on order. I'll use a SFF-8087 to SFF-8088 bracket to convert one of the internal ports to external so that I can hook up to my external chassis.

My 1882LP has an issue when writing data to an array, it locks up. I'll be sending that one off to Tekram for repair. Had it not been for that, I would have been running a 1882 in both my primary server and my backup server.
 
There's always the SAS3108-based 9380-4i4e, although you'd need to get SFF8643/44 to 8087/88 cables...but MOAR MHz with 50% faster dual cores compared to SAS2208!
 
Very true, but they run around $700 right now. I got the 9266 for just over $200 factory sealed.

As long as it will allow me to saturate my 10Gbps link on reads and writes (large files of course), I'll be happy. :)

Eventually, I plan to upgrade my 826 and 846 chassis with the 217 and 417 when I go all SSD. That's when I'll go to SAS3108 based controllers. :D
 
It's a bit complicated. I have 3 Intel X520 NICs (PCIE 2.0 8x) A dual port in each of my servers, and a single port in my workstation. The workstation and prod servers are connected to my core switch, consuming both of its SFP+ ports. So I run a direct 10G link between the prod server's 2nd 10G port and the backup server's 1st 10G port. The backup server is also connected to my core switch via a regular 1G port on my primary subnet. The direct 10G link between the servers lives on a separate subnet and is strictly used for backing up and restoring data to/from the primary server.
 
I was running FreeBSD for a period, and they have always recommended Intel based NICs. Driver support is also excellent in Windows for those. And the price is right. Single port cards go for around $65 and dual port cards for about twice that. And they are SFP+, which is what I need to connect to my switch.
 
I finally got everything worked out! I got the LSI 9266 controller in and after installing it, still had the same issue.

As I think I have mentioned before, I was never able to get SMC.EXE to run. Turns out it was because I was running Win10. Once I booted from a Win7 SSD, I was able to run SMC. Doing the MFG region first, reset controller, and FIRM region 0 and 2, and reset controller, and presto, all was well. This is with the .18 firmware I got from SuperMicro a few days ago. Did the 2 external 846 enclosures as well with no issues.

megaraiddashboard.PNG


megaraidphysical.PNG


megaraidlogical.PNG


Once the array has initialized, I look forward to comparing its performance to the 9280 controller. Hopefully I'll be able to saturate my 10Gb link now.
 
Last edited:
Getting pretty decent xfer rates over my 10G network now with the LSI 9266 RAID controller.

jumboporn2.PNG


I wonder if this is "as good as it gets" with 10G? I'm running latest Intel X520 drivers on both servers and have jumbo frames enabled.
 
Last edited:
I wonder if this is "as good as it gets" with 10G? I'm running latest Intel X520 drivers on both servers and have jumbo frames enabled.

That's pretty good (nearly 90% of line rate for SMB copy). Here are my suggestions:
  • Check you are not being CPU-limited on either side
  • "Tune" the X520 NIC driver options (see that section, here)
  • Now, use a tool like iperf3 to benchmark raw TCP/IP throughput between the server and client, both single stream and multiple parallel streams to establish the maximum
  • Ensure SMB 3.02 is being used, enable SMB Direct and overall, tune SMB too (see here, SMB Tuning sections and links in it)
  • Now try copying multiple large files and see if the performance gets any better.
Other than that, the "faster" option is to simply use Mellanox/OEM'd 40/56 GbE NICs instead of the X520s (Intel 40GbE prices are in the stratosphere). Dual-ported are approx. $150 on eBay; they use QSFP direct attach cables instead of SFP+.
 
Excellent feedback. Thanks!

I "tuned" the ports on both servers that are directly connected, upping queues from 8 to 16 (16 is the max even though one server has 28 logical cores), maxing out buffers, etc.

Performance seemed about the same (CPU on both servers never went above about 25%).

I setup a RAM disk on both servers and copied a 20G file between them and got this:

RAMDisktoBrama.PNG


So that's 91.2% of 10Gbps, which isn't bad.

I'll check out iperf3 to see what that gives me.

I checked out the SMB tuning link, but is seems to be specific to Windows server. Perhaps SMB isn't available in Windows 10?

Those Mellanox NICs looks very interesting. Looks like they can be had for around 50 bucks with shipping! Like this one. And a cable like this one. So for less than $150, I can connect my 2 servers at 40Gbps!
 
As an eBay Associate, HardForum may earn from qualifying purchases.
10 supports SMB3. 02, all steps should apply.

Be very careful of the cheaper cards because they will often be Infiniband only, not 40 GB Ethernet. That's often slower and can be a pain to use since Infiniband doesn't natively support IP for general purpose TCP/IP.

I'll add more on 40/56 later today.
 
Ah I see. So I'd probably need to step up to something like this.

I'll carry out the SMB 3.02 steps then and see if that helps.
 
As an eBay Associate, HardForum may earn from qualifying purchases.
Ah I see. So I'd probably need to step up to something like this.

I'll carry out the SMB 3.02 steps then and see if that helps.

Yeah, that's the ticket. You can save some $$$ by getting the ones branded by HP or IBM; the Mellanox tools will flash them and their drivers will also work. I believe it's also possible to cross-flash them so they even identify as Mellanox.

If your X520s are dual ported and you have a spare SFP+ cable, connect all four ports and see if you can set up link aggregation (Intel calls it "teaming" IIRC) to get up to 20 Gbps raw when transferring multiple files simultaneously.
 
As an eBay Associate, HardForum may earn from qualifying purchases.
Back
Top