Why does some storage have multiple RAID Groups?

KapsZ28

2[H]4U
Joined
May 29, 2009
Messages
2,114
I believe I know part of the answer, but kind of curious what the storage experts say. Two brands I have worked with, NetApp and EqualLogic (24+ disks) both use RAID Groups. Therefore the usable capacity is much lower since especially with dual parity. But it makes sense that it would be better protected but less performance compared to a single large raid group.

Then I look at Nimble which has triple parity, and one large RAID group. I don't know the number off the top of my head, but I believe depending on the system you can have up to 80 disks and it would be just one large RAID group. On top of which they are all SATA disks which most other storage vendors use smaller raid groups when compared to SAS.

So why is this? Is Nimble simply gambling on additional performance and usable storage verse overall data protection, or should storage not be broken up into several raid groups?
 
I'm not sure I understand your question entirely. A storage group, is just a group of 1 or more RAID arrays that work together as a single logical storage unit. For instance

On the Equalogic, you can have something like this

One Storage Group consisting of two members
Member A: 24Disk RAID 10 15k SAS drives @ 7TB
Member B: 16Disk RAID 6 7.2k SAS drives @ 36TB

The group would have a total storage size of 43TB (36+7) in it's storage pool. I then create Volumes (or partitions) on this pool thats comprised of my multiple group members. These volumes would be my iSCSI targets (assuming I'm using iSCSI - since I have no familiarity with Fibre Channel, that's what I'll stick with).

Obviously, my 15k RAID10 array is going to be faster for R/W, but the RAID6 array has a much larger storage pool to work with. I could then manually, or let the Equalogic do some of the work for me, and create my volumes on the storage pool, and put my high performance servers on the 7TB portion of my pool. So I could create (5) 1TB volumes on the RAID10 array, or just create a volume and let the Equalogic handle the performance tuning. I would want to make sure any SQL, Exchange, or other DB driven servers use the RAID10 array first and foremost. Some SANs even have logic in them to manage the pools to a certain degree. So your flat files which are mainly read heavy will primarily reside on the RAID6 member, but maybe Windows files and your AD database will reside within that 15k RAID10. It's one volume that resides on both members, see?

I'm not sure if that answers your question, and my experience with different storage arrays are somewhat limited, but hopefully that helps to some degree.
 
Also of note, I made up those numbers, and don't even know if you can have RAID levels with those storage sizes, just used for demonstration.
 
If you want to read or write to an array build from one large raid, every disk on the raid array must be positioned for every single read or write what means that your iops of the whole raid is similar to one disk.

If your workload is pure seqential without many repositioning of heads, your sequential performance can be up to n x data-disks of the array. But as most workloads are io limited, a multi-raid config is mostly faster because your iops is then n x raid (compared to a single disk).

This may be different with SSD only pools where iops is not that important or when you use massive read or write caching (ex ZFS)
 
I'm not sure I understand your question entirely. A storage group, is just a group of 1 or more RAID arrays that work together as a single logical storage unit. For instance

On the Equalogic, you can have something like this

One Storage Group consisting of two members
Member A: 24Disk RAID 10 15k SAS drives @ 7TB
Member B: 16Disk RAID 6 7.2k SAS drives @ 36TB

The group would have a total storage size of 43TB (36+7) in it's storage pool. I then create Volumes (or partitions) on this pool thats comprised of my multiple group members. These volumes would be my iSCSI targets (assuming I'm using iSCSI - since I have no familiarity with Fibre Channel, that's what I'll stick with).

Obviously, my 15k RAID10 array is going to be faster for R/W, but the RAID6 array has a much larger storage pool to work with. I could then manually, or let the Equalogic do some of the work for me, and create my volumes on the storage pool, and put my high performance servers on the 7TB portion of my pool. So I could create (5) 1TB volumes on the RAID10 array, or just create a volume and let the Equalogic handle the performance tuning. I would want to make sure any SQL, Exchange, or other DB driven servers use the RAID10 array first and foremost. Some SANs even have logic in them to manage the pools to a certain degree. So your flat files which are mainly read heavy will primarily reside on the RAID6 member, but maybe Windows files and your AD database will reside within that 15k RAID10. It's one volume that resides on both members, see?

I'm not sure if that answers your question, and my experience with different storage arrays are somewhat limited, but hopefully that helps to some degree.

That is not exactly what I am referring to. EMC for example using RAID Group (RG) sizes of 16 disk I believe. I'll do this example without spare disks since I don't know EMC best practices and definitely what their RG sizes are. So if you have 48 disks, it would create 3 RG's with 16 disks each. Let's go with dual parity as that seems most common with enterprise storage. Each RG will have 14 disks of usable capacity and 2 for parity. So with 48 disks and 3 RG's, you lose 6 disks to parity.

With NetApp you can go up to 28 disks in a RG when using RAID-DP. We have two shelves of 24 disks each for a total of 48 disks. So I made our RG size 23 disks. This way there are two RG's, 4 disks for parity, and 2 spares.

EqualLogic I believe creates the RG size on it's own, unless maybe you can change it through CLI. I only noticed recently on a EqualLogic with 24 disks setup in RAID 6. The usable capacity didn't make sense which is when I realized there were 4 disks dedicated for parity.

This kind of setup has been used for years, but is it old technology? Should companies still use RAID Groups? Why does Nimble use one huge storage pool instead with a since RAID array and triple parity? Is there way better?
 
EqualLogic I believe creates the RG size on it's own, unless maybe you can change it through CLI. I only noticed recently on a EqualLogic with 24 disks setup in RAID 6. The usable capacity didn't make sense which is when I realized there were 4 disks dedicated for parity.

This kind of setup has been used for years, but is it old technology? Should companies still use RAID Groups? Why does Nimble use one huge storage pool instead with a since RAID array and triple parity? Is there way better?

I'm not as familiar with NetApp or EMC so I can't comment on those. With Equallogic, it's slightly different. In any chassis, including your 24 Drive unit, 2 drives are ALWAYS dedicated as global Hot Spares, Beyond that you have your RAID overhead. In the case of your Raid 6 array, it's actually 24 Drives - 2 Drives for HotSpare - 2 Drives for Raid 6 double parity = 20 Drive array. So the physical RAID array has multiple LUNs (Volumes) within it that are exposed via iSCSI to the hosts. Is your question just asking about the pros/cons of the physical RAID architecture or the logical architecture as well?
 
That is not exactly what I am referring to. EMC for example using RAID Group (RG) sizes of 16 disk I believe. I'll do this example without spare disks since I don't know EMC best practices and definitely what their RG sizes are. So if you have 48 disks, it would create 3 RG's with 16 disks each. Let's go with dual parity as that seems most common with enterprise storage. Each RG will have 14 disks of usable capacity and 2 for parity. So with 48 disks and 3 RG's, you lose 6 disks to parity.

Sorta close and depends. Specifically speaking, your example with EMC stuff is way off tho. From what I gather, you are talking raid 6.. If you have 48 disks, you should have atleast 2 hot spaces. IE, you would do 5 6+2 raid groups, and 2 hot spares. You would have 6 drives left over from the 48 disk.

16 disks if you are doing raid 6 on a VNX. Normally EMC will not do 16 disks raid6 disk groups on Vmax 20k/40. You can but they don't push it. I personally am not a fan of 16 disk raid groups.

Other wise is Raid 5 4+1/7+1 or Raid 1 4+4.

Raid group sizing is entirely up to the best practices put out by the array vendor. Modern tier 1 storage arrays have introduced the pools concept. You let the array create the raid groups, they get added to a pool, and then you build space out of the pool. Data should be spread equally across the whole pool to help eliminate hotspots in the pool.

@OP..

Your questions are questions that should be answered by the tech guy the sales guy brings with him. Nimble may use a modified raid type to give you the protection they are talking about. Hard to say. When they say triple parity, one would hope that you still have hot spares in the frame that will auto swap in when a drive fails.

You have to be warry of a storage array that does not have hot spares.. Or an array that does not let you remove failed drives easily and replace them. I'm looking at you Xiotech Corp.
 
I'm not as familiar with NetApp or EMC so I can't comment on those. With Equallogic, it's slightly different. In any chassis, including your 24 Drive unit, 2 drives are ALWAYS dedicated as global Hot Spares, Beyond that you have your RAID overhead. In the case of your Raid 6 array, it's actually 24 Drives - 2 Drives for HotSpare - 2 Drives for Raid 6 double parity = 20 Drive array. So the physical RAID array has multiple LUNs (Volumes) within it that are exposed via iSCSI to the hosts. Is your question just asking about the pros/cons of the physical RAID architecture or the logical architecture as well?

For Raid 6 on PS61x0 it's:
(10+2)(9+2)+1(Spare)
Which means you have effectively two Raid 6 sets with one spare disk.


Like I said, two groups with 2 parity disks and 1 spare. The EqualLogic we support is setup this way.
 
Sorta close and depends. Specifically speaking, your example with EMC stuff is way off tho. From what I gather, you are talking raid 6.. If you have 48 disks, you should have atleast 2 hot spaces. IE, you would do 5 6+2 raid groups, and 2 hot spares. You would have 6 drives left over from the 48 disk.

16 disks if you are doing raid 6 on a VNX. Normally EMC will not do 16 disks raid6 disk groups on Vmax 20k/40. You can but they don't push it. I personally am not a fan of 16 disk raid groups.

Other wise is Raid 5 4+1/7+1 or Raid 1 4+4.

Raid group sizing is entirely up to the best practices put out by the array vendor. Modern tier 1 storage arrays have introduced the pools concept. You let the array create the raid groups, they get added to a pool, and then you build space out of the pool. Data should be spread equally across the whole pool to help eliminate hotspots in the pool.

@OP..

Your questions are questions that should be answered by the tech guy the sales guy brings with him. Nimble may use a modified raid type to give you the protection they are talking about. Hard to say. When they say triple parity, one would hope that you still have hot spares in the frame that will auto swap in when a drive fails.

You have to be warry of a storage array that does not have hot spares.. Or an array that does not let you remove failed drives easily and replace them. I'm looking at you Xiotech Corp.

Yes, I don't know much about EMC, but I did say no spares to make the formula easier for others to understand since it seems many people are unfamiliar with raid groups. :p

I understand doing what the vendor recommends is best, but I am also trying to understand if using raid groups is an old philosophy verse a large storage pool. I am mostly curious if companies like Nimble are avoiding RAID groups in order to increase their usable capacity.

On one of our NetApps the tier 2 storage has 72x 2TB disk (effective size 1.62TB per disk). We have it setup for 4 RG's of 15 disks and the fifth RG has 10 disks. This leaves 2 spares for the entire aggregate. Since there are 5 raid groups, there are 10 disks dedicated for parity. So essentially we have 60 disks times 1.62 TB for a total of 97.2 TB minus 10% for WAFL and we are left with 87.5 TB of usable.

Since a company like Nimble is just one large RG with triple parity and I believe 2 spares disks it would be 72 disk minus 2 spares, minus 3 parity leaving 67 disks times 1.62 TB which comes out to 108 TB. I have no clue how the rest of their numbers work and they brag so much about their compression and claim such high numbers for usable capacity.

But my question still remains unanswered. What are the pros/cons to having 5 raid groups with 10 parity disks verse one large raid group with 3 parity disks? I am sure if I asked NetApp and Nimble why it is this way they will have their reasons and arguments, but I am trying to get an outside perspective.
 
The answer to your question is just as it is with all things I.T. . . it depends. Is money a factor? Is capacity vs cost more important? Is physical foot print a concern? Is power consumption an obstacle?

There's never a single silver bullet, that's why there's competing vendors. If one shop built the perfect widget, the game would be over and they'd have all the money.

If it's the 'all eggs in one basket' of the single raid group you're exploring, that's going to depend on your own personal comfort level. It's been my experience that for some reason, many people believe the storage piece is pretty much indestructible. I've even had a coworker say, and I quote: "there's just so much robustesness built in to the SAN, it's just going to work." I kid you not... fucking robustesness.

I tend to not look at the storage piece as a single raid group vs multiples for me to sleep good at night. I compare it to something as rudimentary as a two node cluster. You have two because one might fail, right? Why in the hell would you have your layers of redundancy at the client, switch, server, and fabric all drill down into a single storage piece. Yet most architectural drawings do exactly that. It's starting to get a little more traction, (san to san replication) but quite neglected in a lot of peoples minds.
 
I understand doing what the vendor recommends is best, but I am also trying to understand if using raid groups is an old philosophy verse a large storage pool. I am mostly curious if companies like Nimble are avoiding RAID groups in order to increase their usable capacity.

One thing I noticed between vendors is how they use the term "pool".. I know in the EMC world, Pools are built on raid groups. But the end user has no idea how many raid groups under the pool there are because there is some abstraction from the end user.

I wish I could easily attach a screen shot to this post to so you what I see in both my VNX and VMax pools. In the EMC world at least, pools are also the gate way to thin provisioning. You can still do raid groups minus the pools, but they will be thick luns.

I highly doubt nimble is avoiding raid groups. They have to be using some sort of data protection, but it might be a slightly modified version of one of the known standards that is out there and they just call it something else. Queue Marketing Spin.

Ideally, you'll want to kick the sales guy out of the way, and ask the tech guy and hope he's not as slimy as the sales guy.
 
The answer to your question is just as it is with all things I.T. . . it depends. Is money a factor? Is capacity vs cost more important? Is physical foot print a concern? Is power consumption an obstacle?

There's never a single silver bullet, that's why there's competing vendors. If one shop built the perfect widget, the game would be over and they'd have all the money.

If it's the 'all eggs in one basket' of the single raid group you're exploring, that's going to depend on your own personal comfort level. It's been my experience that for some reason, many people believe the storage piece is pretty much indestructible. I've even had a coworker say, and I quote: "there's just so much robustesness built in to the SAN, it's just going to work." I kid you not... fucking robustesness.

I tend to not look at the storage piece as a single raid group vs multiples for me to sleep good at night. I compare it to something as rudimentary as a two node cluster. You have two because one might fail, right? Why in the hell would you have your layers of redundancy at the client, switch, server, and fabric all drill down into a single storage piece. Yet most architectural drawings do exactly that. It's starting to get a little more traction, (san to san replication) but quite neglected in a lot of peoples minds.

I agree a lot of the above points.

You mentioned robustesness... I got one for you.. I had a non-technical manager once say "I've never seen a problem that couldn't be solved by just throwing more hardware at it".. The eye roll was so hard in that room, everyone all at once got vertigo.
 
Back
Top