Building an expandable NAS

selsr

n00b
Joined
Jul 21, 2012
Messages
4
I'm planning to build an expandable NAS to backup large media and image files.

I do not have all the drives yet and I'm not able to acquire all of them during the initial build/setup. After much searching, it seems linux mdadm software RAID5 is the best way of achieving this. ZFS sounds great but it's not easily expandable.

So the plan is basically:
- Start with 2 drives in RAID1
- 2 months later, add 3rd drive and migrate to RAID5
- 2 months later, add 4th drive to expand RAID5
- 2 months later, add 5th drive to expand RAID5
- and so on... until I run out of ports
* OS will be on other storage device separate from the array

The array drives will all be identical Seagate 2TB 7200rpm SATA ones. From what I understand, expanding a large RAID5 array could take anywhere from several hours to even a few days. I have not decided on the other hardware yet; it could be anything from an embedded mini-itx board to a quad-core xeon. I'm figuring out how to build the storage first.

Is there a more efficient way of implementing a redundant + expandable NAS?
 
Yeah, this is what I'm about to do. But, the nice thing about mdadm is you can create a RAID5 with just 2 disks. You tell it that there's 3 drives in the array, but just give it the 2 that you have installed. Then later when you add your 3rd drive, you don't have to worry about trying to convert from RAID1 to RAID5.
 
I have a Linux MDADM Raid 5 NAS server, running since May 2009 without any problem.

I'm building a ZFS storage with expandable capability by external case (JBOD).

Extending = more hardware, more capacity, more data, more point of failure and so more redundancy....
At a point, a filesystem which can handle TB of data is needed ; the answer is ZFS.

With 2 TB drives, Raid 5 could be ok up to 8 drives (14 TB), then Raid6 is recommended up to 16 drives (28TB)...
More than 16 drives with the same filesystem ... let's say 18 drives ... Raid61 ? (32 TB with 36 drives, so Big Case,redundant PSU, Expander, Control power board ...)
...or your can use ZFS in RaidZ with vDevs of with 5x 2TB (8 TB)
OK, then adding more capacity is 5 by 5 but for 32 TB you only need 20 drives...

Let's say that ZFS is an investment and not a spending :)

Cheers.

St3F
 
i would personally start with 3 drives in a raidz1 then save up enough to get another 3 and create another vdev of radz1 and then add that to the original pool.
 
Yeah, this is what I'm about to do. But, the nice thing about mdadm is you can create a RAID5 with just 2 disks. You tell it that there's 3 drives in the array, but just give it the 2 that you have installed. Then later when you add your 3rd drive, you don't have to worry about trying to convert from RAID1 to RAID5.

When you create a raid5 with two disks and one "missing" you do not have redundancy. The OP probably wants redundancy even with two disks hence his choice of RAID1 to start.
 
Thanks for the feedback.

The OP probably wants redundancy even with two disks hence his choice of RAID1 to start.

Yes, that's right. It will be a while before I get the 3rd drive and I don't want to risk losing anything before that. It might be easier to start with 3 drives in RAID5 rather than deal with migrating from RAID1 to RAID5 later.

i would personally start with 3 drives in a raidz1 then save up enough to get another 3 and create another vdev of radz1 and then add that to the original pool.

With 3 drives, raidz gives just 66% capacity which isn't very effective. This would work much better with 5 drives though (80% capacity).

Let's say that ZFS is an investment and not a spending :)

Yes, that just gave me an idea:
- Start with 3 drives in RAID5
- Slowly expand to 8 drives and migrate to RAID6
- Continue obtaining drives, but don't expand the array. Use them for temporary storage that doesn't require redundancy.
- When this temporary set hits ~10 drives, create RAIDZ2 and transfer data over from RAID6.
- Then slowly expand the ZFS in sets of ~10 drives in RAIDZ2.

That seems to be the best way of going about it. The reason I'm waiting 2 (maybe 1) months between drives is so that I'll have ample time to ensure that the previous drive I've added is working properly without issues. It also helps to avoid a scenario of multiple drives failing simultaneously because they were part of a bad batch. And no doubt the price of 2TB drives will be lower a few months from now.
 
You say you are backing up large media and image files. If these are indeed large files that are rarely modified or deleted (mostly just adding additional files), then it sounds like a perfect candidate for snapshot RAID. I run SnapRAID on my linux box, and it works well.

You could start with an mdadm RAID 1 (or you could use snapraid for 1 data drive and 1 parity drive, but it seems like a silly thing to do), and then when you get your third drive, you can just break the RAID 1, erase one of the mirror pairs, and then start using snapraid with 2 data drives and 1 parity drive (technically, you should first run snapraid sync with 1 data drive and 1 parity drive, then erase the other mirror pair and then add it as a 2nd data drive to snapraid). Later, when you have many more disks, you could start using 2 parity drives.

On second thought, maybe it would not be silly to use snapraid with 1 data drive and 1 parity drive. In addition to providing redundancy (no data loss if either drive fails), it would also store checksums on your data which you wouldn't get with mdadm RAID 1.

http://snapraid.sourceforge.net/
 
Last edited:
You say you are backing up large media and image files. If these are indeed large files that are rarely modified or deleted (mostly just adding additional files), then it sounds like a perfect candidate for snapshot RAID. I run SnapRAID on my linux box, and it works well.
http://snapraid.sourceforge.net/

SnapRAID looks interesting. Is it basically higher-level RAID4 (dedicated parity) without data striping?
 
SnapRAID looks interesting. Is it basically higher-level RAID4 (dedicated parity) without data striping?

Yes, it is similar to RAID 4, except snapraid supports single- or dual-parity drives. Also, it is not "real time" but rather snapshot -- you need to run snapraid sync to update the parity after you make changes to the data drive(s). Additionally, snapraid maintains checksums on all your data so that it is possible to detect (and correct) any data that has become corrupt, which RAID 4 (by itself) cannot do.
 
Back
Top