Budget Conscious & Safe build out for 50 Users / 100TB

rdnkjdi_1 · Aug 23, 2016

Currently I have three HP D2600 filled with 600GB SAS drives in raid 6 that is used for an imaging services bureau (18 million images / 18TB / accessed by 8 people). I’m in the process of adding a fourth D2600 and 12 more 600GB SAS drives.

My primary capacity is a developer but I also handle all of the networking / infrastructure. I’m hoping someone with more knowledge can help me with.

Question 1 - These D2600 are rated for up to daisy 4X 2600’s X 4 ext SAS on the P822 controller = 192 drives. I’m getting nervous having 48 drives on one array.

At what point do I hit “Too many drives” for a raid 6 (or RAID 60) array? I have Array controller, 3 enclosures, 6 ext SAS connections as failure points. Continuing to grow extends my points of failure - can someone help me gauge if these things were really intended to have 100+ drives on a single array?

Question 2 - Tentatively my next build out I need to support up to 100TB, 100 mil files & 50 employees . The build out needs to start slow (immediate needs) & grow over time. I see three options (keeping things in the HP fold)

A - Keep adding to existing array with 600GB SAS 3.5 drives.

B - Use D2700 / 1.2GB 2.5 10K SCSI drives which will cut down on my enclosures / ext SAS connectors (less likely to fail?).

C - Get two MDS600 - to put everything in. (Max 74TB high speed SAS w Raid 60).

But for whatever reason NONE of those options feel safe to me. Maybe it’s because I don’t have experience running 100+ drives in a single array? Or it’s actually a horrible idea?

Question 3 - Prior to an upgrade - the imaging department ran a Raid 6 Array using 25 7,200 RPM hard drives in an EMD Netstore (company no longer exists). It supported up to 15 users doing imaging work.

Can I go back to 7,200 RPM HP Sata drives and support 50 people hitting the array?

P.S. Budget is important.

Milk Dawg · Aug 23, 2016

Are all images accessed all the time or can a lot of those be archived off?

_Gea · Aug 23, 2016

My main concern with 100 TB+ and Raid6 on a current filesystem would be:
- After a crash or power outage, you should do a offline fschk. Can you imagine how long that would take

- On a crash during a write, you may be affected by a damaged Raid due the raid-hole problem of raid-6

- You talk about 100 mio files.
How many of them got currupted after some time due silent errors and you have no chance to detect or fix them

You should really think to use ZFS with its crash resistent CopyOnWrite filesystem, the ZFS software raid without the write hole problem and Checksums to repair silent errors.

And:
Versioning (readonly snapshots, Ransomware safe)
No offline fschk needed and prepared for Petabyte storage and many many disks in a pool from multiple vdevs/Raid arrays.
Free/ OpenSource with BSD, ZoL or OmniOS (free Solaris fork where ZFS comes from)

Keljian · Aug 26, 2016

I think Gea is right.

24x8tb disks should be your target setup
RaidZ 1 pairs in a zfs pool (fastest option- 96TB) or Raidz2 (8x3, 112TB) in a zfs pool.

Either option you can grow with a member of the pool (eg 8 drives for the latter, 2 for the former)

Add some SSDs for read and write cache, and a good whack of memory.

Budget Conscious & Safe build out for 50 Users / 100TB

rdnkjdi_1

n00b

Milk Dawg

n00b

_Gea

Supreme [H]ardness

Keljian

[H]ard|Gawd