Best way to reach high-performance 5TB?

Joined
Oct 28, 2004
Messages
722
Alright, my boss has tasked me and the other tech with building a high performance storage solution.

The only specifications for the system are:
-full redundancy
-hot swap
-high-bandwidth interconnect desirable (My assumption is he wants 10-20gbit capability incase we ever go infiniband on any given cluster we have).

Now so far I've sort of at a very basic level compared Apple to Homebrew, with the homebrew having:
Areca 1260/1270
WD RE2 500gb or Seagate 500gb
Some generic hotswap cages

I'm estimating the base cost to be around $5000 for homebrew, not including the case/cpu/etc... Now my questions for homebrewing are:
Where can we get a like 4U case that has a ton of 5.25" slots that I can just jam pack full with sata hot swap cages? What sort of power supply should I be looking at for something that'll run 16-24 drives? Should I run multiple power supplies?

Now on to the Apple Xserve Xraid or whatever it is they call it.
$13000 for 7tb. Plus a fibre channel card which is probably what, $500? Has anybody used these before? Experience? Worth the money, are there better options for non-homebrewed?

Thanks a lot for any input.
 
Non-homebrew = only 99% of your ass is on the line when shit hits the fan.

But really, support contracts and etc, if this is for a business, are what win it. Are you a small business or what?
 
I'm going to be building a server based on this Supermicro chassis in the near future. It's got room for 15 drives, or 11.2 TB if you're willing to use SATA 750GB drives. I'd recommend an LSI card over the Areca; this gives you SAS capability (for a boot disk, or fast swap space, or whatnot) and a card designed for SAS-style connection. There's an expander (well, two, actually) in that case - thus, you only plug 4 connectors' worth into the backplane. I don't think Areca deals with that well. That's my suggestion, anyways.

The case is around a grand, the LSI cards are the same, drives are whatever you find them for, but suppose you want raid 5+hot spare, and are using 750GB disks. I've found the 750s are around $500, and you'd need 9 to get 5 TB within the constraints. So that's $4500, and add another (probably too generous) $2k for the rest of the system. That means the whole shebang is around $8500.

 
How does the LSI compare in terms of performance and features? Ideally we need as much performance as we can get, and afaik nothing compares to the Areca.
 
hokatichenci said:
How does the LSI compare in terms of performance and features? Ideally we need as much performance as we can get, and afaik nothing compares to the Areca.

Considering they both use the Intel processors onboard it is just a matter of who has the better firmware, and the added abilities of SAS over SATA.. LSI has been around for decades and probably has better engineers, but I have yet to see a review of a SAS RAID card.

==>Lazn
 
Here's what i would get.

http://www.circotech.com/rm-5583-5u-24-hot-swappable-hard-drive-tray-rack-mount-case-with-1050w-redundant-power-supply-and-he.html

Now, granted i don't know anything about this company, but hopefully someone here does.

I would fill it with 24 250gb hard drives, and since there's so many drives the chance of one failing is higher, so i would go with raid 6 which allows for 2 drives to fail without data being lost. With raid 6 you would have 2 hot spares and an unfrommated capacity of 5000gb, 5500gb without raid, and 500gb for the hot spares. Assuming $80 per drive, its about $2000 for hard drives and the case is 1700, so there's about 3700 spent on the case and hard drives. With 1300 left over for hardware, i'm not sure if this is enough for the hardware or not.
 
The LSI hasn't been reviewed in any reasonable conditions that I can find.

Why do you need so much storage and still need it performing highly? You might want a smaller subset (say 600GB on two SAS disks) for high-activity stuff, and the rest for bulk storage instead. What kind of access pattern will this be used for? Can you describe the application or workload a little?

 
I'm not really sure where this fits into the master plan of all things, but basically we've got a bunch of computing clusters to varying degrees (3x3 tile screens, 10x5 tile screens, curved screens, etc), and basically we're transferring significant amounts of visual data around. My guess is that this system will be a primary backup point as well as a data storage unit for all the captured HD video/100+ megapixel imagery. Infiniband capability would be really useful for this if we ever make that move, since we'd get a huge amount of bandwidth to play with.
 
You really should consider a unified solution from a provider. I am not sure how much more it would cost over a "roll your own" solution, but if you want native infiniband you really just put yourself out of that market.

==>Lazn
 
hokatichenci said:
I'm not really sure where this fits into the master plan of all things, but basically we've got a bunch of computing clusters to varying degrees (3x3 tile screens, 10x5 tile screens, curved screens, etc), and basically we're transferring significant amounts of visual data around. My guess is that this system will be a primary backup point as well as a data storage unit for all the captured HD video/100+ megapixel imagery. Infiniband capability would be really useful for this if we ever make that move, since we'd get a huge amount of bandwidth to play with.
Let's see. 10x5 screens at (presumably at least) 1024x768 at full color is... 117MB per frame. Capturing at (say) 30fps gives 3.375 GB/s, or 27 gigabits. If you intend to capture that, you'd need two double-speed 12x links. 4x adapters seem to be around $600. Even the 3x3 array mentioned is still 5gbits, or 600MB/s. I think you need to buy this from a provider, and be explicit about what your requirements on STR are, or you're not going to achieve them. When you need more than 4gbit fiber (!) to deal with your incoming data, you're pretty much screwed.

PS: 5 TB will last about 1481 seconds, or 24 minutes, at the 10x5 rate. Ouch.

 
The screens are 30" Apples, we're displaying HD (1080p or whatever the highest quality one is) in a scaled manner. Total resolution btw is 200megapixels ;) I think the best we've done so far so far for video is a single 1080p HD stream, though a lot of this code is experimental and still a work in progress. As for imagery, worst case I think I've seen is a few hundred 5megapixel images flipped between different screens, so really the images are only loaded a few small number of times before they are all in the distributed memory cache at worst (afaik from what I've heard about the software). So say 500 images @ 1mb each, thats really only 500mb in a single burst which should be easily achievable.

What I'm worried about is when they get the distributed HD content algorithms working well and they want to start pulling 5, 10, 50, even 100 HD videos at once.
 
I guess you're using compressed streams as test material? These seem to be low enough bitrate as not to be an issue; the "featured track", for example, in 720p, is 93MB for 113 seconds. That's 820KB/s. This one in 1080 (p or i, not sure) is 203MB for 254 seconds. That's 800KB/s. You could stream a dozen of those from a single disk, I'd be willing to wager. If you're streaming uncompressed video... may God have mercy on your soul :p 6MB a frame, times 24 or 30, means 150 or 186 MB/s. You won't have any problem reading or writing that with any decent controller and disks, but you might have problems finding a transport media for more than 8 of those - 150 MB/s is 1.2 gigabit, so 8 is 9.6. Most things are 10 gbit or slower these days.

In conclusion, compress your video (with what? I dunno, you're the researcher ;)) or die. And a vendor solution will likely be better for this.

 
Define high performance. Your definition, by industry standards, is likely nothing near high performance. The industry standard for that would be 15k fibre channel drives. In the case where you want to make a device that uses these and does 5Tb, contact EMC, HP, IBM, etc.

In the case where you want to make the cheapest device that will do 5TB, you'll use high capacity sata drives.

In either case, you really should be having an OEM do this. As much fun as it is to play geek, there is a really good reason to go with a vendor for this. They will take you through many questions that you haven't though of before such as what type of content is it, will it be changing, what have you thought of for DR, etc. At 5Tb, stick with the pros. That old saying that nobody ever got fired for buying IBM is very much true. If you build it, no matter how valid the reason, people will still point the finger at you if something goes wrong.
 
while I understand that in a university environment, moeny is an issue, I think that the value of your research will easily exceed the cost of buying it from a vendor. If someone has a problem with that, maybe you should explain about the expected cost if something bad goes wrong:

example: chance the stuff fails and you lose everything if you or me builds such a thing 3%
chance the stuff fails and you lose everything if the vendor builds this thing: 0.5% (approx)

cost of your research: $2 million.

0.03 * 2 million = $60k
0.005 * 2 million = $10k

while these numbers are pulled out of my arse, you see the $50k difference, which says that unless you can build that thing for $50k cheaper than the vendor, it makes sense to buy from the vendor.

All you need to do is get the numbers right (as I said, my are complete guesses) and you are ready to convince anyone that doing it "the right way" will likely be cheaper in the end.
 
You can also get competetive bids from differenent vendors on this. They generally have educational pricing. AT THE VERY LEAST, I would put those together and then the price of building it yourself, recommend that HP/IMB/EMC/etc build it and have the higher ups decide. Then if you do build it and things go bad it wasn't your decision and your career will be a lot safer. It sounds very much like you have the skills to do it, but we're talking aobut more than just your ability to do it.
 
Thanks for the advice so far guys. Currently pricing seems to break out as follows:
HP: 53k
Sun: 56k
IBM: ???
Apple: ~12-14k
Homebrew: ~7-8k

Estimated costs from a 2nd tier vendor are still coming in, but the other tech was guesstimating 15k.

Primary vendors are really not looking so hot atm...
 
Hit up EMC as well. They should be able to put together an AX150 for cheap (under 20K anyways). Be sure you're having a cheap NAS quoted. $53k sounds very much like they configured a small SAN to me.
 
why are you not looking at a dell powervault 220S?
I see ones loaded with 300GB drives (4.1TB) that also come with warranty's from dell for about $4000-5000.

Dell has also just relased a new powervault series called the MD1000 that is a SAS solution.
 
Back
Top