Seeking Advice - Managing 130TB of Video Data

Joined
Jul 13, 2007
Messages
634
Hello everyone,

I've been tasked with coming up with a backup solution for our video archive. Let me describe what we have in place currently:

We have a Thecus N16000V NAS unit with 3 D16000's attached to it for a total of 64 Drives. First unit has 16x4TB drives, and the other 2 units have 16x6TB drives (One unit is populated with old HD's that I don't trust for use). Each unit is its own RAID60 Array. We have a total of 3 mount points on the main N16000V, with a total storage space of 174TB..


All the files on this NAS unit are video files (Adobe Premier and RAW video).

I'm looking for any suggestions on where I should take this storage so that we can properly back it up. I started looking at Dell EMC solutions, but don't see how dedup technology is going to save us much in the way of eliminating the amount of data needing to be backed up.

Thank you for taking the time to read this,

- Rob
 
Are you looking to whitebox or go with a vendor solution? What's the budget? What kind of connection to the storage do you need? 1gb, 10gbs, 40gbs Ethernet?
 
Are you looking to whitebox or go with a vendor solution? What's the budget? What kind of connection to the storage do you need? 1gb, 10gbs, 40gbs Ethernet?

Unfortunately, budget hasn't been set yet. I'm looking to see what my options are before proposing what it'll cost.
As far as connection to the storage, we've got a 10gb backbone for our network (Zyxel XS-3700) so 10gb or bonded 10gb's will work.
I'd like to go with a vendor'ed solution over whiteboxing it. I don't want to pigeon-hole myself into being the only source of support for it (I've got enough on my plate, I really don't need it).

Might want to look at 45Drives.com.

Getting another NAS setup is a possible solution. I never heard of these guys before, but I do know of N3rdFusion. Didn't know that they did a sponsored install for them. Interesting.

Thank you all for the responses / questions so far. Appreciate the help :)

- Rob
 
What is your budget for this endeavor, and what is your backup window for the time you have available for uninterrupted backup. Do you plan on shipping data offsite (physically or logically?) Are you looking for just an archival backup (Tape or cold storage disk) or for something that can act as a mirror of your existing data for disaster recovery?
 
I would do Data Domain disk based backups, You will get your expandability and once the initial replication takes place they offer a great site to site mirrioring in the background to maintain your backups across multiple sites for things like UL requirements and such. Plus you have the support behind you and a breadth of potential people experienced with this solution idea.

For stuff you need to offline I second a LTO-7 library that you schedule your backups to. And it if is just disaster recovery that would be a solution.

Beyond that talk to some people that use Data Domain and check on the de duplication operation with raw image files. I would imagine your bit counts of identical data in different positions is higher than you think... based on how deduplication works of course. I don't use it on our systems because of the additional overhead.
 
LTO-7 would do it for archival backup
I'm looking to do this for pure archival purposes. Might be part of the implementation as an LTFS setup where they can take video projects and just offload them to LTO tape for storage purposes. This implementation would happen AFTER I get the day-to-day replication taken care of.

What is your budget for this endeavor, and what is your backup window for the time you have available for uninterrupted backup. Do you plan on shipping data offsite (physically or logically?) Are you looking for just an archival backup (Tape or cold storage disk) or for something that can act as a mirror of your existing data for disaster recovery?
Budget is unknown so far, as I've been tasked with just investigating all possible avenues first.
Off-site storage of archived data might happen, but I'm inclined to handle it with the LTO solution. We don't have the money to do an Amazon AWS implementation for off-site protection..
I'm kinda looking at doing both Day-to-Day backups, and archival handling of the video data.

I would do Data Domain disk based backups, You will get your expandability and once the initial replication takes place they offer a great site to site mirrioring in the background to maintain your backups across multiple sites for things like UL requirements and such. Plus you have the support behind you and a breadth of potential people experienced with this solution idea.

For stuff you need to offline I second a LTO-7 library that you schedule your backups to. And it if is just disaster recovery that would be a solution.

Beyond that talk to some people that use Data Domain and check on the de duplication operation with raw image files. I would imagine your bit counts of identical data in different positions is higher than you think... based on how deduplication works of course. I don't use it on our systems because of the additional overhead.
This is where I'm leaning more and more as the solution. I've been reading up on EMC's Data Domain products and we might have to go this route. The initial imaging of the data might take a long while to occur, but once that is done it shouldn't be that bad to replicate the data between the 2 systems. We'd have to setup an LTO library for archival purposes...

Thank you all for the input. I do truly appreciate it. :)

- Rob
 
We were looking at Data Domains originally. Now we have a pair of Rubriks and like them very much.
 
Crashplan used to have a very reasonably priced enterprise server offering that did offsite data replication to as many servers as you wanted - was well under $1000. You could just do a Windows server with storage spaces at a remote site with the Crashplan software handing everyone. Even with their licensing pricing changes I'd wager it would still beat the pants off an EMC solution. Not much of a risky bet. You can pre-stage the storage too - do the initial backup with the server on site then ship it off site to the remote location - it picks up where it left off.

There's other vendors like cloudberry - yes, they are targeted at providing front ends to bulk storage from vendors like Amazon or Backblaze B2 but you can point front ends like Cloudberry to your own internal storage easily enough.

And this would all be based on bog standard off the shelf software if you use Windows Server as your base for the remote storage, so support shouldn't be anything special. Probably more reliable than something complicated like the Data Domain stuff I've been involved with in the past, especially for your use case.

Heck - Backblaze B2 is cheap enough you might want to just price it out (they have a pretty good cost estimator) and see if it's worth it hosting the backup internally, especially if support is a concern. Backblaze has a nice page that lists several front ends like Cloudberry so if you don't like Cloudberry you can test out others. Encryption should be able to quell any uneasiness about putting your bits in the cloud - all the solutions that Backblaze lists as compatible with their storage allow you to set encryption keys that make your bits meaningless without them (so don't loose those keys or your backups will be worthless!)

I would be loath to start with older solutions like Data Domain today - they were good for their time, but way too expensive and complex with other stuff that's out there - IMNSHO.
 
I can't comment on actual backup solutions, but the network admins where I work did some testing and found de-duplication had a negligible impact on the total backup size of my video and audio files.
 
I can't comment on actual backup solutions, but the network admins where I work did some testing and found de-duplication had a negligible impact on the total backup size of my video and audio files.
That is my fear as well. Thanks for the input!
 
I can't comment on actual backup solutions, but the network admins where I work did some testing and found de-duplication had a negligible impact on the total backup size of my video and audio files.

It is highly unlikely that compressed files will have data blocks with the exact same bit pattern as any data block on some other compressed file.
 
It is highly unlikely that compressed files will have data blocks with the exact same bit pattern as any data block on some other compressed file.
That was their initial impression as well, but the sheer size of my nightly backup meant they had to at least try. Thankfully all of my content *expires* very quickly, so we ended up having legit backups for a "rolling" year. Anything older than a year is archived on my local machine (though the odds of me needing anything older than 6 months even is beyond slim). I haven't had to touch anything older than a month or two in the past seven years.
 
i have a lot of experience with almost all the major emc platforms, and I personally do a lot of video streaming. I can tell you out the door, that no product out there will dedupe and compress your video data that well. Why? Video and audio data is already been compressed. There is very little white space, and commonality in media formats. Its like zipping a zip file. You'd be lucky to see 1:1.5 compression. If you see more, awesome.. but don't bank on it. Want a valid test? Take one of your media files and compress it with the LZ compression standard. What you see in terms of reduction with that is how much your data will compress. Why LZ? because in the Data Domain world, they use either GZ or LZ, but LZ is the default. I've done a lot of EMC isilon.. just finished setting up 2 450+TB clusters a few months ago. They do have a dedupe and compression process but again, with media, you can forget any kind of space savings. They also have "Archive" nodes.. Isilon requires a minimum of a 3 node system. If you are looking for long term storage, don't use data domain go with Isilon because you can easily expand your Isilon cluster. DataDomain will hit limits very early on, and the support and license model is very expensive. I'll be honest, if you are looking at EMC, you need to have the deep pockets for it. I use datadomain to stage all my backups for 30 days, then I spin them off to isilon after 30 days for long term cheap and deep storage.
 
Back
Top