I have roughly 25 TB of mostly not very compressible data (videos and pictures) that is currently growing at a rate of about 8 TB per year. The rate at which this pool of data I keep is growing continues to rise each year. All of this data is personal data of which I possess the only copy and so I need a system where data integrity (i.e. being able to read the data back years later with no non-recoverable read errors) is paramount. Internet storage solutions are not great for me since my upload speed is terrible, being only about 3 mbps. I do not need to be able to access this data very quickly so offline/cold storage is acceptable, but I absolutely do need to be able to access this data with no non-recoverable read errors for the decades to come.
I have a few ideas so far, but all of them have some pretty hefty drawbacks:
1. optical media: all accounts say this isn't great for archiving since optical disks decay in about 5 years.
2. large RAID with ZFS: expansion of the RAID would require a rather hefty investment each time I wanted to expand the amount of data I could store. It also requires electricity to keep running and I'd have to routinely keep replacing disks over the years to avoid data loss. I'd also have to buy additional hardware for this setup and internal SATA ports are actually quite expensive.
3: tape media: although tapes have the cheapest price per GB of all the methods I've come across, newer tape reader/writers for LTO5/6 are crazy expensive. I've also read about the concern of reader/writers failing and not being able to get a compatible replacement unit. Also, it appears that the tape medium itself is very fragile and would require maintaining a special storage environment for the tapes to best ensure readability of the tapes in the future.
Each also have some unique benefits:
1. optical media: Smaller size allows me to more accurately concentrate additional redundancy if I so choose.
2. RAID: more or less real time status of the integrity of my data.
3. tape media: absolutely the cheapest option once my data pool gets above a certain size.
I'd also probably be employing the use of par2 files to help fight against things like bitrot.
I seem to be in this weird spot between typical home users with only a handful of TBs of data to worry about at most and enterprises who have lots more data than I, but also have a lot more money than I to spend.
In short, I'm looking for the cheapest solution that can guarantee that I can recover all of my data 10, 20, 30+ years down the road without any non-recoverable corruption. I'd also like to favor higher upfront costs for reduced recurring costs, provided that it is reasonable and actually more economical.
If you have some alternative solutions in mind, please post a rough estimate of upfront and recurring costs.
I have a few ideas so far, but all of them have some pretty hefty drawbacks:
1. optical media: all accounts say this isn't great for archiving since optical disks decay in about 5 years.
2. large RAID with ZFS: expansion of the RAID would require a rather hefty investment each time I wanted to expand the amount of data I could store. It also requires electricity to keep running and I'd have to routinely keep replacing disks over the years to avoid data loss. I'd also have to buy additional hardware for this setup and internal SATA ports are actually quite expensive.
3: tape media: although tapes have the cheapest price per GB of all the methods I've come across, newer tape reader/writers for LTO5/6 are crazy expensive. I've also read about the concern of reader/writers failing and not being able to get a compatible replacement unit. Also, it appears that the tape medium itself is very fragile and would require maintaining a special storage environment for the tapes to best ensure readability of the tapes in the future.
Each also have some unique benefits:
1. optical media: Smaller size allows me to more accurately concentrate additional redundancy if I so choose.
2. RAID: more or less real time status of the integrity of my data.
3. tape media: absolutely the cheapest option once my data pool gets above a certain size.
I'd also probably be employing the use of par2 files to help fight against things like bitrot.
I seem to be in this weird spot between typical home users with only a handful of TBs of data to worry about at most and enterprises who have lots more data than I, but also have a lot more money than I to spend.
In short, I'm looking for the cheapest solution that can guarantee that I can recover all of my data 10, 20, 30+ years down the road without any non-recoverable corruption. I'd also like to favor higher upfront costs for reduced recurring costs, provided that it is reasonable and actually more economical.
If you have some alternative solutions in mind, please post a rough estimate of upfront and recurring costs.