How do you backup with integrity check?

pinoy

Limp Gawd
Joined
Dec 8, 2010
Messages
447
I currently backup my data using drag and drop to my harddrives and flashdrives. When one of my flashdrives started goofing off and corrupting individual files I realized my backup routine sucks. Of course the drive never gives me a hint that something has gone awry. I randomly checked a few of the thousands of photos I had and some were corrupt and others didn't open at all. There is no way I would do this manual checking on a regular basis as it's time consuming. So I ask what is the best way to backup my data with integrity checking? Does archiving to RAR or using MD5 checksum good?
 
Using checksums like MD5 and SHA1 are fine but all they do is compare the current state of the file's data to the state when the checksum was created. If you want some level of redundancy and the ability to recover files that are corrupt, I suggest learning about PAR technology, using tools such as MultiPar which allows you to take files, preferably in a set or even in an archive, and it does a whole bunch of mathematical work on the data and then creates parity files - hence the PAR aspect - which can then be used to rebuild/recover/recreate/restore files that are in the set or archive without any loss. It does require some level of redundancy (typically 15%) but that's adjustable and the idea is the level of redundancy provides for how much of the content can be restored.

I typically use PAR technology when I save data to M-Discs, and yeah, I know, optical disks, not everybody cares or wants 'em, that's fine. It works best for me and has never failed and I've never lost a single byte of data in 15 years of using my strategy, but I only started using M-Disc long term archival storage media about 8 years ago. Regardless, I use a 22% recovery record which means I could lose a great deal of data and still be completely safe after the rebuild from the parity files. I store data in sets of 10 DVD5-sized M-Disc media at about 4.3GB total storage per disc - yes DVD5 has a cap of about 4.7GB but I prefer not to write data to the most extremes of the outer edges of optical media since that's the place that could develop issues more than anywhere else during the burning process. M-Disc media isn't cheap either - a 50 pack of Verbatim M-Disc usually runs me about $48 but as I said I've never had issues, never lost a disc, never had a bad burn, etc.

The only way to guarantee integrity is to provide redundancy so you can literally rebuild a corrupt copy of a file to it's original status. Checksums can't and never will do that, they're just hashes against the original file - once the original file or the archived version is damaged, the checksum becomes meaningless so, that's where redundancy comes in using parity files with PAR technology.

No, it's not the only method to do this, and as I said recently in another post discussing this type of backup strategy, I don't save every single thing I ever get my hands on or create - the M-Disc backups are content that I really do not want to lose and consider irreplaceable so that's why that strategy is put to work. I don't trust hard drives, or SSDs, because I've had both types of storage just up and die on me and I simply don't like the idea that I could potentially lose terabytes of data in a split second just because some mechanical or electronic issue happens and the media becomes useless.

With optical backups - and over the years of doing this I've got less than about 240GB of long term must-be-safe-at-all-cost data archived on M-Disc media, like I said, I ain't a hoarder - I know if all Hell breaks loose and I actually do lose some data, I won't lose more than a few gigs of it at a time. While some of the stuff truly is one of a kind and cannot ever be replaced - family pics and videos, truly important data like that or something of that nature - I put my faith in the proven optical burning media technology over anything else. Thought about actual DAT or tape backups years ago and I know they have a serious level of reliability but it's just not something I care about now that I have my own rock solid 100% reliable method I created for myself.

Anyway... do some research, you'll find what you're looking for. ;)
 
Back
Top