SnapRAID question

Valnar

Supreme [H]ardness
Joined
Apr 3, 2001
Messages
4,306
I'm about to start using SnapRAID but I'm having a difficult time wrapping my head around how it works.

I understand regular RAID 5 & 6. All drives are pooled together so the parity is striped across all drives. They share the task having enough info to recreate the contents of a single failed drive. But I have some kind of mental block to SnapRAID's snapshot method.

If I have 3, 4 or 5 data drives and one parity drive, how can a single parity drive contain enough info to recreate ALL the files of a failed drive? I mean, wouldn't it need simply an entire copy of the file? Even more confused since it can "back up" several other hard drives with a combined size bigger than the parity drive.

Is hidden data snapshot and put on my other drives somewhere? Otherwise, why wouldn't a single parity drive be able to recover ALL my hard drives at once?
 
If I have 3, 4 or 5 data drives and one parity drive, how can a single parity drive contain enough info to recreate ALL the files of a failed drive? I mean, wouldn't it need simply an entire copy of the file?

With a single parity drive it works similar to raid3 but with a different layout. Instead of being under the filesystem it is on top of the filesystem. There is a content file that maps the individual files on each disk to blocks and contains checksums for the blocks. Parity for these blocks (computed across each drive like raid3) are computed and stored in the parity file that exists on the parity drive. The parity file is as big (actually a little bigger due to fragmentation) as the drive with the most data on it.

More on this later if no one answers.

In either case I do not recommend a single parity drive. Use 2 at minimum. Make them external drives if you want. You only need the parity drives connected when you sync, scrub ...

Is hidden data snapshot and put on my other drives somewhere?

No.
 
Last edited:
OK, RAID 3 I get. I had to brush up my knowledge of it from the 90's.

Thanks!
 
If I have 3, 4 or 5 data drives and one parity drive, how can a single parity drive contain enough info to recreate ALL the files of a failed drive?

A single drive, parity or data, does not contain enough information to recreate all the files of a failed drive.

All of the remaining data drives and parity drives together are used to recreate the lost files.
 
All of the remaining data drives and parity drives together are used to recreate the lost files.

This is very important and also part of the reason why I never recommend a single parity drive with SnapRAID. The reason is if you changed or removed files from 1 or more disks since your last sync the parity will not be able to fix the blocks on the changed or removed data.
 
I usually tell people to think of it conceptually like the way PAR2 files work, assuming you've familiar with QuickPAR. But ofcourse SnapRAID is working more at the drive level.

I absolutely love SnapRAID for parity in concert with DrivePool for pooling. The two handiest features are probably deleted file restoration (which I seem to use quite often in cases where I am too lazy to fire up my backup software and cold drives on the shelf for a restore), and dupe detection.

I suppose the only real downside is that some people might be put off by it being commandline only, and you have to roll your own scheduling. I just use windows task scheduler and have it do a sync at about 3am, with some logic built into the batch file that if any disks are missing it will email me a notification and not proceed with the sync.
 
Last edited:
I just use windows task scheduler and have it do a sync at about 3am, with some logic built into the batch file that if any disks are missing it will email me a notification and not proceed with the sync.

I'd love to see the logic on how it figures out the disk are missing. Thanks.
 
It uses the content file for this. In my setup I have a content file on all 6 of my data disks but not my 2 external parity drives.
 
I'd love to see the logic on how it figures out the disk are missing. Thanks.

Pretty simple, I can post full batch file tomorrow but it's basically an IF NOT EXIST line for every drive

IF NOT EXIST T: goto:diskcheckfail
IF NOT EXIST U: goto:diskcheckfail
Etc.

And then under :diskcheckfail I invoke blat.exe which is a command line smtp mailer. Sends log of failed operation to my Gmail account, and does a broadcast message to any PC on the home loan that I'm logged into. Note that snapraid will refuse to sync if any disks are missing anyway, but the point here is to generate an error level for notification purposes.

Again I can post or PM full batch, which also includes full logging, checking first if there were any changes to data before it bothers to attempt syncing. And then another batch file scheduled for once a month that does scrubbing / integrity checking.
 
Last edited:
Again I can post or PM full batch, which also includes full logging, checking first if there were any changes to data before it bothers to attempt syncing. And then another batch file scheduled for once a month that does scrubbing / integrity checking.

While I created my own batch file for snapraid I would love the opportunity to see how someone else wrote theirs.
 
While I created my own batch file for snapraid I would love the opportunity to see how someone else wrote theirs.

This is solved by moving/copying deleted/changed files to a hidden folder on the disk and then excluding that folder from the SnapRAID sync and then only deleting these files once a successful sync is completed.

I just use windows task scheduler and have it do a sync at about 3am, with some logic built into the batch file that if any disks are missing it will email me a notification and not proceed with the sync.

SnapRAID sync already won't go through with the sync it if a disk specified in it's config is missing and prints that out to stderr when it tries to run.
 
SnapRAID sync already won't go through with the sync it if a disk specified in it's config is missing and prints that out to stderr when it tries to run.

Yep. As I said in the post, SnapRAID won't proceed if it encounters missing disks; my point was I prefer the more cut and dry route of asking the OS if the driveletter exists, rather than trying to parse error output verbage or an errorcode - which is subject to change - and can vary depending on what SnapRAID can't find. Multiple ways to skin it. At the time I first wrote the batch, SnapRAID was a lot more limited than it is now. Now that its got hdparm/smartctl support there are probably more powerful options to explore, so you could have it for example not begin a sync and instead send a notification if any SMART errors are detected on any drives.
 
Does anyone run Elucidate GUI with Snapraid? I see it has a scheduler, but I can't tell if my nightly jobs are successful or not. Is there a log?
 
Is anyone willing to post their favorite Snapraid scripts? Something that checks SMART and emails when drives are failing would be cool too.
 
I believe SMART checking is in development. I use nagios for that however which is the same as I do at work on my work linux raid / zfs servers. With that said nagios is probably an overkill for a single server.
 
Back
Top