Help me understand incremental and differential backups

Peat Moss

Gawd
Joined
Oct 6, 2009
Messages
543
Tell me if I have this correct:

1. Incremental backup...is any changes at the file or folder level since the last incremental backup. A file is created on Monday and changed Tuesday and again Wednesday and again Thursday. When I save the file on Thursday, it will also save Wednesday's version, but not any version before then.

2. Differential backup....is any change at the file or folder level since the last full backup (is a full backup the same thing as a system/ disk image?). A full backup is created on Sunday. On Thursday, when I save the same file as above, it will save all the versions of the file back to Sunday when I made the full backup.
 
They're not all that different, though I'm not an expert, but had a stint where it was my responsibility to set up the back ups for a small org. IIRC, differential saves larger chunks (file/folder level) which leads to longer backup times and shorter restore times, and incremental saves actual data changes, so smaller chunks=quicker backups but longer to restore those smaller chunks to your last full backup. If you're looking for a backup solution it's all relative. For home use it's likely not too important which you choose. For enterprize it all depends on your use needs. I was a fan of Veeam's reverse incremental which created a full backup every time but left the incrementals behind it to backup to....so essentially your incrementals were the things that -didn't- change but your most recent back up on the system is a FULL backup that you can restore very quickly if necessary. If you had to restore further back it would essentially remove the changes to your current full backup instead of the forward incremental process of adding the changes to your most recent full backup. I'm sure someone else can give a much better description but since nobody responded yet I took a stab.
 
A Full backup is getting all the files and backs them up. This is a full "image" of the state of this folder in the moment the backup is done.
Let's say this FULL is done on Monday.
Incremental is as the name says - every new "incr" is only the changes/new_files since the last INCREMENTAL. Simple as that. You need all incrementals (and the last FULL) to restore the state of the source folder. Of course you can restore up to a specified incremental to restore the state as it was when this incremental was created.

Differerential - every DIFF takes all the new_files/changes since the last FULL.

Technically determining the changed or new files can be done in several ways - this will further clarify the advantages of both.
Under Windows the simplest way is the Archive bit (attribute) every file has in NTFS file system. Anytime a file is created or changed, its Archive bit is set to "ON". This indicates the file is "ready for archiving". It's only up to archiving software to use this bit. After the file is archived by a software, the software *can* turn "OFF" the Archive bit so it makes sure it knows the file has been archived and not archive it again (for incr/diff archives usage). It's a flag to know if the file has been archived. It is a cooperative flag so when using one archiving software using this flag, it should be the only one used for archiving so no other software could be misleaded about the archived state of the file.

Let's explain INCR archive creation in this context. You make a FULL on Monday. Archive bit is not taken into account, we archive all files. When the FULL is created, ALL archive bits of files are cleared to "OFF" so we know all files are archived during our next check.
On Tuesday we create the first INCR. It iterates through all files and any file with Archive bit set is archived and its Archive bit cleared (OFF). Remember, when a file gets changed or created, its Archive bit gets set to "ON" automatically by Windows/NTFS. This is the very nice (essential) feature we use A.bit for.
On Wednesday another file gets changed and the second INCR would contain just this file and not the Tuesday file because Tuesday's file has A.bit OFF already. So when you need restoring, you would restore the FULL first, the Tuesday's INCR, then Wednesday's INCR to get to the Wednesday state of the source folder.

Differential now. The only difference is when creating the Tuesday/Wednesday etc. archives after the FULL, Archive bits of changed/new files (since the last FULL) are not cleared! They stay to "ON". So Wednesday's DIFF archive would contain both Tuesday's and Wednesday's files because their Archive bits are not cleared when archived (The Monday's FULL archive still cleared all files Archive bits!!). So if you want to restore, you can restore the FULL first, then only restore the latest DIFF (Wednesday's) and that's it. All previous DIFFs are not needed because every new DIFF contain all the latest changes since the last FULL. You can restore any older DIFF and you would need only it in addition to the last FULL.

Both approaches (INCR and DIFF) have their advantages and disadvantages depending on factors such as desired archive sizes, how often the source files get changed, what file sizes are changing, etc. this is another story.
Just to mention one disadvantage of DIFFs - every DIFF contains all changes since the last FULL. This is potential waste of space because every DIFF contains all the same changes "registered" by all older DIFFs. So most often you will use INCREMENTALS.
One advantage of IDFFs however is when files get deleted frequently - in this case the next DIFF would not contain the deleted files (and you could just delete all previous DIFFs) whereas an INCR in this case also would'nt contain the deleted files but you could not delete previous INCRs because you need all previous INCRs to be able to restore the states.
It's easy to muse upon all the possible scenarios and use-cases with this background in mind.
 
Last edited:
Thanks. Is 'versioning' a separate process from incremental and differential backup? Or is it part of it?
 
Both (INCR and DIFF) can be used for "versioning". Versioning is just a "layer" of functionality over the archiving paradigm. It depends on how it's implemented either by Windows or by any software "on top" of incr/diff.
Both INCR and DIFF contain (register) the needed changes on a timely (daily for example) manner. The difference being only that DIFFs also include all previous (since the last FULL) changes whereas INCRs only include changes since the last INCR (and not the FULL as in DIFFs).
Every DIFF contains versions of all changed files for its day. Thus it's easier to get "versions".
To get a version of some file in INCR paradigm, for instance the Wednesday's, you would need to "check" Wednesday's INCR archive and if the file is not there, then check Tuesday's archive, and if not there, then Monday's FULL. Not finding the file in some INCR means this file has not been changed in its day and you can find it in some older INCR eventually down to the FULL (if the file has not been changed at all).
 
If incremental only includes the last change, how is it possible to get a version before the last one?
 
Because some older incremental would have it. Remember - to get a full state or restore of the source directory for a given day (Wednesday) you will need the last FULL archive plus all incrementals after it. Every incremental contains some change (may not contain anything if no changes took place between it and the previous incr) between it and the previous incremental.
The problem is with deleted files if not using some additional logic like mirroring. If you delete some file in Tuesday, it will not be "recorded" by standard means - incr/diff archiving only registers changes and new files. Because this could cause havoc in some situations I extended my scripts to keep track of deleted files also, creates text list files with deleted files for each incr/diff archive.
 
Tell me if I have this correct:

1. Incremental backup...is any changes at the file or folder level since the last incremental backup. A file is created on Monday and changed Tuesday and again Wednesday and again Thursday. When I save the file on Thursday, it will also save Wednesday's version, but not any version before then.

2. Differential backup....is any change at the file or folder level since the last full backup (is a full backup the same thing as a system/ disk image?). A full backup is created on Sunday. On Thursday, when I save the same file as above, it will save all the versions of the file back to Sunday when I made the full backup.

This is correct, mostly. Incremental will only backup the files that changed since the last backup, assuming that you run a full on Sunday and daily incremental jobs, Thursdays job will ONLY contain the files that changed since Wednesdays job, and a full restore on Friday would require all media from Sundays full to the last completed incremental.
On the other hand, differential jobs as you stated are the changes since the last full. This type of job has the benefit of a faster restore process because it only requires the full and last successful diff. And depending on the product used, could be an image/bare metal recovery option, however this is typically an option in the job settings and shouldn't be assumed as a full file level backup without system state, etc would only be a file level backup and wouldn't allow for the full recovery of a system. Deleted files are a different subject all together, so with a Windows host that has snapshots enabled your would need sysvol backups where the metadata is stored for snapshots. Not sure how this would be handled on a *nix system.

This can be a fairly lengthy conversation depending on what you are trying to accomplish. I'm curious why you ask as a simple google can provide definitions, but not always a similar use case.
 
If incremental only includes the last change, how is it possible to get a version before the last one?
depends on the retention settings of the media you are backing up to. if you rotated your media every 30 days, then you would be limited by this timeframe, Also, if enabled, OS level snapshots might provide additional coverage. oddly enough most people are not aware that newer versions of Windows allow for scheduled snapshots of any NTFS volume to essentially provide some level of versioning.
 
Some software also allows for reverse incremental backups. More info to fry your noodle
 
Back
Top