Binary file compare utility for Windows

JJ Johnson

Gawd
Joined
Jun 26, 2008
Messages
802
I need a file comparison utility that simply tells me whether or not two files are the same. Ideally, when it finds that the files differ it would just quit and display its results, rather than continuing through the entire file.

Normally, I use the built-in Windows file compare in binary mode. This is fine for doing one file at a time in the console, as I can Ctrl-C out of it when it starts spitting out byte differences. But you can't really script it to do several file compares, since the output grows into the mega or gigabytes.

Ideas?
 
This will output 'identical' if they are the same.

Thanks. Yeah, that might work, but it doesn't look like it's practical. I'm trying it right now on a single large file (1.5 GB) and it's insanely slow. I think it's because fc is still generating all that output on mismatched files. Binary files that are mismatched tend to be mismatched at every byte starting at some point in the file.

File comparisons between comparably sized files that are the same are many times faster. I'm pretty certain now that I need a utility that bails out at the first mismatch rather than going through the whole file.
 
diff from diffutils can do this no problem.

The standard invocation is smart enough to detect binary files and give a simple yes/no answer. Otherwise you can force it to give a yes/no only with a '-q' option.

Code:
diff [file1] [file2]

Pretty sure there are Windows ports available.
 
Have you tried WinMerge?

http://winmerge.org/?lang=en

I've used this tool to compare files between different versions of games for uh, research. It doesn't have an option to stop while comparing two files, but it does have one when comparing two folders - it's weird, I know, but it may work for you depending on what you're trying to do.

There's also WinDiff. I haven't used this much though. http://www.grigsoft.com/download-windiff.htm
 
Last edited:
With big or binary files, you should calculate a hash value because doing so is faster than comparing the file contents.

PowerShell offers the Get-FileHash cmdlet for this task. All you have to do is check whether the results are the same or not

((Get-FileHash ".\file1.xml").hash) -eq ((Get-FileHash ".\file2.xml").hash)
 
With big or binary files, you should calculate a hash value because doing so is faster than comparing the file contents.

PowerShell offers the Get-FileHash cmdlet for this task. All you have to do is check whether the results are the same or not

((Get-FileHash ".\file1.xml").hash) -eq ((Get-FileHash ".\file2.xml").hash)

Hashing is the slowest way to compare files.

Comparing hashes requires you to read both files in their entirety before making a comparison.

Contrast with the process below.

Check the file size, if different files are different and done.
Check modification time, if different treat files as different and done.
Check the the file contents by comparing data streams stopping at the first difference.

The above process is how diffutils, rsync, et al detect differences (when all you want to know is if the files are different and not what the differences are).
 
Back
Top