delsum is a cli application for finding out checksums used in a file.
There are currently three subcommands:
check: given a specification of the checksum algorithm and a list of files, this simply outputs the checksums of these files.
part: given a specification of the checksum algorithm and a list of files with corresponding checksums, this finds parts of the files that have the given checksum
reverse: given a list of files with corresponding checksums, this finds the checksum parameters used
This subcommand calculates a checksum with a given algorithm.
$ delsum check -m 'crc width=32 poly=0x4c11db7 init=0xffffffff xorout=0xffffffff refin=true refout=true' file_a file_b file_c 700b14f5,e1207917,79741cb2
An algorithm can be specified directly as an argument with
-m or a list of algorithms can be provided in a file.
For the available algorithms and how to specify them, see here.
This subcommand finds all parts of a list of files where all given checksums match. The parts of the file that match has to be the same accross the files.
$ delsum check -m 'modsum width=16 module=ffff' -c 1234,5678,abcd file_a file_b file_c modsum width=16 module=ffff: 8:-2
In this case, the checksum matches in the part
-2 is relative from the end. If the files were of sizes 15, 16 and 18 bytes respectively, then this output would mean that
1234from byte 8 to 13
5678from byte 8 to 14
abcdfrom byte 8 to 16
One can also have the end of parts be relative from the start of the file (and not the end) by using the
There's a small chance that it will output something like
1,6:5,10 is output. This just means that each combination is possible. In this case, one would have
6:5 would theoretically also be a choice, it is not a valid one since it is backwards.
By exploiting the linearity of the checksums, this whole process can be done in roughly loglinear time, but just keep in mind that it has a big (linear) space overhead and you might run out of memory if you run it on a bunch of 500MB files.
One can also give a list of algorithms in a file as an input to
-M. This can be useful, as it allows to simply put the most common few checksum algorithm in there and look if any algorithms in any part of the files has the desired checksum. For the available algorithms and how to specify them, see here.
This subcommand finds parameters of a checksum algorithm.
With given files and checksums, it searches those algorithm parameters for which the entire files have the given checksums.
Note that at least the fundamental algorithm and the width of the checksum must be specified (for example,
crc width=32). For the available algorithms and how to specify them, see here.
$ delsum check -m 'crc width=32' -c 700b14f5,e1207917,79741cb2 file_a file_b file_c crc width=32 poly=0x4c11db7 init=0xffffffff xorout=0xffffffff refin=true refout=true
You generally need 3 files, and for algorithms other than
modsum at least one of the files needs to have a different length. It is also possible to specify some parameters of the algorithm (using for example
-m 'crc width=32 init=0'), which needs fewer files or yields fewer false positives.
If you have only files of a given length, but also only care about checksums of that length, for an algorithm not
modsum you can simply set
It is normally quite fast; for example the runtime for the CRC reversing algorithm is in most cases around
n is the filesize, which is thanks to the fast gcd algorithm implemented within the NTL and gf2x libraries.
There are currently three families of algorithms:
crc. They are specified like this:
algofamiliy width=123 para1=ff para2=true para3=10 name="algoname". Note that all numerical parameters except width are in hexadecimal.
A simple modular sum with parameters
sum = init for byte in file: sum = (sum + byte) % module return sum
Note that for a
module of 0, it is equivalent to
The default values for
init are both 0.
A fletcher-like sum with parameters
sum1 = init sum2 = 0 for byte in file: sum1 = (sum1 + byte) % module sum2 = (sum2 + sum1) % module sum1 = (sum1 + addout.sum1) % module sum2 = (sum2 + addout.sum2) % module if not swap: returm (sum2 << (width/2)) | sum1 else: returm (sum1 << (width/2)) | sum2
It is output in a "packed" form where the sum1 is stored in the lower width/2 bits and sum2 in the higher width/2 (or the opposite if
swap is enabled). The parameters are:
width: The width of the whole packed checksum. Mandatory.
module: The value by which to reduce.
module = 0means
2^(width/2)and is the default value.
init: The value to initialize the regular checksum with. Defaults to 0.
addout: The packed value which is added at the end of the sum. The high part is always added to the high part of the checksum at the end, regardless of
swap. Defaults to 0.
swap: The boolean flag which indicates that the regular sum should be in the higher half of the packed checksum. Defaults to
Note that currently, only sums over bytes are supported so sums fletcher-32 can't be specified yet.
A CRC algorithm with parameters in accordance to the Rocksoft^TM model, as documented in "A Painless Guide to Crc Error Detection".
It has the following parameters:
width: The width in bits of the checksum (and degree of poly). Mandatory.
poly: The generator polynomial, in normal notation. Mandatory (except for
init: The initial value of the crc state. Defaults to 0.
xorout: The final value to xor to the sum. Defaults to 0.
refin: The boolean flag indicating whether to reflect the bits of the input bytes. Defaults to
refout: The boolean flag indicating whether to reflect the bits of the final checksum, before adding
xorout. Defaults to
There is a linux build which has the NTL library compiled in here, but keep in mind that it is compiled without most modern x86 extensions and therefore can't take advantage of some optimized routines in
gf2x which makes CRC reversing a lot faster. I'm also too dumb for doing a Windows build, so sorry for that.
If you're on a Debian-based system, you can install them with
apt-get install libgmp-dev libgf2x-dev libntl-dev
You can also compile them yourselves, see here. This will generally make the fastest binary, as instruction set extensions can be used and there is also the possible of tuning the algorithm parameters.
If you have
cargo installed, it should then be possible to compile this in the project directory root with
cargo install --path .
or, without downloading the repository, with
cargo install delsum
If you want to link the NTL library statically, you can set the environment variable
DELSUM_STATIC_LIBS=1 when running
The code of this project is licensed under the MIT license.