File containing its own checksum

26,831

Solution 1

I created a piece of code in C, then ran bruteforce for less than 2 minutes and got this wonder:

The CRC32 of this string is 4A1C449B

Note the must be no characters (end of line, etc) after the sentence.

You can check it here: http://www.crc-online.com.ar/index.php?d=The+CRC32+of+this+string+is+4A1C449B&en=Calcular+CRC32

This one is also fun:

I killed 56e9dee4 cows and all I got was...

Source code (sorry it's a little messy) here: http://www.latinsud.com/pub/crc32/

Solution 2

Yes. It's possible, and it's common with simple checksums. Getting a file to include it's own md5sum would be quite challenging.

In the most basic case, create a checksum value which will cause the summed modulus to equal zero. The checksum function then becomes something like

(n1 + n2 ... + CRC) % 256 == 0

If the checksum then becomes a part of the file, and is checked itself. A very common example of this is the Luhn algorithm used in credit card numbers. The last digit is a check digit, and is itself part of the 16 digit number.

Solution 3

Check this:

echo -e '#!/bin/bash\necho My cksum is 918329835' > magic

Solution 4

"I wish my crc32 was 802892ef..."

Well, I thought this was interesting so today I coded a little java program to find collisions. Thought I'd leave it here in case someone finds it useful:

import java.util.zip.CRC32;

public class Crc32_recurse2 {

    public static void main(String[] args) throws InterruptedException {

        long endval = Long.parseLong("ffffffff", 16);

        long startval = 0L;
//      startval = Long.parseLong("802892ef",16); //uncomment to save yourself some time

        float percent = 0;
        long time = System.currentTimeMillis();
        long updates = 10000000L; // how often to print some status info

        for (long i=startval;i<endval;i++) {

            String testval = Long.toHexString(i);

            String cmpval = getCRC("I wish my crc32 was " + testval + "...");
            if (testval.equals(cmpval)) {
                System.out.println("Match found!!! Message is:");
                System.out.println("I wish my crc32 was " + testval + "...");
                System.out.println("crc32 of message is " + testval);
                System.exit(0);
            }

            if (i%updates==0) {
                if (i==0) {
                    continue; // kludge to avoid divide by zero at the start
                }
                long timetaken = System.currentTimeMillis() - time;
                long speed = updates/timetaken*1000;
                percent =  (i*100.0f)/endval;
                long timeleft = (endval-i)/speed; // in seconds
                System.out.println(percent+"% through - "+ "done "+i/1000000+"M so far"
                        + " - " + speed+" tested per second - "+timeleft+
                        "s till the last value.");
                time = System.currentTimeMillis();
            }       
        }       
    }

    public static String getCRC(String input) {
        CRC32 crc = new CRC32();
        crc.update(input.getBytes());
        return Long.toHexString(crc.getValue());
    }

}

The output:

49.825756% through - done 2140M so far - 1731000 tested per second - 1244s till the last value.
50.05859% through - done 2150M so far - 1770000 tested per second - 1211s till the last value.
Match found!!! Message is:
I wish my crc32 was 802892ef...
crc32 of message is 802892ef

Note the dots at the end of the message are actually part of the message.

On my i5-2500 it was going to take ~40 minutes to search the whole crc32 space from 00000000 to ffffffff, doing about 1.8 million tests/second. It was maxing out one core.

I'm fairly new with java so any constructive comments on my code would be appreciated.

"My crc32 was c8cb204, and all I got was this lousy T-Shirt!"

Solution 5

Certainly, it is possible. But one of the uses of checksums is to detect tampering of a file - how would you know if a file has been modified, if the modifier can also replace the checksum?

Share:
26,831
zakovyrya
Author by

zakovyrya

Updated on July 09, 2022

Comments

  • zakovyrya
    zakovyrya almost 2 years

    Is it possible to create a file that will contain its own checksum (MD5, SHA1, whatever)? And to upset jokers I mean checksum in plain, not function calculating it.

  • zakovyrya
    zakovyrya almost 15 years
    Although it's perfectly valid practical approach, I meant checksum that will include itself also
  • Philippe Leybaert
    Philippe Leybaert almost 15 years
    I'm not a mathematician, but I think this is simply impossible
  • Lasse V. Karlsen
    Lasse V. Karlsen almost 15 years
    It isn't impossible, but it is very very difficult.
  • Steven Sudit
    Steven Sudit almost 15 years
    Right, that's what I said. :-) Since it's only 32 bits, it's entirely feasible to just brute-force the solution.
  • Steven Sudit
    Steven Sudit almost 15 years
    For CRC-32, it's actually quite simple. For a crypto hash, you'd be quite correct.
  • andrewrk
    andrewrk about 14 years
    This does not show how to include the md5sum of a file within the file, which is what the question asked.
  • Eli
    Eli about 14 years
    If you embed that data within the file, wouldn't that change the md5 checksum?
  • ChrisBD
    ChrisBD about 14 years
    It would if you ran the checksum routine on it again, but that is the point of removing it before use. Simplest way would be to just add the checksum onto the end of the file. When the file is received you remove the checksum data and rerun the checksum routine on the remaining data. Any data corruption to either the checksum or the original data will show up here.
  • tloflin
    tloflin about 14 years
    I am fairly certain zakovyrya was asking for the checksum to be included in its own calculation.
  • Synox
    Synox over 12 years
    hey, how did you make this precomputed table? i want to do exactly the same... :)
  • Admin
    Admin almost 12 years
    Just incremented the number and checked by a bash script at around 350 checks per second for 3 months or so. I think this in not the only valid cksum for this file
  • LatinSuD
    LatinSuD almost 12 years
    I think i found the code. It is dirty and there is no precomputed table. latinsud.com/pub/crc32
  • Mark Ransom
    Mark Ransom about 11 years
    @AmigableClarkKant, my point being that going down this path is harmful - it defeats the purpose of having a checksum in the first place. The question specifically mentioned cryptographic algorithms so I presume the intent was to detect deliberate tampering rather than accidental corruption.
  • localhost
    localhost about 11 years
    @LatinSuD I'm a java person and not great with c. Can you explain how the code works? I don't understand how you can use a precomputed table when the crc is part of the string you're calculating.
  • flarn2006
    flarn2006 over 6 years
    @MarkRansom I wouldn't trust any cryptographic algorithm that derives its "security" from a lack of public discussion of how to break it. In cases like that, there should be public discussion. It wouldn't ruin the security because any security would have been fake anyway, and that way people will know the algorithm isn't actually secure and that they should use something else instead.
  • Mark Ransom
    Mark Ransom over 6 years
    @flarn2006 my point is that putting the checksum on the file would not provide any security at all. If you want to detect accidental corruption of a file then it might be useful, but it is worthless against an intentional attack.