File containing its own checksum
Solution 1
I created a piece of code in C, then ran bruteforce for less than 2 minutes and got this wonder:
The CRC32 of this string is 4A1C449B
Note the must be no characters (end of line, etc) after the sentence.
You can check it here: http://www.crc-online.com.ar/index.php?d=The+CRC32+of+this+string+is+4A1C449B&en=Calcular+CRC32
This one is also fun:
I killed 56e9dee4 cows and all I got was...
Source code (sorry it's a little messy) here: http://www.latinsud.com/pub/crc32/
Solution 2
Yes. It's possible, and it's common with simple checksums. Getting a file to include it's own md5sum would be quite challenging.
In the most basic case, create a checksum value which will cause the summed modulus to equal zero. The checksum function then becomes something like
(n1 + n2 ... + CRC) % 256 == 0
If the checksum then becomes a part of the file, and is checked itself. A very common example of this is the Luhn algorithm used in credit card numbers. The last digit is a check digit, and is itself part of the 16 digit number.
Solution 3
Check this:
echo -e '#!/bin/bash\necho My cksum is 918329835' > magic
Solution 4
"I wish my crc32 was 802892ef..."
Well, I thought this was interesting so today I coded a little java program to find collisions. Thought I'd leave it here in case someone finds it useful:
import java.util.zip.CRC32;
public class Crc32_recurse2 {
public static void main(String[] args) throws InterruptedException {
long endval = Long.parseLong("ffffffff", 16);
long startval = 0L;
// startval = Long.parseLong("802892ef",16); //uncomment to save yourself some time
float percent = 0;
long time = System.currentTimeMillis();
long updates = 10000000L; // how often to print some status info
for (long i=startval;i<endval;i++) {
String testval = Long.toHexString(i);
String cmpval = getCRC("I wish my crc32 was " + testval + "...");
if (testval.equals(cmpval)) {
System.out.println("Match found!!! Message is:");
System.out.println("I wish my crc32 was " + testval + "...");
System.out.println("crc32 of message is " + testval);
System.exit(0);
}
if (i%updates==0) {
if (i==0) {
continue; // kludge to avoid divide by zero at the start
}
long timetaken = System.currentTimeMillis() - time;
long speed = updates/timetaken*1000;
percent = (i*100.0f)/endval;
long timeleft = (endval-i)/speed; // in seconds
System.out.println(percent+"% through - "+ "done "+i/1000000+"M so far"
+ " - " + speed+" tested per second - "+timeleft+
"s till the last value.");
time = System.currentTimeMillis();
}
}
}
public static String getCRC(String input) {
CRC32 crc = new CRC32();
crc.update(input.getBytes());
return Long.toHexString(crc.getValue());
}
}
The output:
49.825756% through - done 2140M so far - 1731000 tested per second - 1244s till the last value.
50.05859% through - done 2150M so far - 1770000 tested per second - 1211s till the last value.
Match found!!! Message is:
I wish my crc32 was 802892ef...
crc32 of message is 802892ef
Note the dots at the end of the message are actually part of the message.
On my i5-2500 it was going to take ~40 minutes to search the whole crc32 space from 00000000 to ffffffff, doing about 1.8 million tests/second. It was maxing out one core.
I'm fairly new with java so any constructive comments on my code would be appreciated.
"My crc32 was c8cb204, and all I got was this lousy T-Shirt!"
Solution 5
Certainly, it is possible. But one of the uses of checksums is to detect tampering of a file - how would you know if a file has been modified, if the modifier can also replace the checksum?
zakovyrya
Updated on July 09, 2022Comments
-
zakovyrya almost 2 years
Is it possible to create a file that will contain its own checksum (MD5, SHA1, whatever)? And to upset jokers I mean checksum in plain, not function calculating it.
-
zakovyrya almost 15 yearsAlthough it's perfectly valid practical approach, I meant checksum that will include itself also
-
Philippe Leybaert almost 15 yearsI'm not a mathematician, but I think this is simply impossible
-
Lasse V. Karlsen almost 15 yearsIt isn't impossible, but it is very very difficult.
-
Steven Sudit almost 15 yearsRight, that's what I said. :-) Since it's only 32 bits, it's entirely feasible to just brute-force the solution.
-
Steven Sudit almost 15 yearsFor CRC-32, it's actually quite simple. For a crypto hash, you'd be quite correct.
-
andrewrk about 14 yearsThis does not show how to include the md5sum of a file within the file, which is what the question asked.
-
Eli about 14 yearsIf you embed that data within the file, wouldn't that change the md5 checksum?
-
ChrisBD about 14 yearsIt would if you ran the checksum routine on it again, but that is the point of removing it before use. Simplest way would be to just add the checksum onto the end of the file. When the file is received you remove the checksum data and rerun the checksum routine on the remaining data. Any data corruption to either the checksum or the original data will show up here.
-
tloflin about 14 yearsI am fairly certain zakovyrya was asking for the checksum to be included in its own calculation.
-
Synox over 12 yearshey, how did you make this precomputed table? i want to do exactly the same... :)
-
Admin almost 12 yearsJust incremented the number and checked by a bash script at around 350 checks per second for 3 months or so. I think this in not the only valid cksum for this file
-
LatinSuD almost 12 yearsI think i found the code. It is dirty and there is no precomputed table. latinsud.com/pub/crc32
-
Mark Ransom about 11 years@AmigableClarkKant, my point being that going down this path is harmful - it defeats the purpose of having a checksum in the first place. The question specifically mentioned cryptographic algorithms so I presume the intent was to detect deliberate tampering rather than accidental corruption.
-
localhost about 11 years@LatinSuD I'm a java person and not great with c. Can you explain how the code works? I don't understand how you can use a precomputed table when the crc is part of the string you're calculating.
-
flarn2006 over 6 years@MarkRansom I wouldn't trust any cryptographic algorithm that derives its "security" from a lack of public discussion of how to break it. In cases like that, there should be public discussion. It wouldn't ruin the security because any security would have been fake anyway, and that way people will know the algorithm isn't actually secure and that they should use something else instead.
-
Mark Ransom over 6 years@flarn2006 my point is that putting the checksum on the file would not provide any security at all. If you want to detect accidental corruption of a file then it might be useful, but it is worthless against an intentional attack.