Same crc32 for Python and C

12,684

Solution 1

I'm using right now zlib.crc32, but for C there is no such library

Um, yes, there is. It's called zlib. zlib is written in C, and it's what Python is using! Hence the name of the class.

You can use the crc32() function in zlib. That implementation is a fair bit faster than others you might find. Read zlib.h for the interface information.

You can compile zlib yourself, or it may already be installed on your system.

Update:

I now see your comment (which should be edited into the question since it is critical to getting the right answer) that you have extremely limited memory. Then you can use this:

static uint32_t crc32(uint32_t crc, unsigned char *buf, size_t len)
{
    int k;

    crc = ~crc;
    while (len--) {
        crc ^= *buf++;
        for (k = 0; k < 8; k++)
            crc = crc & 1 ? (crc >> 1) ^ 0xedb88320 : crc >> 1;
    }
    return ~crc;
}

The crc is initially set to zero.

The use of ~ will give the correct result, since the uint32_t type in stdint.h is assured to be 32 bits.

If you can afford a little more code space, then unrolling the loop will likely speed it up (if the compiler doesn't already do this):

static uint32_t crc32(uint32_t crc, unsigned char *buf, size_t len)
{
    crc = ~crc;
    while (len--) {
        crc ^= *buf++;
        crc = crc & 1 ? (crc >> 1) ^ 0xedb88320 : crc >> 1;
        crc = crc & 1 ? (crc >> 1) ^ 0xedb88320 : crc >> 1;
        crc = crc & 1 ? (crc >> 1) ^ 0xedb88320 : crc >> 1;
        crc = crc & 1 ? (crc >> 1) ^ 0xedb88320 : crc >> 1;
        crc = crc & 1 ? (crc >> 1) ^ 0xedb88320 : crc >> 1;
        crc = crc & 1 ? (crc >> 1) ^ 0xedb88320 : crc >> 1;
        crc = crc & 1 ? (crc >> 1) ^ 0xedb88320 : crc >> 1;
        crc = crc & 1 ? (crc >> 1) ^ 0xedb88320 : crc >> 1;
    }
    return ~crc;
}

You said that you only have 4 KBytes of "memory". Is that just working memory for the program, or does the program have to live there as well? If you have more space in flash for example for the code, then the table can be precomputed and stored with the code. A table-driven CRC will be much faster. The zlib code provides table-driven CRCs that do one byte at a time and four-bytes at a time, requiring respectively a 1Kbyte or 4Kbyte table.

Update 2:

Since you answered in a comment that the 4KBytes are just working memory, then you should use a table-driven CRC. You can simply use the crc32() function in zlib's crc32.c and the table in crc32.h with BYFOUR undefined.

Solution 2

C:

UInt32
crc32(UInt32 crc, UInt8 *p, SInt len)
{
  crc = ~crc;
  while (--len >= 0) {
    crc = crc ^ *p++;
    for (SInt i = 8; --i >= 0;) {
      crc = (crc >> 1) ^ (0xedb88320 & -(crc & 1));
    }
  }
  return ~crc;
}

void
crc_unitTest(void)
{
  UInt8 b1[] = { 0, 0, 0, 0, 0, 0, 0, 0 };
  UInt8 b2[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
  UInt8 b3[] = { 0xff, 0, 0xff, 0, 0xff, 0, 0xff, 0 };
  assert(crc32(0, b1,  8) == 0x6522df69);
  assert(crc32(0, b2, 10) == 0x456cd746);
  assert(crc32(0, b3,  8) == 0xea8c89c0);
}

Python:

def crc32(crc, p, len):
  crc = 0xffffffff & ~crc
  for i in range(len):
    crc = crc ^ p[i]
    for j in range(8):
      crc = (crc >> 1) ^ (0xedb88320 & -(crc & 1))
  return 0xffffffff & ~crc

def unitTest():
  b1 = [ 0, 0, 0, 0, 0, 0, 0, 0 ]
  b2 = [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
  b3 = [ 0xff, 0, 0xff, 0, 0xff, 0, 0xff, 0 ]
  assert(crc32(0, b1,  8) == 0x6522df69)
  assert(crc32(0, b2, 10) == 0x456cd746)
  assert(crc32(0, b3,  8) == 0xea8c89c0)

Solution 3

As you need a C implementation that doesn't use a lookup table (which most implementations do) with a matching Python equivalent you could use ZIP's CRC32 as suggested by Mark Ransom ( binascii.crc32 ) and the matching, tableless implementation I borrowed here

/* Calculating ZIP CRC-32 in 'C'
   =============================
   Reference model for the translated code */

#define poly 0xEDB88320
/* Some compilers need
   #define poly 0xEDB88320uL
 */

/* On entry, addr=>start of data
             num = length of data
             crc = incoming CRC     */
int crc32(char *addr, int num, int crc)
{
int i;

for (; num>0; num--)              /* Step through bytes in memory */
  {
  crc = crc ^ *addr++;            /* Fetch byte from memory, XOR into CRC */
  for (i=0; i<8; i++)             /* Prepare to rotate 8 bits */
  {
    if (crc & 1)                  /* b0 is set... */
      crc = (crc >> 1) ^ poly;    /* rotate and XOR with ZIP polynomic */
    else                          /* b0 is clear... */
      crc >>= 1;                  /* just rotate */
  /* Some compilers need:
    crc &= 0xFFFFFFFF;
   */
    }                             /* Loop for 8 bits */
  }                               /* Loop until num=0 */
  return(crc);                    /* Return updated CRC */
}

EDIT: as several people pointed out there are issues with the above code, the one below matches Wikipedia's (see http://ideone.com/pWLVSo ) and Python's ( http://ideone.com/SvYuyE - 1277644989==0x4c2750bd). This code comes from this page where other implementations and possible improvements over the basic version I copied

const uint32_t Polynomial = 0xEDB88320;

uint32_t crc32_bitwise(const void* data, size_t length, uint32_t previousCrc32 = 0) { 
     uint32_t crc = ~previousCrc32; // same as previousCrc32 ^ 0xFFFFFFFF 
     unsigned char* current = (unsigned char*) data; 
     while (length--) { 
         crc ^= *current++; 
         for (unsigned int j = 0; j < 8; j++) {
             if (crc & 1) 
                 crc = (crc >> 1) ^ Polynomial; 
             else 
                 crc = crc >> 1; 
         }

     } 
     return ~crc; // same as crc ^ 0xFFFFFFFF 
} 
Share:
12,684
Rafał Łużyński
Author by

Rafał Łużyński

My mind rebels at stagnation.

Updated on June 14, 2022

Comments

  • Rafał Łużyński
    Rafał Łużyński almost 2 years

    I need script that will calculate crc32 with the same output for both Python and C.

    I'm using right now zlib.crc32, but for C there is no such library and we are writing it on our own basing on Wikipedia. But it doesn't return the same value.

    This is our code of C script (copied from wikipedia, based on RFC):

    unsigned int crc32( unsigned char *message, unsigned int n )
    {
       //int i, crc;
       unsigned int crc;
       unsigned int i;
       unsigned int byte, c;
       const unsigned int g0 = 0xEDB88320,    g1 = g0>>1,
          g2 = g0>>2, g3 = g0>>3, g4 = g0>>4, g5 = g0>>5,
          g6 = (g0>>6)^g0, g7 = ((g0>>6)^g0)>>1;
    
       i = 0;
       crc = 0xFFFFFFFF;
       //while ((byte = message[i]) != 0)
       while( i != n)
       {
                   byte = message[i];                   // Get next byte.
                   // byte = FrmReadByte( i );  // Get next byte.
    
                   crc = crc ^ byte;
                   c = ((crc<<31>>31) & g7) ^ ((crc<<30>>31) & g6) ^
                   ((crc<<29>>31) & g5) ^ ((crc<<28>>31) & g4) ^
                   ((crc<<27>>31) & g3) ^ ((crc<<26>>31) & g2) ^
                   ((crc<<25>>31) & g1) ^ ((crc<<24>>31) & g0);
    
                   crc = ((unsigned)crc >> 8) ^ c;
                   i = i + 1;
       }
       return ~crc;
    }
    

    EDIT:

    We've got only 4KB of ram memory, program itself doesn't live there. Taking 1KB of memory by crc32 script is probably too much and wouldn't fit there. Thanks for pointing out that ZLIB library exists for C as well.

  • autistic
    autistic about 11 years
    Neither unsigned int nor int are guaranteed to be 32 bits. They may very well be 16 bits. Shifting a 16 bit integer by 31 places is undefined behaviour. Best to use a uint_least32_t if you want something that's guaranteed to exist and is guaranteed to contain at least 32 bits.
  • fvu
    fvu about 11 years
    @MarkRansom see edit - I modified your ideone with the new code and added a python one to show they give the same result - as I don't do python I don't know how to make Python show the result in hex but the results are identical.
  • Mark Ransom
    Mark Ransom about 11 years
    Yup, saw it already - good job. The Python function hex converts a number to a hex string: hex(binascii.crc32('abcdefghijklmnopqrstuvwxyz'))