effective solution: base32 encoding in php

22,590

Solution 1

For Base32 in PHP, you can try my implementation here:

https://github.com/ademarre/binary-to-text-php

Copied from the Base32 example in the README file:

// RFC 4648 base32 alphabet; case-insensitive
$base32 = new Base2n(5, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567', FALSE, TRUE, TRUE);
$encoded = $base32->encode('encode this');
// MVXGG33EMUQHI2DJOM======

It's not slow, and it may or may not be faster than the class you benchmarked, but it will not be as fast as a built-in PHP function like base64_encode(). If that is very important to you, and you don't really care about Base32 encoding, then you should just use hexadecimal. You can encode hexadecimal with native PHP functions, and it is case-insensitive.

$encoded = bin2hex('encode this'); // 656e636f64652074686973
$decoded = pack('H*', $encoded);   // encode this

// Alternatively, as of PHP 5.4...
$decoded = hex2bin($encoded);      // encode this

The downside to hexadecimal is that there is more data inflation compared to Base32. Hexadecimal inflates the data 100%, while Base32 inflates the data about 60%.

Solution 2

Try this: https://github.com/bbars/utils/blob/master/php-base32-encode-decode/Base32.php

Uses case-insesitive alpabet [0-9, a-v] and works faster than Base2n(5):

size   | Base32::encode  | Base32::decode  | $base2n->encode  | $base2n->decode
-------------------------------------------------------------------------------
1      | 0.0000331401825 | 0.0000088214874 | 0.0002369880676  | 0.0001671314240
2      | 0.0000050067902 | 0.0000040531158 | 0.0000100135803  | 0.0000081062317
4      | 0.0000050067902 | 0.0000059604645 | 0.0000097751617  | 0.0000100135803
8      | 0.0000078678131 | 0.0000100135803 | 0.0000131130219  | 0.0000140666962
16     | 0.0000128746033 | 0.0000178813934 | 0.0000250339508  | 0.0000250339508
32     | 0.0000238418579 | 0.0000319480896 | 0.0000441074371  | 0.0000472068787
64     | 0.0001170635223 | 0.0000629425049 | 0.0000870227814  | 0.0000259876251
128    | 0.0000879764557 | 0.0001208782196 | 0.0001959800720  | 0.0001759529114
256    | 0.0001969337463 | 0.0002408027649 | 0.0004429817200  | 0.0003459453583
512    | 0.0003631114960 | 0.0004880428314 | 0.0021460056305  | 0.0006039142609
1024   | 0.0014970302582 | 0.0009729862213 | 0.0108621120453  | 0.0015850067139
2048   | 0.0013530254364 | 0.0018491744995 | 0.0312080383301  | 0.0027630329132
4096   | 0.0027470588684 | 0.0038080215454 | 0.1312029361725  | 0.0064430236816
8192   | 0.0064270496368 | 0.0086290836334 | 0.5233020782471  | 0.0121779441833
16384  | 0.0112588405609 | 0.0167109966278 | 2.0316259860992  | 0.0277659893036
32768  | 0.0235319137573 | 0.0335960388184 | 11.6220989227295 | 0.0498571395874
65536  | 0.0478749275208 | 0.0648550987244 |                  |                
131072 | 0.1030550003052 | 0.1504058837891 |                  |                
262144 | 0.1995100975037 | 0.2654621601105 |                  |                
524288 | 0.3903131484985 | 0.5326008796692 |                  |                

Solution 3

You can try these functions I adapted from bbars and crockford:

function crockford32_encode($data) {
    $chars = '0123456789abcdefghjkmnpqrstvwxyz';
    $mask = 0b11111;

    $dataSize = strlen($data);
    $res = '';
    $remainder = 0;
    $remainderSize = 0;

    for($i = 0; $i < $dataSize; $i++) {
        $b = ord($data[$i]);
        $remainder = ($remainder << 8) | $b;
        $remainderSize += 8;
        while($remainderSize > 4) {
            $remainderSize -= 5;
            $c = $remainder & ($mask << $remainderSize);
            $c >>= $remainderSize;
            $res .= $chars[$c];
        }
    }
    if($remainderSize > 0) {
        $remainder <<= (5 - $remainderSize);
        $c = $remainder & $mask;
        $res .= $chars[$c];
    }

    return $res;
}

function crockford32_decode($data) {
    $map = [
        '0' => 0,
        'O' => 0,
        'o' => 0,
        '1' => 1,
        'I' => 1,
        'i' => 1,
        'L' => 1,
        'l' => 1,
        '2' => 2,
        '3' => 3,
        '4' => 4,
        '5' => 5,
        '6' => 6,
        '7' => 7,
        '8' => 8,
        '9' => 9,
        'A' => 10,
        'a' => 10,
        'B' => 11,
        'b' => 11,
        'C' => 12,
        'c' => 12,
        'D' => 13,
        'd' => 13,
        'E' => 14,
        'e' => 14,
        'F' => 15,
        'f' => 15,
        'G' => 16,
        'g' => 16,
        'H' => 17,
        'h' => 17,
        'J' => 18,
        'j' => 18,
        'K' => 19,
        'k' => 19,
        'M' => 20,
        'm' => 20,
        'N' => 21,
        'n' => 21,
        'P' => 22,
        'p' => 22,
        'Q' => 23,
        'q' => 23,
        'R' => 24,
        'r' => 24,
        'S' => 25,
        's' => 25,
        'T' => 26,
        't' => 26,
        'V' => 27,
        'v' => 27,
        'W' => 28,
        'w' => 28,
        'X' => 29,
        'x' => 29,
        'Y' => 30,
        'y' => 30,
        'Z' => 31,
        'z' => 31,
    ];

    $data = strtolower($data);
    $dataSize = strlen($data);
    $buf = 0;
    $bufSize = 0;
    $res = '';

    for($i = 0; $i < $dataSize; $i++) {
        $c = $data[$i];
        if(!isset($map[$c])) {
            throw new \Exception("Unsupported character $c (0x".bin2hex($c).") at position $i");
        }
        $b = $map[$c];
        $buf = ($buf << 5) | $b;
        $bufSize += 5;
        if($bufSize > 7) {
            $bufSize -= 8;
            $b = ($buf & (0xff << $bufSize)) >> $bufSize;
            $res .= chr($b);
        }
    }

    return $res;
}
Share:
22,590
Freddy
Author by

Freddy

Updated on December 25, 2020

Comments

  • Freddy
    Freddy over 3 years

    I am looking for a base32 function/class for php. the different classes and function that i found are all very ineffective. I ran a benchmark and came to the following result:

    10000 decodings:

    base32: 2.3273 seconds

    base64: 0.0062 seconds

    The base32 class which I have used is:

    http://www.php.net/manual/en/function.base-convert.php#102232

    Is there any way which is simpler?

    The reason why I want to use base32 is that it is not case sensitive and as a result I have no problems any more regarding url parameters which on some system (e.g. email systems) are always converted to lowercase letters.

    If you have a better alternative for lowercase encoding, I would also love to hear them.

  • My1
    My1 over 6 years
    this is not base32, this is base32hex. base32 uses all letters and kills off 0 and one, and therefore just uses 2 to 7 as digits and starts off with the letters.
  • Bars
    Bars about 6 years
    Iscariot, I've updated my implementation according to that notice of My1
  • Joel Harkes
    Joel Harkes almost 5 years
    Crockford base32 version: gist.github.com/joelharkes/82ac4eb0fd8226649a6118562e3817c4 for those looking
  • Bars
    Bars over 4 years
    If you're talking about alternative alphabet, my implementation also contains the child Base32hex class for that, as My1 and Case have mentioned.