How to efficiently read bits out of bytes?

12,759

Solution 1

The binary AND and bit shifting operations are the fastest way of doing this. They translate well into machine code instructions. The only way to further speed this up is by sacrificing bandwidth for speed by for example simply not using more than 3 bits per byte, but judging from your question you've probably already considered and rejected that tradeoff.

Solution 2

function byte2bits(a)
{
    var tmp = "";
    for(var i = 128; i >= 1; i /= 2)
        tmp += a&i?'1':'0';
    return tmp;
}
function split2Bits(a, n)
{
    var buff = "";
    var b = [];
    for(var i = 0; i < a.length; i++)
    {
        buff += byte2bits(a[i]);
        while(buff.length >= n)
        {
            b.push(buff.substr(0, n));
            buff = buff.substr(n);
        }
    }
    return [b, buff];
}
var a, b, r;
a = [227, 142];
[b, r] = split2Bits(a, 3);
//b = ["111", "000", "111", "000", "111"];
//r = '0'; //rest of bits

Solution 3

We can get value we need by getting appropriate 16-bits integer and then bitshift it.

It is clear, that to get i-th value we should get 16-bits integer with offset in bytes that fits (bits * (i + 1) - 16)/8 <= offset <= (bits * i)/8.

Lets take M=bits*i/8, so we have M + bits/8 - 2<= offset <= M. Then we get minimal offset as ceil(M + bits/8 - 2) and calculate position of i-th value in the 16-bit integer by offsets. I have just wrote the following function

function getDataFromStream(buffer, bitsPerValue, endianness) {
    var valuesCount = Math.floor(buffer.length * 8 / bitsPerValue);
    var ret = new Buffer(valuesCount);

    if (valuesCount > 0) {
        for (var i = 0; i < valuesCount; i++) {
            var offsetMin = Math.ceil(bitsPerValue * i / 8. + bitsPerValue / 8. - 2);
            if (offsetMin < 0) {
                offsetMin = 0;
            }
            if(endianness == 'BE')
                var wordWithValue = buffer.readUInt16BE(offsetMin, true);
            else
                var wordWithValue = buffer.readUInt16LE(offsetMin, true); 
            var offsetInWord = bitsPerValue * i - offsetMin * 8;
            var leftInWord = 16 - bitsPerValue - offsetInWord;

            // then get value in the word by shifting and then remove other bits by "%"
            ret[i] = (wordWithValue >> (endianness == 'BE' ? leftInWord : offsetInWord ))  % Math.pow(2, bitsPerValue);
        }
    }
    return ret;
}

And the following example to read 8 5-bit values off the Buffer with 5 bytes length.

// buffer with 5 bytes
var xx = new Buffer(5);
xx[0] = 255;
xx[1] = 255;
xx[2] = 255;
xx[3] = 255;
xx[4] = 250;

// get data, 5bits per value.
var yy = getDataFromStream(xx, 5, 'BE');
console.log('got buffer with length='+ yy.length);
for(i = 0; i < yy.length; i++){
    console.log(i+'-'+yy[i]);
}

When I launch node test.js I got

got buffer with length=8
0-31
1-31
2-31
3-31
4-31
5-31
6-31
7-26

Solution 4

if endian-ness is taken care, you can access it as an int or a long int array. There is another possibily of not using bit 3 and bit 7

Share:
12,759
pimvdb
Author by

pimvdb

Updated on June 05, 2022

Comments

  • pimvdb
    pimvdb about 2 years

    I'm working on a project that includes WebSockets, and data between the server (Node.js) and the client (Chrome) is sent using a custom (very simple) format for data exchange I set up.

    I'm sending data in pieces of 3 bits because I'm sending items which all have 8 possibilities. The data format looks like this:

                0          1
    bit index   01234567 8901...
    item        aaabbbcc cddd...
    

    Currently, I'm parsing the items out of the bytes like this:

    var itemA = bytes[0] >> 5;
    var itemB = (bytes[0] >> 2) & 7;
    var itemC = (bytes[0] & 3) << 1 | bytes[1] >> 7;
    var itemD = (bytes[1] >> 4) & 7;
    

    Personally, this feels as being too sophisticated. The problem is that it's only complex because I'm getting data in bytes, which are a multiple of 8. To parse out items of 3 bits I have to bit-shift, doing AND operations, and because 8 is not divisible by 3 I sometimes even have to combine parts of two bytes like for itemC.

    It would be much more effective to read this data as groups of 3 bits instead of groups of 8 bits.

    What I've come up with is converting all bytes into bits to a string using .toString(2), then using .substring to get a substring with length 3, and converting back to a number with parseInt(bitString, 2), but I guess that's not the way to do it, since string manipulation is slow and I'm actually not doing anything string-related.

    Is it possible to read bits in groups of e.g. 3 instead of parsing them from bytes? Or is there a more efficient way to read bits out of bytes?

  • pimvdb
    pimvdb over 12 years
    Do you imply there is no real means of getting rid of the multiple of 8 that one always obtains with bytes?
  • Marcel Offermans
    Marcel Offermans over 12 years
    That's how computers work, with bytes (8 bits) and multiples of that (16, 32, 64 bits) so even if a higher level language would allow you to work with anything else, in the end that's what these instructions need to be translated to.
  • Mathias Dolidon
    Mathias Dolidon almost 9 years
    I'd always believed that bitwise operations were lame in JS, because all JS was supposed to know about is 64bit float, and all else is software emulated. Have the passing years made this old truth wrong ?
  • Geert-Jan
    Geert-Jan over 4 years
    I have problems getting this to work with values larger than 255. Could you possibly give an update?
  • shukshin.ivan
    shukshin.ivan over 4 years
    @Geert-Jan but one byte is 0-255 only. What do you want from my code? =)
  • Geert-Jan
    Geert-Jan over 4 years
    I was looking for some generic encode/decode functions to bitpack values of arbitrary length. E.g.: sometimes I would have an array of values of which the max value would be 220. Other times it would be 1045. To encode the latter array I would need 11 bits per item (and thus > 8) . Not really in scope of the original question though ;)
  • shukshin.ivan
    shukshin.ivan over 4 years
    It was six years ago, I look at this piece of my code just as you do :) You should just add converting into bytes.