How to read bits from a file?
Solution 1
Python can only read a byte at a time. You'd need to read in a full byte, then just extract the value you want from that byte, e.g.
b = x.read(1)
firstfivebits = b >> 3
Or if you wanted the 5 least significant bits, rather than the 5 most significant bits:
b = x.read(1)
lastfivebits = b & 0b11111
Some other useful bit manipulation info can be found here: http://wiki.python.org/moin/BitManipulation
Solution 2
As the accepted answer states, standard Python I/O can only read and write whole byte(s) at a time. However you can simulate such a stream of bits using this recipe for Bitwise I/O.
Updates
After modifying the Rosetta Code's Python version to work in unchanged in both Python 2 & 3, I incorporated those changes into this answer.
In addition to that, later, after being inspired by a comment made by @mhernandez, I further modified the Rosetta Code so it supports what's called the context manager protocol which allows instances of both of its two classes to be used in Python with
statements. Latest version is shown below:
class BitWriter(object):
def __init__(self, f):
self.accumulator = 0
self.bcount = 0
self.out = f
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.flush()
def __del__(self):
try:
self.flush()
except ValueError: # I/O operation on closed file.
pass
def _writebit(self, bit):
if self.bcount == 8:
self.flush()
if bit > 0:
self.accumulator |= 1 << 7-self.bcount
self.bcount += 1
def writebits(self, bits, n):
while n > 0:
self._writebit(bits & 1 << n-1)
n -= 1
def flush(self):
self.out.write(bytearray([self.accumulator]))
self.accumulator = 0
self.bcount = 0
class BitReader(object):
def __init__(self, f):
self.input = f
self.accumulator = 0
self.bcount = 0
self.read = 0
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
pass
def _readbit(self):
if not self.bcount:
a = self.input.read(1)
if a:
self.accumulator = ord(a)
self.bcount = 8
self.read = len(a)
rv = (self.accumulator & (1 << self.bcount-1)) >> self.bcount-1
self.bcount -= 1
return rv
def readbits(self, n):
v = 0
while n > 0:
v = (v << 1) | self._readbit()
n -= 1
return v
if __name__ == '__main__':
import os
import sys
# Determine this module's name from it's file name and import it.
module_name = os.path.splitext(os.path.basename(__file__))[0]
bitio = __import__(module_name)
with open('bitio_test.dat', 'wb') as outfile:
with bitio.BitWriter(outfile) as writer:
chars = '12345abcde'
for ch in chars:
writer.writebits(ord(ch), 7)
with open('bitio_test.dat', 'rb') as infile:
with bitio.BitReader(infile) as reader:
chars = []
while True:
x = reader.readbits(7)
if not reader.read: # End-of-file?
break
chars.append(chr(x))
print(''.join(chars))
Another usage example showing how to "crunch" an 8-bit byte ASCII stream discarding the most significant "unused" bit...and read it back (however neither use it as a context manger).
import sys
import bitio
o = bitio.BitWriter(sys.stdout)
c = sys.stdin.read(1)
while len(c) > 0:
o.writebits(ord(c), 7)
c = sys.stdin.read(1)
o.flush()
...and to "decrunch" the same stream:
import sys
import bitio
r = bitio.BitReader(sys.stdin)
while True:
x = r.readbits(7)
if not r.read: # nothing read
break
sys.stdout.write(chr(x))
Solution 3
This appears at the top of a Google search for reading bits using python.
I found bitstring
to be a good package for reading bits and also an improvement over the native capability (which isn't bad for Python 3.6) e.g.
# import module
from bitstring import ConstBitStream
# read file
b = ConstBitStream(filename='file.bin')
# read 5 bits
output = b.read(5)
# convert to unsigned int
integer_value = output.uint
More documentation and details here: https://pythonhosted.org/bitstring/index.html
Hugo Medina
Updated on July 09, 2022Comments
-
Hugo Medina almost 2 years
I know how to read bytes —
x.read(number_of_bytes)
, but how can I read bits in Python?I have to read only 5 bits (not 8 bits [1 byte]) from a binary file
Any ideas or approach?
-
Hugo Medina almost 12 yearswhen my reputations grows to 15, I'll give you thumbs up! (I'm new here) so, if I do this: b = x.read(1) firstfivebits = b >> 3 I'll get the first 5 bits... why not firstfivebits = b >> 5? y mean... why b >> 3?
-
John Gaines Jr. almost 12 years@HugoMedina if you don't know why
firstfivebits = b >> 3
you sure you should be fiddlin' with bits? (You might go blind or something ;). -
Hugo Medina almost 12 yearsnow I get it, since 1 byte = 8 bits we'll apply right-shift operator 3 (like deleting those 3 least significant bits) so we'll get the remaining 5 bits in the byte
-
mhernandez about 6 years+1 for the self-contained snippet. Note that the main may not read what it's meant to because the writer may not be deleted when the reader attempts reading. A call to writer.flush() solves it.
-
martineau about 6 years@mhernandez: Extending the
bitio
classes so they support the context manager protocol like the built-infile
class does would probably be a very worthwhile endeavor—and an even better way to take care of the issue. -
mhernandez about 6 yearsAgreed, in fact that's exactly what I did. Thank you sir
-
martineau almost 6 yearsmhernandez: Glad to hear it helped. BTW I recently modified the Rosetta Code's Python version so it also supports the context manager protocol—and then updated my answer here accordingly. (It was done in that order because Rosetta Code's license on allows verbatim copies in a context like this.)
-
Dobedani over 3 yearsI agree that bitstring is helpful. When you need to read in more than 8 bits at once, you need to understand how the bits are "scattered" over the bytes. E.g. I needed to read in a 14-bit integer. This is how I succeeded: buf1 = b.read(8); buf2 = b.read(2); buf3 = b.read(6); str_with_bits = str(buf3.bin) + str(buf1.bin); int_value = int(str_with_bits, 2);