Why does base64.b64encode() return a bytes object?

python python-3.x unicode encoding base64

18,651

Solution 1

The purpose of the base64.b64encode() function is to convert binary data into ASCII-safe "text"

Python disagrees with that - base64 has been intentionally classified as a binary transform.

It was a design decision in Python 3 to force the separation of bytes and text and prohibit implicit transformations. Python is now so strict about this that bytes.encode doesn't even exist, and so b'abc'.encode('base64') would raise an AttributeError.

The opinion the language takes is that a bytestring object is already encoded. A codec which encodes bytes into text does not fit into this paradigm, because when you want to go from the bytes domain to the text domain it's a decode. Note that rot13 encoding was also banished from the list of standard encodings for the same reason - it didn't fit properly into the Python 3 paradigm.

There also can be a performance argument to make: suppose Python automatically handled decoding of the base64 output, which is an ASCII-encoded binary representation produced by C code from the binascii module, into a Python object in the text domain. If you actually wanted the bytes, you would just have to undo the decoding by encoding into ASCII again. It would be a wasteful round-trip, an unnecessary double-negation. Better to 'opt-in' for the decode-to-text step.

Solution 2

It's impossible for b64encode() to know what you want to do with its output.

While in many cases you may want to treat the encoded value as text, in many others – for example, sending it over a network – you may instead want to treat it as bytes.

Since b64encode() can't know, it refuses to guess. And since the input is bytes, the output remains the same type, rather than being implicitly coerced to str.

As you point out, decoding the output to str is straightforward:

base64.b64encode(b'abc').decode('ascii')

... as well as being explicit about the result.

As an aside, it's worth noting that although base64.b64decode() (note: decode, not encode) has accepted str since version 3.3, the change was somewhat controversial.

18,651

gardarh

I work on Android development. Also python, C# and on occasion flirt with Objective-C.

Updated on June 06, 2022

Comments

gardarh almost 2 years
The purpose of base64.b64encode() is to convert binary data into ASCII-safe "text". However, the method returns an object of type bytes:
```
>>> import base64
>>> base64.b64encode(b'abc')
b'YWJj'
```
It's easy to simply take that output and decode() it, but my question is: what is a significance of base64.b64encode() returning bytes rather than a str?
gardarh about 7 years

Thanks for answering, I have a bit of a problem with this explanation though, the potential output can always be represented with an ascii string, which in a sense is a subset of a bytes object. I would think that you should rather return the result in a more narrow type if possible, a bytes object can be anything. Generally, if you have a function, you will not know what is done with the output, you still want to return it in a descriptive manner that makes sense, otherwise all functions should just return bytes and we should do away with the str type.
gardarh about 7 years

In other words, b64encode() always knows that the output can be represented as a str, why not return a str then?
Zero Piraeus about 7 years

It can be represented as a list of integers, or in many other ways. Since b64encode() doesn't know what you'll want to do with it, it chooses not to coerce it from the input type of bytes.
Zero Piraeus about 7 years

Note that there's no difference between "why not return a str then?" and "why not return a bytes object then?" ... it has to choose something, and bytes was considered most consistent with the principle that implicit coercion should be avoided.
Zero Piraeus about 7 years

Note also that str is definitely not a subset of, or narrower than, bytes: the former consists of up to 1,114,112 different code points, whereas the latter can represent only 256 different states (which may be integers, characters, or something else). ASCII happens to be representable in a subset of both, as does the base64 alphabet, but there's no inherent reason to suppose that one is a more natural fit than the other.
Code-Apprentice about 7 years

@gardarh "I would think that you should rather return the result in a more narrow type if possible" In Object Oriented Programming, it is common to code with the widest type possible as this allows the most flexibility. I think the same principle applies here.
gardarh about 7 years

@Code-Apprentice My line of thinking was that "if you have additional information about the returned data, then provide it" and the fact that the output of the method will always be in the ascii safe range as that kind of information. Choosing to call it "narrow" may be a poor choice of words. Otherwise we could just always return byte objects for everything since all data can be represented as raw bytes - however that might not be very useful.
gardarh about 7 years

I think "A codec which encodes bytes into text does not fit into this paradigm, because when you want to go from the bytes domain to the text domain it's a decode" explains it for me. So in isolation it might not make perfect sense but in spirit of making all encode()/decode() methods have uniform inputs/outputs it makes sense. I still think it is a bit weird :)
Code-Apprentice about 7 years

@gardarh Ultimately it comes down to a design choice. I think that the answers and comments here explain the reasoning behind the decision. Clearly you would have decided differently.
syvex about 6 years

I'd say that 99.99% of the time you want it as a string, and that should be the default. In the case that you'd care about performance or the other nuances you could call another function.
Anthony over 4 years

At it sense, base64 encoding is purely textual, ASCII-only by definition, meaning that it's purpose is to transform binary data to text representation. I can't see any reason why Python implementation produces bytes. Separation of bytes and text is very useful at it's own, but in this case, personally I think that if in such a case codes doesn't fit into this paradigm, then this paradigm shouldn't be applied at all.