io.StringIO encoding in python3

18,113

Solution 1

The class io.StringIO works with str objects in Python 3. That is, you can only read and write strings from a StringIO instance. There is no encoding -- you have to choose one if you want to encode the strings you got from StringIO in a bytes object, but strings themselves don't have an encoding.

(Of course strings need to be internally represented in some encoding. Depending on your interpreter, that encoding is either UCS-2 or UCS-4, but you don't see this implementation detail when working with Python.)

Solution 2

As already mentioned in another answer, StringIO saves (unicode) strings in memory and therefore doesn't have an encoding. If you do need a similar object with encoding you might want to have a look at BytesIO.

If you want to set the encoding of stdout: You can't. At least not directly since sys.stdout.encoding is write only and (often) automatically determined by Python. (Doesn't work when using pipes) If you want to write byte strings with a certain encoding to stdout, then you either just encode the strings you print with the correct encoding (Python 2) or use sys.stdout.buffer.write() (Python 3) to send already encoded byte strings to stdout.

Share:
18,113

Related videos on Youtube

Giovanni Funchal
Author by

Giovanni Funchal

C/C++, Perl, Java, XHTML/CSS, SystemC, LaTeX, Makefile, Bash, PHP, VHDL, Tcl

Updated on June 27, 2022

Comments

  • Giovanni Funchal
    Giovanni Funchal 11 months

    I can't seem to find what's the default encoding for io.StringIO in Python3. Is it the locale as with stdio?

    How can I change it?

    With stdio, seems that just reopening with correct encoding works, but there's no such thing as reopening a StringIO.

  • Giovanni Funchal
    Giovanni Funchal over 11 years
    Quoting "There is no encoding": What if I do stdout = io.StringIO(), then which encoding standard print() will have?
  • Sven Marnach
    Sven Marnach over 11 years
    @GiovanniFunchal: Still none, since all strings you write there are still strings. They are not encoded.
  • Giovanni Funchal
    Giovanni Funchal over 11 years
    So if I use a # coding:utf-8 in the beginning, encoding of print("foo") in the buffer will be utf-8?
  • Sven Marnach
    Sven Marnach over 11 years
    @GiovanniFunchal: You are mixing concepts here. # coding:utf-8 describes the encoding of your source file. String literals in your source file are encoded in UTF-8, but will be decoded to string objects while the program is loaded. If you write such a string object to a StringIO instance, it will still be a string. Again, strings themselves don't have an encoding in Python 3.
  • Giovanni Funchal
    Giovanni Funchal over 11 years
    So if I copy the value of that StringIO object to stdout, I should call decode(stdout.encoding)?
  • Sven Marnach
    Sven Marnach over 11 years
    @GiovanniFunchal: No, when you write a string to stdout, it will be encoded with stdout's encoding anyway, regardless whether the string comes out of a StringIO or from any other source. (I suggest to read the Python 3 Unicode Howto to clear up your misconceptions.)
  • Giovanni Funchal
    Giovanni Funchal over 11 years
    Thanks for the answers. If I understand, in that case if I want a particular encoding I should set the encoding of stdout then.
  • JonnyJD
    JonnyJD over 9 years
    The other answer was already valid, but I not only wanted to tell that there is no such thing, but also "how to fix it".
  • Cas
    Cas almost 6 years
    Thanks! This was what I was looking for to replace StringIO.StringIO while maintaining bytes output.

Related