Python: what does .encode('ascii', 'ignore') do?

16,938
    encode(...)
        S.encode([encoding[,errors]]) -> object

        Encodes S using the codec registered for encoding. encoding defaults
        to the default encoding. errors may be given to set a different error
        handling scheme. Default is 'strict' meaning that encoding errors raise
        a UnicodeEncodeError. Other possible values are 'ignore', 'replace' and
        'xmlcharrefreplace' as well as any other name registered with
        codecs.register_error that is able to handle UnicodeEncodeErrors.

So it encodes a unicode string to ascii and ignores errors

>>> "hello\xffworld".encode("ascii")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character '\xff' in position 5: ordinal not in range(128)

vs

>>> "hello\xffworld".encode("ascii", "ignore")
b'helloworld'
Share:
16,938
SeekingAlpha
Author by

SeekingAlpha

I am passionate about all things programming, mathematics, machine learning and data science.

Updated on June 04, 2022

Comments

  • SeekingAlpha
    SeekingAlpha almost 2 years

    I am new to Python and am going over some code from work.

    I noticed there are a lot of lines that contain row[0].encode('ascii', 'ignore').

    I did some reading and it seems like it is converting from unicode to bytes.

    Is it just a way to convert a string from u'string' to just string?

  • SeekingAlpha
    SeekingAlpha about 10 years
    so if I have a string S in the form u'string' will S.encode('ascii', 'ignore') convert S to just "string"? if I have S in unicode like before if I want to check the string inside can I check S == u'string'? or how do I check for equality?
  • John La Rooy
    John La Rooy about 10 years
    @SeekingAlpha, If the unicode string is entirely ASCII characters, it will give you those as bytes. Any that aren't ASCII will be filtered out