Python decoding Unicode is not supported

99,956

Looks like google.searchGoogle(param) already returns unicode:

>>> unicode(u'foo', 'utf-8')

Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    unicode(u'foo', 'utf-8')
TypeError: decoding Unicode is not supported

So what you want is:

result = google.searchGoogle(param).encode("utf-8")

As a side note, your code expects it to return a utf-8 encoded string so what was the point in decoding it (using unicode()) and encoding back (using .encode()) using the same encoding?

Share:
99,956

Related videos on Youtube

simonbs
Author by

simonbs

Updated on July 05, 2022

Comments

  • simonbs
    simonbs almost 2 years

    I am having a problem with my encoding in Python. I have tried different methods but I can't seem to find the best way to encode my output to UTF-8.

    This is what I am trying to do:

    result = unicode(google.searchGoogle(param), "utf-8").encode("utf-8")
    

    searchGoogle returns the first Google result for param.

    This is the error I get:

    exceptions.TypeError: decoding Unicode is not supported
    

    Does anyone know how I can make Python encode my output in UTF-8 to avoid this error?

  • simonbs
    simonbs over 12 years
    Honestly, the unicode() was just fooling around trying to understand what was happening. Thank you very much :-)
  • simonbs
    simonbs over 12 years
    Now I will sometimes get ascii' codec can't decode byte 0xc3 in position. Do you know why that is?
  • yak
    yak over 12 years
    In the line I suggested? Then it would mean that searchGoogle() returned a string with 0xC3 byte. Calling .encode() on that results in Python trying to convert to unicode first (using ascii encoding). I don't know why searchGoogle() would sometimes return unicode and sometimes a string. Maybe it depends on what you give it in param? Try to stick to one type.
  • Eric Walker
    Eric Walker over 9 years
    I wish there was a safe, simple way to cast to unicode.
  • Leonid
    Leonid over 6 years
    @EricWalker You could write an awkward helper function like def uors2u(object, encoding=..., errors=...) which will return object param unchanged if it is already in Unicode or convert it if str. However, this code smells. You should be converting all input to Unicode as soon as you receive it from the outside (like a file system) and converting it back if needed before sending it back. There should be only one place where you convert str to unicode, so a helper function like the one I described should not be needed.