Unescaping Characters in a JSON response string

15,332

Solution 1

If the data came from JSON, the json module should already have decoded these escapes for you:

>>> import json
>>> json.loads('"\u003Cp\u003E"')
u'<p>'

Solution 2

>>> "\\u003Cp\\u003E".decode('unicode-escape')
u'<p>'

Solution 3

EDIT: The original question "Unescaping Characters in a String with Python" did not clarify if the string was to be written or to be read (later on, the "JSON response" words were added, to clarify the intention was to read).

So I answered the opposite question: how to write JSON serialized data dumping them to a unescaped string (rather than loading data from the string).

My use case was producing a JSON file from my own data dictionary, but the file contained scaped non-ASCII characters. So I did it like this:

with open(filename,'w') as jsonfile:
    jsonstr = json.dumps(myDictionary, ensure_ascii=False)
    print(jsonstr)          # to screen
    jsonfile.write(jsonstr) # to file

If ensure_ascii is true (the default), the output is guaranteed to have all incoming non-ASCII characters escaped. If ensure_ascii is false, these characters will be output as-is.

Taken from here: https://docs.python.org/3/library/json.html

Share:
15,332

Related videos on Youtube

Spike
Author by

Spike

I did my undergrad at UCLA in Comp Sci &amp; Engineering, then did two masters programs at the University of Wisconsin - Madison, one of which was in Comp Sci. I finished that in July 2011. My bio since then: Co-founded a startup called Leaguevine. Attended The Iron Yard accelerator with this company and then sold it to 8to18. Worked at 8to18 for over a year, hiring a tech team, rebuilding product, and heading up a legacy system migration. Started Upkeep Charlie. Built some other things for people such as this scheduling system for fire departments. Other places I live on the web: http://github.com/mliu7 http://twitter.com/markwliu

Updated on June 04, 2022

Comments

  • Spike
    Spike almost 2 years

    I made a JSON request that gives me a string that uses Unicode character codes that looks like:

    s = "\u003Cp\u003E"
    

    And I want to convert it to:

    s = "<p>"
    

    What's the best way to do this in Python?

    Note, this is the same question as this one, only in Python except Ruby. I am also using the Posterous API.

  • Spike
    Spike about 13 years
    I wish I could mark both these answers as correct. Thanks for your help!
  • Amit Patil
    Amit Patil almost 13 years
    @Spike: if it is JSON input, this is the real right answer. Python string literals are not completely compatible with JSON string literals and unicode-escape can return the wrong results for characters outside the Basic Multilingual Plane (which JS/JSON store as surrogates but Python might not).
  • ti7
    ti7 about 3 years
    heads-up: I edited the (wow almost a decade-old) Question with some hindsight and your Answer to remove some of the extra intro - feel free to revert if you don't think this was appropriate!
  • abu
    abu about 3 years
    thanks @ti7 I prefer to clarify this answer was written before your change of the original title

Related