json.dump - UnicodeDecodeError: 'utf8' codec can't decode byte 0xbf in position 0: invalid start byte
The exception is caused by the contents of your data
dictionary, at least one of the keys or values is not UTF-8 encoded.
You'll have to replace this value; either by substituting a value that is UTF-8 encoded, or by decoding it to a unicode
object by decoding just that value with whatever encoding is the correct encoding for that value:
data['142'] = data['142'].decode('latin-1')
to decode that string as a Latin-1-encoded value instead.
Belphegor
Updated on July 09, 2022Comments
-
Belphegor almost 2 years
I have a dictionary
data
where I have stored:key
- ID of an eventvalue
- the name of this event, wherevalue
is a UTF-8 string
Now, I want to write down this map into a json file. I tried with this:
with open('events_map.json', 'w') as out_file: json.dump(data, out_file, indent = 4)
but this gives me the error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xbf in position 0: invalid start byte
Now, I also tried with:
with io.open('events_map.json', 'w', encoding='utf-8') as out_file: out_file.write(unicode(json.dumps(data, encoding="utf-8")))
but this raises the same error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xbf in position 0: invalid start byte
I also tried with:
with io.open('events_map.json', 'w', encoding='utf-8') as out_file: out_file.write(unicode(json.dumps(data, encoding="utf-8", ensure_ascii=False)))
but this raises the error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xbf in position 3114: ordinal not in range(128)
Any suggestions about how can I solve this problem?
EDIT: I believe this is the line that is causing me the problem:
> data['142'] '\xbf/ANCT25'
EDIT 2: The
data
variable is read from a file. So, after reading it from a file:data_file_lines = io.open(file_name, 'r', encoding='utf8').readlines()
I then do:
with io.open('data/events_map.json', 'w', encoding='utf8') as json_file: json.dump(data, json_file, ensure_ascii=False)
Which gives me the error:
TypeError: must be unicode, not str
Then, I try to do this with the data dictionary:
for tuple in sorted_tuples (the `data` variable is initialized by a tuple): data[str(tuple[1])] = json.dumps(tuple[0], ensure_ascii=False, encoding='utf8')
which is, again, followed by:
with io.open('data/events_map.json', 'w', encoding='utf8') as json_file: json.dump(data, json_file, ensure_ascii=False)
but again, the same error:
TypeError: must be unicode, not str
I get the same error when I use the simple
open
function for reading from the file:data_file_lines = open(file_name, "r").readlines()
-
Belphegor over 9 yearsI read these values from a file. You were correct about the inverted question mark, so I changed that value with another UTF-8 character (the letter "é"). With your solution
data['142'].decode('latin-1')
it doesn't raise any errors, but in the final json file I have "142": "\u00e9ANCT25", instead of the expected: "142": "éANCT25". I tried to read the file with codecs.open(file_name, "r", "utf-8"), but here I have:UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 2526468: invalid continuation byte
. How do I solve this prob. so the real characters are written in the json? -
Martijn Pieters over 9 years
\u00e9
is a valid JSON escape sequence; do you absolutely have to have the Unicode character instead of the JSON\uxxxx
escape sequence? -
Martijn Pieters over 9 years@Belphegor: see Saving utf-8 texts in json.dumps as UTF8, not as \u escape sequence for how to produce such data.
-
Belphegor over 9 yearsThanks for the help, but this didn't help me. It still doesn't work. I edited my question where I describe what else I've tried (in "Edit 2"). Any other suggestion?
-
Belphegor over 9 yearsNever mind, I've solved it finally! I got the answer from here: stackoverflow.com/questions/12309269/… (the code for Python 2.x). Anyway, @Martijn Pieters , I wouldn't have done it without you, so I am accepting your answer. But, please add the answer from the link I've provided in your answer, so it would be clearer if someone else bumps into the same problem. Cheers!
-
Belphegor over 9 yearsFYI: I already edited your answer with the final version of my code, but I don't know if it's going to be approved by the moderators. Anyway, tnx for the help!
-
Blairg23 over 8 yearsThanks, that answer at stackoverflow.com/questions/12309269/… worked for me too!