Parsing JSON with python: blank fields
Solution 1
Use dict.get
instead of []
:
entries['extensions'].get('telephone', '')
Or, simply:
entries['extensions'].get('telephone')
get
will return the second argument (default, None
) instead of raising a KeyError
when the key is not found.
Solution 2
If the data is missing in only one place, then dict.get can be used to fill-in missing the missing value:
tel = d['entries'][0]['extensions'].get('telelphone', '')
If the problem is more widespread, you can have the JSON parser use a defaultdict or custom dictionary instead of a regular dictionary. For example, given the JSON string:
json_txt = '''{
"entries": [
{
"extensions": {
"telephone": "123123",
"url": "www.blablablah",
"name": "name",
"coordinates": "coords",
"address": "address"
},
"summary": "here is the summary"
}
]
}'''
Parse it with:
>>> class BlankDict(dict):
def __missing__(self, key):
return ''
>>> d = json.loads(json_txt, object_hook=BlankDict)
>>> d['entries'][0]['summary']
u'here is the summary'
>>> d['entries'][0]['extensions']['color']
''
As a side note, if you want to clean-up your datasets and enforce consistency, there is a fine tool called Kwalify that does schema validation on JSON (and on YAML);
Pablo Pardo
Updated on June 19, 2022Comments
-
Pablo Pardo almost 2 years
I'm having problems while parsing a JSON with python, and now I'm stuck.
The problem is that the entities of my JSON are not always the same. The JSON is something like:"entries":[ { "summary": "here is the sunnary", "extensions": { "coordinates":"coords", "address":"address", "name":"name" "telephone":"123123" "url":"www.blablablah" }, } ]
I can move through the JSON, for example:
for entrie in entries: name =entrie['extensions']['name'] tel=entrie['extensions']['telephone']
The problem comes because sometimes, the JSON does not have all the "fields", for example, the
telephone
field, sometimes is missing, so, the script fails withKeyError
, because the key telephone is missing in this entry.
So, my question: how could I run this script, leaving a blank space where telephone is missing? I've tried with:if entrie['extensions']['telephone']: tel=entrie['extensions']['telephone']
but I think is not ok.
-
Derek Litz almost 11 yearsNice, I like this better then
defaultdict
because inside the__missing__
method one would be able to add some logic to catch a potential bug. Withdefaultdict
I always cringe because I won't get a KeyError when I make a typo. -
Marcin over 5 years
entries['extensions'].get('telephone', {}).get('anothermissingkey',{})
is almost 3x as fast (on Deb9's py3.5) asobject_hook=BlankDict
approach, and it works for multiple levels