recursive iteration through nested json for specific key in python

51,478

Solution 1

def id_generator(dict_var):
    for k, v in dict_var.items():
        if k == "id":
            yield v
        elif isinstance(v, dict):
            for id_val in id_generator(v):
                yield id_val

This will create an iterator which will yield every value on any level under key "id". Example usage (printing all of those values):

for _ in id_generator(some_json_dict):
    print(_)

Solution 2

The JSON might contain a list of objects, which needs to be searched:

Python 2.7 version:

def item_generator(json_input, lookup_key):
    if isinstance(json_input, dict):
        for k, v in json_input.iteritems():
            if k == lookup_key:
                yield v
            else:
                for child_val in item_generator(v, lookup_key):
                    yield child_val
    elif isinstance(json_input, list):
        for item in json_input:
            for item_val in item_generator(item, lookup_key):
                yield item_val

Python 3.x version:

def item_generator(json_input, lookup_key):
    if isinstance(json_input, dict):
        for k, v in json_input.items():
            if k == lookup_key:
                yield v
            else:
                yield from item_generator(v, lookup_key)
    elif isinstance(json_input, list):
        for item in json_input:
            yield from item_generator(item, lookup_key)

Solution 3

A little bit cleaner code (in python 3.x).

def parse_json_recursively(json_object, target_key):
    if type(json_object) is dict and json_object:
        for key in json_object:
            if key == target_key:
                print("{}: {}".format(target_key, json_object[key]))
            parse_json_recursively(json_object[key], target_key)

    elif type(json_object) is list and json_object:
        for item in json_object:
            parse_json_recursively(item, target_key)


json_object = {"key1": "val1", "key2": [{"key3":"val3", "key4": "val4"}, 123, "abc"]}
target_key = "key3"
parse_json_recursively(json_object, target_key) # Ouput key3: val3

Solution 4

Here is a simple recursive function to collect all values from a json document for a given key. Values can be json documents as well. The corresponding values appended to search_result.

def json_full_search(lookup_key, json_dict, search_result = []):
    if type(json_dict) == dict:
        for key, value in  json_dict.items():
            if key == lookup_key:
                search_result.append(value)
            json_full_search(lookup_key, value, search_result)
    elif type(json_dict) == list:
        for element in json_dict:
            json_full_search(lookup_key, element, search_result)
    return search_result
Share:
51,478

Related videos on Youtube

adam
Author by

adam

Artificial Intelligence Engineer // Computational Linguist

Updated on February 06, 2022

Comments

  • adam
    adam over 2 years

    I'm trying to pull nested values from a json file. I want to print out each of the values for every "id" key. I think I'm close but can't figure out why the obj type changes from a dict to a list, and then why I'm unable to parse that list. Here is a link to the json I'm working with: http://hastebin.com/ratevimixa.tex

    and here is my current code:

    #!/usr/bin/env python
    #-*- coding: utf-8 -*-
    
    import json
    
    json_data = open('JubJubProductions.json', 'r+')
    jdata = json.loads(json_data.read().decode("utf-8"))
    
    def recursion(dict):
    
        for key, value in dict.items():
    
            if type(value) == type(dict):
                if key != "paging":
                    for key, value in value.items():
                        if isinstance (value,list):
                            print key
                            # place where I need to enter list comprehension?
                    if type(value) == type(dict):
                        if key == "id":
                            print " id found " + value
                        if key != "id":
                            print key + " 1st level"
                    if key == "id":
                        print key
            else:
                if key == "id":
                    print "id found " + value       
    if __name__ == '__main__':
        recursion(jdata)
    

    -------------------------------------------------------------------------------------------update

    This is now what I'm working with and it'll return a single id value, but not all of them:

    #!/usr/bin/env python
    #-*- coding: utf-8 -*-
    
    import json
    
    json_data = open('jubjubProductions', 'r+')
    jdata = json.loads(json_data.read().decode("utf-8"))
    
    def id_generator(d):
        for k, v in d.items():
            if k == "id":
                yield v
            elif isinstance(v, dict):
                for id_val in id_generator(v):
                    yield id_val
    
    if __name__ == '__main__':
        for _ in id_generator(jdata):
            print (_)
    
    • Farhad
      Farhad over 6 years
      When I use this, I get an error "Too many values to unpack". I didn't use d.items() though since otherwise I get "AttributeError: 'unicode' object has no attribute 'items'"
  • adam
    adam over 10 years
    add the for loop to print after the if name == main statement at the end?
  • Filip Malczak
    Filip Malczak over 10 years
    Depends of what you need. This piece of code will extract all values you need. Easiest way to do what you tried to do is to replace your "recursion" function with id_generator and put that loop instead of your "recursion(jdata)" but with "jdata" instead of "some_json_dict"
  • adam
    adam over 10 years
    That's what I thought. I did that and am now getting "TypeError: arg 2 must be a class, tuple or tuple of classes and types"
  • Filip Malczak
    Filip Malczak over 10 years
    Yeah... because I made a mistake and called argument "dict" :P Change all occurences of "dict" besides one that is in isinstance to anything else ("d", "dict_", whatever). Then I'd say it will work. I've written it without run.ing.
  • adam
    adam over 10 years
    Cool! it sort of works. it's now printing a single id value, but not all of them. I'll post the updated code to make sure I'm doing it right.
  • adam
    adam over 10 years
    any ideas why it's not getting all of the values and only printing one?
  • CKM
    CKM almost 7 years
    It fails to return all values of key is present at several levels e.g. for deepjson={"status":{"status":"success"}}, it returns only {'status': 'success'} not the second success.
  • Ilya Rusin
    Ilya Rusin about 6 years
    The answer of Bo Sunesen seems to be more appropriate because of lists of objects.
  • Ilya Rusin
    Ilya Rusin about 6 years
    As you are in python3 , use dict.items() instead of dict.iteritems()
  • Bo Sunesen
    Bo Sunesen about 6 years
    I updated the answer to reflect comment from @IlyaRusin
  • lorenzo
    lorenzo about 6 years
    This is the correct answer as it handles lists too. You can use yield from to make it a little nicer.
  • Bo Sunesen
    Bo Sunesen about 6 years
    I updated the answer to include a python 3.x version with the yield from expression.
  • Ram Ghadiyaram
    Ram Ghadiyaram over 2 years
    nice... I am stuck with .....what if I want to have a return value for method/func not just print