Python ijson parse file (ijson from softwaremaniacs.org )

12,011

Solution 1

Try ijson.parse(open('sample.json')). The output will look something like

list(ijson.parse(open('sample.json')))

[('', u'start_map', None),
 ('', u'map_key', u'Server'),
 (u'Server', u'string', u'xy'),
 ('', u'map_key', u'Assets'),
 (u'Assets', u'start_array', None),
 (u'Assets.item', u'start_map', None),
 (u'Assets.item', u'map_key', u'Identifier'),
 (u'Assets.item.Identifier', u'string', u'21979c09fc4e6574'),
 (u'Assets.item', u'end_map', None),
 (u'Assets.item', u'start_map', None),
 (u'Assets.item', u'map_key', u'Identifier'),
 (u'Assets.item.Identifier', u'string', u'e6235cce58ec8b9c'),
 (u'Assets.item', u'end_map', None),
 (u'Assets', u'end_array', None),
 ('', u'map_key', u'AssetCount'),
 (u'AssetCount', u'number', 2),
 ('', u'end_map', None)]

ijson is also available from pypi.

Solution 2

This is probably too late, but anyway…

ijson.parse is a lower level function, you want ijson.items instead which does all the prefix filtering and object construction for you:

import ijson
f = open('sample.json', 'rb')
for id in ijson.items(f, 'Assets.item.Identifier'):
    # do something with id

Note: ijson wants your file in binary form, hence the 'rb' mode in open.

Solution 3

better to try like

import ijson

file_name="sample.json"

with open(file_name) as file:

    parser = ijson.parse(file)

    for prefix, event, value in parser:

        if prefix=="AssetCount":

            print value

        if prefix=="Server":

            print value

        if prefix=="Assets.item.Identifier":

            print value

ouput like:

2

xy

21979c09fc4e6574

e6235cce58ec8b9c
Share:
12,011
Admin
Author by

Admin

Updated on June 27, 2022

Comments

  • Admin
    Admin almost 2 years

    I need a little help to parse a large JSON file. Here I have just a sample of the data (only 2 items).

    I need to use the parse method. open() does not work, because the file is too large.

    parser=ijson.parse("sample.json")
    

    I need to loop and print out the Identifier from all the Assets.

    It cannot be so hard, but I cannot get the correct code.

    Thank you for any helpful tips.

    Peter

    json data:

    {
      "AssetCount": 2,
      "Server": "xy",
      "Assets": [
        {
          "Identifier": "21979c09fc4e6574"
        },
        {
          "Identifier": "e6235cce58ec8b9c"
        }
     ]
    }
    
  • Nawaz
    Nawaz about 8 years
    Is there any documentation to explain the details of the API? and how it works?
  • shivisuper
    shivisuper about 7 years
    I don't know why I didn't think of this way to investigate into ijson. Thanks, it really helped