ValueError: Extra Data error when importing json file using python

15,931

Solution 1

Figured it out. Looks like breaking it up into lines was the mistake. Here's what the final code looks like.

counter = 0
for jsonFile in jsonFiles:
    with open(jsonFile) as f:
        data = f.read()
        jsondata = json.loads(data)
        try:
            db[args.collection].insert(jsondata)
            counter += 1

Solution 2

Look at this example:

s = """{ "data": { "one":1 } },{ "1": { "two":2 } }"""
json.load( s )

It will produce the "Extra data" error like in your json file:

ValueError: Extra data: line 1 column 24 - line 1 column 45 (char 23 - 44)

This is because this is not a valid JSON object. It contains two independend "dict"s, separated by a colon. Perhaps this could help you finding the error in your JSON file.

in this post you find more information.

Share:
15,931
Johnny Metz
Author by

Johnny Metz

Updated on June 13, 2022

Comments

  • Johnny Metz
    Johnny Metz almost 2 years

    I'm trying to build a python script that imports json files into a MongoDB. This part of my script keeps jumping to the except ValueError for larger json files. I think it has something to do with parsing the json file line by line because very small json files seem to work.

    def read(jsonFiles):
    from pymongo import MongoClient
    
    client = MongoClient('mongodb://localhost:27017/')
    db = client[args.db]
    
    counter = 0
    for jsonFile in jsonFiles:
        with open(jsonFile, 'r') as f:
            for line in f:
                # load valid lines (should probably use rstrip)
                if len(line) < 10: continue
                try:
                    db[args.collection].insert(json.loads(line))
                    counter += 1
                except pymongo.errors.DuplicateKeyError as dke:
                    if args.verbose:
                        print "Duplicate Key Error: ", dke
                except ValueError as e:
                    if args.verbose:
                        print "Value Error: ", e
    
                        # friendly log message
                if 0 == counter % 100 and 0 != counter and args.verbose: print "loaded line:", counter
                if counter >= args.max:
                    break
    

    I'm getting the following error message:

    Value Error:  Extra data: line 1 column 10 - line 2 column 1 (char 9 - 20)
    Value Error:  Extra data: line 1 column 8 - line 2 column 1 (char 7 - 18)
    
  • Johnny Metz
    Johnny Metz almost 8 years
    Ok so it looks like I need to define multiple dicts (my json file is pretty large and has five levels on indentation at some points), dump the dicts, wrap them in a list, and dump the list. How will this look in my code?