ValueError: Extra Data error when importing json file using python
Solution 1
Figured it out. Looks like breaking it up into lines was the mistake. Here's what the final code looks like.
counter = 0
for jsonFile in jsonFiles:
with open(jsonFile) as f:
data = f.read()
jsondata = json.loads(data)
try:
db[args.collection].insert(jsondata)
counter += 1
Solution 2
Look at this example:
s = """{ "data": { "one":1 } },{ "1": { "two":2 } }"""
json.load( s )
It will produce the "Extra data" error like in your json file:
ValueError: Extra data: line 1 column 24 - line 1 column 45 (char 23 - 44)
This is because this is not a valid JSON object. It contains two independend "dict"s, separated by a colon. Perhaps this could help you finding the error in your JSON file.
in this post you find more information.
Johnny Metz
Updated on June 13, 2022Comments
-
Johnny Metz almost 2 years
I'm trying to build a python script that imports json files into a MongoDB. This part of my script keeps jumping to the
except ValueError
for larger json files. I think it has something to do with parsing the json file line by line because very small json files seem to work.def read(jsonFiles): from pymongo import MongoClient client = MongoClient('mongodb://localhost:27017/') db = client[args.db] counter = 0 for jsonFile in jsonFiles: with open(jsonFile, 'r') as f: for line in f: # load valid lines (should probably use rstrip) if len(line) < 10: continue try: db[args.collection].insert(json.loads(line)) counter += 1 except pymongo.errors.DuplicateKeyError as dke: if args.verbose: print "Duplicate Key Error: ", dke except ValueError as e: if args.verbose: print "Value Error: ", e # friendly log message if 0 == counter % 100 and 0 != counter and args.verbose: print "loaded line:", counter if counter >= args.max: break
I'm getting the following error message:
Value Error: Extra data: line 1 column 10 - line 2 column 1 (char 9 - 20) Value Error: Extra data: line 1 column 8 - line 2 column 1 (char 7 - 18)
-
Johnny Metz almost 8 yearsOk so it looks like I need to define multiple dicts (my json file is pretty large and has five levels on indentation at some points), dump the dicts, wrap them in a list, and dump the list. How will this look in my code?