Pandas and JSON ValueError: arrays must all be same length
10,814
You have different lengths if rows so your original code will fail.
Try this:
import json
from pandas.io.json import json_normalize
with open('Lyrics_SteelyDan.json') as json_data:
data = json.load(json_data)
df = pd.DataFrame(data['songs'])
df['lyrics']
Read also this: https://hackersandslackers.com/json-into-pandas-dataframes/
Related videos on Youtube
Author by
RWalling21
Updated on June 04, 2022Comments
-
RWalling21 almost 2 years
I'm trying to make a simple application that will take lyrics from a song and save them, I'm using lyricsgenius to create a JSON file with the lyrics of the songs I'm requesting, however, I can't figure out how to parse the data from the JSON file. I've tried following this tutorial but I am getting an error when I start working with Pandas.
Code to create the JSON File
import lyricsgenius as genius import os os.getcwd() geniusCreds = "qlDFcHWqCRpSfq0pVTctt1ZhDc4wHF6lpP5WGODh4iVQB7yTPn7Hw6SjWAFiCdxa" artist_name = "Steely Dan" api = genius.Genius(geniusCreds) artist = api.search_artist(artist_name, max_songs=3) artist.save_lyrics()
Code to read the Data from the JSON File
import pandas as pd import os Artist = pd.read_json("Lyrics_SteelyDan.json") df = pd.DataFrame.from_dict(Artist['songs']) df.head
Whenever I run the code above I get the error, any help on how to fix the error or a better way to parse the data would be much appreciated, thank you.
"c:/Users/Admin/Desktop/Steely Dan/Data.py" Traceback (most recent call last): File "c:/Users/Admin/Desktop/Steely Dan/Data.py", line 5, in <module> Artist = pd.read_json("Lyrics_SteelyDan.json") File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\json\_json.py", line 592, in read_json result = json_reader.read() File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\json\_json.py", line 717, in read obj = self._get_object_parser(self.data) File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\json\_json.py", line 739, in _get_object_parser obj = FrameParser(json, **kwargs).parse() File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\json\_json.py", line 849, in parse self._parse_no_numpy() File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\io\json\_json.py", line 1093, in _parse_no_numpy loads(json, precise_float=self.precise_float), dtype=None File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\frame.py", line 411, in __init__ mgr = init_dict(data, index, columns, dtype=dtype) File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals\construction.py", line 257, in init_dict return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype) File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals\construction.py", line 77, in arrays_to_mgr index = extract_index(arrays) File "C:\Users\Admin\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals\construction.py", line 368, in extract_index raise ValueError("arrays must all be same length") ValueError: arrays must all be same length
-
It_is_Chris over 4 yearsplease paste the full traceback.
-
It_is_Chris over 4 yearsCan you also post the json?
-
It_is_Chris over 4 yearsIf you have a github can you post it there and link to it or provide a sample / portion of the json file.
-
It_is_Chris over 4 yearsSorry, that repo (JSON-Snip) is returning a 404
-
It_is_Chris over 4 yearsSame 404; is it a public repo?
-
-
It_is_Chris over 4 yearschange
df = json_normalize(data)
todf = pd.DataFrame(data['songs'])
then call the lyrics columnsdf['lyrics']
-
brainstorm over 2 yearsjson_normalize() is deprecated...
-
Gary Carlyle Cook almost 2 yearsAlso they are spelling it with an S instead of a Z.