Nested dictionary to pandas DataFrame

10,585

Solution 1

Data from Jpp

pd.Series(d).apply(lambda x  : pd.Series({ k: v for y in x for k, v in y.items() }))
Out[1166]: 
    K1  K2  K3
O1   1   2   3
O2   4   5   6

Update

pd.Series(d).apply(lambda x  : pd.Series({ k: v for y in x for k, v in y.items() }))
Out[1179]: 
     K1   K2   K3
O1  1.0  2.0  3.0
O2  4.0  5.0  6.0
O3  NaN  NaN  NaN

Solution 2

Here's one way:

d = { 'O1' : [ {'K1': 1},
               {'K2': 2},
               {'K3': 3} ],
      'O2' : [ {'K1': 4},
               {'K2': 5},
               {'K3': 6} ] }

d = {k: { k: v for d in L for k, v in d.items() } for k, L in d.items()}

df = pd.DataFrame.from_dict(d, orient='index')

#     K1  K2  K3
# O1   1   2   3
# O2   4   5   6

Alternative solution:

df = pd.DataFrame(d).T

More cumbersome method for None data:

d = { 'O1' : [ {'K1': 1},
               {'K2': 2},
               {'K3': 3} ],
      'O2' : [ {'K1': 4},
               {'K2': 5},
               {'K3': 6} ],
      'O3' : [ {'K1': None},
               {'K2': None},
               {'K3': None} ] }

d = {k: v if isinstance(v[0], dict) else [{k: None} for k in ('K1', 'K2','K3')] for k, v in d.items()}
d = {k: { k: v for d in L for k, v in d.items() } for k, L in d.items()}

df = pd.DataFrame.from_dict(d, orient='index')

#      K1   K2   K3
# O1  1.0  2.0  3.0
# O2  4.0  5.0  6.0
# O3  NaN  NaN  NaN
Share:
10,585
ba_ul
Author by

ba_ul

Updated on June 05, 2022

Comments

  • ba_ul
    ba_ul almost 2 years

    My data looks like this:

    { outer_key1 : [ {key1: some_value},
                    {key2: some_value},
                    {key3: some_value} ],
      outer_key2 : [ {key1: some_value},
                    {key2: some_value},
                    {key3: some_value} ] }
    

    The inner arrays are always the same lengths. key1, key2, key3 are also always the same.

    I want to convert this to a pandas DataFrame, where outer_key1, outer_key2, ... are the index and key1, key2, key3 are the columns.

    Edit:

    There's an issue in the data, which I believe is the reason the given solutions are not working. In a few cases, in the inner array there are three Nones instead of the three dictionaries. Like this:

    outer_key3: [ None, None, None ]

  • ba_ul
    ba_ul about 6 years
    I've added some new info. Could you please how to take care of the None problem?
  • ba_ul
    ba_ul about 6 years
    None is actually in place of the whole dictionary (the inner ones) like this: 'O3' : [ None, None, None ] This leads to the error: AttributeError: 'NoneType' object has no attribute 'items'
  • BENY
    BENY about 6 years
    @ba_ul it will show nan in dataframe, if you want to drop it , using dropna