Pandas: create dataframe from list of namedtuple

python pandas dataframe

13,672

Solution 1

The function you want is from_records.

For namedtuple instances you must pass the _fields property of the namedtuple to the columns parameter of from_records, in addition to a list of namedtuples:

df = pd.DataFrame.from_records(
   [namedtuple_instance1, namedtuple_instance2],
   columns=namedtuple_type._fields
)

If you have dictionaries, you can use it directly as

df = pd.DataFrame.from_records([dict(a=1, b=2), dict(a=2, b=3)])

Solution 2

In a similar vein to creating a Series from a namedtuple, you can use the _fields attribute:

In [11]: Point = namedtuple('Point', ['x', 'y'])

In [12]: points = [Point(1, 2), Point(3, 4)]

In [13]: pd.DataFrame(points, columns=Point._fields)
Out[13]: 
   x  y
0  1  2
1  3  4

Assuming they are all of the same type, in this example all Points.

Solution 3

To simplify upon the prior answers, there is evidently no need to specify ._fields. It looks to not be necessary or useful. This should be true especially if all input tuples are of the same type. This was tested with pandas==1.3.4.

> import collections

> Point = collections.namedtuple('Point', ['x', 'y'])
> points = [Point(1, 2), Point(3, 4)]
> pd.DataFrame(points)
   x  y
0  1  2
1  3  4

13,672

Author by

Patrick the Cat

The one who can code.

Updated on June 02, 2022

Comments

Patrick the Cat almost 2 years

I'm new to pandas, therefore perhaps I'm asking a very stupid question. Normally initialization of data frame in pandas would be column-wise, where I put in dict with key of column names and values of list-like object with same length.

But I would love to initialize row-wise without dynamically concat-ing rows. Say I have a list of namedtuple, is there a optimized operation that will give me a pandas data frame directly from it?