Include empty series when creating a pandas dataframe with .concat

12,828

Passing an argument for levels will do the trick. Here's an example. First, the wrong way:

import pandas as pd
ser1 = pd.Series()
ser2 = pd.Series([1, 2, 3])
list_of_series = [ser1, ser2, ser1]
df = pd.concat(list_of_series, axis=1)

Which produces this:

>>> df
   0
0  1
1  2
2  3

But if we add some labels to the levels argument, it will include all the empty series too:

import pandas as pd
ser1 = pd.Series()
ser2 = pd.Series([1, 2, 3])
list_of_series = [ser1, ser2, ser1]
labels = range(len(list_of_series))
df = pd.concat(list_of_series, levels=labels, axis=1)

Which produces the desired dataframe:

>>> df
    0  1   2
0 NaN  1 NaN
1 NaN  2 NaN
2 NaN  3 NaN
Share:
12,828
Alex
Author by

Alex

Building an eco-friendliness rating app for cafes called EcoRate.

Updated on June 18, 2022

Comments

  • Alex
    Alex almost 2 years

    UPDATE: This is no longer an issue since at least pandas version 0.18.1. Concatenating empty series doesn't drop them anymore so this question is out of date.

    I want to create a pandas dataframe from a list of series using .concat. The problem is that when one of the series is empty it doesn't get included in the resulting dataframe but this makes the dataframe be the wrong dimensions when I then try to rename its columns with a multi-index. UPDATE: Here's an example...

    import pandas as pd
    
    sers1 = pd.Series()
    sers2 = pd.Series(['a', 'b', 'c'])
    df1 = pd.concat([sers1, sers2], axis=1)
    

    This produces the following dataframe:

    >>> df1
    0    a
    1    b
    2    c
    dtype: object
    

    But I want it to produce something like this:

    >>> df2
        0  1
    0 NaN  a
    1 NaN  b
    2 NaN  c
    

    It does this if I put a single nan value anywhere in ser1 but it seems like this should be possible automatically even if some of my series are totally empty.