Access a list within an element of a Pandas DataFrame

14,595

Solution 1

A bit more straightforward is:

df['C'] = df['A'] + df['B'].apply(lambda x:x[1])

Solution 2

One option is to use the apply, which should be faster than creating a data frame out of it:

df['C'] = df['A'] + df.apply(lambda row: row['B'][1], axis = 1) 

Some speed test:

%timeit df['C'] = df['A'] + pd.DataFrame(df['B'].tolist())[1]
# 1000 loops, best of 3: 567 µs per loop
%timeit df['C'] = df['A'] + df.apply(lambda row: row['B'][1], axis = 1) 
# 1000 loops, best of 3: 406 µs per loop
%timeit df['C'] = df['A'] + df['B'].apply(lambda x:x[1])
# 1000 loops, best of 3: 250 µs per loop

OK. Slightly better. @breucopter's answer is the fastest.

Solution 3

You can also simply try the following:

df['C'] = df['A'] + df['B'].str[1]

Performance of this method:

%timeit df['C'] = df['A'] + df['B'].str[1]
#1000 loops, best of 3: 445 µs per loop
Share:
14,595
Michael
Author by

Michael

Updated on June 07, 2022

Comments

  • Michael
    Michael almost 2 years

    I have a Pandas DataFrame which has a list of integers inside one of the columns. I'd like to access the individual elements within this list. I've found a way to do it by using tolist() and turning it back into a DataFrame, but I am wondering if there is a simpler/better way. In this example, I add Column A to the middle element of the list in Column B.

    import pandas as pd
    df = pd.DataFrame({'A' : (1,2,3), 'B': ([0,1,2],[3,4,5,],[6,7,8])})
    df['C'] = df['A'] + pd.DataFrame(df['B'].tolist())[1]
    df
    

    Is there a better way to do this?