Select columns from dataframe on condition they exist

13,975

Solution 1

Use isin with loc to filter, this will handle non-existent columns:

In [97]:
df = pd.DataFrame(columns=[1,2,4])
df.loc[:,df.columns.isin([1,2,3,4,])]

Out[97]:
Empty DataFrame
Columns: [1, 2, 4]
Index: []

Solution 2

It is simpler to directly calculate the set of common columns and ask for them:

df[df.columns & [1, 2, 3, 4]]

(The & operator is the (set) intersection operator.)

Solution 3

One possible way:

df[df.columns.intersection(set(['list', 'of', 'cols']))]

For example:

$ ipython
Python 3.8.5 (default, Sep  3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.20.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]:
import pandas as pd
df = pd.DataFrame(columns=[1,2,3,4])
df
Out[1]:
Empty DataFrame
Columns: [1, 2, 3, 4]
Index: []

In [2]:
df[df.columns.intersection(set([1, 2, 2, 5]))]
Out[2]:
Empty DataFrame
Columns: [1, 2]
Index: []

In [3]:
pd.__version__
Out[3]:
'1.2.1'
Share:
13,975

Related videos on Youtube

astudentofmaths
Author by

astudentofmaths

Self learning maths at home.

Updated on September 16, 2022

Comments

  • astudentofmaths
    astudentofmaths over 1 year

    I have a pandas DataFrame with multiple columns (columns names are numbers; 1, 2, ...) and I want to copy some of them if they do exist.

    For example df1 = df[[1,2,3,4]] But it might happen that some columns do not exist in df, eg df might only have columns 1, 2, and 4 or columns 1, and 2 etc

  • Daniel Ortega
    Daniel Ortega over 3 years
    I prefer this solution over the original accepted one, as this is simpler to be implemented in one line
  • zmike
    zmike almost 3 years
    I get a FutureWarning stating that this method is deprecated as of 1.2.4, and that index.intersection(other) should be used instead.