Select columns from dataframe on condition they exist
13,975
Solution 1
Use isin
with loc
to filter, this will handle non-existent columns:
In [97]:
df = pd.DataFrame(columns=[1,2,4])
df.loc[:,df.columns.isin([1,2,3,4,])]
Out[97]:
Empty DataFrame
Columns: [1, 2, 4]
Index: []
Solution 2
It is simpler to directly calculate the set of common columns and ask for them:
df[df.columns & [1, 2, 3, 4]]
(The &
operator is the (set) intersection operator.)
Solution 3
One possible way:
df[df.columns.intersection(set(['list', 'of', 'cols']))]
For example:
$ ipython
Python 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.20.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]:
import pandas as pd
df = pd.DataFrame(columns=[1,2,3,4])
df
Out[1]:
Empty DataFrame
Columns: [1, 2, 3, 4]
Index: []
In [2]:
df[df.columns.intersection(set([1, 2, 2, 5]))]
Out[2]:
Empty DataFrame
Columns: [1, 2]
Index: []
In [3]:
pd.__version__
Out[3]:
'1.2.1'
Related videos on Youtube
Comments
-
astudentofmaths over 1 year
I have a pandas DataFrame with multiple columns (columns names are numbers; 1, 2, ...) and I want to copy some of them if they do exist.
For example
df1 = df[[1,2,3,4]]
But it might happen that some columns do not exist in df, eg df might only have columns 1, 2, and 4 or columns 1, and 2 etc -
Daniel Ortega over 3 yearsI prefer this solution over the original accepted one, as this is simpler to be implemented in one line
-
zmike almost 3 yearsI get a
FutureWarning
stating that this method is deprecated as of 1.2.4, and thatindex.intersection(other)
should be used instead.