Delete data frame column if column name ends with some string, Python 3.6
11,017
You can filter by inverting (~) boolean mask for columns which not need delete with loc
and str.endswith
, also working str.contains
with $
for match end of string:
cols = ['SectorName', 'Name Sector', 'ItemName', 'Item', 'Counterpart SectorName']
df = pd.DataFrame([range(5)], columns = cols)
print (df)
SectorName Name Sector ItemName Item Counterpart SectorName
0 0 1 2 3 4
print (~df.columns.str.endswith('Name'))
[False True False True False]
df1 = df.loc[:, ~df.columns.str.endswith('Name')]
df1 = df.loc[:, ~df.columns.str.contains('Name$')]
Or filter columns names first:
print (df.columns[~df.columns.str.endswith('Name')])
Index(['Sector', 'Item'], dtype='object')
df1 = df[df.columns[~df.columns.str.endswith('Name')]]
print (df1)
Name Sector Item
0 1 3
Related videos on Youtube
Author by
Learnings
Updated on June 04, 2022Comments
-
Learnings over 1 year
I have a dataframe with below columns:
SectorName', 'Sector', 'ItemName', 'Item', 'Counterpart SectorName', 'Counterpart Sector', 'Stocks and TransactionsName', 'Stocks and Transactions', 'Units', 'Scale', 'Frequency', 'Date', 'Value'
How to delete column from
df
where column name ends withName
.-
Shihe Zhang about 6 yearsPossible duplicate of How to remove multiple columns that end with same text in Pandas?
-
-
Learnings about 6 yearsthanks, I go for 2nd one.
-
Learnings about 6 yearsCan we give multiple, like Ends with 'Code', df1 = df[df.columns[~df.columns.str.endswith('Name', 'Code')]]
-
jezrael about 6 yearsYou need chain conditions like
df1 = df[df.columns[~(df.columns.str.endswith('Name') | df.columns.str.endswith('Code'))]]
-
jezrael about 6 yearsOr
df1 = df.loc[:, ~df.columns.str.contains('Name$|Code$')]
-
Learnings about 6 yearsyes working fine... thanks again...
-
3kstc over 5 yearsthe
~
is a!
(not) sodf[df.columns[~(df.columns.str.endswith('Name')
will return all the columns that does not (!
) have'Name'
-
bernando_vialli almost 4 years@jezrael, do you know, should the same code work if I am working with a pyspark dataframe instead of a pandas one?