How to drop multiple column names given in a list from Spark DataFrame?

apache-spark dataframe pyspark apache-spark-sql pyspark-sql

40,091

Solution 1

You can use the * operator to pass the contents of your list as arguments to drop():

df.drop(*drop_lst)

Solution 2

You can give column name as comma separated list e.g.

df.drop("col1","col11","col21")

Solution 3

This is how drop specified number of consecutive columns in scala:

val ll = dfwide.schema.names.slice(1,5)
dfwide.drop(ll:_*).show

slice take two parameters star index and end index.

40,091

Author by

GeorgeOfTheRF

Data Scientist

Updated on October 21, 2021

Comments

GeorgeOfTheRF over 2 years
I have a dynamic list which is created based on value of n.
```
n = 3
drop_lst = ['a' + str(i) for i in range(n)]
df.drop(drop_lst)
```
But the above is not working.

Note:

My use case requires a dynamic list.

If I just do the below without list it works
```
df.drop('a0','a1','a2')
```
How do I make drop function work with list?

Spark 2.2 doesn't seem to have this capability. Is there a way to make it work without using select()?

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Related

How to check if array column is inside another column array in PySpark dataframe

How to add Extra column with current date in Spark dataframe

How to group by multiple columns and collect in list in PySpark?

Converting yyyymmdd to MM-dd-yyyy format in pyspark

How to filter a python Spark DataFrame by date between two date format columns

how to get first value and last value from dataframe column in pyspark?

Spark 2.3 Dropping Temp Table

LEFT and RIGHT function in PySpark SQL

Create a dataframe from a list in pyspark.sql

pyspark, Compare two rows in dataframe