How to delete a column from a data frame with pandas?

95,268

Solution 1

To actually delete the column

del df['id'] or df.drop('id', 1) should have worked if the passed column matches exactly

However, if you don't need to delete the column then you can just select the column of interest like so:

In [54]:

df['text']
Out[54]:
0    text1
1    text2
2    textn
Name: text, dtype: object

If you never wanted it in the first place then you pass a list of cols to read_csv as a param usecols:

In [53]:
import io
temp="""id    text
363.327    text1
366.356    text2
37782    textn"""
df = pd.read_csv(io.StringIO(temp), delimiter='\s+', usecols=['text'])
df
Out[53]:
    text
0  text1
1  text2
2  textn

Regarding your error it's because 'id' is not in your columns or that it's spelt differently or has whitespace. To check this look at the output from print(df.columns.tolist()) this will output a list of the columns and will show if you have any leading/trailing whitespace.

Solution 2

df.drop(colname, axis=1) (or del df[colname]) is the correct method to use to delete a column.

If a ValueError is raised, it means the column name is not exactly what you think it is.

Check df.columns to see what Pandas thinks are the names of the columns.

Solution 3

The best way to delete a column in pandas is to use drop:

df = df.drop('column_name', axis=1)

where 1 is the axis number (0 for rows and 1 for columns.)

To delete the column without having to reassign df you can do:

df.drop('column_name', axis=1, inplace=True)

Finally, to drop by column number instead of by column label, try this. To delete, e.g. the 1st, 2nd and 4th columns:

df.drop(df.columns[[0, 1, 3]], axis=1)  # df.columns is zero-based pd.Index 


Exceptions:

If a wrong column number or label is requested an error will be thrown. To check the number of columns use df.shape[1] or len(df.columns.values) and to check the column labels use df.columns.values.

An exception would be raised answer was based on @LondonRob's answer and left here to help future visitors of this page.

Share:
95,268

Related videos on Youtube

newWithPython
Author by

newWithPython

Updated on November 17, 2020

Comments

  • newWithPython
    newWithPython over 3 years

    I read my data

    import pandas as pd
    df = pd.read_csv('/path/file.tsv', header=0, delimiter='\t')
    print df
    

    and get:

              id    text
    0    361.273    text1...
    1    374.350    text2...
    2    374.350    text3...
    

    How can I delete the id column from the above data frame?. I tried the following:

    import pandas as pd
    df = pd.read_csv('/path/file.tsv', header=0, delimiter='\t')
    print df.drop('id', 1)
    

    But it raises this exception:

    ValueError: labels ['id'] not contained in axis
    
    • unutbu
      unutbu over 9 years
      What does df.columns report as the column names? Perhaps there is a space in the column name?
    • newWithPython
      newWithPython over 9 years
      Index([u'id opinion'], dtype='object') Thanks for the response
    • EdChum
      EdChum over 9 years
      One thing to note, do you really need to delete the column? You can select just the columns of interest from the df by doing df['text'] or more generally df[some_list], additionally if you never wanted it in the first place then don't load it df = pd.read_csv('/path/file.tsv', header=0, delimiter='\t', usecols=[0])
    • xavier
      xavier about 8 years
      I want to delete it, too. But it is a matter of presentation, for when you actually make the report. Is better to pivot the frame before or just delete de column ?
    • Gaurav Taneja
      Gaurav Taneja over 7 years
      Just for completeness df.drop(['id'],1) works
  • Tim D
    Tim D over 6 years
    The question was how to delete a column. It is a valid question which is not addressed in this answer. I was not the downvoter.
  • EdChum
    EdChum over 6 years
    @TimD the context of the question is that OP wanted to remove a column they were not interested in, my answer shows that firstly this isn't necessary if you just want to use a specific column or that you could in fact just not read that column or only read the columns of interest and the OP accepted the answer
  • Tim D
    Tim D over 6 years
    you have indeed solved the problem that the OP had, which is evident from context. I landed on this question from a Google search looking for a way to remove column. In my context, this answer does not help me since I don't know a priori which columns I will need until after I have read them. You may have solved the OP problem, but I bet subsequent visitors to the page will be looking for DataFrame.drop() and upvoting answers that present that.
  • EdChum
    EdChum over 6 years
    @TimD I've added the additional information now plus how to debug this issue