Pandas drop first columns after csv read

12,351

Solution 1

One way is to use pd.DataFrame.iloc:

import pandas as pd
from io import StringIO

mystr = StringIO("""col1,col2,col3
a,b,c
d,e,f
g,h,i
""")

df = pd.read_csv(mystr).iloc[:, 1:]

#   col2 col3
# 0    b    c
# 1    e    f
# 2    h    i

Solution 2

Assuming you know the total number of columns in the dataset, and the indexes you want to remove -

a = range(3)
a.remove(1)
df = pd.read_csv('test.csv', usecols = a)

Here 3 is the total number of columns, and I wanted to remove 2nd column. You can directly write index of columns to use

Share:
12,351
Ti me
Author by

Ti me

Updated on June 21, 2022

Comments

  • Ti me
    Ti me almost 2 years

    Is there a way to reference an object within the line of the instantiation ?

    See the following example : I wanted to drop the first column (by index) of a csv file just after reading it (usually pd.to_csv outputs the index as first col) :

    df = pd.read_csv(csvfile).drop(self.columns[[0]], axis=1)
    

    I understand self should be placed in the object context but it here describes what I intent to do.

    (Of course, doing this operation in two separate lines works perfectly.)

  • Ti me
    Ti me about 6 years
    Hi, this works fine and it's a simple single line. thank you.Still I wonder if there is a way to refer to an object on the line it's instantiated...
  • jpp
    jpp about 6 years
    Don't think so. The problem is the usecols argument for read_csv accepts only array-like input, so [1, 2] is valid input to exclude the iniitial column, but [1:] is not.