Select columns in pandas dataframe by index (column) number

14,151

Remember that Python is zero indexed. Here you have ten columns but the max index will be nine. You can do this in pandas with the following:

df01.iloc[:, [0,1,3,8,9]]

   a  b  d  i  j
0  6  0  9  9  0
1  7  9  9  4  4
2  1  3  4  0  4
3  4  6  1  7  0
4  4  6  3  1  2
5  5  6  2  9  1
6  0  6  6  6  2
7  8  2  0  5  5
8  4  7  5  8  4
9  2  3  6  2  9
Share:
14,151

Related videos on Youtube

Mario Reyes
Author by

Mario Reyes

Updated on June 04, 2022

Comments

  • Mario Reyes
    Mario Reyes almost 2 years

    I have been mainly an R user up until now, and I am now trying to get better with Python, so please keep that in mind as I may not be thinking in a pythonic way...

    In any case, here it goes, I want to subset a pandas dataframe by column position, where I would select for instance, the first 2 columns, the the 4th column, and then the last two columns.

    The code I used for that is as follows:

    df01 = pd.DataFrame(np.random.randint(low=0, high=10, size=(10, 10)),
                    columns=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i','j'])
    df01.iloc[:,list(range(0,2)) + list([3]) + list(range(-3,-1))]
    

    I am doing the subsetting by essentially creating 3 lists with the columns I want, but I am thinking there must be a better way to do this as this appears to me as too cumbersome. In R I could just do a simple:

    df01[c(1:2,4,9:10)]
    

    Again, this may be just the way it is, but given my status as a python "newbie', Im interested to know if there is a better more concise way.

    Thanks,

    • Space Impact
      Space Impact over 5 years
      Use df.iloc[:,np.r_[1:2,4,9:10]]. where np is import numpy as np
    • Mario Reyes
      Mario Reyes over 5 years
      Yeap!!! np.r_ did the trick and is what I was looking for... Thanks @SandeepKadapa