Remove rows of a dataframe based on the row number

17,385

Solution 1

Try:

df.drop(df.index[rm_indexes])

example:

import pandas as pd

df = pd.DataFrame({"A":[0,1,2,3,4,5,6,7,8],
                   "B":[0,1,2,3,4,5,6,7,8],
                   "C":[0,1,2,3,4,5,6,7,8]})

pos = [0,2,4]
df.drop(df.index[pos], inplace=True)

output

    A   B   C
1   1   1   1
3   3   3   3
5   5   5   5
6   6   6   6
7   7   7   7
8   8   8   8

EDIT, after further specification provided by OP: multiple rows with the same index

df = pd.DataFrame({"A":[0,1,2,3,4,5,6,7,8],
                   "B":[0,1,2,3,4,5,6,7,8],
                   "C":[0,1,2,3,4,5,6,7,8],},
                   index=["a","b","b","a","b","c","c","d","e"])
df['idx'] = df.index

pos = [1]
df.reset_index(drop=True, inplace=True)
df.drop(df.index[pos], inplace=True)
df.set_index('idx', inplace=True)

output

    A   B   C
idx         
a   0   0   0
b   2   2   2
a   3   3   3
b   4   4   4
c   5   5   5
c   6   6   6
d   7   7   7
e   8   8   8

Solution 2

You can simply drop by index. This will remove entries in df via index 1, 2, 3, 4..etc.. 199.

df.reset_index()    #this will change the index from timestamp to 0,1,2...n-1
df.drop([1, 2, 3, 4, 34, 100, 154, 155, 199])  # will drop the rows
df.index = df['myTimeStamp']  # this will restore the index back to timestamp
Share:
17,385
Eghbal
Author by

Eghbal

Updated on June 15, 2022

Comments

  • Eghbal
    Eghbal almost 2 years

    Suppose that I have a data-frame (DF) and also I have an array like this:

    rm_indexes = np.array([1, 2, 3, 4, 34, 100, 154, 155, 199])
    

    I want to remove row numbers in rm_indexes from DF. One in rm_indexes means row number one (second row of DF), three means third row of data-frame, etc. (the first row is 0). The index column of this data-frame is timestamp.

    PS. I have many identical timestamps as the index of data-frame.

  • Eghbal
    Eghbal about 5 years
    The index column of this data-frame is timestamp
  • jose_bacoy
    jose_bacoy about 5 years
    oh.. you changed the SO question. The index is timestamp. let me edit it.
  • Quang Hoang
    Quang Hoang about 5 years
    df.reset_index(inplace=True), and same for drop.
  • Eghbal
    Eghbal about 5 years
    I checked this. I think the problem here is that I have identical timestamps as my index. So suppose that I remove row number 2 by your method. Not only it will remove row number two, but also all other similar timestamps.
  • Eghbal
    Eghbal about 5 years
    Should I add a new column with the name of myTimeStamp? I think the third line should be df.index = df['time']