Remove rows of a dataframe based on the row number
17,385
Solution 1
Try:
df.drop(df.index[rm_indexes])
example:
import pandas as pd
df = pd.DataFrame({"A":[0,1,2,3,4,5,6,7,8],
"B":[0,1,2,3,4,5,6,7,8],
"C":[0,1,2,3,4,5,6,7,8]})
pos = [0,2,4]
df.drop(df.index[pos], inplace=True)
output
A B C
1 1 1 1
3 3 3 3
5 5 5 5
6 6 6 6
7 7 7 7
8 8 8 8
EDIT, after further specification provided by OP: multiple rows with the same index
df = pd.DataFrame({"A":[0,1,2,3,4,5,6,7,8],
"B":[0,1,2,3,4,5,6,7,8],
"C":[0,1,2,3,4,5,6,7,8],},
index=["a","b","b","a","b","c","c","d","e"])
df['idx'] = df.index
pos = [1]
df.reset_index(drop=True, inplace=True)
df.drop(df.index[pos], inplace=True)
df.set_index('idx', inplace=True)
output
A B C
idx
a 0 0 0
b 2 2 2
a 3 3 3
b 4 4 4
c 5 5 5
c 6 6 6
d 7 7 7
e 8 8 8
Solution 2
You can simply drop by index. This will remove entries in df via index 1, 2, 3, 4..etc.. 199.
df.reset_index() #this will change the index from timestamp to 0,1,2...n-1
df.drop([1, 2, 3, 4, 34, 100, 154, 155, 199]) # will drop the rows
df.index = df['myTimeStamp'] # this will restore the index back to timestamp
Author by
Eghbal
Updated on June 15, 2022Comments
-
Eghbal almost 2 years
Suppose that I have a data-frame (
DF
) and also I have an array like this:rm_indexes = np.array([1, 2, 3, 4, 34, 100, 154, 155, 199])
I want to remove row numbers in
rm_indexes
fromDF
. One inrm_indexes
means row number one (second row ofDF
), three means third row of data-frame, etc. (the first row is 0). The index column of this data-frame is timestamp.PS. I have many identical timestamps as the index of data-frame.
-
Eghbal about 5 yearsThe index column of this data-frame is timestamp
-
jose_bacoy about 5 yearsoh.. you changed the SO question. The index is timestamp. let me edit it.
-
Quang Hoang about 5 years
df.reset_index(inplace=True)
, and same fordrop
. -
Eghbal about 5 yearsI checked this. I think the problem here is that I have identical timestamps as my index. So suppose that I remove row number 2 by your method. Not only it will remove row number two, but also all other similar timestamps.
-
Eghbal about 5 yearsShould I add a new column with the name of
myTimeStamp
? I think the third line should bedf.index = df['time']