Slice Pandas DataFrame by Row
Solution 1
In [36]: df
Out[36]:
A B C D
a 0 2 6 0
b 6 1 5 2
c 0 2 6 0
d 9 3 2 2
In [37]: rows
Out[37]: ['a', 'c']
In [38]: df.drop(rows)
Out[38]:
A B C D
b 6 1 5 2
d 9 3 2 2
In [39]: df[~((df.A == 0) & (df.B == 2) & (df.C == 6) & (df.D == 0))]
Out[39]:
A B C D
b 6 1 5 2
d 9 3 2 2
In [40]: df.ix[rows]
Out[40]:
A B C D
a 0 2 6 0
c 0 2 6 0
In [41]: df[((df.A == 0) & (df.B == 2) & (df.C == 6) & (df.D == 0))]
Out[41]:
A B C D
a 0 2 6 0
c 0 2 6 0
Solution 2
If you already know the index you can use .loc
:
In [12]: df = pd.DataFrame({"a": [1,2,3,4,5], "b": [4,5,6,7,8]})
In [13]: df
Out[13]:
a b
0 1 4
1 2 5
2 3 6
3 4 7
4 5 8
In [14]: df.loc[[0,2,4]]
Out[14]:
a b
0 1 4
2 3 6
4 5 8
In [15]: df.loc[1:3]
Out[15]:
a b
1 2 5
2 3 6
3 4 7
ruben baetens
Ruben Baetens is an eager practitioner in the domain of architecture and system engineering, interested in the holistic assessment of projects in the built environment. After an abroad study and research stay in Norway, he graduated in the summer of 2009 becoming Master of Applied Sciences and Engineering in Architecture (ir.-arch.) and started as PhD candidate at the K.U.Leuven.
Updated on January 05, 2020Comments
-
ruben baetens over 4 years
I am working with survey data loaded from an h5-file as
hdf = pandas.HDFStore('Survey.h5')
through the pandas package. Within thisDataFrame
, all rows are the results of a single survey, whereas the columns are the answers for all questions within a single survey.I am aiming to reduce this dataset to a smaller
DataFrame
including only the rows with a certain depicted answer on a certain question, i.e. with all the same value in this column. I am able to determine the index values of all rows with this condition, but I can't find how to delete this rows or make a new df with these rows only. -
yoshiserry over 9 yearsis it possible to slice the dataframe and say (c = 5 or c =6) like THIS: ---> df[((df.A == 0) & (df.B == 2) & (df.C == 5 or 6) & (df.D == 0))]
-
Wouter Overmeire over 9 yearsdf[((df.A == 0) & (df.B == 2) & df.C.isin([5, 6]) & (df.D == 0))] or df[((df.A == 0) & (df.B == 2) & ((df.C == 5) | (df.C == 6)) & (df.D == 0))]
-
Phoenix Meadowlark about 4 yearsIt's worth a quick note that despite the notational similarity between
df.loc[1:3]
andsome_list[1:3]
, the first uses an inclusive upper index while the second (and most of python) uses an exclusive upper index.