Slice Pandas DataFrame by Row

87,546

Solution 1

In [36]: df
Out[36]:
   A  B  C  D
a  0  2  6  0
b  6  1  5  2
c  0  2  6  0
d  9  3  2  2

In [37]: rows
Out[37]: ['a', 'c']

In [38]: df.drop(rows)
Out[38]:
   A  B  C  D
b  6  1  5  2
d  9  3  2  2

In [39]: df[~((df.A == 0) & (df.B == 2) & (df.C == 6) & (df.D == 0))]
Out[39]:
   A  B  C  D
b  6  1  5  2
d  9  3  2  2

In [40]: df.ix[rows]
Out[40]:
   A  B  C  D
a  0  2  6  0
c  0  2  6  0

In [41]: df[((df.A == 0) & (df.B == 2) & (df.C == 6) & (df.D == 0))]
Out[41]:
   A  B  C  D
a  0  2  6  0
c  0  2  6  0

Solution 2

If you already know the index you can use .loc:

In [12]: df = pd.DataFrame({"a": [1,2,3,4,5], "b": [4,5,6,7,8]})

In [13]: df
Out[13]:
   a  b
0  1  4
1  2  5
2  3  6
3  4  7
4  5  8

In [14]: df.loc[[0,2,4]]
Out[14]:
   a  b
0  1  4
2  3  6
4  5  8

In [15]: df.loc[1:3]
Out[15]:
   a  b
1  2  5
2  3  6
3  4  7
Share:
87,546
ruben baetens
Author by

ruben baetens

Ruben Baetens is an eager practitioner in the domain of architecture and system engineering, interested in the holistic assessment of projects in the built environment. After an abroad study and research stay in Norway, he graduated in the summer of 2009 becoming Master of Applied Sciences and Engineering in Architecture (ir.-arch.) and started as PhD candidate at the K.U.Leuven.

Updated on January 05, 2020

Comments

  • ruben baetens
    ruben baetens over 4 years

    I am working with survey data loaded from an h5-file as hdf = pandas.HDFStore('Survey.h5') through the pandas package. Within this DataFrame, all rows are the results of a single survey, whereas the columns are the answers for all questions within a single survey.

    I am aiming to reduce this dataset to a smaller DataFrame including only the rows with a certain depicted answer on a certain question, i.e. with all the same value in this column. I am able to determine the index values of all rows with this condition, but I can't find how to delete this rows or make a new df with these rows only.

  • yoshiserry
    yoshiserry over 9 years
    is it possible to slice the dataframe and say (c = 5 or c =6) like THIS: ---> df[((df.A == 0) & (df.B == 2) & (df.C == 5 or 6) & (df.D == 0))]
  • Wouter Overmeire
    Wouter Overmeire over 9 years
    df[((df.A == 0) & (df.B == 2) & df.C.isin([5, 6]) & (df.D == 0))] or df[((df.A == 0) & (df.B == 2) & ((df.C == 5) | (df.C == 6)) & (df.D == 0))]
  • Phoenix Meadowlark
    Phoenix Meadowlark about 4 years
    It's worth a quick note that despite the notational similarity between df.loc[1:3] and some_list[1:3], the first uses an inclusive upper index while the second (and most of python) uses an exclusive upper index.