Filter Multiple Values using pandas

47,797

Solution 1

You are missing a pair of parentheses to get comparable items on both sides of the | operator - which has higher precedence than == (see docs):

df = df.loc[(df['Col 2'] == 'High') | (df['Col2'] == 'Medium')]

Solution 2

This works as well, more pythonic

country_list = ['brazil','poland','russia','countrydummy','usa']

filtered_df = df[df['Country Name'].isin(country_list)]
print(filtered_df )

Solution 3

You can also use ( for Pandas >= 0.13 ) :

filtered_df = df.query( '"Country Name" == ["brazil","poland","russia","countrydummy","usa"]' )

print(filtered_df )
Share:
47,797
DataNoob
Author by

DataNoob

Updated on July 15, 2022

Comments

  • DataNoob
    DataNoob almost 2 years

    I am using Python and Pandas. I have a df that works similar to this:

     +--------+--------+-------+
     |  Col1  |  Col2  | Col3 |
     +--------+--------+-------+
     | Team 1 | High   | Pizza |
     | Team 1 | Medium | Sauce |
     | Team 1 | Low    | Crust |
     +--------+--------+-------+
    

    I would like to filter the df so that I only see High or Medium from Col2.

    This is what I have tried with no luck

     df = df.loc[df['Col2'] == 'High' | (df['Col2'] == 'Medium')]
    

    This is the error I am getting

     cannot compare a dtyped [bool] array with a scalar of type [bool]
    

    Any ideas how to make this work and what that error means?

  • KY Lu
    KY Lu over 3 years
    this solution is better because you can use dynamic length of country_list
  • Mattia
    Mattia over 3 years
    Thank you! Here the docs docs for more details.