Select rows if string begins with certain characters in pandas

10,220

Solution 1

Use Series.str.startswith with convert list to tuple and filtering by DataFrame.loc with boolean indexing:

wdata = pd.DataFrame({'words':['what','and','how','good','yes']})

L = ['a','g']
s = wdata.loc[wdata['words'].str.startswith(tuple(L)), 'words']
print (s)
1     and
3    good
Name: words, dtype: object

Solution 2

To get relevant rows, extract the first letter, then use isin:

df
  words  frequency
0  what         10
1   and          8
2   how          8
3  good          5
4   yes          7

df[df['words'].str[0].isin(['a', 'g'])]
  words  frequency
1   and          8
3  good          5

If you want a specific column, use loc:

df.loc[df['words'].str[0].isin(['a', 'g']), 'words']
1     and
3    good
Name: words, dtype: object

df.loc[df['words'].str[0].isin(['a', 'g']), 'words'].tolist()
# ['and', 'good']

Solution 3

it is very easy and handy. you can just use str.startwith in this way:

df[df.Words.str.startswith('G')]



df[df.Words.str.startswith('A')]
Share:
10,220
programming freak
Author by

programming freak

beginner python developer, who loves play with codes and explore new areas to know what is happening

Updated on June 26, 2022

Comments

  • programming freak
    programming freak almost 2 years

    I have a csv file as the given picture bellow

    enter image description here

    I'm trying to find any word that will start with letter A and G or any list that I want

    but my code returns an error any Ideas what I'm doing wrong ? this is my code

    if len(sys.argv) == 1:
        print("please provide a CSV file to analys")
    else:
        fileinput = sys.argv[1]
    
    wdata = pd.read_csv(fileinput)
    
    
    print( list(filter(startswith("a","g"), wdata)) )