Find matching words in a list and a string

94,086

Solution 1

You still have to check them all at least until one is found to be in the text, but it can be more concise:

keyword_list = ['motorcycle', 'bike', 'cycle', 'dirtbike']

if any(word in all_text for word in keyword_list):
    print 'found one of em'

Solution 2

How about this.

>>> keyword_list = ['motorcycle', 'bike', 'cycle', 'dirtbike', "long"]
>>> all_text = 'some rather long string'
>>> if set(keyword_list).intersection(all_text.split()):
...     print "Found One"
Found One

Solution 3

One way would be to build a prefix tree out of the keyword list. Then you can iterate through the long string character per character. At each iteration you try to find in the prefix tree the prefix in the big string starting at the current position. This operation takes O(log k) time, where the keyword list is of size k (assuming the prefix tree is balanced). If the long string is of length n, then the overal complexity is just O(n log k), which is much better then the naive O(n k) if k is large.

Solution 4

Using regular expression is probably the fast way.

re.findall(r'motorcycle|bike|cycle|dirtbike', text)

will return all matches of selected words.

Solution 5

ya need to make all_text a variable or it wont work

keyword_list = ['motorcycle', 'bike', 'cycle', 'dirtbike']
all_text = input("what kind of bike do you like?")
for item in keyword_list:
      if item in all_text:
            print ('found one of em')
Share:
94,086
clifgray
Author by

clifgray

PhD Student at Duke in Marine Science focused on satellite and drone based remote sensing for understanding the spatial and temporal variability of ocean biology and ecology. This includes lots of scientific computing, machine learning approaches to remote sensing analysis, and geospatial analysis. Good overview of general techniques here: https://github.com/patrickcgray/open-geo-tutorial Broad background in computer science and space systems. Lots of Python thrown in there.

Updated on August 03, 2021

Comments

  • clifgray
    clifgray over 2 years

    I am writing some code in Python and I want to check if a list of words is in a long string. I know I could iterate through it multiple times and that may be the same thing but I wanted tp see if there is a faster way to do it. What I am currently doing is this:

        all_text = 'some rather long string'
        if "motorcycle" in all_text or 'bike' in all_text or 'cycle' in all_text or 'dirtbike' in all_text:
            print 'found one of em'
    

    but what I want to do is this:

    keyword_list = ['motorcycle', 'bike', 'cycle', 'dirtbike']
    if item in keyword_list in all_text:
                print 'found one of em'
    

    Is there anyway to do this efficiently? I realize I could do:

    keyword_list = ['motorcycle', 'bike', 'cycle', 'dirtbike']
    for item in keyword_list:
          if item in all_text:
                print 'found one of em'
    

    But it seems like there would be a better way once the keyword list becomes long.