Python regex for finding all words in a string

39,436

Use word boundary \b

import re

shop="hello seattle what have you got"
regex = r'\b\w+\b'
list1=re.findall(regex,shop)
print list1

OP : ['hello', 'seattle', 'what', 'have', 'you', 'got']

or simply \w+ is enough

import re

shop="hello seattle what have you got"
regex = r'\w+'
list1=re.findall(regex,shop)
print list1

OP : ['hello', 'seattle', 'what', 'have', 'you', 'got']
Share:
39,436
TNT
Author by

TNT

Just another SE at work

Updated on July 23, 2022

Comments

  • TNT
    TNT almost 2 years

    Hello I am new into regex and I'm starting out with python. I'm stuck at extracting all words from an English sentence. So far I have:

    import re
    
    shop="hello seattle what have you got"
    regex = r'(\w*) '
    list1=re.findall(regex,shop)
    print list1
    

    This gives output:

    ['hello', 'seattle', 'what', 'have', 'you']

    If I replace regex by

    regex = r'(\w*)\W*'
    

    then output:

    ['hello', 'seattle', 'what', 'have', 'you', 'got', '']

    whereas I want this output

    ['hello', 'seattle', 'what', 'have', 'you', 'got']

    Please point me where I am going wrong.