Better way to remove multiple words from a string?
12,722
Solution 1
Here's a solution with regex:
import re
def RemoveBannedWords(toPrint,database):
statement = toPrint
pattern = re.compile("\\b(Good|Bad|Ugly)\\W", re.I)
return pattern.sub("", toPrint)
toPrint = "Hello Ugly Guy, Good To See You."
print(RemoveBannedWords(toPrint,bannedWord))
Solution 2
I use
bannedWord = ['Good','Bad','Ugly']
toPrint = 'Hello Ugly Guy, Good To See You.'
print(' '.join(i for i in toPrint.split() if i not in bannedWord))
Solution 3
Slight variation on Ajay's code, when one of the string is a substring of other in the bannedWord list
bannedWord = ['good', 'bad', 'good guy' 'ugly']
The result of toPrint ='good winter good guy'
would be
RemoveBannedWords(toPrint,database = bannedWord) = 'winter good'
as it will remove good
first. A sorting is required wrt length of elements in the list.
import re
def RemoveBannedWords(toPrint,database):
statement = toPrint
database_1 = sorted(list(database), key=len)
pattern = re.compile(r"\b(" + "|".join(database_1) + ")\\W", re.I)
return pattern.sub("", toPrint + ' ')[:-1] #added because it skipped last word
toPrint = 'good winter good guy.'
print(RemoveBannedWords(toPrint,bannedWord))
Solution 4
Yet another variation on a theme. If you are going to be calling this a lot, then it is best to compile the regex once to improve the speed:
import re
bannedWord = ['Good', 'Bad', 'Ugly']
re_banned_words = re.compile(r"\b(" + "|".join(bannedWord) + ")\\W", re.I)
def RemoveBannedWords(toPrint):
global re_banned_words
return re_banned_words.sub("", toPrint)
toPrint = 'Hello Ugly Guy, Good To See You.'
print(RemoveBannedWords(toPrint))
Comments
-
Andy Wong almost 2 years
bannedWord = ["Good", "Bad", "Ugly"] def RemoveBannedWords(toPrint, database): statement = toPrint for x in range(0, len(database)): if bannedWord[x] in statement: statement = statement.replace(bannedWord[x] + " ", "") return statement toPrint = "Hello Ugly Guy, Good To See You." print(RemoveBannedWords(toPrint, bannedWord))
The output is
Hello Guy, To See You.
Knowing Python I feel like there is a better way to implement changing several words in a string. I searched up some similar solutions using dictionaries but it didn't seem to fit this situation. -
questionto42standswithUkraine over 3 yearsBest answer, strange that it has so few votes. Add a star "*" to the
\\W
if you need to find embedded words:re.compile(r"\b(" + "|".join(list_not_for_search) + ")\\W*", re.I)
. Like in 'Hello uglyyy guy, good do see you.' which will exclude the 'ugly' and give out the 'yy' as the rest. By the way:re.I
stands for re.IGNORECASE.