How to replace words in a string using a dictionary mapping

14,903

Solution 1

Here is one way.

a = "you don't need a dog"

d =  {"don't": "do not" }

res = ' '.join([d.get(i, i) for i in a.split()])

# 'you do not need a dog'

Explanation

  • Never name a variable after a class, e.g. use d instead of dict.
  • Use str.split to split by whitespace.
  • There is no need to wrap str around values which are already strings.
  • str.join works marginally better with a list comprehension versus a generator expression.

Solution 2

All answers are correct, but in case your sentence is quite long and the mapping-dictionary rather small, you should think of iterating over the items (key-value pairs) of the dictionary and apply str.replace to the original sentence.

The code as suggested by the others. It takes 6.35 µs per loop.

%%timeit

search = "you don't need a dog. but if you like dogs, you should think of getting one for your own. Or a cat?"
mapping =  {"don't": "do not" }

search = ' '.join([mapping.get(i, i) for i in search.split()])

Let's try using str.replace instead. It takes 633 ns per loop.

%%timeit 

search = "you don't need a dog. but if you like dogs, you should think of getting one for your own. Or a cat?"
mapping =  {"don't": "do not" }

for key, value in mapping.items():
    search = search.replace(key, value)

And let's use Python3 list comprehension. So we get the fastest version that takes 1.09 µs per loop.

%%timeit 

search = "you don't need a dog. but if you like dogs, you should think of getting one for your own. Or a cat?"
mapping =  {"don't": "do not" }

search = [search.replace(key, value) for key, value in mapping.items()][0]

You see the difference? For your short sentence the first and the third code are about the same speed. But the longer the sentence (search string) gets, the more obvious the difference in performance is.

Result string is:

'you do not need a dog. but if you like dogs, you should think of getting one for your own. Or a cat?'

Remark: str.replace would also replace occurrences within long concatenated words. One needs to ensure that replacement is done for full words only. I guess there are options for str.replace. Another idea is using regular expressions as explained in this posting as they also take care of lower and upper cases. Trailing white spaces in your lookup dictionary won’t work since you won’t find occurrences at the beginning or on the end of a sentence.

Solution 3

You need to split(' ') your sentence on ' ' - if you simply iterate over a string, you iterate characters:

a = "you don't need a dog"

for word in a:  # thats what you are using as input to your dict-key-replace
    print(word) # the single characters are never matched, thats why yours does not work.

Output:

y
o
u

d
o
n
'
t

n
e
e
d

a

d
o
g

Read How to debug small programs

After that, read How to split a string into a list? or use jpp's solution.

Share:
14,903
A.Papa
Author by

A.Papa

By Day: Jr. Data Scientist By Night: Philosophical conversation of how Open Source can change the world

Updated on June 07, 2022

Comments

  • A.Papa
    A.Papa almost 2 years

    I have the following sentence

    a = "you don't need a dog"
    

    and a dictionary

    dict =  {"don't": "do not" }
    

    But I can't use the dictionary to map words in the sentence using the below code:

    ''.join(str(dict.get(word, word)) for word in a)
    

    Output:

    "you don't need a dog"
    

    What am I doing wrong?

  • Abdul Niyas P M
    Abdul Niyas P M about 6 years
    Do we need to built an extra list inside join?
  • Matthias
    Matthias about 6 years
    If the lookup dictionary is rather short, wouldn’t it be better to iterate through its item and then apply str.replace onto the search string? I guess it’s much better performance than splitting the long search string into words and iterate through all of them.
  • jpp
    jpp about 6 years
    @Matthias, agreed. I suggest you can write that up as an alternative solution. Definitely a valid approach to the problem.
  • Sameh
    Sameh over 3 years
    search = [search.replace(key, value) for key, value in mapping.items()][0] if your mapping dictionary has more than one term, it will return the first replacement only.