How do I compare two sentence strings for a similarity in python?

16,530

You can use SequenceMatcher().ratio() from difflib, i.e:

from difflib import SequenceMatcher

a = "I love Coding"
b = "I love Codiing"

ratio = SequenceMatcher(None, a, b).ratio()
# 0.9629629629629629

Demo

Share:
16,530
Admin
Author by

Admin

Updated on June 04, 2022

Comments

  • Admin
    Admin over 1 year

    I would first like to say that I am using tweepy. I found a way to filter out the same string but I am having a hard time filtering out similar strings.

    I have two sentence strings that I need to compare (Tweepy keyword ="Donald Trump")

    String 1: "Trump Administration Dismisses Surgeon General Vivek Murthy (http)PUGheO7BuT5LUEtHDcgm"

    String 2: "Trump Administration Dismisses Surgeon General Vivek Murthy (http)avGqdhRVOO"

    As you can see they are similar but not the same. I needed to find a way to compare the two and get a number value to decide if the second tweet should be added to the first. I thought I had the solution when I used SequenceMatcher() but it always printed out 0.0. I was expecting it to be greater than 0.5. However Sequence Matcher only seems to work for one word strings (correct me if I am wrong).

    Now you are probably thinking, "just splice off the http portions". That won't work either because it does not account for people tweet names like @cars: xyz zyx and @trucks: xyz zyx

    Is there some way to compare the two texts? It should be simple but for some reason the solution eludes me. I just learned python a week ago. Still feels weird using indents to discern between what's in a function or not.