Excel Exact Word Matching

18,936

Solution 1

I believe to correctly cover cases you have to pad spaces before and after the term "gas" and the search term. This will ensure that gas will be found at the beginning or end of a cell, and also prevent it from being found in the middle of any words. Your post does not indicate whether punctuation can exist in the file, but to accomodate punctuation padding spaces around the search will not work correctly, you would have to include the case of " gas. " " gas! " etc to allow for any punctuation specifically. If you are worried about catching values like "gas.cost" or similar you can use the same padding around the punctuation search.

=Or(ISNUMBER(SEARCH(" gas ", " "&A1&" ")),ISNUMBER(SEARCH(" gas. ", " "&A1&" ")))

Is a basic search that should return the word gas by itself, or "gas." By padding a space after "gas." in the search it will find it as the final word in a sentence, or at the end of a cell.

Edit: Dropped a parentheses.

Solution 2

One can also use regular expressions in VBA to accomplish this. In Regular Expressions, "\b" represents a word boundary. A word boundary is defined as the position between a word and a non-word character or the beginning or end of the line. Word characters are [A-Za-z0-9_] (letters, digits, and the underscore). Hence, one can use this UDF. You do need to be aware that words which include non-word characters (e.g. a hyphen) may be treated differently than you expect. And if you are dealing with non-English letters, the Pattern would need to be modified.

But the code is fairly compact.

Option Explicit
Function reFindWord(FindWord As String, SearchText As String, Optional MatchCase As Boolean = False) As Boolean
    Dim RE As Object
    Dim sPattern As String
Set RE = CreateObject("vbscript.regexp")
sPattern = "\b" & FindWord & "\b"
With RE
    .Pattern = sPattern
    .ignorecase = Not MatchCase
    reFindWord = .test(SearchText)
End With
End Function

Solution 3

I think the only way to cover all possible punctuation surrounding the search word is to create a custom macro function. Use the enhanced split function to tokenize the sentence into an array of words then search the array for a match.

Enhanced split function https://msdn.microsoft.com/en-us/library/aa155763

How to create custom macro http://www.wikihow.com/Create-a-User-Defined-Function-in-Microsoft-Excel

Code to create FindEngWord function

Public Function FindEngWord(ByVal TextToSearch As String, ByVal WordToFind As String) As Boolean

Dim WrdArray() As String
Dim text_string As String
Dim isFound As Boolean

isFound = False

text_string = TextToSearch

WrdArray() = Split(text_string)

isFound = False
For i = 0 To UBound(WrdArray)
    If LCase(WrdArray(i)) = LCase(WordToFind) Then
        isFound = True
    End If
Next i

FindEngWord = isFound

End Function


Public Function Split(ByVal InputText As String, _
         Optional ByVal Delimiter As String) As Variant

' This function splits the sentence in InputText into
' words and returns a string array of the words. Each
' element of the array contains one word.

    ' This constant contains punctuation and characters
    ' that should be filtered from the input string.
    Const CHARS = ".!?,;:""'()[]{}"
    Dim strReplacedText As String
    Dim intIndex As Integer

    ' Replace tab characters with space characters.
    strReplacedText = Trim(Replace(InputText, _
         vbTab, " "))

    ' Filter all specified characters from the string.
    For intIndex = 1 To Len(CHARS)
        strReplacedText = Trim(Replace(strReplacedText, _
            Mid(CHARS, intIndex, 1), " "))
    Next intIndex

    ' Loop until all consecutive space characters are
    ' replaced by a single space character.
    Do While InStr(strReplacedText, "  ")
        strReplacedText = Replace(strReplacedText, _
            "  ", " ")
    Loop

    ' Split the sentence into an array of words and return
    ' the array. If a delimiter is specified, use it.
    'MsgBox "String:" & strReplacedText
    If Len(Delimiter) = 0 Then
        Split = VBA.Split(strReplacedText)
    Else
        Split = VBA.Split(strReplacedText, Delimiter)
    End If
End Function

Can be called from your excel sheet with this.

=FindEngWord(A1,"gas")
Share:
18,936
Chris J. Vargo
Author by

Chris J. Vargo

Assistant Professor at The University of Alabama. Yes, we get Internet down here. Still learning python. That might explain the rudimentary of my questions, but hey, at least I'm trying right?

Updated on July 14, 2022

Comments

  • Chris J. Vargo
    Chris J. Vargo almost 2 years

    Let's say I have "Vegas is great" in cell A1. I want to write a formula that looks for the exact word, "gas" in cells. Vegas ≠ gas, but the only search formula I'm finding:

     =ISNUMBER(SEARCH("gas",lower(A1))
    

    returns true. Is there anyway to do do exact matching? I'd ideally like it to be non-case sensitive which I believe is satisfied by wrapping A1 in lower().

  • Ron Rosenfeld
    Ron Rosenfeld over 9 years
    When I tested, it returned FALSE when Gas was the first word in the string. e.g: gas is good --> FALSE
  • Padhraic
    Padhraic over 9 years
    I tried to reproduce your error but that string "gas is good" and other strings with "gas" at the start worked fine for me. Returned TRUE.
  • Ron Rosenfeld
    Ron Rosenfeld over 9 years
    I took a closer look and wonder if you are using the same code that you have posted. In particular, this line: For i = 1 To UBound(WrdArray) Since WrdArray is the result of VBA.Split, and since that will always be zero-based, your posted code will never check the first word.
  • Padhraic
    Padhraic over 9 years
    Yes you are right. Don't know what I did the last time I checked to make me miss it. The array is 0 based. Thanks for the save. I have edited my code, it should work now.
  • Ron Rosenfeld
    Ron Rosenfeld over 9 years
    I would also suggest, in general, not using words which are the same as VBA functions for your own functions. Especially since Split (by itself) will also invoke VBA.Split if your function were not present. Perhaps rename it mySplit or xSplit. Less confusing and easier to debug for other projects.