Regex Word Macro that finds two words within a range of each other and then italicizes those words?

10,832

I'm a long way off being a decent Word programmer, but this might get you started.

EDIT: updated to include a parameterized version.

Sub Tester()

    HighlightIfClose ActiveDocument, "panama", "canal", wdBrightGreen
    HighlightIfClose ActiveDocument, "red", "socks", wdRed

End Sub


Sub HighlightIfClose(doc As Document, word1 As String, _
                     word2 As String, clrIndex As WdColorIndex)
    Dim re As RegExp
    Dim para As Paragraph
    Dim rng As Range
    Dim txt As String
    Dim allmatches As MatchCollection, m As match

    Set re = New RegExp
    re.Pattern = "\b" & word1 & "\W+(?:\w+\W+){0,10}?" _
                 & word2 & "\b"
    re.IgnoreCase = True
    re.Global = True

    For Each para In ActiveDocument.Paragraphs

      txt = para.Range.Text

      'any match?
      If re.Test(txt) Then
        'get all matches
        Set allmatches = re.Execute(txt)
        'look at each match and hilight corresponding range
        For Each m In allmatches
            Debug.Print m.Value, m.FirstIndex, m.Length
            Set rng = para.Range
            rng.Collapse wdCollapseStart
            rng.MoveStart wdCharacter, m.FirstIndex
            rng.MoveEnd wdCharacter, Len(word1)
            rng.HighlightColorIndex = clrIndex
            Set rng = para.Range
            rng.Collapse wdCollapseStart
            rng.MoveStart wdCharacter, m.FirstIndex + (m.Length - Len(word2))
            rng.MoveEnd wdCharacter, Len(word2)
            rng.HighlightColorIndex = clrIndex
        Next m
      End If

    Next para

End Sub
Share:
10,832
pavja2
Author by

pavja2

Updated on June 22, 2022

Comments

  • pavja2
    pavja2 almost 2 years

    So, I'm just beginning to understand Regular Expressions and I've found the learning curve fairly steep. However, stackoverflow has been immensely helpful in the process of my experimenting. There is a particular word macro that I would like to write but I have not figured out a way to do it. I would like to be able to find two words within 10 or so words of each other in a document and then italicize those words, if the words are more than 10 words apart or are in a different order I would like the macro not to italicize those words.

    I have been using the following regular expression:

    \bPanama\W+(?:\w+\W+){0,10}?Canal\b
    

    However it only lets me manipulate the entire string as a whole including random words in between. Also the .Replace function only lets me replace that string with a different string not change formatting styles.

    Does any more experienced person have an idea as to how to make this work? Is it even possible to do?


    EDIT: Here is what I have so far. There are two problems I am having. First I don't know how to only select the words "Panama" and "Canal" from within a matched Regular Expression and replace only those words (and not the intermediate words). Second, I just don't know how to replace a Regexp that is matched with a different format, only a different string of text - probably just as a result of a lack of familiarity with word macros.

    Sub RegText()
    Dim re As regExp
    Dim para As Paragraph
    Dim rng As Range
    Set re = New regExp
    re.Pattern = "\bPanama\W+(?:\w+\W+){0,10}?Canal\b"
    re.IgnoreCase = True
    re.Global = True
    For Each para In ActiveDocument.Paragraphs
      Set rng = para.Range
      rng.MoveEnd unit:=wdCharacter, Count:=-1
      Text$ = rng.Text + "Modified"
      rng.Text = re.Replace(rng.Text, Text$)
    Next para
    End Sub
    

    Ok, thanks to help from Tim Williams below I got the following solution together, it's more than a little clumsy in some respects and it is by no means pure regexp but it does get the job done. If anyone has a better solution or idea about how to go about this I'd be fascinated to hear it though. Again, my brute forcing the changes with the search and replace feature is a little embarrassingly crude but at least it works...

    Sub RegText()
    Dim re As regExp
    Dim para As Paragraph
    Dim rng As Range
    Dim txt As String
    Dim allmatches As MatchCollection, m As match
    Set re = New regExp
    re.pattern = "\bPanama\W+(?:\w+\W+){0,13}?Canal\b"
    re.IgnoreCase = True
    re.Global = True
    For Each para In ActiveDocument.Paragraphs
    
      txt = para.Range.Text
    
      'any match?
      If re.Test(txt) Then
        'get all matches
        Set allmatches = re.Execute(txt)
        'look at each match and hilight corresponding range
        For Each m In allmatches
            Debug.Print m.Value, m.FirstIndex, m.Length
            Set rng = para.Range
            rng.Collapse wdCollapseStart
            rng.MoveStart wdCharacter, m.FirstIndex
            rng.MoveEnd wdCharacter, m.Length
            rng.Font.ColorIndex = wdOrange
        Next m
      End If
    
    Next para
    
    Selection.Find.ClearFormatting
    Selection.Find.Font.ColorIndex = wdOrange
    Selection.Find.Replacement.ClearFormatting
    Selection.Find.Replacement.Font.Italic = True
    With Selection.Find
        .Text = "Panama"
        .Replacement.Text = "Panama"
        .Forward = True
        .Wrap = wdFindContinue
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    Selection.Find.ClearFormatting
    Selection.Find.Font.ColorIndex = wdOrange
    Selection.Find.Replacement.ClearFormatting
    Selection.Find.Replacement.Font.Italic = True
    With Selection.Find
        .Text = "Canal"
        .Replacement.Text = "Canal"
        .Forward = True
        .Wrap = wdFindContinue
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    
    Selection.Find.ClearFormatting
    Selection.Find.Font.ColorIndex = wdOrange
    Selection.Find.Replacement.ClearFormatting
    Selection.Find.Replacement.Font.ColorIndex = wdBlack
    With Selection.Find
        .Text = ""
        .Replacement.Text = ""
        .Forward = True
        .Wrap = wdFindContinue
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchWildcards = False
        .MatchSoundsLike = False
        .MatchAllWordForms = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
    End Sub