Replace bookmark text in Word file using Open XML SDK

53,929

Solution 1

Here's my approach after using you guys as inspiration:

  IDictionary<String, BookmarkStart> bookmarkMap = 
      new Dictionary<String, BookmarkStart>();

  foreach (BookmarkStart bookmarkStart in file.MainDocumentPart.RootElement.Descendants<BookmarkStart>())
  {
      bookmarkMap[bookmarkStart.Name] = bookmarkStart;
  }

  foreach (BookmarkStart bookmarkStart in bookmarkMap.Values)
  {
      Run bookmarkText = bookmarkStart.NextSibling<Run>();
      if (bookmarkText != null)
      {
          bookmarkText.GetFirstChild<Text>().Text = "blah";
      }
  }

Solution 2

Replace bookmarks with a single content (possibly multiple text blocks).

public static void InsertIntoBookmark(BookmarkStart bookmarkStart, string text)
{
    OpenXmlElement elem = bookmarkStart.NextSibling();

    while (elem != null && !(elem is BookmarkEnd))
    {
        OpenXmlElement nextElem = elem.NextSibling();
        elem.Remove();
        elem = nextElem;
    }

    bookmarkStart.Parent.InsertAfter<Run>(new Run(new Text(text)), bookmarkStart);
}

First, the existing content between start and end is removed. Then a new run is added directly behind the start (before the end).

However, not sure if the bookmark is closed in another section when it was opened or in different table cells, etc. ..

For me it's sufficient for now.

Solution 3

After a lot of hours, I have written this method:

    Public static void ReplaceBookmarkParagraphs(WordprocessingDocument doc, string bookmark, string text)
    {
        //Find all Paragraph with 'BookmarkStart' 
        var t = (from el in doc.MainDocumentPart.RootElement.Descendants<BookmarkStart>()
                 where (el.Name == bookmark) &&
                 (el.NextSibling<Run>() != null)
                 select el).First();
        //Take ID value
        var val = t.Id.Value;
        //Find the next sibling 'text'
        OpenXmlElement next = t.NextSibling<Run>();
        //Set text value
        next.GetFirstChild<Text>().Text = text;

        //Delete all bookmarkEnd node, until the same ID
        deleteElement(next.GetFirstChild<Text>().Parent, next.GetFirstChild<Text>().NextSibling(), val, true);
    }

After that, I call:

Public static bool deleteElement(OpenXmlElement parentElement, OpenXmlElement elem, string id, bool seekParent)
{
    bool found = false;

    //Loop until I find BookmarkEnd or null element
    while (!found && elem != null && (!(elem is BookmarkEnd) || (((BookmarkEnd)elem).Id.Value != id)))
    {
        if (elem.ChildElements != null && elem.ChildElements.Count > 0)
        {
            found = deleteElement(elem, elem.FirstChild, id, false);
        }

        if (!found)
        {
            OpenXmlElement nextElem = elem.NextSibling();
            elem.Remove();
            elem = nextElem;
        }
    }

    if (!found)
    {
        if (elem == null)
        {
            if (!(parentElement is Body) && seekParent)
            {
                //Try to find bookmarkEnd in Sibling nodes
                found = deleteElement(parentElement.Parent, parentElement.NextSibling(), id, true);
            }
        }
        else
        {
            if (elem is BookmarkEnd && ((BookmarkEnd)elem).Id.Value == id)
            {
                found = true;
            }
        }
    }

    return found;
}

This code is working good if u have no empty Bookmarks. I hope it can help someone.

Solution 4

I just figured this out 10 minutes ago so forgive the hackish nature of the code.

First I wrote a helper recursive helper function to find all the bookmarks:

private static Dictionary<string, BookmarkEnd> FindBookmarks(OpenXmlElement documentPart, Dictionary<string, BookmarkEnd> results = null, Dictionary<string, string> unmatched = null )
{
    results = results ?? new Dictionary<string, BookmarkEnd>();
    unmatched = unmatched ?? new Dictionary<string,string>();

    foreach (var child in documentPart.Elements())
    {
        if (child is BookmarkStart)
        {
            var bStart = child as BookmarkStart;
            unmatched.Add(bStart.Id, bStart.Name);
        }

        if (child is BookmarkEnd)
        {
            var bEnd = child as BookmarkEnd;
            foreach (var orphanName in unmatched)
            {
                if (bEnd.Id == orphanName.Key)
                    results.Add(orphanName.Value, bEnd);
            }
        }

        FindBookmarks(child, results, unmatched);
    }

    return results;
}

That returns me a Dictionary that I can use to part through my replacement list and add the text after the bookmark:

var bookMarks = FindBookmarks(doc.MainDocumentPart.Document);

foreach( var end in bookMarks )
{
    var textElement = new Text("asdfasdf");
    var runElement = new Run(textElement);

    end.Value.InsertAfterSelf(runElement);
}

From what I can tell inserting into and replacing the bookmarks looks harder. When I used InsertAt instead of InsertIntoSelf I got: "Non-composite elements do not have child elements." YMMV

Solution 5

I took the code from the answer, and had several problems with it for exceptional cases:

  1. You might want to ignore hidden bookmarks. Bookmarks are hidden if the name starts with an _ (underscore)
  2. If the bookmark is for one more more TableCell's, you will find it in the BookmarkStart in the first Cell of the row with the property ColumnFirst refering to the 0-based column index of the cell where the bookmark starts. ColumnLast refers to the cell where the bookmark ends, for my special case it was always ColumnFirst == ColumnLast (bookmarks marked only one column). In this case you also won't find a BookmarkEnd.
  3. Bookmarks can be empty, so a BookmarkStart follows directly a BookmarkEnd, in this case you can just call bookmarkStart.Parent.InsertAfter(new Run(new Text("Hello World")), bookmarkStart)
  4. Also a bookmark can contain many Text-elements, so you might want to Remove all the other elements, otherwise parts of the Bookmark might be replaced, while other following parts will stay.
  5. And I'm not sure if my last hack is necessary, since I don't know all the limitations of OpenXML, but after discovering the previous 4, I also didn't trust anymore that there will be a sibling of Run, with a child of Text. So instead I just look at all my siblings (until BookmarEnd which has the same ID as BookmarkStart) and check all the children until I find any Text. - Maybe somebody with more experience with OpenXML can answer if it is necessary?

You can view my specific implementation here)

Hope this helps some of you who experienced the same issues.

Share:
53,929

Related videos on Youtube

Mr. Boy
Author by

Mr. Boy

SOreadytohelp

Updated on July 09, 2022

Comments

  • Mr. Boy
    Mr. Boy almost 2 years

    I assume v2.0 is better... they have some nice "how to:..." examples but bookmarks don't seem to act as obviously as say a Table... a bookmark is defined by two XML elements BookmarkStart & BookmarkEnd. We have some templates with text in as bookmarks and we simply want to replace bookmarks with some other text... no weird formatting is going on but how do I select/replace bookmark text?

  • Mr. Boy
    Mr. Boy almost 14 years
    I suppose what I want to do is use start/end bookmark tags to let me select a portion of text (a run?) and modify it. It seems pretty random where the bookmarks are stored though, mine are all in doc.MainDocumentPart.Document.Body.Descendants
  • John Farrell
    John Farrell almost 14 years
    @John They are inside the tree at the place in the document they were added. Nothing random about it at all. Everything is going to be in Body.Descendants. Body.Elements only gets first level children. Wait, maybe I should just be searching Descendants...
  • Tim Post
    Tim Post over 12 years
    Note, I've translated this answer (with a lot of help from Google). Please check it for accuracy. In the future, please post in English.
  • Andrew Barber
    Andrew Barber about 11 years
    Please note that you should post the useful points of an answer here, on this site, or your post risks being deleted as "Not an Answer". You may still include the link if you wish, but only as a 'reference'. The answer should stand on its own without needing the link.
  • Saber
    Saber about 11 years
    you are following a very simple pattern here which will not work in all cases. In many cases bookmark replacing gets a lot more complicated which will not work with this algorithm.
  • Saad A
    Saad A almost 7 years
    This doesn't work for me, it doesn't give me any errors and I confirm its reading the bookmarks but not replacing them with the text.
  • Saad A
    Saad A almost 7 years
    This is the one that worked for me, just make sure to add the following lines to save the changes back to your document, file.MainDocumentPart.Document.Save(); file.Close(); file is the file you opened using WordprocessingDocument.Open("path", true)
  • Cee McSharpface
    Cee McSharpface almost 6 years
    this works well but it is very brittle if you cannot limit to text-only, single-word bookmark runs. an example: "BOOKMARK" will work, "BOOKMARK1" will not be found because it gets separated into "BOOKMARK" and "1" (tested with Microsoft Word 2016 on Windows 10, desktop version), ending up with a partial substitution.
  • confusedandamused
    confusedandamused about 5 years
    Did you ever have issues with changing/matching font styles/sizes? I can insert text, but the text size/style that was originally in the template get overwritten with a "default" style of sorts. Any ideas?