Escaping new-line characters with XmlDocument

15,723

How about using HttpUtility.HtmlEncode()?
http://msdn.microsoft.com/en-us/library/73z22y6h.aspx

OK, sorry about the wrong lead there. HttpUtility.HtmlEncode() will not handle the newline issue you're facing.

This blog link will help you out, though
http://weblogs.asp.net/mschwarz/archive/2004/02/16/73675.aspx

Basically, the newline handling is controlled by the xml:space="preserve" attribute.

Sample working code:

XmlDocument doc = new XmlDocument();
doc.LoadXml("<ROOT/>");
doc.DocumentElement.InnerText = "1234\r\n5678";

XmlAttribute e = doc.CreateAttribute(
    "xml", 
    "space", 
    "http://www.w3.org/XML/1998/namespace");
e.Value = "preserve";
doc.DocumentElement.Attributes.Append(e);

var child = doc.CreateElement("CHILD");
child.InnerText = "1234\r\n5678";
doc.DocumentElement.AppendChild(child);

Console.WriteLine(doc.InnerXml);
Console.ReadLine();

The output will read:

<ROOT xml:space="preserve">1234
5678<CHILD>1234
5678</CHILD></ROOT>
Share:
15,723
Chris C.
Author by

Chris C.

Updated on June 04, 2022

Comments

  • Chris C.
    Chris C. almost 2 years

    My application generates XML using XmlDocument. Some of the data contains newline and carriage return characters.

    When text is assigned to an XmlElement like this:

       e.InnerText = "Hello\nThere";
    

    The resulting XML looks like this:

    <e>Hello
    There</e>
    

    The receiver of the XML (which I have no control over) treats the new-line as white space and sees the above text as:

     "Hello There"
    

    For the receiver to retain the new-line it requires the encoding to be:

    <e>Hello&#xA;There</e>
    

    If the data is applied to an XmlAttribute, the new-line is properly encoded.

    I've tried applying text to XmlElement using InnerText and InnerXml but the output is the same for both.

    Is there a way to get XmlElement text nodes to output new-lines and carriage-returns in their encoded forms?

    Here is some sample code to demonstrate the problem:

    string s = "return[\r] newline[\n] special[&<>\"']";
    XmlDocument d = new XmlDocument();
    d.AppendChild( d.CreateXmlDeclaration( "1.0", null, null ) );
    XmlElement  r = d.CreateElement( "root" );
    d.AppendChild( r );
    XmlElement  e = d.CreateElement( "normal" );
    r.AppendChild( e );
    XmlAttribute a = d.CreateAttribute( "attribute" );
    e.Attributes.Append( a );
    a.Value = s;
    e.InnerText = s;
    s = s
        .Replace( "&" , "&amp;"  )
        .Replace( "<" , "&lt;"   )
        .Replace( ">" , "&gt;"   )
        .Replace( "\"", "&quot;" )
        .Replace( "'" , "&apos;" )
        .Replace( "\r", "&#xD;"  )
        .Replace( "\n", "&#xA;"  )
    ;
    e = d.CreateElement( "encoded" );
    r.AppendChild( e );
    a = d.CreateAttribute( "attribute" );
    e.Attributes.Append( a );
    a.InnerXml = s;
    e.InnerXml = s;
    d.Save( @"C:\Temp\XmlNewLineHandling.xml" );
    

    The output of this program is:

    <?xml version="1.0"?>
    <root>
      <normal attribute="return[&#xD;] newline[&#xA;] special[&amp;&lt;&gt;&quot;']">return[
    ] newline[
    ] special[&amp;&lt;&gt;"']</normal>
      <encoded attribute="return[&#xD;] newline[&#xA;] special[&amp;&lt;&gt;&quot;']">return[
    ] newline[
    ] special[&amp;&lt;&gt;"']</encoded>
    </root>
    

    Thanks in advance. Chris.