Problems serializing a class to XML and including a CDATA section

16,648

Solution 1

In response to the 'spaces' you are seeing after your edit, it is because of the encoding you are using (Unicode, 2 bytes per character).

Try:

settings.Encoding = new Utf8Encoding(false);

EDIT:

Also, note that format of the MemoryStream is not necessarily a valid UTF-8 encoded string! You can use a StringBuilder instead of MemoryStream to create your inner writer.

    public void WriteXml(XmlWriter writer)   
    {   
        XmlSerializerNamespaces ns = new XmlSerializerNamespaces();   
        ns.Add("", "");   

        XmlWriterSettings settings = new XmlWriterSettings();   

        settings.OmitXmlDeclaration = true;   
        settings.Indent = true;   

        StringBuilder sb = new StringBuilder();
        using (XmlWriter innerWriter = XmlWriter.Create(sb, settings))   
        {   
            shipmentInfoSerializer.Serialize(innerWriter, this.Shipment,ns);   
            innerWriter.Flush();   
            writer.WriteCData(sb.ToString());   
        }   
    }

Solution 2

Could this be of any help: http://msdn.microsoft.com/en-us/library/system.xml.xmldocument.createcdatasection.aspx

//Create a CData section.
XmlCDataSection CData;
CData = doc.CreateCDataSection("All Jane Austen novels 25% off starting 3/23!");    

//Add the new node to the document.
XmlElement root = doc.DocumentElement;
root.AppendChild(CData);  

Console.WriteLine("Display the modified XML...");        
doc.Save(Console.Out);

Also, what Exception did you get when using the attribute?

-- edit --

You could try adding a custom class, and do something like this:

some xml serializable class,
 {
    .......

    [XmlElement("PayLoad", Type=typeof(CDATA))]
    public CDATA PayLoad
    {
       get { return _payLoad; }
       set { _payLoad = value; }
    }
 }


 public class CDATA : IXmlSerializable
 {
    private string text;
    public CDATA()
    {}

    public CDATA(string text)
    {
       this.text = text;
    }

    public string Text
    {
       get { return text; }
    }

    /// <summary>
    /// Interface implementation not used here.
    /// </summary>
    XmlSchema IXmlSerializable.GetSchema()
    {
       return null;
    }

    /// <summary>
    /// Interface implementation, which reads the content of the CDATA tag
    /// </summary>
    void IXmlSerializable.ReadXml(XmlReader reader)
    {
       this.text = reader.ReadElementString();
    }

    /// <summary>
    /// Interface implementation, which writes the CDATA tag to the xml
    /// </summary>
    void IXmlSerializable.WriteXml(XmlWriter writer)
    {
       writer.WriteCData(this.text);
    }
 }

As found here http://bytes.com/topic/net/answers/530724-cdata-xmltextattribute

Solution 3

Implementing ShipmentInfo as an IXmlSerializable type will get close to what you need - see example below.

public class StackOverflow_11471676
{
    public class UpdateOrderStatus
    {
        public int Action { get; set; }
        public ValueInfo Value { get; set; }
    }
    [XmlType(TypeName = "Shipment")]
    public class ShipmentInfo
    {
        public string Header { get; set; }
        public string Body { get; set; }
    }
    public class ValueInfo : IXmlSerializable
    {
        public ShipmentInfo Shipment { get; set; }
        private XmlSerializer shipmentInfoSerializer = new XmlSerializer(typeof(ShipmentInfo));

        public System.Xml.Schema.XmlSchema GetSchema()
        {
            return null;
        }

        public void ReadXml(XmlReader reader)
        {
            using (MemoryStream ms = new MemoryStream(
                Encoding.UTF8.GetBytes(
                    reader.ReadContentAsString())))
            {
                Shipment = (ShipmentInfo)this.shipmentInfoSerializer.Deserialize(ms);
            }
        }

        public void WriteXml(XmlWriter writer)
        {
            using (MemoryStream ms = new MemoryStream())
            {
                using (XmlWriter innerWriter = XmlWriter.Create(ms, new XmlWriterSettings { OmitXmlDeclaration = true }))
                {
                    shipmentInfoSerializer.Serialize(innerWriter, this.Shipment);
                    innerWriter.Flush();
                    writer.WriteCData(Encoding.UTF8.GetString(ms.ToArray()));
                }
            }
        }
    }
    public static void Test()
    {
        UpdateOrderStatus obj = new UpdateOrderStatus
        {
            Action = 1,
            Value = new ValueInfo
            {
                Shipment = new ShipmentInfo
                {
                    Header = "Shipment header",
                    Body = "Shipment body"
                }
            }
        };

        XmlSerializer xs = new XmlSerializer(typeof(UpdateOrderStatus));
        MemoryStream ms = new MemoryStream();
        xs.Serialize(ms, obj);
        Console.WriteLine(Encoding.UTF8.GetString(ms.ToArray()));
    }
}

Solution 4

The below example is only when the structure of the schema is defined and you have no choice of altering the schema.

When you Deserialize/Serialize a [xmltext] value it is very difficult to hold the text in the CDATA[] enclose. you can use compiletransform to get the CDATA value in the xml as it is but the CDATA is lost as soon as you deserialize in C# and load to memory stuff.

This is one of the easiest way to do

  1. deserialize/Serialize
  2. once the final xml output is derived. The final xml can be converted to string and parsed as shown below and return it as string will embed CDATA to the test1 value.
string xml ="<test><test1>@#@!#!#!@#!@#%$%@#$%#$%</test1></test>";
XNamespace ns = @"";
XDocument doc = XDocument.Parse(xml);
string xmlString = string.Empty; 
var coll = from query in doc.Descendants(ns + "test1")
                select query;

foreach (var value in coll){
    value.ReplaceNodes(new XCData(value .Value));
}

doc.save("test.xml");// convert doc.tostring()

Share:
16,648

Related videos on Youtube

Robert H
Author by

Robert H

SOreadytohelp

Updated on June 04, 2022

Comments

  • Robert H
    Robert H almost 2 years

    I have a class that is serialized into XML for consumption by a web service. In this classes instance the XML must include a CDATA section for the web service to read it but I am at a loss on how to implement this.

    The XML needs to look like:

    <UpdateOrderStatus> 
        <Action>2</Action> 
            <Value> 
                <![CDATA[ 
                    <Shipment> 
                        <Header> 
                            <SellerID>
                                ...
                 ]]>
             </Value>
     </UpdateOrderStatus>
    

    I am able to generate the appropriate XML, except for the CDATA part.

    My class structure looks like:

    public class UpdateOrderStatus
    {
        public int Action { get; set; }
    
    
        public ValueInfo Value { get; set; }
    
        public UpdateOrderStatus()
        {
            Value = new ValueInfo();
        }
    
    
        public class ValueInfo
        {
            public ShipmentInfo Shipment { get; set; }
    
            public ValueInfo()
            {
                Shipment = new ShipmentInfo();
            }
    
            public class ShipmentInfo
            {
                public PackageListInfo PackageList { get; set; }
                public HeaderInfo Header { get; set; }
                public ShipmentInfo()
                {
                    PackageList = new PackageListInfo();
                    Header = new HeaderInfo();
                }
    
             ....
    

    I have seen some suggestions on using:

    [XmlElement("node", typeof(XmlCDataSection))]
    

    but that causes an exception

    I have also tried

     [XmlElement("Value" + "<![CDATA[")]
    

    but the resulting XML is incorrect showing

     <Value_x003C__x0021__x005B_CDATA_x005B_>
     ....
     </Value_x003C__x0021__x005B_CDATA_x005B_>
    

    Can anyone show me what I am doing wrong, or where I need to go with this?

    --Edit--

    making shipmentInfo serializable per carlosfigueira works for the most part, however I get extra ? characters in the resulting XML ( see post Writing an XML fragment using XmlWriterSettings and XmlSerializer is giving an extra character for details )

    As such I changed the Write XML method to:

    public void WriteXml(XmlWriter writer)
            {
                using (MemoryStream ms = new MemoryStream())
                {
                    XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
                    ns.Add("", "");
    
                    XmlWriterSettings settings = new XmlWriterSettings();
    
                    settings.OmitXmlDeclaration = true;
                    settings.Encoding = new UnicodeEncoding(bigEndian: false, byteOrderMark: false);
                    settings.Indent = true;
    
                    using (XmlWriter innerWriter = XmlWriter.Create(ms, settings))
                    {
                        shipmentInfoSerializer.Serialize(innerWriter, this.Shipment,ns);
                        innerWriter.Flush();
                        writer.WriteCData(Encoding.UTF8.GetString(ms.ToArray()));
                    }
                }
            }
    

    However I am not getting an exception:

    System.InvalidOperationException: There was an error generating the XML document. ---> System.ArgumentException: '.', hexadecimal
    value 0x00, is an invalid character.
    

    --Edit --

    The exception was caused by the inclusion of my previous serializeToString method. Since removing that the CDATA output is correct, except for a spacing issue, but I am also getting a namespace and xml declaration that should be removed by the XML settings specified. Output is:

    <?xml version="1.0"?>
    <UpdateOrderStatus xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
      <Action>1</Action>
      <Value><![CDATA[< S h i p m e n t I n f o >
         < P a c k a g e L i s t >
             < P a c k a g e >
                 < S h i p D a t e > 2 0 1 2 - 0 7 - 1 3 T 1 1 : 5 8 : 5 1 . 0 9 2 5 6 1 5 - 0 4 : 0 0 < / S h i p D a t e >
                 < I t e m L i s t >
                     < I t e m >
                         < S h i p p e d Q t y > 0 < / S h i p p e d Q t y >
                     < / I t e m >
                 < / I t e m L i s t >
             < / P a c k a g e >
         < / P a c k a g e L i s t >
         < H e a d e r >
             < S e l l e r I d > S h i p m e n t   h e a d e r < / S e l l e r I d >
             < S O N u m b e r > 0 < / S O N u m b e r >
         < / H e a d e r >
     < / S h i p m e n t I n f o > ]]></Value>
    </UpdateOrderStatus>
    

    Any ideas of avoiding the BOM using the new class?

    --Edit 3 -- SUCCESS!

    I have implemented changes suggested below and now have the following writer class and test methods:

     UpdateOrderStatus obj = new UpdateOrderStatus();
    
            obj.Action = 1;
            obj.Value = new UpdateOrderStatus.ValueInfo();
            obj.Value.Shipment = new UpdateOrderStatus.ValueInfo.ShipmentInfo();
            obj.Value.Shipment.Header.SellerId = "Shipment header";
            obj.Value.Shipment.PackageList = new UpdateOrderStatus.ValueInfo.ShipmentInfo.PackageListInfo();
            obj.Value.Shipment.PackageList.Package = new UpdateOrderStatus.ValueInfo.ShipmentInfo.PackageListInfo.PackageInfo();
            obj.Value.Shipment.PackageList.Package.ShipDate = DateTime.Now;
    
    
    
            XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
            ns.Add("", "");
            XmlWriterSettings settings = new XmlWriterSettings();
            settings.OmitXmlDeclaration = true;
            settings.Encoding = new UTF8Encoding(false);
            settings.Indent = true;
            XmlSerializer xs = new XmlSerializer(typeof(UpdateOrderStatus));
            MemoryStream ms = new MemoryStream();
    
    
            XmlWriter writer = XmlWriter.Create(ms, settings);
            xs.Serialize(writer, obj, ns);
            Console.WriteLine(Encoding.UTF8.GetString(ms.ToArray()));
        }
    
    
    public void WriteXml(XmlWriter writer)
            {
    
                XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
                ns.Add("", "");
    
                XmlWriterSettings settings = new XmlWriterSettings();
    
                settings.OmitXmlDeclaration = true;
                settings.Indent = true;
    
                StringBuilder sb = new StringBuilder();
                using (XmlWriter innerWriter = XmlWriter.Create(sb, settings))
                {
                    shipmentInfoSerializer.Serialize(innerWriter, this.Shipment, ns);
                    innerWriter.Flush();
                    writer.WriteCData(sb.ToString());
                }   
            }
    

    This produces the following XML:

    <UpdateOrderStatus>
      <Action>1</Action>
      <Value><![CDATA[<ShipmentInfo>
      <PackageList>
        <Package>
          <ShipDate>2012-07-13T14:05:36.6170802-04:00</ShipDate>
          <ItemList>
            <Item>
              <ShippedQty>0</ShippedQty>
            </Item>
          </ItemList>
        </Package>
      </PackageList>
      <Header>
        <SellerId>Shipment header</SellerId>
        <SONumber>0</SONumber>
      </Header>
    </ShipmentInfo>]]></Value>
    </UpdateOrderStatus>
    
    • Pawel
      Pawel almost 12 years
      Note that CDATA section is a way of escaping xml to make it more readable. The content of CDATA is not Xml. <![CDATA[<test />]]> is an equivalent of &lt;test /&gt; If you are sure that the content is always a valid Xml document you should be able to pre-process the document to remove CDATA and unescape the content of CDATA section. Note that doing so if the content is not valid Xml will make the whole document invalid. Another option is to implement IXmlSerializable as noted below but once you start it will grow and can be hard to maintain.
  • Robert H
    Robert H almost 12 years
    I get exception: System.InvalidOperationException: Unable to generate a temporary class (result=1).
  • Gerald Versluis
    Gerald Versluis almost 12 years
    Try putting the attribute like this [XmlElement("CDataElement")]
  • Robert H
    Robert H almost 12 years
    when I do that it renames Value to CDataElement, which will fail the web service validation
  • Robert H
    Robert H almost 12 years
    I have also tried: UpdateString.Replace(@"<Value>", @"<Value><![CDATA["); UpdateString.Replace(@"</Value>", @"]]></Value>"); with no luck
  • Gerald Versluis
    Gerald Versluis almost 12 years
    See edit :-) Add your own CDATA type and use the attribute like the above class in your own class
  • Robert H
    Robert H almost 12 years
    I get the same exception: System.InvalidOperationException: Unable to generate a temporary class (result=1). error CS0030: Cannot convert type 'UpdateOrderStatus.ValueInfo' to 'CDATA' error CS0029: Cannot implicitly convert type 'CDATA' to 'UpdateOrderStatus.ValueInfo'
  • Robert H
    Robert H almost 12 years
    So I made some changes as your code almost does what I need it to - I need to remove some extra ? characters ( see another post by me ),but its throwing an exception. Please see edit for more information.
  • Robert H
    Robert H almost 12 years
    Ok, in reading through the document you linked to, my line " XmlWriter writer = XmlWriter.Create(ms, settings);" should keep the encoding listed in the settings as I do not specify encoding any place else, however when I execute I get an exception: "System.InvalidOperationException: There was an error generating the XML document. ---> System.ArgumentException: '.', hexadecimal value 0x00, is an invalid character."
  • Monroe Thomas
    Monroe Thomas almost 12 years
    @RobertH I think this is because the MemoryStream contents are not necessarily a valid UTF8 encoded string. I've updated my answer.
  • carlosfigueira
    carlosfigueira almost 12 years
    To remove the BOM, set the Encoding property of the XmlWriterSettings to the an instance without the BOM: new XmlWriterSettings { OmitXmlDeclaration = true, Encoding = new UTF8Encoding(false) }