XML Serialization and Inherited Types

56,596

Solution 1

Problem Solved!

OK, so I finally got there (admittedly with a lot of help from here!).

So summarise:

Goals:

  • I didn't want to go down the XmlInclude route due to the maintenence headache.
  • Once a solution was found, I wanted it to be quick to implement in other applications.
  • Collections of Abstract types may be used, as well as individual abstract properties.
  • I didn't really want to bother with having to do "special" things in the concrete classes.

Identified Issues/Points to Note:

  • XmlSerializer does some pretty cool reflection, but it is very limited when it comes to abstract types (i.e. it will only work with instances of the abstract type itself, not subclasses).
  • The Xml attribute decorators define how the XmlSerializer treats the properties its finds. The physical type can also be specified, but this creates a tight coupling between the class and the serializer (not good).
  • We can implement our own XmlSerializer by creating a class that implements IXmlSerializable .

The Solution

I created a generic class, in which you specify the generic type as the abstract type you will be working with. This gives the class the ability to "translate" between the abstract type and the concrete type since we can hard-code the casting (i.e. we can get more info than the XmlSerializer can).

I then implemented the IXmlSerializable interface, this is pretty straight forward, but when serializing we need to ensure we write the type of the concrete class to the XML, so we can cast it back when de-serializing. It is also important to note it must be fully qualified as the assemblies that the two classes are in are likely to differ. There is of course a little type checking and stuff that needs to happen here.

Since the XmlSerializer cannot cast, we need to provide the code to do that, so the implicit operator is then overloaded (I never even knew you could do this!).

The code for the AbstractXmlSerializer is this:

using System;
using System.Collections.Generic;
using System.Text;
using System.Xml.Serialization;

namespace Utility.Xml
{
    public class AbstractXmlSerializer<AbstractType> : IXmlSerializable
    {
        // Override the Implicit Conversions Since the XmlSerializer
        // Casts to/from the required types implicitly.
        public static implicit operator AbstractType(AbstractXmlSerializer<AbstractType> o)
        {
            return o.Data;
        }

        public static implicit operator AbstractXmlSerializer<AbstractType>(AbstractType o)
        {
            return o == null ? null : new AbstractXmlSerializer<AbstractType>(o);
        }

        private AbstractType _data;
        /// <summary>
        /// [Concrete] Data to be stored/is stored as XML.
        /// </summary>
        public AbstractType Data
        {
            get { return _data; }
            set { _data = value; }
        }

        /// <summary>
        /// **DO NOT USE** This is only added to enable XML Serialization.
        /// </summary>
        /// <remarks>DO NOT USE THIS CONSTRUCTOR</remarks>
        public AbstractXmlSerializer()
        {
            // Default Ctor (Required for Xml Serialization - DO NOT USE)
        }

        /// <summary>
        /// Initialises the Serializer to work with the given data.
        /// </summary>
        /// <param name="data">Concrete Object of the AbstractType Specified.</param>
        public AbstractXmlSerializer(AbstractType data)
        {
            _data = data;
        }

        #region IXmlSerializable Members

        public System.Xml.Schema.XmlSchema GetSchema()
        {
            return null; // this is fine as schema is unknown.
        }

        public void ReadXml(System.Xml.XmlReader reader)
        {
            // Cast the Data back from the Abstract Type.
            string typeAttrib = reader.GetAttribute("type");

            // Ensure the Type was Specified
            if (typeAttrib == null)
                throw new ArgumentNullException("Unable to Read Xml Data for Abstract Type '" + typeof(AbstractType).Name +
                    "' because no 'type' attribute was specified in the XML.");

            Type type = Type.GetType(typeAttrib);

            // Check the Type is Found.
            if (type == null)
                throw new InvalidCastException("Unable to Read Xml Data for Abstract Type '" + typeof(AbstractType).Name +
                    "' because the type specified in the XML was not found.");

            // Check the Type is a Subclass of the AbstractType.
            if (!type.IsSubclassOf(typeof(AbstractType)))
                throw new InvalidCastException("Unable to Read Xml Data for Abstract Type '" + typeof(AbstractType).Name +
                    "' because the Type specified in the XML differs ('" + type.Name + "').");

            // Read the Data, Deserializing based on the (now known) concrete type.
            reader.ReadStartElement();
            this.Data = (AbstractType)new
                XmlSerializer(type).Deserialize(reader);
            reader.ReadEndElement();
        }

        public void WriteXml(System.Xml.XmlWriter writer)
        {
            // Write the Type Name to the XML Element as an Attrib and Serialize
            Type type = _data.GetType();

            // BugFix: Assembly must be FQN since Types can/are external to current.
            writer.WriteAttributeString("type", type.AssemblyQualifiedName);
            new XmlSerializer(type).Serialize(writer, _data);
        }

        #endregion
    }
}

So, from there, how do we tell the XmlSerializer to work with our serializer rather than the default? We must pass our type within the Xml attributes type property, for example:

[XmlRoot("ClassWithAbstractCollection")]
public class ClassWithAbstractCollection
{
    private List<AbstractType> _list;
    [XmlArray("ListItems")]
    [XmlArrayItem("ListItem", Type = typeof(AbstractXmlSerializer<AbstractType>))]
    public List<AbstractType> List
    {
        get { return _list; }
        set { _list = value; }
    }

    private AbstractType _prop;
    [XmlElement("MyProperty", Type=typeof(AbstractXmlSerializer<AbstractType>))]
    public AbstractType MyProperty
    {
        get { return _prop; }
        set { _prop = value; }
    }

    public ClassWithAbstractCollection()
    {
        _list = new List<AbstractType>();
    }
}

Here you can see, we have a collection and a single property being exposed, and all we need to do is add the type named parameter to the Xml declaration, easy! :D

NOTE: If you use this code, I would really appreciate a shout-out. It will also help drive more people to the community :)

Now, but unsure as to what to do with answers here since they all had their pro's and con's. I'll upmod those that I feel were useful (no offence to those that weren't) and close this off once I have the rep :)

Interesting problem and good fun to solve! :)

Solution 2

One thing to look at is the fact that in the XmlSerialiser constructor you can pass an array of types that the serialiser might be having difficulty resolving. I've had to use that quite a few times where a collection or complex set of datastructures needed to be serialised and those types lived in different assemblies etc.

XmlSerialiser Constructor with extraTypes param

EDIT: I would add that this approach has the benefit over XmlInclude attributes etc that you can work out a way of discovering and compiling a list of your possible concrete types at runtime and stuff them in.

Solution 3

Just a quick update on this, I have not forgotten!

Just doing some more research, looks like I am on to a winner, just need to get the code sorted.

So far, I have the following:

  • The XmlSeralizer is basically a class that does some nifty reflection on the classes it is serializing. It determines the properties that are serialized based on the Type.
  • The reason the problem occurs is because a type mismatch is occurring, it is expecting the BaseType but in fact receives the DerivedType .. While you may think that it would treat it polymorphically, it doesn't since it would involve a whole extra load of reflection and type-checking, which it is not designed to do.

This behaviour appears to be able to be overridden (code pending) by creating a proxy class to act as the go-between for the serializer. This will basically determine the type of the derived class and then serialize that as normal. This proxy class then will feed that XML back up the line to the main serializer..

Watch this space! ^_^

Solution 4

It's certainly a solution to your problem, but there is another problem, which somewhat undermines your intention to use "portable" XML format. Bad thing happens when you decide to change classes in the next version of your program and you need to support both formats of serialization -- the new one and the old one (because your clients still use thier old files/databases, or they connect to your server using old version of your product). But you can't use this serializator anymore, because you used

type.AssemblyQualifiedName

which looks like

TopNamespace.SubNameSpace.ContainingClass+NestedClass, MyAssembly, Version=1.3.0.0, Culture=neutral, PublicKeyToken=b17a5c561934e089

that is contains your assembly attributes and version...

Now if you try to change your assembly version, or you decide to sign it, this deserialization is not going to work...

Solution 5

I've done things similar to this. What I normally do is make sure all the XML serialization attributes are on the concrete class, and just have the properties on that class call through to the base classes (where required) to retrieve information that will be de/serialized when the serializer calls on those properties. It's a bit more coding work, but it does work much better than attempting to force the serializer to just do the right thing.

Share:
56,596
Rob Cooper
Author by

Rob Cooper

I R GUY Available on twitter @robcthegeek

Updated on August 22, 2020

Comments

  • Rob Cooper
    Rob Cooper over 3 years

    Following on from my previous question I have been working on getting my object model to serialize to XML. But I have now run into a problem (quelle surprise!).

    The problem I have is that I have a collection, which is of a abstract base class type, which is populated by the concrete derived types.

    I thought it would be fine to just add the XML attributes to all of the classes involved and everything would be peachy. Sadly, thats not the case!

    So I have done some digging on Google and I now understand why it's not working. In that the XmlSerializer is in fact doing some clever reflection in order to serialize objects to/from XML, and since its based on the abstract type, it cannot figure out what the hell it's talking to. Fine.

    I did come across this page on CodeProject, which looks like it may well help a lot (yet to read/consume fully), but I thought I would like to bring this problem to the StackOverflow table too, to see if you have any neat hacks/tricks in order to get this up and running in the quickest/lightest way possible.

    One thing I should also add is that I DO NOT want to go down the XmlInclude route. There is simply too much coupling with it, and this area of the system is under heavy development, so the it would be a real maintenance headache!

  • Thorarin
    Thorarin almost 15 years
    I ran into this problem myself some time ago. Personally, I ended up abandoning XmlSerializer and using the IXmlSerializable interface directly, since all my classes needed to implement it anyway. Otherwise, the solutions are quite similar. Good write-up though :)
  • Arcturus
    Arcturus about 14 years
    We use XML_ properties where we convert the list to Arrays :)
  • Frederik Gheysels
    Frederik Gheysels about 14 years
    why is it necessary to declare the default constructor, if you're not allowed to use it ?
  • Silas Hansen
    Silas Hansen almost 14 years
    Because a parameterless constructor is needed in order to dynamically instantiate the class.
  • Daniel
    Daniel almost 14 years
    Hello! I've been looking for a solution like this for quite some time now. I think it's brilliant! Allthough im not able to figure out how to use it, would you mind to give an example? Are you serializing your class or the list, containing your objects?
  • Luca
    Luca over 13 years
    This is what I'm trying to do, but it is not easy as I was thinking: stackoverflow.com/questions/3897818/…
  • tcovo
    tcovo over 12 years
    Nice code. Note that the parameterless contructor could be declared private or protected to enforce that it not be available to other classes.
  • Fueled
    Fueled over 12 years
    I'm not able to make your class work in my project, but it seems to be exactly what I need! I would definitely like to see an example of how you call the Serialize method.
  • Rob Cooper
    Rob Cooper over 12 years
    @Daniel / @Fueled - Unfortunately I jumped the .NET ship a few months ago and I currently don't have a working Windows build to put some code together for you. The key bit is to ensure that you are setting the Type in the XML attribs that decorate your properties. Ensure they are also generics of the "Abstract Type". The serialiser will pick up the rest.
  • Rob Cooper
    Rob Cooper over 12 years
    @tcovo - read the code, the default paramless constructor HAS to be public - it's a requirement from the framework.
  • tcovo
    tcovo over 12 years
    @RobCooper - Sorry, I guess my earlier comment was a bit vague. Since .NET 2.0 (ca. 2006), the parameterless constructor can be private or protected for the purpose of serialization by XmlSerializer. Reference (last comment)
  • Nikodemus RIP
    Nikodemus RIP about 12 years
    After following the steps linked in the OP (Article by Hewitt), and using the fqn as done here, I had one problem left: It failed on grand children (double inherited). Adding another implicit operator solved it.
  • Justin Aquadro
    Justin Aquadro about 12 years
    Great solution. A potential "gotcha" if you're not paying attention (like me), the Xml node that represents the AbstractXmlSerializer cannot be the same node representing your concrete type. E.g. reading <MyNode Type="MyClass" MyProp="abc"> won't work, but reading <MyNode Type="MyClass"><MyClass MyProp="abc"> will. It's obvious after the fact, but my original source XML had those nodes merged.
  • Dmitriy Konovalov
    Dmitriy Konovalov almost 12 years
    Such an elegant solution! Thanks a lot!
  • Zack Jannsen
    Zack Jannsen over 11 years
    Great Solution ... but I have a question. If we are not dealing with legacy code would it make sense to avoid the "abstract" class hierarchy all together? My concern is IXmlSerializable is the LEAST efficient choice for serialization (all be it most flexible). I would assume it preferable to focus on more efficient serializers where possible. If IXmlSerializable is the only option, or you expect a lot of communication to external (non-.net) systems & devices ... then this can be a great solution for helping maintain class hierarchies.
  • Vladimir
    Vladimir almost 9 years
    This is great if you know your classes, it's the most elegant sollution. If you load new inherited classes from an external source then you can't use it unfortunately.
  • wexman
    wexman over 8 years
    Excellent solution! I just replaced type.AssemblyQualifiedName in the WriteXml method with type.Assembly.GetName().Name (same, but without version numbers) to avoid versioning issues...
  • wexman
    wexman over 8 years
    Just noticed this fails if the collection is empty. Adding if(data != null) to the WriteXml method fixes that.
  • Julien Lebot
    Julien Lebot over 7 years
    This is a very old post but for anyone looking into implementing this like we did, please note XmlSerializer's constructor with extraTypes param does not cache the assemblies it generates on the fly. This costs us weeks of debugging that memory leak. So if you are to use the extra types with the accepted answer's code, cache the serializer. This behavior is documented here: support.microsoft.com/en-us/kb/886385