How to ignore the validation of Unknown tags?

14,991

Solution 1

Conclusion:

This is not possible with XSD. All the approaches I was trying to achieve the requirement were named as "ambiguous" by validation-tools, accompanying bunch of errors.

Solution 2

In case your not already done with this, you might try the following:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="root" type="root"></xs:element>
  <xs:complexType name="root">
    <xs:sequence>
      <xs:any maxOccurs="2" minOccurs="0" processContents="skip"/>
      <xs:element name="node" type="xs:string"/>
      <xs:any maxOccurs="2" minOccurs="0" processContents="skip"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

Under Linux this works fine with xmllint using libxml version 20706.

Solution 3

You could make use of a new feature in XML 1.1 called "Open Content". In short, it allows you to specify that additional "unknown" elements can be added to a complex type in various positions, and what the parser should do if it hit any of those elements.

Using XML 1.1, your complex type would become:

<xs:element name="root" type="root" />
<xs:complexType name="root"> 
  <xs:openContent mode="interleave">
    <xs:any namespace="##any" processContents="skip"/>
  </xs:openContent>

  <xs:sequence> 
    <xs:element name="node" type="xs:string"/> 
  </xs:sequence> 
</xs:complexType>

If you have a lot of complex types, you can also set a "default" open content mode at the top of your schema:

<xs:schema ...>
  <xs:defaultOpenContent mode="interleave">
    <xs:any namespace="##any" processContents="skip"/>
  </xs:defaultOpenContent>

  ...
</xs:schema>

The W3C spec for Open Content can be found at http://www.w3.org/TR/xmlschema11-1/#oc, and there's a good writeup of this at http://www.ibm.com/developerworks/library/x-xml11pt3/#N102BA.

Unfortunately, .NET doesn't support XML 1.1 as of yet I can't find any free XML 1.1 processors - but a couple of paid-for options are:

Solution 4

Maybe its is possible to use namespaces:

<xs:element name="root" type="root"></xs:element> 
  <xs:complexType name="root"> 
    <xs:sequence> 
      <xs:any maxOccurs="2" minOccurs="0" namespace="http://ns1.com" /> 
      <xs:element name="node" type="xs:string"/> 
      <xs:any maxOccurs="2" minOccurs="0" namespace="http://ns2.com"/> 
    </xs:sequence> 
  </xs:complexType>

This will probably validate.

Solution 5

I faced the same problem.

Since I called the validation from .NET; I decided to suppress the specific validation error in ValidationEventHandler as a workaround. It worked for me.

    private void ValidationEventHandler(object sender, ValidationEventArgs e)
    {
        switch (e.Severity)
        {
            case XmlSeverityType.Warning:
                // Processing warnings
                break;
            case XmlSeverityType.Error:
                if (IgnoreUnknownTags
                    && e.Exception is XmlSchemaValidationException
                    && new Regex(
                        @"The element '.*' has invalid child element '.*'\."
                        + @" List of possible elements expected:'.*'\.")
                       .IsMatch(e.Exception.Message))
                {
                    return;
                }
                // Processing errors
                break;
            default:
                throw new InvalidEnumArgumentException("Severity should be one of the valid values");
        }
    }

It is important that Thread.CurrentUICulture must be set to English or CultureInfo.InvariantCulture for the current thread for this to work.

Share:
14,991
InfantPro'Aravind'
Author by

InfantPro'Aravind'

An Infant Pro 'Aravind' Siebel, HTML, CSS, JavaScript, DHTML, XML, XPath, XSD, XSLT, VBScript, VB, C#, Core Java, RegEx, SQL and so on.. interest lies in: Languages, culture, Anime-Manga, music-melody and art :) Hobbies: Singing and Pencil sketching/shade art.. One of them is already demonstrated in my profile picture :)

Updated on July 20, 2022

Comments

  • InfantPro'Aravind'
    InfantPro'Aravind' almost 2 years

    One more challenge to the XSD capability,

    I have been sending XML files by my clients, which will be having 0 or more undefined or [call] unexpected tags (May appear in hierarchy). Well they are redundant tags for me .. so I have got to ignore their presence, but along with them there are some set of tags which are required to be validated.

    This is a sample XML:

    <root>
      <undefined_1>one</undefined_1>
      <undefined_2>two</undefined_2>
      <node>to_be_validated</node>
      <undefined_3>two</undefined_3>
      <undefined_4>two</undefined_4>
    </root>
    

    And the XSD I tried with:

      <xs:element name="root" type="root"></xs:element>
      <xs:complexType name="root">
        <xs:sequence>
          <xs:any maxOccurs="2" minOccurs="0"/>
          <xs:element name="node" type="xs:string"/>
          <xs:any maxOccurs="2" minOccurs="0"/>
        </xs:sequence>
      </xs:complexType
    

    XSD doesn't allow this, due to certain reasons.
    The above mentioned example is just a sample. The practical XML comes with the complex hierarchy of XML tags ..

    Kindly let me know if you can get a hack of it.

    By the way, The alternative solution is to insert XSL-transformation, before validation process. Well, I am avoiding it because I need to change the .Net code which triggers validation process, which is supported at the least by my company.

  • InfantPro'Aravind'
    InfantPro'Aravind' about 14 years
    well thought [+1].. but unfortunately doesn't work in my case. thanx for the response. :-)
  • InfantPro'Aravind'
    InfantPro'Aravind' over 12 years
    however it still doesn't allow first element to be ANY! :(
  • alk
    alk over 12 years
    What exactly is the issue, please?
  • InfantPro'Aravind'
    InfantPro'Aravind' over 12 years
    this is the error I am getting : Wildcard '##any' allows element 'node', and causes the content model to become ambiguous. so and so
  • alk
    alk over 12 years
    This looks like a conceptunal problem. For details please see here: w3.org/TR/xmlschema-1/#cos-nonambig An interesting fact to me is that obviously differnt tools handle this case differently. As I wrote, the solution provide by my does work with the tools mentioned.
  • InfantPro'Aravind'
    InfantPro'Aravind' over 12 years
    Well. Thanks for the info and ur valuable time :) it was helpful though I cannot implement it coz I have got to deal only with .net :)