schema validation with msxml in delphi

17,666

Solution 1

I've come up with an approach that seems to work. I first load the schema's explicitly, then add themn to the schemacollection. Next I load the xml-file and assign the schemacollection to its schemas property. The solution now looks like this:

uses MSXML2_TLB  
That is:  
// Type Lib: C:\Windows\system32\msxml4.dll  
// LIBID: {F5078F18-C551-11D3-89B9-0000F81FE221}

function TfrmMain.ValidXML(
    const xmlFile: String; 
    out err: IXMLDOMParseError): Boolean;
var
    xml, xsd: IXMLDOMDocument2;
    cache: IXMLDOMSchemaCollection;
begin
    xsd := CoDOMDocument40.Create;
    xsd.Async := False;
    xsd.load('http://the.uri.com/schemalocation/schema.xsd');

    cache := CoXMLSchemaCache40.Create;
    cache.add('http://the.uri.com/schemalocation', xsd);

    xml := CoDOMDocument40.Create;
    xml.async := False;
    xml.schemas := cache;

    Result := xml.load(xmlFile);
    if not Result then
      err := xml.parseError
    else
      err := nil;
end;

It is important to use XMLSchemaCache40 or later. Earlier versions don't follow the W3C XML Schema standard, but only validate against XDR Schema, a MicroSoft specification.

The disadvantage of this solution is that I need to load the schema's explicitly. It seems to me that it should be possible to retrieve them automatically.

Solution 2

I worked on the Miel´s solution to solve the disadventage. I open the xml twice, once to get the namespaces, and the other, after create the schema collection, to validate the file. It works for me. It seems like the IXMLDOMDocument2, once open, don´t accepts set the schemas property.

function TForm1.ValidXML2(const xmlFile: String;
  out err: IXMLDOMParseError): Boolean;
var
  xml, xml2, xsd: IXMLDOMDocument2;
  schemas, cache: IXMLDOMSchemaCollection;
begin
  xml := CoDOMDocument.Create;
  if xml.load(xmlFile) then
    begin
    schemas := xml.namespaces;
    if schemas.length > 0 then
      begin
      xsd := CoDOMDocument40.Create;
      xsd.Async := False;
      xsd.load(schemas.namespaceURI[0]);
      cache := CoXMLSchemaCache40.Create;
      cache.add(schemas.namespaceURI[1], xsd);
      xml2 := CoDOMDocument40.Create;
      xml2.async := False;
      xml2.schemas := cache;
      Result := xml2.load(xmlFile);
      //err := xml.validate;
      if not Result then
        err := xml2.parseError
      else
        err := nil;
      end;
    end;

Solution 3

While BennyBechDk might be on the right track, I have a few problems with his code that I'm going to correct below:

uses Classes, XMLIntf, xmlDoc, SysUtils;

function IsValidXMLDoc(aXmlDoc: IXMLDocument): boolean;
var
  validateDoc: IXMLDocument;
begin
  result := false;  // eliminate any sense of doubt, it starts false period.
  validateDoc := TXMLDocument.Create(nil);
  try   
    validateDoc.ParseOptions := [poResolveExternals, poValidateOnParse];
    validateDoc.XML := aXmlDoc.XML;
    validateDoc.Active := true;
    Result := True;
  except
    // for this example, I am going to eat the exception, normally this
    // exception should be handled and the message saved to display to 
    // the user.
  end;
end;

If you wanted the system to just raise the exception, then there is no reason to make it a function in the first place.

uses Classes, XMLIntf, XMLDoc, SysUtils;

procedure ValidateXMLDoc(aXmlDoc: IXMLDocument);
var
  validateDoc: IXMLDocument;
begin
  validateDoc := TXMLDocument.Create(nil);
  validateDoc.ParseOptions := [poResolveExternals, poValidateOnParse];
  validateDoc.XML := aXmlDoc.XML;
  validateDoc.Active := true;
end;

Because validateDoc is an interface, it will be disposed of properly as the function/procedure exits, there is no need to perform the disposal yourself. If you call ValidateXmlDoc and don't get an exception then it is valid. Personally I like the first call, IsValidXMLDoc which returns true if valid or false if not (and does not raise exceptions outside of itself).

Share:
17,666
Miel
Author by

Miel

expert systems author, delphi programmer, c# programmer

Updated on June 20, 2022

Comments

  • Miel
    Miel almost 2 years

    I'm trying to validate an XML file against the schemas it references. (Using Delphi and MSXML2_TLB.) The (relevant part of the) code looks something like this:

    procedure TfrmMain.ValidateXMLFile;
    var
        xml: IXMLDOMDocument2;
        err: IXMLDOMParseError;
        schemas: IXMLDOMSchemaCollection;
    begin
        xml := ComsDOMDocument.Create;
        if xml.load('Data/file.xml') then
        begin
            schemas := xml.namespaces;
            if schemas.length > 0 then
            begin
                xml.schemas := schemas;
                err := xml.validate;
            end;
        end;
    end;
    

    This has the result that cache is loaded (schemas.length > 0), but then the next assignment raises an exception: "only XMLSchemaCache-schemacollections can be used."

    How should I go about this?

  • Miel
    Miel over 15 years
    I can't get this to work. I get an "Undeclared identifier" error for TXMLDocument. Do I need to import something other than msxml to get this to work?
  • Miel
    Miel over 12 years
    sorry, I think it should be xml, not xmlDoc. Just to be sure I'll check before I edit.
  • Günther the Beautiful
    Günther the Beautiful over 3 years
    This does not validate an XML document against an XSD schema at all. It throws no exception or anything even on xml content like <xml />.