XML to XML using XSLT

11,907

Solution 1

Remove your second template rule. The first template rule (the identity rule) will already copy attributes for you. By including the second one (which has the explicit <xsl:attribute> instruction), you're creating a conflict--an error condition, and the XSLT processor is recovering by picking the one that comes later in your stylesheet. The reason the "id" attribute is empty is that your second rule is creating a new attribute with the same name but with no value. But again, that second rule is unnecessary anyway, so you should just delete it. That will solve the missing attribute value issue.

As for the output encoding, it sounds like your XSLT processor is not honoring the <xsl:output> directive you've given it, or it's being invoked in a context (such as a server-side framework?) where the encoding is determined by the framework, rather than the XSLT code. What XSLT processor are you using and how are you invoking it?

UPDATE (re: character encoding):

The save Method (DOMDocument) documentation says this:

Character encoding is based on the encoding attribute in the XML declaration, such as <?xml version="1.0" encoding="windows-1252"?>. When no encoding attribute is specified, the default setting is UTF-8.

I would try using transformNodeToObject() and save() instead of outputting to a string.

I haven't tested this, but you probably want something like this:

var result = new ActiveXObject("Microsoft.XMLDOM")

// Transform
xml.transformNodeToObject(xsl, result);

result.save("Output.xml");

UPDATE (re: unwanted whitespace):

If you want to have ultimate control over what whitespace appears in the result, you should not specify indent="yes" on the <xsl:output> element. Try removing that.

Solution 2

Try this:

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" encoding="UTF-8" indent="yes" omit-xml-declaration="no" />

    <xsl:template match="@*|node()">
      <xsl:copy>
          <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
  </xsl:template>

    <!-- You don't actually need this template -->
    <!-- but I think this was what you were trying to do -->
    <xsl:template match="@*" priority="2">
      <xsl:attribute namespace="{namespace-uri()}" name="{name()}"><xsl:value-of select="."/></xsl:attribute>
    </xsl:template>

  <xsl:template match="AccountName" priority="2">
  <AccountName>acc_no</AccountName>
  </xsl:template>

</xsl:stylesheet>

As for the UTF issue, you are doing the right thing.

From www.w3.org/TR/xslt: The encoding attribute specifies the preferred encoding to use for outputting the result tree. XSLT processors are required to respect values of UTF-8 and UTF-16.

Share:
11,907
Sumit
Author by

Sumit

Updated on June 04, 2022

Comments

  • Sumit
    Sumit almost 2 years

    I am trying to create a new XML file from an exisiting one using XSL. When writing the new file, I want to mask data appearing in the accountname field.

    This is how my XML looks like:

    <?xml version="1.0" encoding="UTF-8"?>
    <Sumit>
        <AccountName>Sumit</AccountName>
          <CCT_datasetT id="Table">
           <row>
             <CCTTitle2>Title</CCTTitle2>
           </row>
           </CCT_datasetT>
    </Sumit>
    

    Here is my XSL Code:

    <xsl:stylesheet version="1.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" encoding="UTF-8" indent="yes" omit-xml-declaration="no" />
    
      <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    
      <xsl:template match="@*">
        <xsl:attribute namespace="{namespace-uri()}" name="{name()}"/>
      </xsl:template>
    
    <xsl:template match="AccountName">
    <AccountName>acc_no</AccountName>
    </xsl:template>
    
    </xsl:stylesheet>
    

    When I apply the XSL code to my XML, I get the following output:

    <?xml version="1.0" encoding="UTF-16"?>
    <Sumit>
    <AccountName>acc_no</AccountName>
    <CCT_datasetT id="">
    <row>
    <CCTTitle2>Title</CCTTitle2>
    </row>
    </CCT_datasetT>
    </Sumit>
    

    with the following issues:

    1) It creates the output using UTF-16 encoding

    2) The output of the second line is:

    <CCT_datasetT id="">
    

    The attribute value(Table) is missing.

    Can anyone please tell me how do I get rid of these two issues. Many thanks.


    @Evan Lenz:

    Here is the javascript code:

    var oArgs = WScript.Arguments;
    
    if (oArgs.length == 0)
    {
       WScript.Echo ("Usage : cscript xslt.js xml xsl");
       WScript.Quit();
    }
    xmlFile = oArgs(0) + ".xml";
    xslFile = oArgs(1) + ".xsl";
    
    
    var xml = new ActiveXObject("Microsoft.XMLDOM")
    xml.async = false
    xml.load(xmlFile)
    
    // Load the XSL
    var xsl = new ActiveXObject("Microsoft.XMLDOM")
    xsl.async = false
    xsl.load(xslFile)
    
    // Transform
    var msg = xml.transformNode(xsl)
    
    
    
    var fso = new ActiveXObject("Scripting.FileSystemObject");
    
    
    
    // Open the text file at the specified location with write mode
    
    var txtFile = fso.OpenTextFile("Output.xml", 2, false, 0);
    
    txtFile.Write(msg);
    txtFile.close();
    

    It creates the output in a new file "Output.xml", but I don't know why the encoding is getting changed. I am more concerned about it, because of the following reason:

    My input XML containg the following code:

    <Status></Status>
    

    And in the output it appears as

    <Status>
    </Section>
    

    A carriage return is introduced for all empty tags. I am not sure, if it has something to do with the encoding. Please suggest.

    Many Thanks.

  • Sumit
    Sumit almost 15 years
    Thanks for the help on the first point. Removing the template, makes it work. For the second, I am using a javascript to apply the XSL code to my XML. It doesn't allow me to put more than 600 characters here, so I am posting the code as a reply.
  • Sumit
    Sumit almost 15 years
    Mark, as you and Evan pointed out, I don't need the template. But having one with your code, does solve the issue. Many thanks for the reply. :) Any idea on why the encoding is getting changed?