Set HTML5 doctype with XSLT

84,362

Solution 1

I think this is currently only supported by writing the doctype out as text:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="html" encoding="utf-8" indent="yes" />

  <xsl:template match="/">
    <xsl:text disable-output-escaping='yes'>&lt;!DOCTYPE html&gt;</xsl:text>
    <html>
    </html>
  </xsl:template>

</xsl:stylesheet>

This will produce the following output:

<!DOCTYPE html>
<html>
</html>

Solution 2

To use the simple HTML doctype <!DOCTYPE html>, you have to use the disable-output-escaping feature: <xsl:text disable-output-escaping="yes">&lt;!DOCTYPE html&gt;</xsl:text>. However, disable-output-escaping is an optional feature in XSLT, so your XSLT engine or serialization pipeline might not support it.

For this reason, HTML5 provides an alternative doctype for compatibility with HTML5-unaware XSLT versions (i.e. all the currently existing versions of XSLT) and other systems that have the same problem. The alternative doctype is <!DOCTYPE html SYSTEM "about:legacy-compat">. To output this doctype, use the attribute doctype-system="about:legacy-compat" on the xsl:output element without using a doctype-public attribute at all.

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output method="html" doctype-system="about:legacy-compat"/>
   ...
   <html>
   </html>
</xsl:stylesheet>

Solution 3

<xsl:output
     method="html"
     doctype-system="about:legacy-compat"
     encoding="UTF-8"
     indent="yes" />

this outputs

<!DOCTYPE html SYSTEM "about:legacy-compat">

this is modified as my fix to http://ukchill.com/technology/generating-html5-using-xslt/

Solution 4

With Saxon 9.4 you can use:

<xsl:output method="html" version="5.0" encoding="UTF-8" indent="yes" />

This generates:

<!DOCTYPE HTML>

Solution 5

Use doctype-system instead of doctype-public

Share:
84,362
Jon Hadley
Author by

Jon Hadley

Man of science, more or less agog.

Updated on February 12, 2020

Comments

  • Jon Hadley
    Jon Hadley about 4 years

    How would I cleanly set the doctype of a file to HTML5 <!DOCTYPE html> via XSLT (in this case with collective.xdv)

    The following, which is the best my Google foo has been able to find:

    <xsl:output
        method="html"
        doctype-public="XSLT-compat"
        omit-xml-declaration="yes"
        encoding="UTF-8"
        indent="yes" />
    

    produces:

    <!DOCTYPE html PUBLIC "XSLT-compat" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    
  • Jon Hadley
    Jon Hadley over 13 years
    That still leaves "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" in the doctype.
  • Admin
    Admin over 13 years
    This is the only standar way. But, with MSXSL, there is a non standar way: use empty xsl:output/@doctype-public and xsl:output/@doctype-system.
  • Admin
    Admin over 13 years
    Yes, but as I've commented in 0xA3 answer, empty @doctype-system or @doctype-public are not standar (also, it's against the spec!)
  • Jirka Kosek
    Jirka Kosek over 13 years
    if <xsl:output doctype-system="about:legacy-compat" method="html"/> produces what you say, then there is definitively bug in your XSLT processor you use.
  • Jon Hadley
    Jon Hadley over 13 years
    I appreciate this is probably the correct, standards driven way to accomplish what I want (I've upvoted it as such). But the former isn't supported (my processor falls over) and the latter still results in "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" in my doctype. As @Jirka Kosek suggested, I think my XSLT processor might be broken.
  • Jon Hadley
    Jon Hadley over 13 years
    Deliverance (the XSLT processor I am using) mailing list discussion regarding this problem is here: coactivate.org/projects/deliverance/lists/…
  • Admin
    Admin about 13 years
    If your XSLT processor adds elements to your stylesheets or has some non-standards attribute default values, that would mean it's broken.
  • Laurence Rowe
    Laurence Rowe about 13 years
    @Alejandro: XDV (now renamed diazo) is not an XSLT processor, it is a theme -> XSLT compiler. It is XDV which is adding the the default values into the compiled XSLT. I know this because I wrote it ;)
  • yegor256
    yegor256 over 12 years
    disable-output-escaping was meant by Casey
  • Jon Hadley
    Jon Hadley almost 12 years
    I am no longer working on this project, so unable to test - however, marking this as best answer based on up-votes.
  • jgroenen
    jgroenen over 11 years
    I use this all the time. Thanks.
  • Paulb
    Paulb almost 10 years
    Unfortunately, it's specific to Saxon. On the otherhand, it is simply the most concise answer to the Q. I wonder if this works with the other XSLT 2.0 processors?
  • cgatian
    cgatian almost 9 years
    Saved me... Thank you
  • rustyx
    rustyx over 8 years
    Where is this behavior specified? This definitely doesn't work in JAXP XSLT.
  • greenland
    greenland over 8 years
    This worked great once I removed both internal and public doc type attributes from the output method tag. Thanks!
  • Igor Mironenko
    Igor Mironenko almost 8 years
    xml.apache.org/xalan-j this one gives nowhere near what you're expecting - maybe just age.
  • Tom Hillman
    Tom Hillman over 7 years
    This will work most of the time, but it is a hack, and it is unlikely (i.e. won't) work as expected if you are not serialising your result back to a text file on disk (e.g. if the result of the transform is being passed on to another process without serialisation).
  • sideshowbarker
    sideshowbarker about 7 years
    This is now no longer specific just to Saxon but is also supported in the libxslt/xsltproc sources. See the details at the end of stackoverflow.com/questions/3387127/set-html5-doctype-with-x‌​slt/…
  • Adrian W
    Adrian W almost 6 years
    The w3c validator service issues a warning when the document starts with <!DOCTYPE html SYSTEM "about:legacy-compat">
  • Adrian W
    Adrian W almost 6 years
    The w3c validator service issues a warning when the document starts with <!DOCTYPE html SYSTEM "about:legacy-compat">
  • earcam
    earcam almost 5 years
    If the doctype and opening html tag end up on the same line, then you can simply add a newline <xsl:text disable-output-escaping='yes'>&lt;!DOCTYPE html&gt;\n</xsl:text> (at least in Java's JAX, some 9 years later)
  • mistertodd
    mistertodd over 2 years
    @AdrianW The warning is "Documents should not use about:legacy-compat, except if generated by legacy systems that can't output the standard <!DOCTYPE html> doctype.", which is exactly what is happening here with xslt. This system is a legacy system that must emit a System ID. The HTML spec makes it very clear that <!DOCTYPE html SYSTEM "about:legacy-compat"> is the correct html5 doctype.