XML to Fixed width text file with xsl style sheet

16,582

Solution 1

The secret to doing this in XSLT 1.0 is to realize that you can combine a "padding strategy" with a "substring strategy" to either pad or cut off a piece of text to a desired width. In particular, XSLT instructions of this form:

substring(concat('value to pad or cut', '       '), 1, 5)

...where concat is used to add a number of padding characters to a string and substring is used to limit the overall width, are helpful. With that said, here's an XSLT 1.0 solution that accomplishes what you want.

Please note that in your expected output, some of the character widths do not match your requirements; for example, according to the requirements, <LastName> should be sized to 16 characters, whereas your output appears to cut it off at 13. That said, I believe my solution below outputs what you expect.

When this XSLT:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output omit-xml-declaration="no" indent="yes" method="text"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="Detail">
    <xsl:apply-templates />
    <xsl:text>&#10;</xsl:text>
  </xsl:template>

  <xsl:template match="SSN">
    <xsl:value-of
      select="substring(concat(., '         '), 1, 9)"/>
  </xsl:template>

  <xsl:template match="DOB">
    <xsl:value-of
      select="substring(concat(translate(., '/', ''), '        '), 1, 8)"/>
  </xsl:template>

  <xsl:template match="LastName">
    <xsl:value-of
      select="substring(concat(., '                '), 1, 16)"/>
  </xsl:template>

  <xsl:template match="FirstName">
    <xsl:value-of
      select="substring(concat(., '             '), 1, 13)"/>
  </xsl:template>

  <xsl:template match="Date">
    <xsl:value-of
      select="substring(concat(translate(., '/', ''), '        '), 1, 8)"/>
  </xsl:template>

  <xsl:template match="Time">
    <xsl:value-of
      select="substring(concat(., ' '), 1, 8)"/>
  </xsl:template>

  <xsl:template match="CurrentStreetAddress1">
    <xsl:value-of
      select="substring(concat(., '                            '), 1, 28)"/>
  </xsl:template>

  <xsl:template match="CurrentCity">
    <xsl:value-of
      select="substring(concat(., '                         '), 1, 25)"/>
  </xsl:template>

  <xsl:template match="CurrentStat">
    <xsl:value-of
      select="substring(concat(., '               '), 1, 15)"/>
  </xsl:template>

</xsl:stylesheet>

...is run against the provided XML (with a </Detail> added to make the document well-formed):

<Report>
  <table1>
    <Detail_Collection>
      <Detail>
        <SSN>*********</SSN>
        <DOB>1980/11/11</DOB>
        <LastName>user</LastName>
        <FirstName>test</FirstName>
        <Date>2013/02/26</Date>
        <Time>14233325</Time>
        <CurrentStreetAddress1>53 MAIN STREET</CurrentStreetAddress1>
        <CurrentCity>san diego</CurrentCity>
        <CurrentState>CA</CurrentState>
      </Detail>
    </Detail_Collection>
  </table1>
</Report>

...the wanted result is produced:

*********19801111user            test         201302261423332553 MAIN STREET              san diego                CA

Solution 2

Here are (in my view) lite more reliable and maintainable version:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
    <xsl:output method="text" indent="no"/>

    <xsl:variable name="some_spaces" select="'                                                                  '" />

    <xsl:template match="/">
        <xsl:apply-templates select="//Detail_Collection/Detail" />
    </xsl:template>

    <xsl:template match="Detail_Collection/Detail">
        <xsl:apply-templates mode="format" select="SSN">
            <xsl:with-param name="width" select="number(9-1)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format_date" select="DOB">
            <xsl:with-param name="width" select="number(17-10)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format" select="LastName">
            <xsl:with-param name="width" select="number(33-18)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format" select="FirstName">
            <xsl:with-param name="width" select="number(46-34)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format_date" select="Date">
            <xsl:with-param name="width" select="number(54-47)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format" select="Time">
            <xsl:with-param name="width" select="number(62-55)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format" select="CurrentStreetAddress1">
            <xsl:with-param name="width" select="number(90-63)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format" select="CurrentCity">
            <xsl:with-param name="width" select="number(115-91)"/>
        </xsl:apply-templates>
        <xsl:apply-templates mode="format" select="CurrentState">
            <xsl:with-param name="width" select="number(131-116)"/>
        </xsl:apply-templates>
        <xsl:text>&#10;</xsl:text>
    </xsl:template>

    <xsl:template  match="node()" mode ="format">
        <xsl:param name="width" />
        <xsl:value-of select="substring(concat(text(),$some_spaces ), 1, $width+1)"/>
    </xsl:template>
    <xsl:template  match="node()" mode="format_date">
        <xsl:param name="width" />
        <xsl:value-of select="substring(concat(translate(text(),'/',''),$some_spaces ), 1, $width+1)"/>
    </xsl:template>

</xsl:stylesheet>

It will create the right output even if the fields in input not in order with the requested output, or if fields are missing in input. Also it consider that there are more than one Detail entry.

Solution 3

To pad a string to a given length in XSLT 1.0, I'd use a combination of concat() and substring(). In a template for Detail, for example, I might write something like

<xsl:value-of 
  select="substring(concat(SSN,'          '),1,9)"/>
<xsl:value-of 
  select="substring(concat(DOB,'          '),1,8)"/>
<xsl:value-of 
  select="substring(concat(LastName,'                '),1,16)"/>
...
<xsl:text>&#xA;</xsl:text>

If you know very little about XSLT, you will also need to learn how to construct the stylesheet: XSLT typically uses template matching to drive flow of control in the stylesheet, which is often difficult for people coming from imperative programming languages to get their heads around.

If you know that every Detail element will have the same children in the same sequence (this is one thing DTDs and schemas are good for), then the simplest thing to do is to write a template for each element type that can occur in the input. The following stylesheet illustrates the pattern for some but not all elements:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

  <xsl:variable name="blanks10" select="          "/>
  <xsl:variable name="blanks" 
    select="concat($blanks10, $blanks10, $blanks10)"/>

  <!--* For Report, table1, and Detail_collection, we just 
      * recur on the children *-->
  <xsl:template match="Report | table1 | Detail_collection">
    <xsl:apply-templates select="*"/>
  </xsl:template>

  <!--* For Detail, we recur on the children and supply a
      * line-ending newline. *-->
  <xsl:template match="Detail">
    <xsl:apply-templates select="*"/>
    <xsl:text>&#xA;</xsl:text>
  </xsl:template>

  <!--* For SSN, DOB, etc., we pad the value with blanks and
      * truncate at the appropriate length. *-->
  <xsl:template match="SSN">
    <xsl:value-of select="substring(concat(.,$blanks),1,9)"
  </xsl:template>

  <!--* For DOB, we assume input is yyyy/mm/dd and output should
      * be yyyymmdd. *-->
  <xsl:template match="DOB">
    <xsl:value-of 
      select="substring(concat(translate(.,'/',''),$blanks),1,8)"
  </xsl:template>

  <xsl:template match="LastName">
    <xsl:value-of select="substring(concat(.,$blanks),1,16)"
  </xsl:template>     

  <!--* FirstName etc. left as exercise for the reader. *-->

</xsl:stylesheet>

If Detail can vary in order or population, the variation can be normalized by replacing the call to xsl:apply-templates in the template for Detail with code like that shown in the first code fragment here. That style of code also feels more natural to some procedural programmers; for that reason, I advise you to avoid it consciously while learning XSLT. If you want to learn XSLT well, it pays to become friends with xsl:apply-templates.

If you don't care about learning XSLT, then my advice is to hope that someone answers your query by giving you a complete solution to your task.

Share:
16,582
user973671
Author by

user973671

Updated on June 20, 2022

Comments

  • user973671
    user973671 almost 2 years

    I need help formatting this xml to a fixed width text file using a xsl style sheet. I know very little about xsl and have found very little information online on how this can be done.

    Basically I need this xml

    <?xml version="1.0" encoding="UTF-8"?>
    <Report>
       <table1>
          <Detail_Collection>
             <Detail>
                <SSN>*********</SSN>
                <DOB>1980/11/11</DOB>
                <LastName>user</LastName>
                <FirstName>test</FirstName>
                <Date>2013/02/26</Date>
                <Time>14233325</Time>
                <CurrentStreetAddress1>53 MAIN STREET</CurrentStreetAddress1>
                <CurrentCity>san diego</CurrentCity>
                <CurrentState>CA</CurrentState>
          </Detail_Collection>
       </table1>
    </Report>
    

    In this format, all on the same line

    *********19801111user         test       201302261423332553 MAIN STREET                                    san diego          CA
    

    These are the fixed widths

    FR TO
    1   9     SSN
    10  17    DOB
    18  33    LastName
    34  46    FirstName
    47  54    Date
    55  62    Time
    63  90    CurrentStreetAddress1 
    91  115   CurrentCity
    116 131   CurrentStat
    

    All help is much appreciated! Thanks in advance!