A Quick Intro to XSLT

The other type of markup file you’ll be coming to know intimately is an XSLT. Where CSS tells the application to render a certain piece of marked up text in a certain way, XSLT tells the application to change the markup of a certain piece of marked up text. This is incredibly useful when you want to change an XML file from one format to another e.g. from LibreOffice’s XML based .ODT format to Amazon’s XML based .mobi format.

In a nutshell, you create templates that identify bits of XML, describe how to transform them, and which of the identified bits of XML’s children should be processed next. When the parser reads the original XML file, each time it finds something that matches a template, it applies the changes. Because of this, it doesn’t matter in what order you write the templates in the XSLT file. However, this can get a bit confusing because it means that order in which things are output depends upon a combination of the order in which things appear in the original XML file, the order in which the templates match and the order in which the templates allow the children of the matched XML to be processed.

To give a better idea of what a real XSLT stylesheet looks like we’ll create one that’ll convert our Hello, World example from XML to HTML (the language used on the web).

The first is thing to note about XSLT is that it is written in XML. This has the advantage that you can use all the same tools to work on them as you would other XML files, but because XML isn’t really intended for marking up programming, it can sometimes be a bit clunky and, sometimes, things that are very common and very easy to do in other languages can be less straight forward. An example of this is looping. In most languages looping over a set of variables is pretty trivial; however, though it is possible in XSLT, it isn’t intuitive, especially when you consider that program flow in the XSLT is controlled by the original XML file (rather than the code defining the loop).

Another thing worth noting before we begin our example is that the text the XSLT outputs needs to be well formed XML. This doesn’t necessarily mean that you have to output XML (though it is by far the most common thing to do) it just needs to be compatible with it. For example, from our Hello, World XML file we will be outputting HTML, we just have to be careful not to use features of HTML that are invalid in XML (such as omitting closing tags etc.)

So lets begin. The first thing to do is tell helloWorld.xml that we are going to be transforming it. This is done by adding a processing instruction (this step isn’t necessary if you want to use command line XSLT processor to which you would supply both the original XML file and the XSLT).

<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xml" href="./helloWorld.xslt"?>
<!-- This is a simple example of an XML file -->
<raw:greeting xmlns:raw="http://www.robertawood.com/xmlExample" raw:language="English">
  Hello, World!
</raw:greeting>

We then need to create our XSLT file (called helloWorld.xslt).(Nb. if you want to test this yourself in your browser, as of this writing, Chrome doesn’t support xslt, but it works fine in Firefox.)

<?xml version="1.0"?>
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
</xsl:transform>

This should look very familiar to you. As you can see, all we have done is create a top level stylesheet element that we use to map the xsl namespace to the xsl prefix and raw namespace to the raw prefix (we could map it to a different prefix if wished). You can use either “stylesheet” or “transform” as your top level element. I’ll use transform throughout, but you might see stylesheet used if you look at other examples on the web.

This is a very minimal yet functional XSLT file. You might expect that, as we haven’t defined any templates, it won’t have any effect on the XML file. However, surprisingly, the output of this XSLT is

Hello, World!

All the markup has been stripped. How did this happen? Well, XSLT has a set of built in templates that will be applied if you don’t supply your own in a given situation. Most notably these match all the elements and instruct the processor to process their children whilst matching any text that is found and outputting it. Thus <raw:greeting>Hello, World!</raw:greeting> becomes plain old “Hello, World!”

Ok, lets add a template to our XSLT file so we can control the output ourselves.

<?xml version="1.0"?>
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="greeting">
    We found a greeting!
  </xsl:template>
</xsl:transform>

We now get the output “We found a greeting!” Here we matched based on element name, but more complicated matches are possible by utilsing XPath (though we shouldn’t have to use this functionality). However, we will need to be able to match attribute names. To do this you simply prefix the name with the @ symbol to indicate you are matching an attribute not an element.

<?xml version="1.0"?>
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="greeting">
    We found a greeting!
  </xsl:template>
  <xsl:template match="@language">
    We found a language attribute!
  </xsl:template>
</xsl:transform>

And ta daa!

Hello, World!

Oh, we seem to have got the same output we did before. Why is this? It’s because of the slightly confusing program flow of the XSLT processor. You have to explicitly tell it to apply templates to an element’s children. We do this by adding an “apply-templates” element to our template’s output, which also indicates where any output from any children should be placed. We can either apply all templates to the elements children or just a subset by using the “select” attribute.

<?xml version="1.0"?>
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="greeting">
    We found a greeting!
    <xsl:apply-templates select="@language"/>
  </xsl:template>
  <xsl:template match="@language">
    We found a language attribute!
  </xsl:template>
</xsl:transform>

We now get the expected output of:

We found a greeting! We found a language attribute!

We need to add one more feature to our template before we actually get it to output HTML and that is we want to get at the text encased in our greeting element and language attribute. There are two ways of doing this. The first is by creating a rule to match the text and telling the XSLT proceesor to process the text children of the element (or using the built in rule that does this). The second way is by using the “value-of” element. As you can probably guess, the “value-of” element returns the value of the element or attibute it matches. In our case, using the “value-of” element is slightly clearer so that is what we’ll do. Note that “.” means get the value of the current element.

<?xml version="1.0"?>
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="greeting">
    We found a greeting! The greeting was: <value-of selection="."/>
    <xsl:apply-templates select="@language"/>
  </xsl:template>
  <xsl:template match="@language">
    We found a language attribute! The language was: <value-of selection="@language"/>
  </xsl:template>
</xsl:transform>

Ok, now we have all the pieces we need to create our XSLT that will convert hellowold.xml into HTML. The most common way to convert XML to HTML is to convert the XML element name to an HTML class name, and that is what we’ll do (note that XSLT has a built in template that outputs the value of an attribute so we don’t need to supply our own).

<?xml version="1.0"?>
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="greeting">
    <html>
      <head><title>Greetings</title></head>
      <body>
        <h1>Greetings in <xsl:apply-templates select="@language"/></h1>
        <div class="greeting"><xsl:value-of select="."/></div>
      </body>
    </html>
  </xsl:template>
</xsl:transform>
Series Navigation<< A Quick Intro to CSSA Brief Moan About Styles in LibreOffice >>

Leave a Reply

Your email address will not be published. Required fields are marked *