Template, Plugin and Test Files

Here you can find the files needed to export your Novel from LibreOffice to print format and to .mobi. Please read the preceding sections for information on how to use them.

Note, you need to right click and select “Save As” or the equivalent in your browser. If you just click them, most will open in the browser window in a way that doesn’t make sense.

Here is my example book.

And finally here is are some test documents to help if you are writing your own plugin. They will look for odtToXhtml.xsl in the same directory that they are in.

Have fun!

Installing the Plugin

Right, it is getting exciting. Let’s install the plugin in LibreOffice. Click on “Tools->XML Filter Settings” and then select “New”. This brings up the “XML Filter: New Filter Dialogue”. In the “Filter name” you enter the name that you want displayed in the filter list. “Writer to XHTML for Kindle” is my preference. In the “Application” dropdown, select “LibreOffice Writer (.odt)”. “Name of filter type” is the name you see in the export dialogue. I would keep this the same as “Filter name”. Change “File extension” to .html and add any comments you feel you might need. On the “Transformation” tab all you need to is set the “XSLT for export” filter to the file you just created (or my one, which you can download on the next page) by hitting the “Browse…” button. Click “OK”, and you are done.

LibreOffice's New FIlter Dialog
Installing the stylesheet in LibreOffice

Now to export your novel! Open it up (having already converted it to the Novel template!).Then go “File->Export…”. Click the “All Formats” dropdown and select “Odt to XHTML for kindle” (or whatever you called it). Click “Save” and you’ve done it! You can open up the resulting file in your web browser to check that all is well.

Note that if your book has any images, you will need to create an “images” directory manually for them, as the export plugin doesn’t extract them, it only creates the html that points to them.

If everything looks good, then it is time for the final step, turning it into a mobi. This can be done by uploading your file to Amzon’s KDP platform or with Amazon’s kindlegen program, available here. Using kindlegen is simply a case of:

kindlegen AnExampleBook.html

It will spit out a load of information about the conversion and your book, in this case AnExampleBook.mobi.

If you want to include a cover and are using kindlegen to create your mobi file, you will need to add a link to it in your .html file. The link goes in the <head> section and looks like this:

<meta name="cover" content="images/cover.jpg">

And that’s it, we’re done! All you need do is proof read it carefully to make sure no errors have crept in, and you can start selling your book!

Neatening Things Up

Well, we are pretty much on the home straight now. All that is left is a bit neatening up. Firstly we need to deal with any character styles you might be using. There is one new complication with characters styles, LibreOffice sometimes creates blank automatic ones. These don’t really hurt and could be left in, but for the sake of a couple of lines of code, we might as well remove them. If you are using any character styles, you will need to add them to the css in the root template and define how they should be rendered.


<xsl:template match="text:span">

	<xsl:variable name="style"> 
	<xsl:choose>
		<xsl:when test="contains(@text:style-name,'Novel')">
			<xsl:value-of select="@text:style-name"/>
		</xsl:when>
		<xsl:when test="@text:style-name = /office:document/office:automatic-styles/style:style[contains(@style:parent-style-name,'Novel-')]/@style:name">
			<xsl:value-of select="/office:document/office:automatic-styles/style:style[@style:name = current()/@text:style-name]/@style:parent-style-name"/>
		</xsl:when>
	</xsl:choose>
	</xsl:variable>

	<!--Sometimes LibreOffice inserts unnecessary auto character styles.
	    We want to only include intended styles -->
	<xsl:choose>
		<xsl:when test="contains($style, 'Novel-')">
			<span class="{$style}"><xsl:apply-templates /></span>
		</xsl:when>
		<xsl:otherwise>
			<xsl:apply-templates />
		</xsl:otherwise>
	</xsl:choose>
	
</xsl:template>

As you can see, as there are now three possible things in the “text:style-name” attribute (a Novel-Character style, an automatic style based on Novel-Character style and an automatic style based on nothing), we have had to add an extra <xsl:when> clause to our <xsl:choose> statement. In the third case, where this is a rouge automatic style, $style won’t contain any text and, therefore, won’t contain “Novel-”.

Any links in the document are nice and easy to deal with. It is a simple case of renaming them.


<xsl:template match="text:a">
	<a href="{@xlink:href}"><xsl:apply-templates /></a>
</xsl:template>

Images are equally easy or really tricky depending on your point of view. Assuming you only have one image, perhaps a lovely picture of the author in the author section or, in the case of the example document, a copyright graphic, you can deal with it as special case, and just put a copy of the image suitable for the kindle in an images folder manually.


<xsl:template match="draw:frame">
	<img alt="Copyleft Image" src="images/copyleft.jpg" />
</xsl:template>

If you have lots of images, that is a little beyond the scope of this plugin I’m afraid. That just leaves us with one last little bit of polish. The kindle doesn’t support drop caps (nb. This is no longer true for kindles that support KF8, old kindles it is still true for). We can get a similar effect though by making the first character of each chapter simply bigger. This is made simple by the use Novel-Paragraph-Section-First. If your layout necessitated the use of a Novel-Section-Very-First or similar additions, you just need to change the match statement to a suitable contains() one, which, seeing as you’re an old hand at this by now, will be a breeze!


<!-- Poor mans drop caps. -->
<xsl:template match="text:p[@text:style-name = 'Novel-Paragraph-Section-First' or @text:style-name = /office:document/office:automatic-styles/style:style[@style:parent-style-name = 'Novel-Paragraph-Section-First']/@style:name]">
	<p class="Novel-Pargraph-Section-First">
		<xsl:apply-templates select="node()[1]"/><xsl:apply-templates select="node()[not(position()=1)]"/>
	</p>
</xsl:template>

Here we do something a bit new, we apply templates to the first child node of of our element and all the rest of the nodes separately. Child nodes consist of the text in side the element plus any <text:span> elements in there. We do it this way as it will preserve any character styles you have in the first paragraph. We then have to have two templates to match this first node, one that matches if it is a text node and one that matches if it is span element (i.e. if the first character has a character style applied to it). We do this by checking whether the node has any text children. As a text node can’t have text children, if this node does have them, it must be a <text:span> rather than a text node. The template for the span element is a bit messy as, as in the normal span element template, we have to check whether this is real span or an unnecessary automatic span.


<xsl:template match="text:p[@text:style-name = 'Novel-Paragraph-Section-First' or @text:style-name = /office:document/office:automatic-styles/style:style[@style:parent-style-name = 'Novel-Paragraph-Section-First']/@style:name]/node()[1]">
	<span class="Novel-Char-First"><xsl:value-of select="substring(.,1,1)"/></span><xsl:value-of select="substring(., 2)"/> 
</xsl:template>

<!-- The fist character must be in span -->
<xsl:template match="text:p[@text:style-name = 'Novel-Paragraph-Section-First' or @text:style-name = /office:document/office:automatic-styles/style:style[@style:parent-style-name = 'Novel-Paragraph-Section-First']/@style:name]/node()[1][text()]">
	<xsl:variable name="style"> 
	<xsl:choose>
		<xsl:when test="contains(@text:style-name,'Novel')">
			<xsl:value-of select="@text:style-name"/>
		</xsl:when>
		<xsl:when test="@text:style-name = /office:document/office:automatic-styles/style:style[contains(@style:parent-style-name,'Novel-')]/@style:name">
			<xsl:value-of select="/office:document/office:automatic-styles/style:style[@style:name = current()/@text:style-name]/@style:parent-style-name"/>
		</xsl:when>
	</xsl:choose>
	</xsl:variable>
	<!--Sometimes LibreOffice inserts unnecessary auto character styles.
	    We want to only include intended styles -->
	<xsl:choose>
		<xsl:when test="contains($style, 'Novel-')">
			<span class="{$style}"><span class="Novel-Char-First"><xsl:value-of select="substring(text(),1,1)"/></span><xsl:value-of select="substring(text(), 2)"/></span>
		</xsl:when>
		<xsl:otherwise>
			<span class="Novel-Char-First"><xsl:value-of select="substring(text(),1,1)"/></span><xsl:value-of select="substring(text(), 2)"/>
		</xsl:otherwise>
	</xsl:choose>
</xsl:template>

And that’s it! Have a quick test of it in your browser, and if all is well, you can change the output format to “xml” and install the plugin in LibreOffice (nb. you might need to convert the exported text to Latin-1 for older kindles. We’ll discuss that and show you how to install the plugin in a second).

A Firefox window showing the structure of our document
You can now see that we have paragraphs inside chapters inside parts.

Title Templates

Right, we now need to create some templates that will match our Novel-Forward-Title, Novel-Part-Title and Novel-Chapter-Title titles and process their child paragraphs. They are all pretty similar. We’ll start by taking a look at the template for Novel-Forward-Title.


<xsl:template match="text:p[@text:style-name = 'Novel-Forward-Title' or @text:style-name = /office:document/office:automatic-styles/style:style[@style:parent-style-name = 'Novel-Forward-Title']/@style:name]">
	<xsl:variable name="forwardID" select="generate-id(.)"/>
	<div id="{$forwardID}" class="forward">
		<p class="Novel-Forward-Title"><xsl:apply-templates /></p>
		<xsl:apply-templates select="key('forward', $forwardID)" />
	</div>
	<mbp:pagebreak />
</xsl:template>

The match statement is exactly the same as the one used in our $forwards variable (because we can’t use variables in match statements in xslt 1.0 we can’t just use the variable here). We then just use another variable to create an ID for the current element using “generate-id()”. This will be used both to create an “id” attribute that our table of contents links to and to select the paragraphs that are related to this forward, which we do in the select attribute of <apply-templates>. These matched paragraphs are then dealt with by our generic <text:p> template.

Our Novel-Part-Title template is very similar. The big difference you’ll see is that we have an extra <xsl:choose> statement that enables us to deal with case where the book is divided into parts but not chapters.


<xsl:template match="text:p[@text:style-name = 'Novel-Part-Title' or @text:style-name = /office:document/office:automatic-styles/style:style[@style:parent-style-name = 'Novel-Part-Title']/@style:name]">

	<xsl:variable name="partID" select="generate-id(.)"/>
	<div id="{$partID}" class="part">
		<p class="Novel-Part-Title"><xsl:apply-templates /></p>
		<mbp:pagebreak />
		<!-- Check wether this part is divided into chapters -->
		<xsl:choose>
			<xsl:when test="key('chapters', $partID)">
				<xsl:apply-templates select="key('chapters', $partID)" />
			</xsl:when>			
			<xsl:otherwise>
				<xsl:apply-templates select="key('part-paragraphs', $partID)"/>
			</xsl:otherwise>
		</xsl:choose>
	</div>
	<mbp:pagebreak />
</xsl:template>

For completeness here is our chapters template. This is should all look very familiar to you by now. The only difference here is that, as we don’t know exactly what the style name is going to be (as it could be Novel-Chapter-Title-First etc.) we have to use the same technique as our generic <text:p> template to get the style name.


<xsl:template match="text:p[contains(@text:style-name, 'Novel-Chapter-Title') or @text:style-name = /office:document/office:automatic-styles/style:style[contains(@style:parent-style-name,'Novel-Chapter-Title')]/@style:name]">

	<xsl:variable name="style"> 
	<xsl:choose>
		<xsl:when test="contains(@text:style-name,'Novel-Chapter-Title')">
			<xsl:value-of select="@text:style-name"/>
		</xsl:when>
		<xsl:otherwise>
			<xsl:value-of select="/office:document/office:automatic-styles/style:style[@style:name = current()/@text:style-name]/@style:parent-style-name"/>
		</xsl:otherwise>
	</xsl:choose>
	</xsl:variable>

	<xsl:variable name="chapterID" select="generate-id(.)"/>
	<div id="{$chapterID}" class="chapter">
		<p class="{$style}"><xsl:apply-templates /></p>
		<xsl:apply-templates select="key('paragraphs', $chapterID)" />
	</div>
</xsl:template>

If you take a look at your test document now, you should see that it is all nicely hierarchical.

Creating a Hierarchical HTML File

In order to get the hierarchical structure in our html that we desire, we can’t just let our “text:p” template match everything; we have to be more selective in how our templates are applied. This starts by creating an explicit “office:text” template that will match the <xsl:apply-templates />call in our root template.

The first two elements we have to deal with in this template are the ones for the title and author, so let’s have a look.


<xsl:template match="office:document/office:body/office:text">
	<div id="Novel-FrontMatter">
		<p class="Novel-FrontMatter-Title"><xsl:value-of select="text:p[@text:style-name='Novel-FrontMatter-Title' or @text:style-name = /office:document/office:automatic-styles/style:style[@style:parent-style-name = 'Novel-FrontMatter-Title']/@style:name]"/></p>
		<p class="Novel-FrontMatter-Author"><xsl:value-of select="text:p[@text:style-name='Novel-FrontMatter-Author' or @text:style-name = /office:document/office:automatic-styles/style:style[@style:parent-style-name = 'Novel-FrontMatter-Author']/@style:name]"/></p>
	</div>

	<mbp:pagebreak />

Hopefully that all makes sense to you. We haven’t used <xsl:value-of/> before. It just outputs the content of the node indicated by the “select” attribute. We are assuming here that all books have one title and one author. You may need to change this if you book has a subtitle or it is a writing collaboration. Note the use of the “or” statements to deal with automatic styles. The <mbp:pagebreak/> is new too. It is one of Amazon’s special elements. As you might expect, it adds a page break when the file is viewed on a kindle.

Note that that isn’t the end of our template. The template continues throughout the rest of this section.

Next in it we need to deal with the copyright page, which has any number of copyright elements. However, as an added bonus, we want to exclude from our ebook the ISBN the print version is using (as this is unique to the print version). Typically kindle books don’t have ISBNs, but if you are using one, rather than ignoring the ISBN entries, you just need to create a template that matches them and use it to replace them with the correct ones for your ebook.


<div id="Novel-Copyright">
	<xsl:apply-templates select="text:p[contains(@text:style-name,'Copyright') or @text:style-name = /office:document/office:automatic-styles/style:style[contains(@style:parent-style-name, 'Copyright')]/@style:name and not(contains(text(),'ISBN'))] "/>
</div>
	
<mbp:pagebreak />

As you can see, we excluded the ISBNs from the copyright page by adding an “and” statement that checks that the text in the elements doesn’t contain “ISBN”. The elements selected by this statement are matched by the generic “text:p” template we made earlier, which turns the matched elements into <p> elements with the style name as the class.

Next we come to the table of contents. The table of contents is just a list of links to the various bits of the book. Most of the uris are created with our old friend “generate-id()”. We will call it again when we get round to creating the actual entries and store the result id their “id” attributes.


<div id="toc">
	<p class="Novel-TOC-Heading">Contents</p>
	<ul>	
	<xsl:for-each select="$forward">
		<li><a href="#{generate-id(.)}"><xsl:value-of select="."/></a></li>	
	</xsl:for-each>

Hopefully that was all pretty readable despite it introducing a few new things like <xsl:for-each>. You probably saw that we got some use out of one of our variables for the first time, $forward, which contains all the <text:p> elements that have a forward-title style. We then used <xsl:for-each> to loop over each of them and our old friend “generate-id()” to create the uri. We passed it “.”, which just means “the current element” i.e the one we are currently looping over, and then we print out the current element’s value using <xsl:value> (remember that $forward contains all the elements with a forward-title style, so their value is the title of the forward itself).

Next we need to build the table of contents for the actual content of our novel. Here we are going to have to use the <xsl:choose> statement again so we can deal with books that have parts and chapters, chapters only, no parts or chapters etc.


<li><a href="#Novel-Content">Content</a></li>		
<xsl:choose>
	<xsl:when test="$parts">
	<ul>				
		<xsl:for-each select="$parts">
			<xsl:variable name="partID" select="generate-id(.)"/>
			<li><a href="#{$partID}"><xsl:value-of select="."/></a></li>
			<xsl:if test="key('chapters', $partID)">
				<ul>
				<xsl:for-each select="key('chapters', $partID)">
					<li><a href="#{generate-id(.)}"><xsl:value-of select="."/></a></li>
				</xsl:for-each>
				</ul>
			</xsl:if>
		</xsl:for-each>
	</ul>
</xsl:when>
<xsl:when test="$chapters">
	<ul>
	<xsl:for-each select="$chapters">
		<li><a href="#{generate-id(.)}"><xsl:value-of select="."/></a></li>
	</xsl:for-each>
	</ul>
</xsl:when>
</xsl:choose>

As you can see, a book with chapters but no parts is dealt with in pretty much the same as the forward was. However, dealing with books with parts is slightly more complex and requires the use of one of our keys. It starts off the same as the forward and chapters sections, but once we have written the entry for a part, we check to see if that part has chapters by seeing if our chapters key for the current part contains anything (we could have just checked $chapters, but this way covers the unlikely case of a book in which some parts are divided into chapters and others are not.)

Finally, our table of contents has a link to our about section. Compared to the last section, this is pretty straight forward.


<xsl:if test="$about">
	<li><a href="#Novel-About">About the Author</a></li>
</xsl:if>
</ul>
</div>
<mbp:pagebreak />

Now to actually deal with the content of our novel. This is actually simpler than the table of contents, as all the messy stuff is dealt with in the templates for the various elements.


<xsl:if test="$forward">
	<div id="Novel-Forward">
		<xsl:apply-templates select="$forward"/>
	</div>
</xsl:if>	
<div id="Novel-Content">
	<xsl:choose>
		<xsl:when test="$parts">	
			<xsl:apply-templates select="$parts"/>
		</xsl:when>
		<xsl:when test="$chapters">
			<xsl:apply-templates select="$chapters"/>
		</xsl:when>
		<xsl:otherwise>
			<xsl:apply-templates select="$paragraphs"/>
		</xsl:otherwise>
	</xsl:choose>
</div>
<xsl:if test="$about">
	<div id="Novel-About">
		<xsl:apply-templates select="$about"/>
	</div>
</xsl:if>
</xsl:template>

As you can see we just use <xsl:if> to see if we have a forward and about sections, and our friend <xsl:choose> to figure out whether or not our book has parts and chapters. We then start the processing at that correct level. Notice to that that is the end of our “office:text” template.

Using XSL Keys and Variables

Right, let’s get rid of that horrible flatness. To do this, we are going to use a feature of xsl called keys. Keys enable you to access a group elements (or any other sort of node) that are identified by said key. So, for example, all the paragraphs in chapter one will use the ID attribute of chapter one as their key. We can then select them all by using that key.

Let’s see how this magic happens (in this first example I’ll pretend we don’t have automatic styles, so you can see how it all works without the added complexity the automatic styles add). Let’s start by linking the chapters to the parts.


<xsl:key name="chapters" 
     match="/office:document/office:body/office:text/text:p[contains(@text:style-name,'Novel-Chapter-Title')]"
     use="generate-id(preceding-sibling::text:p[@text:style-name='Novel-Part-Title'][1])" />

The first couple of bits are hopefully self explanatory: we create a key called chapters, in it we stuff all <text:p> elements with Novel-Chapter-Title in their @text:style-name attribute (we use “contains()” rather than “=” so we catch elements that also use “Novel-Chapter-Title-First” etc. as their style name). The clever bit is the next line. With each chapter we store the ID of the part they are related to (the “use” attribute). To get at the part they belong to, we select all the preceding siblings i.e all the <text:p> elements that come before it in the document. We then throw away all those that don’t have “Novel-Part-Title” as their style name and then discard all but the first one of the elements left. Because “preceding-sibling” starts at the current item and goes backwards, the first “Novel-Part-Title” is the one that is closest to the current <text:p> element. Now, when we want only the chapters in certain part all we need to do is call generate-id() on that part and use it as our key (generate-id() always returns the same unique id when used with the same element).

We do a similar thing to get all the paragraphs that belong to a chapter.


<xsl:key name="paragraphs" 
     match="/office:document/office:body/office:text/text:p[contains(@text:style-name, 'Novel-Paragraph')]"
     use="generate-id(preceding-sibling::text:p[contains(@text:style-name,'Novel-Chapter-Title')][1])" />

We also do the same thing to link the forward paragraphs to each forward title.

Now you understand the general principle, let’s see what the real code looks like when we have to deal with automatic styles.

 
<xsl:key name="chapters" 
  match="/office:document/office:body/office:text/text:p[@text:style-name = /office:document/office:automatic-styles/style:style[contains(@style:parent-style-name,'Novel-Chapter-Title')]/@style:name or contains(@text:style-name, 'Novel-Chapter-Title')]"
  use="generate-id(preceding-sibling::text:p[@text:style-name = /office:document/office:automatic-styles/style:style[@style:parent-style-name = 'Novel-Part-Title']/@style:name or @text:style-name = 'Novel-Part-Title'][1])"/>

Basically we’ve just added in an “or” statement to both the match and use statements that checks the @style:parent-style-name of the <text:p> element’s automatic style. As you can see, it makes the code a lot less readable.

For completeness here are the other three keys, the part-paragraphs key is used in cases where a book is divided into parts but those parts aren’t divided into chapters:


<xsl:key name="paragraphs" 
     match="/office:document/office:body/office:text/text:p[@text:style-name = /office:document/office:automatic-styles/style:style[contains(@style:parent-style-name,'Novel-Paragraph')]/@style:name or contains(@text:style-name, 'Novel-Paragraph')]"
     use="generate-id(preceding-sibling::text:p[@text:style-name = /office:document/office:automatic-styles/style:style[contains(@style:parent-style-name,'Novel-Chapter-Title')]/@style:name or contains(@text:style-name, 'Novel-Chapter-Title')][1])" />

<xsl:key name="part-paragraphs" 
     match="/office:document/office:body/office:text/text:p[@text:style-name = /office:document/office:automatic-styles/style:style[contains(@style:parent-style-name,'Novel-Paragraph')]/@style:name or contains(@text:style-name, 'Novel-Paragraph')]"
     use="generate-id(preceding-sibling::text:p[@text:style-name = /office:document/office:automatic-styles/style:style[@style:parent-style-name = 'Novel-Part-Title']/@style:name or @text:style-name = 'Novel-Part-Title'][1])" />

<xsl:key name="forward" 
     match="/office:document/office:body/office:text/text:p[@text:style-name = /office:document/office:automatic-styles/style:style[contains(@style:parent-style-name,'Novel-Forward-Paragraph')]/@style:name or contains(@text:style-name, 'Novel-Forward-Paragraph')]"
     use="generate-id(preceding-sibling::text:p[@text:style-name = /office:document/office:automatic-styles/style:style[contains(@style:parent-style-name,'Novel-Forward-Title')]/@style:name or contains(@text:style-name, 'Novel-Forward-Title')][1])" />

Before we give our new keys a test, we are going to define a couple of new variables to save us a bit of typing as we go. These variables are just a short hand way of writing “all chapters”, “all paragraphs” etc. They help to keep our code tidy, especially with the added complexity automatic styles bring. You might be wondering why we didn’t use these variables in our keys to make them neater. The reason is that you can’t use variables in match statements in xsl 1.0. This is fixed in xsl 2.0, but when I wrote this script there wasn’t much support for xsl 2.0. This has changed now, so feel free upgrade to xsl 2.0 and use these variables in match statements.


<xsl:variable name="forward" select="/office:document/office:body/office:text/text:p[@text:style-name = 'Novel-Forward-Title' or @text:style-name = /office:document/office:automatic-styles/style:style[@style:parent-style-name = 'Novel-Forward-Title']/@style:name]"/>

<xsl:variable name="parts" select="/office:document/office:body/office:text/text:p[@text:style-name = 'Novel-Part-Title' or @text:style-name = /office:document/office:automatic-styles/style:style[@style:parent-style-name = 'Novel-Part-Title']/@style:name]"/>

<xsl:variable name="chapters" select="/office:document/office:body/office:text/text:p[contains(@text:style-name,'Novel-Chapter-Title') or @text:style-name = /office:document/office:automatic-styles/style:style[contains(@style:parent-style-name, 'Novel-Chapter-Title')]/@style:name]"/>

<xsl:variable name="paragraphs" select="/office:document/office:body/office:text/text:p[@text:style-name = /office:document/office:automatic-styles/style:style[contains(@style:parent-style-name,'Novel-Paragraph')]/@style:name or contains(@text:style-name, 'Novel-Paragraph')]"/>

<xsl:variable name="about" select="/office:document/office:body/office:text/text:p[contains(@text:style-name, 'Novel-About') or @text:style-name = /office:document/office:automatic-styles/style:style[contains(@style:parent-style-name, 'Novel-About')]/@style:name]"/></pre>

To access the variables you just use their name preceded by the dollar sign e.g $parts.

Outputting HTML

Right let’s get our stylesheet to output something useful. The first thing to do is to actually get it to output an html document rather than just text. We do this by matching the root element of our input document.


<!-- We match the root node and set up the html and our css-->
<xsl:template match="/">

	<html xmlns="http://www.w3.org/1999/xhtml" xmlns:mbp="http://www.amazon.com">
		<head>
			<style>
			.Novel-FrontMatter-Title {font-size:200%; font-weight:bold; text-align:center;}
			.Novel-FrontMatter-Author {text-align:center; }
			.Novel-FrontMatter-Copyright {margin-bottom:5%; text-align:center; }
			.Novel-Part-Title { font-size:150%; margin-bottom:5%; text-indent:0; margin-left:0;}
			#Novel-Forward {font-style: italic;}
			.Novel-Chapter-Title, .Novel-Chapter-Title-First, .Novel-Chapter-Title-Very-First, .Novel-Forward-Title { font-size:150%; margin-bottom:5%; text-indent:0; margin-left:0; text-align:center;}   
			.Novel-Paragraph-Section, .Novel-Paragraph-Section-First, .Novel-Forward-Paragraph-Section-First, .Novel-Forward-Section, .Novel-About-First {margin-top:3%; margin-bottom:0; text-indent:0;}

			.Novel-Paragraph, .Novel-Forward-Paragraph, .Novel-About {text-indent:5%; margin-bottom:0; margin-top:0;}
			.Novel-Char-First { font-size:150%; }
			#Novel-About{font-family:mono} 
			.Novel-Char-Italics {font-style: italic;}
		</style>
		</head>
	<body>
		<xsl:apply-templates select="office:document/office:body/office:text"/>
	</body>
	</html>
</xsl:template>

As you can, see this is where we define our styles. Feel free to tweak as you see fit and to add in any extra styles that you need, but remember that less is probably more. Also note that we use % values in all the sizes rather than ones relative to the font size (such as “em”). This is so if, for example, the reader needs the font set really big because they have poor eyesight, the margins etc. don’t get ridiculously big too.

You can give this a test this now if you like, but you won’t see much difference, as we are still just outputting all our content in one long line. Let’s do something about this now.


<!-- Match all <text:p> elements that aren't matched anywhere else -->
<xsl:template match="text:p">

	<xsl:variable name="style"> 
	<xsl:choose>
		<xsl:when test="contains(@text:style-name,'Novel-')">
			<xsl:value-of select="@text:style-name"/>
		</xsl:when>
		<xsl:otherwise>
			<xsl:value-of select="/office:document/office:automatic-styles/style:style[@style:name = current()/@text:style-name]/@style:parent-style-name"/>
		</xsl:otherwise>
	</xsl:choose>
	</xsl:variable>

	<p class="{$style}"><xsl:apply-templates /></p>
</xsl:template>

This is probably going to need a bit of explaining. What we are doing here is matching all the <text:p> elements in our document, which are the elements that contain all our novel’s text. We are then defining a variable called style (to save us a bit of typing) and then using a choose statement to see if our element is using a real style or an automatic style. We do this by checking the @text:style-name attribute of the <text:p> element. If it is using a real style (i.e the @text:style-name attribute begins with “Novel-”) we set the value of the style variable to the attributes value. If it is using an automatic style we have to look up which style the automatic style is based on. This information is stored in the @style:parent-style-name attribute of the automatic style, and if you remember, automatic styles are located in /office:document/office:automatic-styles/. We then output a <p> element that has the real style name as its class. We then apply all templates to the children of the current element, which will be mostly just text, and we know what happens to text by default, it just gets copied into the output document.

If you have a look at your test document in your browser now, it’ll look very nearly finished, as if we only need to add a few bits of polish here and there (a table of contents, dealing with links etc.) However, we have one big problem, if you look at the source code, you’ll see our html is still flat rather than nested properly.

A Firefox window showing the structure of our document
Inspecting the elements in FireFox reveals that our document is still not hierarchical.

Inside your Test Document and Starting Your Stylesheet

Let’s take a look inside our test file. Open it up in your text editor, and if you ignore all the info we’re not really interested in, you should be able to see that the LibreOffice document has the following basic structure (in simplified form):

 
<office:document>
	<office:automatic-styles>
		<style:style>Here are the automatic styles.</style:style>
		…
	</office:automatic-styles>
	<office:body>
		<office:text>
			<text:p text:style-name="Novel-Part-Title"/>
			<text:p text:style-name="Novel-Chapter-Title"/>	…
			<text:p text:style-name="Novel-Paragraph">Here is the text of our book.</text:p>
			…
		</office:text>
	</office:body>
</office:document>

All we need to do is transform into something resembling (in simplified form) this:


<html>
	<head>
		<style>The styles that format our book.</style>
	</head>
	<body>
		<div id="Novel-Part">
			<div id="Novel-Chapter">
				<p>Here is the text of our book.</p>
				…
			</div>
		</div>
	</body>
</html>

As you can see, they aren’t a million miles away from each other. It is mostly a case of just renaming the elements. The big exception to this is that, as I have said before, the .odt file has a flat layout that we want to turn into a nice hierarchical one. We are also going to have to do a bit of work to get rid of the automatic styles.

Now, there are a lot of different ways we can accomplish the export of our novel with xslt. Luckily we don’t really need to consider the efficiency of our stylesheet as it will only be run once every few years!

Let’s get to writing our template. If you are writing your own, I’m sure you’re aware, but just in case, you need to do so in a text editor (like gedit) and, for the sake of clarity, it should have the file extension “.xsl”. I won’t go back over any other basics, as you can recap them in the Quick Intro to XSLT article if need be.

Right, the first thing we need to deal with are the namespaces. You might have noticed there are quite a lot of them mentioned in the LibreOffice document. Luckily we only need to deal with the ones that we are actually using. A quick look at the example document will show you that we need a minimum of three namespace prefixes for the input document: office, style and text (if there are any images to deal with we will also need a forth, draw, and a fifth, xlink, if there are any links). The output file uses the defult namespace for the xhtml elements so we don’t need to worry about that, but it will also use some extras from Amazon. These use the mbp namespace prefix. As far as I know, there isn’t an official namespace for it, so I just use “http://www.amazon.com”. That just leaves the xsl namespace itself.

Now, we don’t want all of these namespaces to show up in our output, so we need to exclude those that are only relevant to the stylesheet. We do that with the “exclude-result-prefixes” attribute.

If you are going to try this out in your browser, you need to take one further step. You need to tell it that it is getting html. You do this with the method attribute of the output element (remember to remove it again when your template is finished). However, if you are going to be using a command line parser to test with (such as xsltproc), you are probably better off leaving it at xml.

Altogether this gives us:


<?xml version="1.0"?>
<!-- First set up the namespaces. -->
<xsl:stylesheet version="1.0"
	xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
	xmlns:xlink="http://www.w3.org/1999/xlink"
	
	xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
	xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0"	
	xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
	xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0"

	xmlns:mbp="http://www.amazon.com"

	exclude-result-prefixes="xlink office style draw text">
		<xsl:output method="html" indent="yes"/> 
</xsl:stylesheet>

This is a fully functional stylesheet. Not that it is that useful a one. But have a quick test that it is all working. As we don’t have any templates defined, the default ones will be used. These just output the text content of all of the elements, so you should see all the text of your test document output in a long line.

Getting a Test File

Note, quite a lot of time has passed between me writing this and having the time to polish it and get in online. During this period, Amazon has improved the file format used in kindles (they currently use KF8). This means that a lot of the limitations I mention no longer exist (at least for newer devices). They have also improved the tools used to create mobi files. One big difference is that you can now convert from epub. This has quite a few advantages, and future versions of this plugin will take this route, but for now, this method will still work.

Right, things are about to get a bit hairy. What we are going to do is write an export plugin for LibreOffice that will convert our novel to xhtml suitable for submission to the Amazon kindle store. The first thing we need to know is how the novel is currently laid out by LibreOffice. A simple way to do this is just to have a look at a .odt file in a text editor (such as gedit). You can either use my test document below or use your actual novel. In order to extract your novel’s text (using Ubuntu Linux) simply right click on it and open with Archive Manager. Select the content.xml file and extract it somewhere.

Unzip showing the contents of an .odt file
The contents of an .odt file. The text is stored in content.xml.

Before you open it in your text editor, we need make it a bit more human friendly. Currently all the text is in one very long line, which is bit difficult to read and will probably exceed the maximum line length the text editor can deal with. To neaten it up, just pass it through tidy (you can use the online version if you can’t install it for some reason: tidy). This will add line breaks and indentations that will make the file much more readable.


tidy -xml -i -o testBook.xml content.xml
 

You should now have a file called testBook.xml, open it up (or open my test document below) in gedit (or the text editor of your choice).

A document for testingLibreOffice Writer Icon

You will see the file starts with the usual xml declarations and then has a load of namespace declarations followed by lots of style book keeping. I have added the stylesheet declaration for easy testing.

The xml and namespace declarations in test.xml
The xml and namespace declarations in test.xml

We’ll come back to these in a minute. The actual guts of your book starts inside the <office:text> element, which is inside the <office:body> one.

The contents of content.xml
The text of our novel is in <office:text>.

One of the things you might be a bit surprised by is that quite a lot of your lovely styles seem to have gone missing to be replaced with styles with names like “P1”. Don’t despair, this is normal. These weird styles are automatic styles. By default, LibreOffice will use automatic styles to make it easier for it to compare documents. You can turn them off in LibreOffice 5 with the catchily titled “Random number to improve accuracy of document comparison: Store it when changing the document” setting. Be aware though that this won’t delete any that already exist in a document. If you are planning on editing the plugin a lot, you will find it simpler if you create a document that doesn’t use automatic styles, otherwise don’t worry too much about it, they just make the plugin a bit harder to read.

The reason we are looking at these files is because, when LibreOffice exports a file, it first converts it to almost this exact format in memory before applying the export transformation, so these files are effectively what our plugin needs to transform. The only difference is that <office:document-content> is replaced by <office:document>. I don’t know why. So if you are using your own novel’s text to test with, do a quick search and replace on “office:document-content” replacing it simply with “office:document” (make sure to replace both opening and closing tags) and, whilst you’re at, add in a stylesheet declaration (the stylesheet declaration goes right after the xml declaration), then we can conveniently test our plugin as we write it by opening our test file in a browser.


<?xml-stylesheet type="text/xsl" href="odtToXhtml.xsl"?>

This assumes that the stylesheet we are about to write is going to be called “odtToXhtml.xsl” and is in the same directory as the test file. As of this writing, this works fine with Firefox on Linux but not Chrome. Other browsers/platforms are untested. Testing the transformation as we go in a browser ensures that, once the plugin is finished, we can then install it into LibreOffice confident that it is going to work as expected.

Let’s Get Pasting

Before we begin, it is good idea to save our document (and to make sure you have autosave etc. set up sanely), so do that quickly, and then let’s get going. The new document should be ready for you to enter your book’s title: the Novel-FrontMatter-Title paragraph style should be selected and the page style set to Novel-Page-FrontMatter (if they’re not, select them). Take a deep breath, enter the title and hit enter. Automatically the cursor moves down a bit and the style changes to the Novel-FrontMatter-Author style. Enter your name, hit enter again, and you’ll notice that we are magically at the top of a new page thanks to the “page break after” setting of Novel-FrontMatter-Author. You’ll also notice that we have now switched to the Novel-FontMatter-Copyright paragraph style. Enter all your copyright info (and ISBN). When you are done, hit enter one last time, and change the style of this empty paragraph to Novel-Forward-Title (or Novel-Part-Title or Novel-Chapter-Title, depending on how your book is structured), which will start a new a page thanks to “page break before”.

The first page of the example book
The first page of the example book.

Forward

When you changed the paragraph style it will have triggered a change in page style to Novel-Page-Forward and reset the page count. To make use of this, click on the footer and change the paragraph style to Novel-Footer. Then click on “Insert->Fields->Page Number”, this will add a page number to all pages with the same page style (i.e. Novel-Page-Forward). When our page style changes in a minute to Novel-Page-First and again to Novel-Page, we will have to add them again. We don’t want a header for our forward, so leave that blank.

Right, back to adding stuff. Enter the title of your forward, hit enter and paste in the text. You can either paste it all in one go and then go back and change any paragraphs that begin sections to Novel-Forward-Section or you can just past them in section by section. Once you are done, hit enter a final time and change the style of the empty paragraph to Novel-Part-Title (or Novel-Chapter-Title if applicable).

The first page of the forward with page numbers inserted.
The first page of the forward with page numbers inserted.

Part One

If your forward finished on a left hand page, you’ll notice that a new page has been started thanks to “page break before”, but if your forward finished on a right hand page, a blank page has also been inserted. This is because our Novel-Part-Title style is set to “right only” so it skips the left hand page, leaving it blank. Our part title has also had its page style changed (to Novel-Page-Part). You’ll notice that this removes the page numbers from the bottom of the page.

Enter the part title and hit enter again. A new page is created and a blank page is inserted.

The part one title with the automatically inserted blank page to its left.
The part one title with the automatically inserted blank page to its left.

Zoom

At this point, I would change the page layout of LibreOffice (if you haven’t done so already) to make it easier to see what you are doing. Go to “View->Zoom->Zoom…” and choose “Columns : 2” and check “Book mode”. This will layout your pages in pairs (the two columns bit) that are joined together (book mode) with the odd pages on the right and the even ones on the left (book mode again). It should look like you have opened up a printed copy of your book. This makes it easier to check that everything is to your liking and that you’re getting things in the right place. You also need to set a suitable Zoom Factor. This will vary depending on your monitor size and preference, but will ideally be one that is small enough to fit both pages on the screen and big enough enable you to still be able to read the text. I like using 100% so I can see exactly what it is going to look like.

LibreOffice's Zoom DIalog
The zoom dialogue. Set columns to two, check Book mode and select your preferred Zoom factor.

Chapter One

Right, back to our chapter title. Before we move on, we need to change it from Novel-Chapter-First to Novel-Chapter-Very-First. This style will set our page numbers back to 1. So do that now. To check that it worked, repeat the procedure to insert page numbers that we used for the forward (click the footer, change the paragraph style to Novel-Footer and then go “Insert->Fields->Page Number”). You should find that our first page is numbered page 1.

Ok, enter your first chapter title, hit enter, and we are ready to enter the text of your novel!

Like the forward, you can either paste in the whole chapter in one go and then go back and change the style of the paragraph sections or go section by section (you could actually paste the whole novel in in one go and just go through changing the styles over, but doing things that way is a bit error prone). Once you’ve finished the chapter, hit enter again and change this new empty paragraph to Novel-Chapter-Title. Before we do the rest of the chapters though, it’s a good idea to add the page numbers and the header, as if there are problems, this is where they are likely to be.

Header and Footer

The page numbers for the Novel-Page pages are done just as you did on the first page. You’ll see the page numbers carry on from the first page with 2 (it doesn’t matter which of the Novel-Page pages you add it too, as it will be added to them all).

To add the header, click on a left page header, change its style to Novel-Header-Left and enter your name. On the right hand page (but not the one with the chapter title), click and change the style to Novel-Header-Right and type the Novel’s title (or the Chapter Title or whatever you are using). Now all Novel-Pages will have the same header.

Rinse and Repeat and About

That is pretty much all there is to it. Now it is just a case of repeating the procedure for the rest of your chapters and parts. When the last paragraph of the last chapter has been done, hit enter one more time and, this time, change this empty paragraph into a Novel-About-First. This will start a new page on the next left hand side page and is where you can enter some info about yourself.

Getting Rid of Extra Lines

If you pasted in large sections of your book, we have one more small bit of house keeping to do. We need to get rid of the empty lines. Annoyingly and confusingly, LibreOffice only uses “\n” to mean newline in the “Replace” section of its “Find & Replace” dialogue. In the search section, it means Line Break, which is similar to a newline but different. Anyway, to get round this we need to use a couple of new regex characters: “^”, which means only match at the beginning of a paragraph, and “$”, which means only match at the end of a paragraph. Putting the two together (“^$”) will find us any empty paragraphs. So bring up the Find & Replace dialogue (“Edit->Find & Replace”), click the “Other options” drop down and check “Regular expressions”, and then “Replace All”.

LibreOffice's find and replace dialog
The Find & replace dialogue. Make sure to check Regular expressions.

The End

And that’s it. You’re done! Hit save, and then export to pdf (“File->Export to PDF”, selecting “Archive PDF/A-1a”, as this embeds the fonts in the document). Upload the exported pdf to CreateSpace and you can have a copy of your book in your hands in a day or two. It is probably a good idea to check it quite thoroughly first, though, to make sure you haven’t made any mistakes, but well done!

Once you have checked it all, you can also now export your novel to .doc (using “File->Save As” and selecting .doc as the format) and submit it to Smashwords.

Whilst you wait for your book to get to you from CreateSpace and the royalties to start rolling in from Smashwords, let’s set up this export plugin for the kindle.