Support free tutorials









vogella training Training Books



DocBook - Tutorial

Lars Vogel

Version 3.7

11.12.2014

DocBook

This tutorial explains how to write DocBook files in Eclipse and how to convert these files into various output formats, e.g., to HTML and PDF. It also explains how to configure XInclude to divide the information into different source files.

This tutorial uses Docbook 4.5, the Saxon XLST processor in version 6.5.5, Eclipse 4.3 and the XLST stylesheets in version 1.77.1.


Table of Contents

1. Introduction to DocBook
1.1. Overview
1.2. Documents classes
1.3. DocBook versions
1.4. DocBook Example
2. The required toolset
3. Installation
3.1. Eclipse
3.2. Docbook and Stylesheets
3.3. XSL processor
3.4. Issues
4. Tutorial: convert Docbook to HTML5
4.1. Project Setup
4.2. Write your first DocBook document
4.3. Use Ant to convert DocBook to HTML5
5. Convert Docbook to plain text
6. DocBook Tags
6.1. Markup
6.2. Warnings
6.3. Line break
6.4. References to other elements
6.5. Legal Notice
6.6. Index
6.7. Tables
6.8. Lists
6.9. Links
6.10. Graphics
6.11. Menus
6.12. Keyboard Shortcuts
7. Creating epub
7.1. Overview of EPUB
7.2. Creating EPUB files with Apache Ant
8. Create pdf output
8.1. Overview
8.2. Installation
8.3. Define the Ant Task
9. Influencing the output result
9.1. HTML Parameters
9.2. PDF Parameters
9.3. Add content into the HTML output
10. Advanced Features
10.1. Syntax Highlighting
10.2. Remove certain parts
10.3. Using own stylesheets
11. Visual Editor for XML (Vex)
11.1. What is Vex
11.2. Installation
11.3. Usage
12. LanguageTool
12.1. What is LanguageTool?
12.2. Installation
12.3. Usage
13. Using XInclude with Eclipse XSL
13.1. Overview
13.2. Eclipse XSL Tools
13.3. Using the XInclude ant task
14. Support this website
14.1. Thank you
14.2. Questions and Discussion
15. Links and Literature

1. Introduction to DocBook

1.1. Overview

DocBook is a standard for creating well-formated plain text documents. DocBook files are written as plain text.

For further processing DocBook files are transformed into other output formats. This is typically done via XSLT ( Extensible Stylesheet Language Transformation ).

XSLT sheets for converting DocBook into common output, e.g. HTML or PDF, are available,

1.2. Documents classes

DocBook has two main document classes, the book class and the article class.

  • Article: Used for writing technical articles. The main tag is <article>. Article is used in the following example.

  • Book: Used for longer description. The main tags is book. In addition to sections in articles you have also the <chapter> tag and the <part> tag.

1.3. DocBook versions

DocBook is currently available in two versions. The 4.5 version and the 5.0 version. This tutorial is based on the 4.5 version which seems to be the version that is still heavily used.

1.4. DocBook Example

The following listing shows an example of a DocBook article.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "../docbook-xml-4.5/docbookx.dtd">
<article> <title>Docbook Article Example</title> <articleinfo> <title>DocBook Intro</title> <author> <firstname>Lars</firstname> <surname>Vogel</surname> </author> <volumenum>1234</volumenum> </articleinfo> <chapter> <title>This is the first chapter</title> <section> <title>First section in the chapter</title> <para>Test</para> <section> <title>First sub section</title> <para>Subsection1</para> </section> <section> <title>second sub section </title> <para> Subsection2</para> </section> </section> <section> <title>Second section in the chapter</title> <para>Other random text</para> <para> <mediaobject> <imageobject> <imagedata fileref="images/title.png"/> </imageobject> <textobject> <phrase>Image description</phrase> </textobject> </mediaobject> </para> </section> </chapter> <chapter> <title>This is the second chapter</title> <section> <title>My Title</title> <para>More...</para> </section> <section> <title>Other title</title> <para>Blabla</para> </section> </chapter> </article>

The following listing shows an example of a DocBook book.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE bookarticlearticlearticlearticle PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "../docbook-xml-4.5/docbookx.dtd">
<book> <title>Docbook Book Example</title> <chapter> <title>This is the first chapter</title> <section> <title>First section in the chapter</title> <para>Random text.</para> </section> <section> <title>Second section in the chapter</title> <para>Other random text</para> </section> </chapter> <chapter> <title>This is the second chapter</title> <section> <title>My Title</title> <para>More...</para> </section> <section> <title>Other title</title> <para>Blabla</para> </section> </chapter> </book>

The header in the file points to the location of the DTD file of the docbook download.

2. The required toolset

To create DocBook files and to convert them into other formats, you need the following tools.

  • The DocBook DTD which defines the structure of a DocBook document.

  • An XSLT stylesheet to convert your DocBook file into another format.

  • An XSLT parser

In this tutorial you use the Eclipse IDE as an XML editor, Saxon as the XSLT parser and Apache Ant for the XSLT transformation.

3. Installation

3.1. Eclipse

Install Eclipse. See Eclipse IDE for installing and using Eclipse.

Eclipse has Apache Ant integrated. Therefore, no additional installation for Ant is required.

3.2. Docbook and Stylesheets

Download the Docbook DTD in version 4.5 and the latest version of the XSLT stylesheets.

DocBook DTD in version 4.5

XSLT stylesheets

3.3. XSL processor

The download link for Saxon is: http://saxon.sourceforge.net/.

Download the version 6.5.5 as newer Saxon versions do not work well with DocBook 4.5. Saxon 9 is an XSLT 2.0 processor and the current official version of the XSL stylesheets are XSLT 1.0 based.

Download the saxon.zip file. The saxon.jar is later needed.

3.4. Issues

Sometimes running the XSLT conversion results in the following error message.

javax.xml.transform.TransformerConfigurationException: java.net.MalformedURLException: no protocol: ../common/entities.ent 

In this case try adding the xerces-j XML parser to your build path and see if that resolves the error.

4. Tutorial: convert Docbook to HTML5

The following tutorial describes how you can use Eclipse to convert a Docbook input file into an HTML output using Apache Ant.

4.1. Project Setup

In Eclipse select from the menu the FileNewProject entry and select from the proposed list the GeneralProjects entry.

The new project is called de.vogella.docbook.first.

Create the following folder structure:

  • input

  • input/images

  • css

  • output

  • docbook-xml-4.5

  • docbook-xsl

  • lib

Place the DocBook DTD and the XSLT stylesheets into the corresponding directories.

Copy the jar files from your Saxon download into the lib folder.

4.2. Write your first DocBook document

In your input folder create a file called book.xml with the following content.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE bookarticlearticlearticlearticle PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "../docbook-xml-4.5/docbookx.dtd">
<book> <title>Docbook Book Example</title> <chapter> <title>This is the first chapter</title> <section> <title>First section in the chapter</title> <para>Random text.</para> </section> <section> <title>Second section in the chapter</title> <para>Other random text</para> </section> </chapter> <chapter> <title>This is the second chapter</title> <section> <title>My Title</title> <para>More...</para> </section> <section> <title>Other title</title> <para>Blabla</para> </section> </chapter> </book>

We also want to use images in our example. Place in the input/images a graphic of type PNG called title.png.

The ../docbook-xml-4.5/docbookx.dtd corresponds to the directory you have created earlier in your project setup.

Create the following CSS file called style.css in the css folder.

h1 {
  text-transform: uppercase;

} 

4.3. Use Ant to convert DocBook to HTML5

Create the following buildhtml.xml file in your project directory.

<?xml version="1.0"?>
<!-- - Author: Lars Vogel -->
<project name="build HTML5" default="build-html">

  <description>
    Used to transform DocBook XML to HTML5 output
  </description>

  <!-- Define base properties -->
  <property name="input.dir" value="input" />
  <property name="output.dir" value="output" />
  <property name="docbook.xsl.dir" value="docbook-xsl-1.76.1" />
  <property name="xhtml5.stylesheet" value="${docbook.xsl.dir}/xhtml5/docbook.xsl" />

  <!-- Making saxon available -->
  <path id="saxon.class.path">
    <pathelement location="lib/saxon.jar" />
  </path>


  <!-- - target: usage -->
  <target name="usage" description="Prints help">
    <echo message="Use -projecthelp to get a list of the available targets." />
  </target>

  <!-- - target: clean -->
  <target name="clean" description="Cleans up generated files.">
    <delete dir="${output.dir}" />
  </target>

  <!-- - target: depends -->
  <target name="createtargetdir" description="Generate targetdir.">
    <mkdir dir="${output.dir}" />
    <mkdir dir="${output.dir}/images" />
  </target>


  <target name="build-html" depends="clean, createtargetdir"
    description="Generates HTML5 files">
    <echo message="Building HTML5 output" />

    <!-- Copy the stylesheet to the same directory as the HTML files -->
    <copy todir="${output.dir}">
      <fileset dir="css">
        <include name="style.css" />
      </fileset>
    </copy>

    <!-- Copy the images to the same directory as the HTML files -->
    <copy todir="${output.dir}\images">
      <fileset dir="${input.dir}\images">
        <include name="title.png" />
      </fileset>
    </copy>
    <!-- Transfer to HTML -->
    <xslt style="${xhtml5.stylesheet}" extension=".html" basedir="${input.dir}"
      destdir="${output.dir}">
      <include name="**/*book.xml" />
      <include name="**/*article*.xml" />
      <param name="html.stylesheet" expression="style.css" />
      <param name="docbook.css.source" expression="" />
      <param name="section.autolabel" expression="1" />
      <param name="make.clean.html" expression="1" />
      <outputproperty name="indent" value="yes" />
      <classpath refid="saxon.class.path" />
    </xslt>
  </target>

</project> 

Run the build.xml file via right-click on it and by selecting Run asAnt Build.

Afterwards, check the output directory. You should find an Example directory, with the book.html file.

Congratulations, you created your first DocBook and converted it into an HTLM document.

5. Convert Docbook to plain text

The best way to convert Docbook files to plain text is first to convert them to HTML and then use the text browser Lynx to convert it to text with the following command.

lynx -dump myfile.html > myfile.txt 

This way the text is well structured, e.g., tables are looking nice.

6. DocBook Tags

So far you have used a very limited set of DocBook attributes. The following chapter presents more tags which you typically need in a DocBook document.

6.1. Markup

Table 1. Important Docbook tags

Tag Explanation
<![CDATA[ SPECIAL_SIGN_HERE, e.g. & ]]> Allows you to enter special signs into the text which would be otherwise interpreted by DocBook.
<programlisting> </programlisting> Highlights the text as coding. You can also specify the programming language of this listing, e.g., language="java" or language="xml". This information can, for example, be used for syntax highlighting.
<wordasword></wordasword> Indicates a special word
<parameter class='command'>/w</parameter> Describes a parameter; class can be command, function, option
<guilabel> </guilabel> Label on a GUI
<guibutton> </guibutton> Button on a GUI
<filename class="directory">/usr/bin</filename>, <filename>db2html</filename> Directory or filename
<emphasis> </emphasis> Highlights the text
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" parse="text" href="example1.txt" /> Includes example1.xml as text, so the file can contain tags, etc.
<ulink url="http://www.heise.de/newsticker">German IT News</ulink>. Paste a hypertext link into the document.
&amp; Creates the ampersand (&) sign. Can, for example, be used in links.


6.2. Warnings

To define warning messages, DocBook provides several tags. The following list gives an overview of them.

  • tip

  • note

  • important

  • warning

  • caution

The following snippet shows how to use these.

<note>
  <title>This is a title</title>
  <para>Lalala</para>
</note> 

6.3. Line break

You can create an explicit line break with the <sbr/> command.

6.4. References to other elements

You link to different elements in Docbook with <xref linkend="id_of_the_element"/> .

6.5. Legal Notice

You can define legal notices where you state the conditions for reproduction like traditional copyright or an open content license. These tags are confined to meta-information.

<bookinfo>
  <title>Docbook Book Example</title>
  <subtitle>Demonstrating Copyright Information</subtitle>
  <author>
    <firstname>Lars</firstname>
    <surname>Vogel</surname>
  </author>
  <copyright>
    <year>2013</year>
    <holder>Lars Vogel</holder>
  </copyright>
  <pubdate>01.10.2013</pubdate>
  <releaseinfo>Version 1.0</releaseinfo>
  <legalnotice>
    <para>
      ALL RIGHTS RESERVED. This book contains material protected
      under International and Federal Copyright Laws and Treaties. Any
      unauthorized reprint or use of this material is prohibited. No part
      of this book may be reproduced or transmitted in any form or by any
      means, electronic or mechanical, including photocopying, recording,
      or by any information storage and retrieval system without express
      written permission from the author.
    </para>
  </legalnotice>
</bookinfo> 

6.6. Index

You create an index in your document with the <index/> entry.

To create index elements, you can use the following:

<indexterm>
  <primary></primary>
</indexterm> 

You can add a secondary index term.

<indexterm>
  <primary></primary>
  <secondary></secondary>
</indexterm> 

You can also place a reference to another index term.

<indexterm>
  <primary>Export</primary>
  <see>Deployment</see>
</indexterm> 

6.7. Tables

You can create a table via the following coding.

<table frame='all'>
  <title>Sample Table</title>
  <tgroup cols='2' align='left' colsep='1' rowsep='1'>
    <colspec colname='c1'  colwidth="1*"/>
    <colspec colname='c2'  colwidth="1*"/>
    <thead>
      <row>
        <entry>a4</entry>
        <entry>a5</entry>
      </row>
    </thead>
    <tfoot>
      <row>
        <entry>f4</entry>
        <entry>f5</entry>
      </row>
    </tfoot>
    <tbody>
      <row>
        <entry>b1</entry>
        <entry>b2</entry>
      </row>
      <row>
        <entry>d1</entry>
        <entry>d5</entry>
      </row>
    </tbody>
  </tgroup>
</table> 

6.8. Lists

You can create non-numbered lists like this:

<itemizedlist>
  <listitem>
    <para>Item1</para>
  </listitem>
  <listitem>
    <para>Item2</para>
  </listitem>
  <listitem>
    <para>Item3</para>
  </listitem>
  <listitem>
    <para>Item4</para>
  </listitem>
</itemizedlist> 

You can create numbered lists like this:

<orderedlist>
  <listitem>
    <para>This is a list entry</para>
  </listitem>
  <listitem>
    <para>This is another list entry</para>
  </listitem>
</orderedlist> 

6.9. Links

You can create links like this

<para>
  We use the Ant integrated into Eclipse. See
  <ulink url="http://www.vogella.com/tutorials/ApacheAnt/article.html">
    Apache Ant Tutorial</ulink>
  for an introduction into Apache Ant.
</para> 

6.10. Graphics

DocBook has no restrictions what kind of graphic format you use, e.g. JPEG, PNG or SVG. You can include graphics via the following tag. The optional "phrase" is used in HTML output to define the mandatory "alt" attribute of image.

<para>
  <mediaobject>
    <imageobject>
      <imagedata fileref="images/antview10.gif"/>
    </imageobject>
    <textobject>
      <phrase>A text for the graphic</phrase>
    </textobject>
  </mediaobject>
</para> 

You can also specify different graphics for different output formats.

<para>
  <mediaobject>
    <imageobject role="html">
      <imagedata fileref="images/antview10.gif" />
    </imageobject>
    <imageobject role="fo">
      <imagedata fileref="images/antview10.gif" />
    </imageobject>
    <textobject>
      <phrase>A text for the graphic</phrase>
    </textobject>
  </mediaobject>
</para> 

You can also embed a figure with enumeration and header.

<figure>
  <title>Logo</title>
  <mediaobject>
    <imageobject>
      <imagedata fileref="images/logo.png"/>
    </imageobject>
    <textobject>
      <phrase>Our Company Logo</phrase>
    </textobject>
  </mediaobject>
</figure> 

6.11. Menus

To define menu paths, as, for example, FileNew Project, use the following

<menuchoice>
  <guimenu>File</guimenu>
  <guisubmenu>New Project</guisubmenu>
</menuchoice> 

6.12. Keyboard Shortcuts

To define keyboard as, for example, Ctrl+Space, use the following

<keycombo>
  <keycap>Ctrl</keycap>
  <keycap>Space</keycap>
</keycombo> 

7. Creating epub

7.1. Overview of EPUB

EPUB is a format for electronic book defined by the International Digital Publishing Forum (IDPF). EPUB is based on XHTML and supports styling via CSS. An EPUB file is a ZP file with a predefined content. The ZIP file must contain a folder META-INF which contains a file container.xml. This file contains a pointer to the OEBPS/content.opf file. The content.opf contains the meta information about the book and points to the content pages which are defined as HTML pages.

The Docbook XLST stylesheets support a conversion into EPUB. This conversion is based on the XHTML stylesheets and therefore supports the same parameter as in HTML. The final EPUB document also requries an additonal file mimetype with a predefined content and the content of OEBPS and META-INF. The XSLT transformation will not automatically create the mimetype file nor the ZIP file. We will use Apache Ant to create them for us.

To validate an EPUB file, you can use the JAR file from the EPubCheck Validation Tool. Download the latest 1.x version and put it into the classpath of your Ant file. Make sure that you also extract the lib folder included in the ZIP file relative to the epub*.jar. After the conversion you can validate your EPUB file via the following command. We will include the check also in our Ant task.

java -jar epubcheck-1.2.jar book.epub 

7.2. Creating EPUB files with Apache Ant

The following example is based on the same file and directory structure as the other examples. Create the following book.xml file in your input directory.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "../docbook-xml-4.5/docbookx.dtd">
<book> <title>Docbook Book Example</title> <bookinfo> <title>DocBook Intro</title> <author> <firstname>Lars</firstname> <surname>Vogel</surname> </author> </bookinfo> <chapter> <title>This is the first chapter</title> <section> <title>First section in the chapter</title> <para>Random text. </para> <para> <mediaobject> <imageobject> <imagedata fileref="images/vogella_current_logo.png"/> </imageobject> <textobject> <phrase> </phrase> </textobject> </mediaobject> </para> </section> <section> <title>Second section in the chapter</title> <para>Other random text </para> </section> </chapter> <chapter> <title>This is the second chapter</title> <section> <title>My Title</title> <para>More... </para> </section> <section> <title>Other title</title> <para>Blabla </para> </section> </chapter> </book>

This book refers to the vogella_current_logo.png image in the image folder. Either create such an image or delete the part.

Create also a folder epubinput with a file mimetype. This file should have only the following content:

application/epub+zip 

The following Ant build.xml file will create EPUB output.

<?xml version="1.0"?>
<!--
  - Author:  Lars Vogel
  -->

<project name="docbook-src" default="build-epub">
  <description>
            This Ant file is used to transform DocBook XML to epub output
    </description>

  <!--
      - Configure basic properties that will be used in the file.
      -->

  <property name="input.dir" value="input" />
  <property name="output.dir" value="output" />
  <property name="docbook.xsl.dir" value="docbook-xsl-1.77.1" />

  <property name="epub.stylesheet" value="${docbook.xsl.dir}/epub/docbook.xsl" />

  <property name="{destfilename}" value="book" />

  <!-- Making saxon available -->
  <path id="saxon.class.path">
    <pathelement location="lib/saxon.jar" />
  </path>

  <property name="epubcheck.jar" value="lib/epubcheck/epubcheck-1.2.jar" />
  <!--
      - target:  usage
      -->

  <target name="usage" description="Prints the Ant build.xml usage">
    <echo message="Use -projecthelp to get a list of the available targets." />
  </target>

  <!--
      - target:  clean
      -->

  <target name="clean" description="Cleans up generated files.">
    <delete dir="${output.dir}" />
  </target>

  <!--
      - target:  depends
      -->

  <target name="depends">
    <mkdir dir="${output.dir}" />
    <mkdir dir="${output.dir}/tmp" />
    <copy todir="${output.dir}/tmp">
      <fileset dir="epubinput">
        <include name="mimetype" />
      </fileset>
    </copy>
    <copy todir="${output.dir}/tmp/OEBPS/images">
      <fileset dir="images">
        <include name="vogella_current_logo.png" />
      </fileset>
    </copy>
  </target>

  <!--
      - target:  build-html
      - description:  Iterates through a directory and transforms
      -     .xml files into .html files using the DocBook XSL.
      -->

  <!--
     - target:  build-epub
     - description:  Iterates through a directory and transforms
     -     .xml files into .epub files using the DocBook XSL.
   -->
  <target name="build-epub" depends="clean, depends" description="Generates EPUB files from DocBook XML">

    <xslt style="${epub.stylesheet}" extension=".html" 
      basedir="${input.dir}" destdir="${output.dir}/tmp">
      <include name="**/*book.xml" />
      <param name="epub.stylesheet" expression="style.css" />
      <!-- The following parameter do not work currently
      
      <param name="epub.metainf.dir" expression="${output.dir}/META-INF/" />
      <param name="epub.oebps.dir" expression="${output.dir}/OEBPS/" />
      -->
      <classpath refid="saxon.class.path" />
    </xslt>

    <copy todir="${output.dir}/tmp/OEBPS">
      <fileset dir="OEBPS">
      </fileset>
    </copy>

    <copy todir="${output.dir}/tmp/META-INF">
      <fileset dir="META-INF">
      </fileset>
    </copy>

    <!-- Don't know how to avoid genereation of "${destfilename}.html" by Saxon -->
    <delete file="${output.dir}/tmp/book.html" />

    <echo message="Generating book.epub" level="info" />

    <!-- We create temporary zips so that minetype is the first one in the final zip  -->

    <zip destfile="${output.dir}/temp.mimetype" basedir="${output.dir}/tmp" compress="false" includes="mimetype" />
    <zip destfile="${output.dir}/temp.zip" basedir="${output.dir}/tmp/" level="9" compress="true" excludes="mimetype" includes="OEBPS/** META-INF/**" />
    <zip destfile="${output.dir}/book.epub" update="true" keepcompression="true" encoding="UTF-8" excludes="*.html">
      <zipfileset src="${output.dir}/temp.mimetype" />
      <zipfileset src="${output.dir}/temp.zip" />
    </zip>

    <!-- Have to delete these directories would be nicer to place then in tmp output dir -->
    <delete dir="./OEBPS" />
    <delete dir="./META-INF" />

    <!-- Make sure the epubcheck lib has a subfolder lib with saxon.jar and jing.jar in it
    -->
    <epub.check epub="book" />
    
  </target>

  <!-- epub check macro definition -->
  <macrodef name="epub.check" description="Check an epub">
    <attribute name="epub" description="Name of the EPUB" />
    <sequential>
      <java jar="${epubcheck.jar}" fork="true">
        <arg value="${output.dir}/@{epub}.epub" />
      </java>
    </sequential>
  </macrodef>
</project> 

I personally see the following issues. Please let me know if you have a solution for it.

  • Target location of META-INF/ can be specified via epub.metainf.dir, but if you do so this path is also used in the container.xml.

  • Same issue with epub.oebps.dir.

You find another example Ant file in Ant for EPUB Blog Entry from Tony Graham.

8. Create pdf output

8.1. Overview

You can convert DocBook to XML-FO via the DocBook XSL Stylesheets. XML FO stands for XML Formating Objects and is an XML standard which is optimized for print media. XML-FO can then be tranlated into PDF via the Apache FOP library.

8.2. Installation

In addition to the existing setup you also require the Apache FOP library. Download the binary FOP distribution from http://xmlgraphics.apache.org/fop/.

Copy all the JAR files from the FOP distribution in your library directory and add the libs to the ant build path. See Apache Ant Tutorial on how to modify the ant build path.

8.3. Define the Ant Task

You have to add the task to your ant build file and then call the task. The following snippet shows how to define the task and how to call it. The second listing is then a full example ant build.xml file.

<!--
  - Defines the ant task for xinclude
-->
<taskdef name="fop" classname="org.apache.fop.tools.anttasks.Fop" />


<!-- Transformation into pdf
  - Two steps
  - 1.) First create the FO files 
  - 2.) Then transform the FO files into pdf files
-->

<!--
  - target:  build-pdf
  - description:  Iterates through a directory and transforms
  -     .xml files into .fo files using the DocBook XSL.
-->
<target name="build-pdf" depends="depends, xinclude"
  description="PDF from DocBook XML">
  <!-- Convert DocBook Files into FO -->
  <xslt style="${fo.stylesheet}" extension=".fo" basedir="${src.tmp}"
    destdir="${src.tmp}">
    <include name="**/*book.xml" />
    <include name="**/*article.xml" />
    <param name="section.autolabel" expression="1" />
  </xslt>
  <!-- Convert FO Files into pdf -->
  <fop format="application/pdf" outdir="${doc.dir}">
    <fileset dir="${src.tmp}">
      <include name="**/*.fo" />
    </fileset>
  </fop>
</target> 

<?xml version="1.0"?>
<!--
  - Author: Lars Vogel
-->
<project name="docbook-src" default="all">

  <description>
    This Ant build.xml file is used to transform DocBook XML to
    various output formats
  </description>

  <!--
    - Defines the ant task for xinclude
  -->

  <taskdef name="xinclude" classname="de.vogella.xinclude.XIncludeTask" />

  <!--
    - Defines the ant task for xinclude
  -->
  <taskdef name="fop" classname="org.apache.fop.tools.anttasks.Fop" />

  <!--
    - Configure basic properties that will be used in the file.
  -->


  <property name="javahelp.dir" value="${basedir}/../Documentation/output/vogella/javahelp" />
  <property name="src" value="${basedir}/documentation" />
  <property name="output.dir" value="${basedir}/../Documentation/output/vogella/articles" />
  <property name="output.tmp" value="${basedir}/output.tmp" />
  <property name="lib" value="${basedir}/lib/" />
  <property name="docbook.xsl.dir" value="${basedir}/docbook-xsl-1.72.0" />
  <property name="xinclude.lib.dir" value="${basedir}/lib/" />

  <!--
    - Usage of the differect style sheets which will be used for the transformation
  -->
  <property name="eclipse.stylesheet" value="${docbook.xsl.dir}/eclipse/eclipse.xsl" />
  <property name="html.stylesheet" value="${docbook.xsl.dir}/html/docbook.xsl" />
  <property name="fo.stylesheet" value="${docbook.xsl.dir}/fo/docbook.xsl" />
  <property name="javahelp.stylesheet" value="${docbook.xsl.dir}/javahelp/javahelp.xsl" />



  <property name="chunk-html.stylesheet" value="${docbook.xsl.dir}/html/chunk.xsl" />




  <!--
    - target: usage
  -->
  <target name="usage" description="Prints the Ant build.xml usage">
    <echo message="Use -projecthelp to get a list of the available targets." />
  </target>

  <!--
    - target: clean
  -->
  <target name="clean" description="Cleans up generated files.">
    <delete dir="${output.dir}" />
  </target>

  <!--
    - target: depends
  -->
  <target name="depends">
    <mkdir dir="${output.dir}" />
  </target>

  <!--
    - target: copy
    - Copies the images from the subdirectories to the target folder
  -->
  <target name="copy">
    <echo message="Copy the images" />
    <copy todir="${output.dir}">
      <fileset dir="${src}">
        <include name="**/images/*.*" />
      </fileset>
    </copy>
  </target>


  <!--
    - target: xinclude
    - description: Creates one combined temporary files for the different inputs files.
    - The combined file will then be processed via different ant tasks
  -->
  <target name="xinclude">

    <xinclude in="${src}/DocBook/article.xml" out="${output.tmp}/DocBook/article.xml" />

    <xinclude in="${src}/JavaConventions/article.xml" out="${output.tmp}/JavaConventions/article.xml" />

    <xinclude in="${src}/JUnit/article.xml" out="${output.tmp}/JUnit/article.xml" />

    <xinclude in="${src}/EclipseReview/article.xml" out="${output.tmp}/EclipseReview/article.xml" />

    <xinclude in="${src}/HTML/article.xml" out="${output.tmp}/HTML/article.xml" />

    <xinclude in="${src}/Eclipse/article.xml" out="${output.tmp}/Eclipse/article.xml" />

    <xinclude in="${src}/Logging/article.xml" out="${output.tmp}/Logging/article.xml" />
    <!--
    <xinclude in="${src}/ant/article.xml" out="${src.tmp}/ant/article.xml" />
    -->

  </target>


  <!--
    - target: build-html
    - description: Iterates through a directory and transforms
    - .xml files into .html files using the DocBook XSL.
  -->
  <target name="build-html" depends="depends, xinclude" description="Generates HTML files from DocBook XML">
    <xslt style="${html.stylesheet}" extension=".html" basedir="${output.tmp}" destdir="${output.dir}">
      <include name="**/*book.xml" />
      <include name="**/*article.xml" />
      <param name="html.stylesheet" expression="styles.css" />
      <param name="section.autolabel" expression="1" />
      <param name="html.cleanup" expression="1" />
      <outputproperty name="indent" value="yes" />
    </xslt>
    <!-- Copy the stylesheet to the same directory as the HTML files -->
    <copy todir="${output.dir}">
      <fileset dir="lib">
        <include name="styles.css" />
      </fileset>
    </copy>
  </target>

  <!--
    - target: build-javahelp
    - description: Iterates through a directory and transforms
    - .xml files into .html files using the DocBook XSL.
    -->
  <target name="build-javahelp" depends="depends, xinclude" description="JavaHelp from DocBook XML">
    <xslt style="${javahelp.stylesheet}" extension=".html" basedir="${output.tmp}" destdir="${javahelp.dir}">
      <include name="**/*book.xml" />
      <include name="**/*article.xml" />
      <outputproperty name="indent" value="yes" />
    </xslt>
  </target>





  <!--
    - target: chunks-html
    - description: Iterates through a directory and transforms
    - .xml files into seperate .html files using the DocBook XSL.
  -->
  <target name="build-chunks" depends="depends, xinclude" description="chunk HTML from DocBook XML">
    <xslt style="${html.stylesheet}" extension=".html" basedir="${output.tmp}" destdir="${output.dir}">
      <include name="**/*book.xml" />
      <include name="**/*article.xml" />
      <param name="html.stylesheet" expression="styles.css" />
      <param name="section.autolabel" expression="1" />
      <param name="html.cleanup" expression="1" />
      <param name="chunk.first.selection" expression="1" />
    </xslt>
    <!-- Copy the stylesheet to the same directory as the HTML files -->
    <copy todir="${output.dir}">
      <fileset dir="lib">
        <include name="styles.css" />
      </fileset>
    </copy>
  </target>


  <!-- Transformation into pdf
    - Two steps
    - 1.) First create the FO files
    - 2.) Then transform the FO files into pdf files
  -->

  <!--
    - target: build-pdf
    - description: Iterates through a directory and transforms
    - .xml files into fo files using the DocBook XSL.
    - Relativebase is set to true to enable FOP to find the graphics which are included
    - in the images directory
  -->
  <target name="build-pdf" depends="depends, xinclude" description="PDF from DocBook XML">
    <!-- Convert DocBook Files into FO -->
    <xslt style="${fo.stylesheet}" extension=".fo" basedir="${output.tmp}" destdir="${output.tmp}">
      <include name="**/*book.xml" />
      <include name="**/*article.xml" />
      <param name="section.autolabel" expression="1" />
    </xslt>
    <!-- Convert FO Files into pdf -->
    <fop format="application/pdf" outdir="${output.dir}" relativebase="true">
      <fileset dir="${output.tmp}">
        <include name="**/*.fo" />
      </fileset>
    </fop>
  </target>

  <!--
    - target: chunks-html
    - description: Iterates through a directory and transforms
    - .xml files into seperate .html files using the DocBook XSL.
  -->
  <target name="build-eclipse" depends="depends, xinclude" description="Eclipse help from DocBook XML">
    <xslt style="${eclipse.stylesheet}" basedir="${output.tmp}" destdir="${output.dir}">
      <include name="**/*book.xml" />
      <include name="**/*article.xml" />
    </xslt>
  </target>

  <target name="all" depends="copy, build-html, build-pdf, build-chunks, build-eclipse">
  </target>

</project> 

9. Influencing the output result

The XSLT stylesheets have several parameters which can influence the result of the conversion.

9.1. HTML Parameters

You find all HTML relevant parameters at http://docbook.sourceforge.net/release/xsl/current/doc/html/.

Table 2. HTML Parameters

Parameter Description
name="section.autolabel" expression="1" Turns on the autolabeling for sections (1. Title, 1.1. Subtitle, etc.
name="chapter.autolabel" expression="1" Turns on the autolabeling for chapters
name="html.stylesheet" expression="styles.css" Define the stylesheet which should be used.
name="html.cleanup" expression="1" Will try to clean-up the html code for better readability
name="chunk.first.sections" expression="0" Will try to clean-up the html code for better readability [TODO: Does not work yet]


9.2. PDF Parameters

You find all FO / PDF relevant parameters at http://docbook.sourceforge.net/release/xsl/current/doc/fo/.

Table 3. PDF Parameters

Parameter Description
name="section.autolabel" expression="1" Turns on the autolabeling for sections (1. Title, 1.1. Subtitle, etc.
name="chapter.autolabel" expression="1" Turns on the autolabeling for chapters
name="html.stylesheet" expression="styles.css" Define the stylesheet which should be used.
name="html.cleanup" expression="1" Will try to clean-up the html code for better readability


9.3. Add content into the HTML output

DocBook allows you to include external HTML files into the HTML output. For example, you could use this to add JavaScript into your HTML output.

For example, use the following statement to include some HTML code.

<?dbhtml-include href="../../myadditonalcontent.html"?> 

See Inserting external HTML code for details.

10. Advanced Features

10.1. Syntax Highlighting

You can also enable syntax highlighting. This involves creating a customization stylesheet layer, the usage of an external lib and configuration file. Please see Source Code Syntax Highlighting with DocBook for a description of the setup.

To change how the highlighting is done, you could adjust the following template file: your_xslt_installation_dir/html/highlight.xsl

To add highlighting for a different language, create a language description file in the highlighting folder of your DocBook XSLT folder, e.g., add the following file called bourne-hl.xml for syntax highlighting for the bourne shell.

<?xml version="1.0" encoding="utf-8"?>
<!--

Syntax highlighting definition for SH

xslthl - XSLT Syntax Highlighting
http://sourceforge.net/projects/xslthl/
Copyright (C) 2010 Mathieu Malaterre

This software is provided 'as-is', without any express or implied
warranty.  In no event will the authors be held liable for any damages
arising from the use of this software.

Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:

1. The origin of this software must not be misrepresented; you must not
   claim that you wrote the original software. If you use this software
   in a product, an acknowledgment in the product documentation would be
   appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
   misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.

-->
<highlighters>
  <highlighter type="oneline-comment">#</highlighter>
  <highlighter type="heredoc">
    <start>&lt;&lt;</start>
    <quote>'</quote>
    <quote>"</quote>
    <flag>-</flag>
    <noWhiteSpace />
    <looseTerminator />
  </highlighter>
  <highlighter type="string">
    <string>"</string>
    <escape>\</escape>
  </highlighter>
  <highlighter type="string">
    <string>'</string>
    <escape>\</escape>
    <spanNewLines />
  </highlighter>
  <highlighter type="hexnumber">
    <prefix>0x</prefix>
    <ignoreCase />
  </highlighter>
  <highlighter type="number">
    <point>.</point>
    <pointStarts />
    <ignoreCase />
  </highlighter>
  <highlighter type="keywords">
    <!-- reserved words -->
    <keyword>if</keyword>
    <keyword>then</keyword>
    <keyword>else</keyword>
    <keyword>elif</keyword>
    <keyword>fi</keyword>
    <keyword>case</keyword>
    <keyword>esac</keyword>
    <keyword>for</keyword>
    <keyword>while</keyword>
    <keyword>until</keyword>
    <keyword>do</keyword>
    <keyword>done</keyword>
    <!-- built-ins -->
    <keyword>exec</keyword>
    <keyword>shift</keyword>
    <keyword>exit</keyword>
    <keyword>times</keyword>
    <keyword>break</keyword>
    <keyword>export</keyword>
    <keyword>trap</keyword>
    <keyword>continue</keyword>
    <keyword>readonly</keyword>
    <keyword>wait</keyword>
    <keyword>eval</keyword>
    <keyword>return</keyword>
    <!-- other commands -->
    <keyword>cd</keyword>
    <keyword>echo</keyword>
    <keyword>hash</keyword>
    <keyword>pwd</keyword>
    <keyword>read</keyword>
    <keyword>set</keyword>
    <keyword>test</keyword>
    <keyword>type</keyword>
    <keyword>ulimit</keyword>
    <keyword>umask</keyword>
    <keyword>unset</keyword>
  </highlighter>
</highlighters> 

And register it in the highlighting/xslthl-config.xml file, for example:

<?xml version="1.0" encoding="UTF-8"?>
<!-- 

xslthl - XSLT Syntax Highlighting
http://sourceforge.net/projects/xslthl/
Copyright (C) 2005-2008 Michal Molhanec, Jirka Kosek, Michiel Hendriks

This software is provided 'as-is', without any express or implied
warranty.  In no event will the authors be held liable for any damages
arising from the use of this software.

Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:

1. The origin of this software must not be misrepresented; you must not
   claim that you wrote the original software. If you use this software
   in a product, an acknowledgment in the product documentation would be
   appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
   misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.

Michal Molhanec <mol1111 at users.sourceforge.net>
Jirka Kosek <kosek at users.sourceforge.net>
Michiel Hendriks <elmuerte at users.sourceforge.net>

-->
<xslthl-config>
  <highlighter id="java" file="java-hl.xml" />
  <highlighter id="delphi" file="delphi-hl.xml" />
  <highlighter id="pascal" file="delphi-hl.xml" />
  <highlighter id="ini" file="ini-hl.xml" />
  <highlighter id="php" file="php-hl.xml" />
  <highlighter id="myxml" file="myxml-hl.xml" />
  <highlighter id="m2" file="m2-hl.xml" />
  <highlighter id="tcl" file="tcl-hl.xml" />
  <highlighter id="c" file="c-hl.xml" />
  <highlighter id="cpp" file="cpp-hl.xml" />
  <highlighter id="csharp" file="csharp-hl.xml" />
  <highlighter id="python" file="python-hl.xml" />
  <highlighter id="ruby" file="ruby-hl.xml" />
  <highlighter id="perl" file="perl-hl.xml" />  
  <highlighter id="javascript" file="javascript-hl.xml" />
  <highlighter id="bourne" file="bourne-hl.xml" />
  <namespace prefix="xslthl" uri="http://xslthl.sf.net" />
</xslthl-config> 

10.2. Remove certain parts

Sometimes you want to remove certain parts of your document before processing it. The following is an example where sections marked with the role="wrapper" will be removed.

The following processing rule will remove the marked section. You would output that to a temporary folder and run your real conversion on the temp folder.

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

  <xsl:output method="xml" />

  <xsl:template match="section[@role='wrapper']">
    <xsl:apply-templates select="section" />
  </xsl:template>

  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()" />
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet> 

10.3. Using own stylesheets

You can also create your own stylesheets, import the default ones and override the parts which you do not like.

The following example shows an own stylesheet. It imports some HTML content and changes the titlepage by including an image.

<?xml version='1.0'?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">
  <xsl:import href="../docbook-xsl-1.77.1/html/docbook.xsl" />
  <xsl:template name="user.header.content">
    <xsl:variable name="codefile"
      select="document('../../../de.vogella.publishing/mystylesheets/headerstandalone.html',/)" />
    <xsl:copy-of select="$codefile/htmlcode/node()" />
  </xsl:template>
  <xsl:template name="user.footer.content">
    <xsl:variable name="codefile"
      select="document('../../../de.vogella.publishing/mystylesheets/footerstandalone.html',/)" />
    <xsl:copy-of select="$codefile/htmlcode/node()" />
  </xsl:template>

  <xsl:template name="user.head.content">
    <link rel="shortcut icon" href="http://www.vogella.com/favicon.ico" />
  </xsl:template>

  <xsl:template name="article.titlepage.before.recto">
    <xsl:if test="articleinfo">
      <div class="vogellalogo">
        <a rel="author" href="http://www.vogella.com/people/larsvogel.html">
          <img src="http://www.vogella.com/img/logo/preface.png" height="67"
            width="202" alt="About Lars Vogel" />
        </a>
      </div>
    </xsl:if>
  </xsl:template>

</xsl:stylesheet> 

11. Visual Editor for XML (Vex)

11.1. What is Vex

Vex is an Open Source plug-in for the Eclipse that augments the IDE with a WYSIWYG XML editor. Since August 2008 the plugin is part of the Eclipse project.

11.2. Installation

You can install Vex via an Eclipse update site:

http://download.eclipse.org/vex/releases/1.0/

http://download.eclipse.org/vex/milestones/1.1/ 

11.3. Usage

To switch between the standard Eclipse XML editor and Vex, do a right-click on the file in the Project Explorer and select Visual XML Editor.

The plugin is helpful as a quick preview, but still a work in progress. At the time of writing, embedded text via XInclude is just ignored.

12. LanguageTool

12.1. What is LanguageTool?

LanguageTool is an Open Source style and grammar check written in Java. You can replace the basic JDT spell-checker with an enhanced one that supports HunSpell dictionaries and comes with thousands of rules to detect grammar and style pitfalls as well as common typos. The Eclipse integration is work in progress, but already useful as it filters XML tags and indicates problems in plain text with tooltips and red underlines.

12.2. Installation

You can install LanguageTool via an Eclipse update site:

http://download.vogella.com/p2/C-MASTER-Eclipse-LanguageTool/workspace/cx.ath.remisoft.languagetool.p2updatesite/target/repository/ 

Hacking

Development takes place on GitHub for both the Eclipse plugin and the main LanguageTool project. Contributing new rules for undetected errors is easy and only requires editing XML files which can be developed using a web-based Rule Editor. You can also write your own detection algorithms in Java or use the LanguageTool API to embed the functionality in your own Java projects.

12.3. Usage

To enable LanguageTool click WindowPreferences and select LanguageTool in the drop-down box.

Spell Checking

Be sure to specify a language variant, i.e., American English instead of generic English, to also enable the spell checking dictionary and additional rules.

13. Using XInclude with Eclipse XSL

13.1. Overview

XInclude can be used to structure the DocBook source files so that you have one file per chapter / section and one master file which includes these files. Via XInclude these separate files can be combined into one file.

You can, for example, include a file foo.xml into another one via the following statement

<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="foo.xml" /> 

In case this file should be treated as text:

<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" parse="text" href="bar.xml" /> 

Am XInclude ant task is provided by the Eclipse XSL project. I can proudly say that this ant task was contributed by me to the Eclipse XLS project. :-)

13.2. Eclipse XSL Tools

Eclipse XSL Tools provide support for XSLT transformations. It supports XSL editing and debugging support. We will only use the XInclude task, but you have to install the whole package.

Install the XSL tools via the update manager from the standard Eclipse update site. See Using the Eclipse update manager for details.

13.3. Using the XInclude ant task

From your Eclipse installation take the org.eclipse.wst.xsl.core.jar and add this JAR file to your ant classpath. Put the new JAR into your Ant classpath. See Apache Ant Tutorial - Classpath for details.

You should now be able to create and run the xinclude task. Below an example ant build.xml file.

<?xml version="1.0"?>
<!-- Author: Lars Vogel -->
<project name="docbook-src" default="usage">

  <description>
  This Ant build.xml file is used to transform DocBook XML to various
  output formats
  </description>

  <!--Configure basic properties that will be used in the file. -->

  <property name="doc.dir" value="${basedir}/output" />
  <property name="src" value="${basedir}/src" />
  <property name="src.tmp" value="${basedir}/src.tmp" />
  <property name="lib" value="${basedir}/lib/" />
  <property name="docbook.xsl.dir" value="${basedir}/docbook-xsl-1.72.0" />

  <property name="html.stylesheet" value="${docbook.xsl.dir}/html/docbook.xsl" />
  <property name="xinclude.lib.dir" value="${basedir}/lib/" />


  <!-- target: usage -->
  <target name="usage" description="Prints the Ant build.xml usage">
    <echo message="Use -projecthelp to get a list of the available targets." />
  </target>

  <!-- target: clean -->
  <target name="clean" description="Cleans up generated files.">
    <delete dir="${doc.dir}" />
  </target>

  <!-- target: depends -->
  <target name="depends">
    <mkdir dir="${doc.dir}" />
  </target>


  <!--
  - target: xinclude
  - description: Creates one combined temporary files for the different inputs files.
  - The combined file will then be processed via different ant tasks
  -->
  <target name="xinclude">
    <xsl.xinclude in="${src}/DocBook/article.xml" out="${src.tmp}/DocBook/article.xml" />
  </target>


  <!--
  - target: build-html
  - description: Iterates through a directory and transforms
  - .xml files into .html files using the DocBook XSL.
  -->
  <target name="build-html" depends="depends, xinclude" description="HTML from DocBook XML">
    <xslt style="${html.stylesheet}" extension=".html" basedir="${src.tmp}" destdir="${doc.dir}">
      <include name="**/*book.xml" />
      <include name="**/*article.xml" />
      <param name="html.stylesheet" expression="styles.css" />
    </xslt>
    <!-- Copy the stylesheet to the same directory as the HTML files -->
    <copy todir="${doc.dir}">
      <fileset dir="lib">
        <include name="styles.css" />
      </fileset>
    </copy>
  </target>

</project> 

14. Support this website

This tutorial is Open Content under the CC BY-NC-SA 3.0 DE license. Source code in this tutorial is distributed under the Eclipse Public License. See the vogella License page for details on the terms of reuse.

Writing and updating these tutorials is a lot of work. If this free community service was helpful, you can support the cause by giving a tip as well as reporting typos and factual errors.

14.1. Thank you

Please consider a contribution if this article helped you. It will help to maintain our content and our Open Source activities.

14.2. Questions and Discussion

If you find errors in this tutorial, please notify me (see the top of the page). Please note that due to the high volume of feedback I receive, I cannot answer questions to your implementation. Ensure you have read the vogella FAQ as I don't respond to questions already answered there.

15. Links and Literature

The XSLT stylesheets

DocBook XSL Online Book from Bob Stayton

Eclipse XSLT

XSLT mailing list

Publican

oXygen XML Editor

Vex - A Visual Editor for XML

XMLmind XML Editor

XSLT 2.0 Stylesheets

Reference of the XSLT stylesheet parameters

Syntax Highlighting with XSLTHL

Reference of the DocBook parameters

The Apache FOP Distribution