DocBook. This tutorial explains how to write DocBook files in Eclipse and how to convert these files into various output formats, e.g., to HTML and PDF. It also explains how to configure XInclude to divide the information into different source files. This tutorial uses Docbook 4.5, the Saxon XLST processor in version 6.5.5, Eclipse 4.3 and the XLST stylesheets in version 1.77.1.
1. Introduction to DocBook
1.1. Overview
DocBook is a standard for creating well-formated plain text documents. DocBook files are written as plain text.
For further processing DocBook files are transformed into other output formats. This is typically done via XSLT (Extensible Stylesheet Language Transformation).
XSLT sheets for converting DocBook into common output, e.g. HTML or PDF, are available.
1.2. Documents classes
DocBook has two main document classes, the book class and the article class.
-
Article: Used for writing technical articles. The main tag is
<article>. Article is used in the following example. -
Book: Used for longer description. The main tags is book. In addition to sections in articles you have also the
<chapter>tag and the<part>tag.
1.3. DocBook versions
DocBook
is currently available in two versions. The
4.5
version and the
5.0
version. This tutorial is based on the
4.5
version which seems to be the version that is still heavily used.
1.4. DocBook Example
The following listing shows an example of a DocBook article.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "../docbook-xml-4.5/docbookx.dtd">
<article>
== Docbook Article Example
<articleinfo>
=== DocBook Intro
<author>
<firstname>Lars</firstname>
<surname>Vogel</surname>
</author>
<volumenum>1234</volumenum>
<chapter>
=== This is the first chapter
=== First section in the chapter
Test
=== First sub section
Subsection1
=== second sub section
Subsection2
=== Second section in the chapter
Other random text
image::title.png[]
</imageobject>
<textobject>
<phrase>Image description</phrase>
</textobject>
</mediaobject>
</chapter>
<chapter>
=== This is the second chapter
=== My Title
More...
=== Other title
Blabla
</chapter>
The following listing shows an example of a DocBook book.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE bookarticlearticlearticlearticle PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "../docbook-xml-4.5/docbookx.dtd">
<book>
== Docbook Book Example
<chapter>
=== This is the first chapter
=== First section in the chapter
Random text.
=== Second section in the chapter
Other random text
</chapter>
<chapter>
=== This is the second chapter
=== My Title
More...
=== Other title
Blabla
</chapter>
</book>
The header in the file points to the location of the DTD file of the docbook download.
2. The required toolset
To create DocBook files and to convert them into other formats, you need the following tools.
-
The DocBook DTD which defines the structure of a DocBook document.
-
An XSLT stylesheet to convert your DocBook file into another format.
-
An XSLT parser
In this tutorial you use the Eclipse IDE as an XML editor, Saxon as the XSLT parser and Apache Ant for the XSLT transformation.
3. Installation
3.1. Eclipse
Install Eclipse. See Eclipse IDE for installing and using Eclipse.
Eclipse has Apache Ant integrated. Therefore, no additional installation for Ant is required.
3.2. Docbook and Stylesheets
Download the Docbook DTD in version 4.5 and the latest version of the XSLT stylesheets.
3.3. XSL processor
The download link for Saxon is: http://saxon.sourceforge.net/ .
Download the version 6.5.5 as newer Saxon versions do not work well with DocBook 4.5. Saxon 9 is an XSLT 2.0 processor and the current official version of the XSL stylesheets are XSLT 1.0 based.
Download the
saxon.zip
file. The
saxon.jar
is later needed.
3.4. Issues
Sometimes running the XSLT conversion results in the following error message.
javax.xml.transform.TransformerConfigurationException: java.net.MalformedURLException: no protocol: ../common/entities.ent
In this case try adding the xerces-j XML parser to your build path and see if that resolves the error.
4. Tutorial: convert Docbook to HTML5
The following tutorial describes how you can use Eclipse to convert a Docbook input file into an HTML output using Apache Ant.
4.1. Project Setup
In Eclipse select from the menu the entry and select from the proposed list the entry.
The new project is called de.vogella.docbook.first.
Create the following folder structure:
-
input
-
input/images
-
css
-
output
-
docbook-xml-4.5
-
docbook-xsl
-
lib
Place the DocBook DTD and the XSLT stylesheets into the corresponding directories.
Copy the
jar
files from your
Saxon
download into the
lib folder.
4.2. Write your first DocBook document
In your
input
folder create a file called
book.xml
with the following content.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE bookarticlearticlearticlearticle PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "../docbook-xml-4.5/docbookx.dtd">
<book>
== Docbook Book Example
<chapter>
=== This is the first chapter
=== First section in the chapter
Random text.
=== Second section in the chapter
Other random text
</chapter>
<chapter>
=== This is the second chapter
=== My Title
More...
=== Other title
Blabla
</chapter>
</book>
We also want to use images in our example.
Place in the
input/images
a graphic of type PNG called
title.png.
The
../docbook-xml-4.5/docbookx.dtd
corresponds to the
directory you
have created earlier in your project
setup.
Create the following CSS file called
style.css
in the
css
folder.
h1 {
text-transform: uppercase;
}
4.3. Use Ant to convert DocBook to HTML5
Create the following
buildhtml.xml
file in your project directory.
<?xml version="1.0"?>
<!-- - Author: Lars Vogel -->
<project name="build HTML5" default="build-html">
<description>
Used to transform DocBook XML to HTML5 output
</description>
<!-- Define base properties -->
<property name="input.dir" value="input" />
<property name="output.dir" value="output" />
<property name="docbook.xsl.dir" value="docbook-xsl-1.76.1" />
<property name="xhtml5.stylesheet" value="${docbook.xsl.dir}/xhtml5/docbook.xsl" />
<!-- Making saxon available -->
<path id="saxon.class.path">
<pathelement location="lib/saxon.jar" />
</path>
<!-- - target: usage -->
<target name="usage" description="Prints help">
<echo message="Use -projecthelp to get a list of the available targets." />
</target>
<!-- - target: clean -->
<target name="clean" description="Cleans up generated files.">
<delete dir="${output.dir}" />
</target>
<!-- - target: depends -->
<target name="createtargetdir" description="Generate targetdir.">
<mkdir dir="${output.dir}" />
<mkdir dir="${output.dir}/images" />
</target>
<target name="build-html" depends="clean, createtargetdir"
description="Generates HTML5 files">
<echo message="Building HTML5 output" />
<!-- Copy the stylesheet to the same directory as the HTML files -->
<copy todir="${output.dir}">
<fileset dir="css">
<include name="style.css" />
</fileset>
</copy>
<!-- Copy the images to the same directory as the HTML files -->
<copy todir="${output.dir}\images">
<fileset dir="${input.dir}\images">
<include name="title.png[]
</fileset>
</copy>
<!-- Transfer to HTML -->
<xslt style="${xhtml5.stylesheet}" extension=".html" basedir="${input.dir}"
destdir="${output.dir}">
<include name="**/*book.xml" />
<include name="**/*article*.xml" />
<param name="html.stylesheet" expression="style.css" />
<param name="docbook.css.source" expression="" />
<param name="section.autolabel" expression="1" />
<param name="make.clean.html" expression="1" />
<outputproperty name="indent" value="yes" />
<classpath refid="saxon.class.path" />
</xslt>
</target>
</project>
Run the build.xml file
via right-click on it and by selecting
.
Afterwards, check the output directory. You should find
an
Example
directory, with the
book.html
file.
Congratulations, you created your first DocBook and converted it into an HTLM document.
5. Convert Docbook to plain text
The best way to convert Docbook files to plain text is first to convert them to HTML and then use the text browser Lynx to convert it to text with the following command.
lynx -dump myfile.html > myfile.txt
This way the text is well structured, e.g., tables are looking nice.
6. DocBook Tags
So far you have used a very limited set of DocBook attributes. The following chapter presents more tags which you typically need in a DocBook document.
6.1. Markup
| Tag | Explanation |
|---|---|
<![CDATA[ SPECIAL_SIGN_HERE, e.g. & ]]> |
Allows you to enter special signs into the text which would be otherwise interpreted by DocBook. |
<programlisting> </programlisting> |
Highlights the text as coding. You can also specify the programming language of this listing, e.g., language="java" or language="xml". This information can, for example, be used for syntax highlighting. |
<wordasword></wordasword> |
Indicates a special word |
<parameter class='command'>/w</parameter> |
Describes a parameter; |
<guilabel> </guilabel> |
Label on a GUI |
<guibutton> </guibutton> |
Button on a GUI |
|
Directory or filename |
<emphasis> </emphasis> |
Highlights the text |
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" parse="text" href="example1.txt" /> |
Includes |
<ulink url="http://www.heise.de/newsticker">German IT News</ulink>. |
Paste a hypertext link into the document. |
& |
Creates the ampersand (&) sign. Can, for example, be used in links. |
6.2. Warnings
To define warning messages, DocBook provides several tags. The following list gives an overview of them.
-
tip
-
note
-
important
-
warning
-
caution
The following snippet shows how to use these.
<note>
== This is a title
Lalala
6.3. Line break
You can create an explicit line break with the <sbr/> command.
6.4. References to other elements
You link to different elements in Docbook with <xref linkend="id_of_the_element"/>.
6.5. Legal Notice
You can define legal notices where you state the conditions for reproduction like traditional copyright or an open content license. These tags are confined to meta-information.
<bookinfo>
== Docbook Book Example
<subtitle>Demonstrating Copyright Information</subtitle>
<author>
<firstname>Lars</firstname>
<surname>Vogel</surname>
</author>
<copyright>
<year>2013</year>
<holder>vogella GmbH</holder>
</copyright>
<pubdate>01.10.2013</pubdate>
<releaseinfo>Version 1.0</releaseinfo>
<legalnotice>
ALL RIGHTS RESERVED. This book contains material protected
under International and Federal Copyright Laws and Treaties. Any
unauthorized reprint or use of this material is prohibited. No part
of this book may be reproduced or transmitted in any form or by any
means, electronic or mechanical, including photocopying, recording,
or by any information storage and retrieval system without express
written permission from the author.
</legalnotice>
</bookinfo>
6.6. Index
You create an index in your document with the <index/> entry.
To create index elements, you can use the following:
<indexterm>
<primary></primary>
</indexterm>
You can add a secondary index term.
<indexterm>
<primary></primary>
<secondary></secondary>
</indexterm>
You can also place a reference to another index term.
<indexterm>
<primary>Export</primary>
<see>Deployment</see>
</indexterm>
6.7. Tables
You can create a table via the following coding.
.Table Title
|===
== Sample Table
<tgroup cols='2' align='left' colsep='1' rowsep='1'>
<colspec colname='c1' colwidth="1*"/>
<colspec colname='c2' colwidth="1*"/>
|a4
|a5
<tfoot>
|f4
|f5
</tfoot>
|b1
|b2
|d1
|d5
|===
6.8. Lists
You can create non-numbered lists like this:
*
Item1
*
Item2
*
Item3
*
Item4
You can create numbered lists like this:
<orderedlist>
*
This is a list entry
*
This is another list entry
</orderedlist>
6.9. Links
You can create links like this
We use the Ant integrated into Eclipse. See
http://www.vogella.com/tutorials/ApacheAnt/article.html">
Apache Ant Tutorial]
for an introduction into Apache Ant.
6.10. Graphics
DocBook has no restrictions what kind of graphic format you
use,
e.g.
JPEG, PNG or SVG.
You can include graphics via the following
tag.
The
optional
"phrase"
is used in HTML output to define the
mandatory
"alt"
attribute of image.
<mediaobject>
<imageobject>
<imagedata fileref="images/antview10.gif"/>
</imageobject>
<textobject>
<phrase>A text for the graphic</phrase>
</textobject>
</mediaobject>
You can also specify different graphics for different output formats.
<mediaobject>
<imageobject role="html">
<imagedata fileref="images/antview10.gif[]
</imageobject>
<imageobject role="fo">
<imagedata fileref="images/antview10.gif[]
</imageobject>
<textobject>
<phrase>A text for the graphic</phrase>
</textobject>
</mediaobject>
You can also embed a figure with enumeration and header.
<figure>
<mediaobject>
<imageobject>
<imagedata fileref="images/antview10.gif"/>
</imageobject>
<textobject>
<phrase>Our Company Logo</phrase>
</textobject>
</mediaobject>
</figure>
6.12. Keyboard Shortcuts
To define keyboard as, for example, <shortcut> <keycombo> <keycap>Ctrl</keycap> <keycap>Space</keycap> </keycombo> </shortcut>, use the following
<keycombo>
<keycap>Ctrl</keycap>
<keycap>Space</keycap>
</keycombo>
7. Creating epub
7.1. Overview of EPUB
EPUB is a format for electronic book defined by the
International
Digital Publishing Forum (IDPF). EPUB is based on XHTML
and supports
styling via CSS. An EPUB file is a ZP file with a
predefined content.
The ZIP file must contain a folder
META-INF
which contains a file
container.xml.
This file contains a pointer
to
the
OEBPS/content.opf
file. The
content.opf
contains the
meta
information about the book and points to the
content pages which
are
defined as HTML pages.
The Docbook XLST stylesheets support a conversion into EPUB.
This
conversion is based on the XHTML stylesheets and therefore
supports the
same parameter as in HTML. The final EPUB document
also requries an
additonal file mimetype with a predefined content and
the content of
OEBPS
and
META-INF.
The XSLT transformation will not
automatically
create the mimetype
file nor the ZIP file. We will use
Apache Ant to
create them for us.
To validate an EPUB file, you can use the JAR file from the
EPubCheck Validation Tool.
Download the latest 1.x version and put it into the classpath of
your Ant file. Make sure that you also extract the
lib
folder
included in the ZIP file
relative
to the
epub*.jar.
After the
conversion you can validate your
EPUB file via
the following
command.
We will include the check also in our Ant task.
java -jar epubcheck-1.2.jar book.epub
7.2. Creating EPUB files with Apache Ant
The following example is based on the same file and directory
structure as the other examples. Create the following
book.xml file
in your input directory.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "../docbook-xml-4.5/docbookx.dtd">
<book>
== Docbook Book Example
<bookinfo>
=== DocBook Intro
<author>
<firstname>Lars</firstname>
<surname>Vogel</surname>
</author>
</bookinfo>
<chapter>
=== This is the first chapter
=== First section in the chapter
Random text.
image::vogella_current_logo.png[]
</imageobject>
<textobject>
<phrase>]
</mediaobject>
=== Second section in the chapter
Other random text
</chapter>
<chapter>
=== This is the second chapter
=== My Title
More...
=== Other title
Blabla
</chapter>
</book>
This book refers to the
vogella_current_logo.png
image in the
image
folder. Either create
such an image or delete the part.
Create also a folder
epubinput
with a file
mimetype.
This
file should have only the following content:
application/epub+zip
The following Ant build.xml file will create EPUB output.
<?xml version="1.0"?>
<!--
- Author: Lars Vogel
-->
<project name="docbook-src" default="build-epub">
<description>
This Ant file is used to transform DocBook XML to epub output
</description>
<!--
- Configure basic properties that will be used in the file.
-->
<property name="input.dir" value="input" />
<property name="output.dir" value="output" />
<property name="docbook.xsl.dir" value="docbook-xsl-1.77.1" />
<property name="epub.stylesheet" value="${docbook.xsl.dir}/epub/docbook.xsl" />
<property name="{destfilename}" value="book" />
<!-- Making saxon available -->
<path id="saxon.class.path">
<pathelement location="lib/saxon.jar" />
</path>
<property name="epubcheck.jar" value="lib/epubcheck/epubcheck-1.2.jar" />
<!--
- target: usage
-->
<target name="usage" description="Prints the Ant build.xml usage">
<echo message="Use -projecthelp to get a list of the available targets." />
</target>
<!--
- target: clean
-->
<target name="clean" description="Cleans up generated files.">
<delete dir="${output.dir}" />
</target>
<!--
- target: depends
-->
<target name="depends">
<mkdir dir="${output.dir}" />
<mkdir dir="${output.dir}/tmp" />
<copy todir="${output.dir}/tmp">
<fileset dir="epubinput">
<include name="mimetype" />
</fileset>
</copy>
<copy todir="${output.dir}/tmp/OEBPS/images">
<fileset dir="images">
<include name="vogella_current_logo.png[]
</fileset>
</copy>
</target>
<!--
- target: build-html
- description: Iterates through a directory and transforms
- .xml files into .html files using the DocBook XSL.
-->
<!--
- target: build-epub
- description: Iterates through a directory and transforms
- .xml files into .epub files using the DocBook XSL.
-->
<target name="build-epub" depends="clean, depends" description="Generates EPUB files from DocBook XML">
<xslt style="${epub.stylesheet}" extension=".html"
basedir="${input.dir}" destdir="${output.dir}/tmp">
<include name="**/*book.xml" />
<param name="epub.stylesheet" expression="style.css" />
<!-- The following parameter do not work currently
<param name="epub.metainf.dir" expression="${output.dir}/META-INF/" />
<param name="epub.oebps.dir" expression="${output.dir}/OEBPS/" />
-->
<classpath refid="saxon.class.path" />
</xslt>
<copy todir="${output.dir}/tmp/OEBPS">
<fileset dir="OEBPS">
</fileset>
</copy>
<copy todir="${output.dir}/tmp/META-INF">
<fileset dir="META-INF">
</fileset>
</copy>
<!-- Don't know how to avoid genereation of "${destfilename}.html" by Saxon -->
<delete file="${output.dir}/tmp/book.html" />
<echo message="Generating book.epub" level="info" />
<!-- We create temporary zips so that minetype is the first one in the final zip -->
<zip destfile="${output.dir}/temp.mimetype" basedir="${output.dir}/tmp" compress="false" includes="mimetype" />
<zip destfile="${output.dir}/temp.zip" basedir="${output.dir}/tmp/" level="9" compress="true" excludes="mimetype" includes="OEBPS/** META-INF/**" />
<zip destfile="${output.dir}/book.epub" update="true" keepcompression="true" encoding="UTF-8" excludes="*.html">
<zipfileset src="${output.dir}/temp.mimetype" />
<zipfileset src="${output.dir}/temp.zip" />
</zip>
<!-- Have to delete these directories would be nicer to place then in tmp output dir -->
<delete dir="./OEBPS" />
<delete dir="./META-INF" />
<!-- Make sure the epubcheck lib has a subfolder lib with saxon.jar and jing.jar in it
-->
<epub.check epub="book" />
</target>
<!-- epub check macro definition -->
<macrodef name="epub.check" description="Check an epub">
<attribute name="epub" description="Name of the EPUB" />
<sequential>
<java jar="${epubcheck.jar}" fork="true">
<arg value="${output.dir}/@{epub}.epub" />
</java>
</sequential>
</macrodef>
</project>
I personally see the following issues. Please let me know if you have a solution for it.
-
Target location of
META-INF/can be specified viaepub.metainf.dir, but if you do so this path is also used in thecontainer.xml. -
Same issue with
epub.oebps.dir.
You find another example Ant file in Ant for EPUB Blog Entry from Tony Graham.
8. Create pdf output
8.1. Overview
You can convert DocBook to XML-FO via the DocBook XSL Stylesheets. XML FO stands for XML Formating Objects and is an XML standard which is optimized for print media. XML-FO can then be tranlated into PDF via the Apache FOP library.
8.2. Installation
In addition to the existing setup you also require the Apache FOP library. Download the binary FOP distribution from http://xmlgraphics.apache.org/fop/.
Copy all the JAR files from the FOP distribution in your library directory and add the libs to the ant build path. See Apache Ant Tutorial on how to modify the ant build path.
8.3. Define the Ant Task
You have to add the task to your ant build file and then call
the task. The following snippet shows how to define the task and how to call
it. The second listing is then a full example ant build.xml file.
<!--
- Defines the ant task for xinclude
-->
<taskdef name="fop" classname="org.apache.fop.tools.anttasks.Fop" />
<!-- Transformation into pdf
- Two steps
- 1.) First create the FO files
- 2.) Then transform the FO files into pdf files
-->
<!--
- target: build-pdf
- description: Iterates through a directory and transforms
- .xml files into .fo files using the DocBook XSL.
-->
<target name="build-pdf" depends="depends, xinclude"
description="PDF from DocBook XML">
<!-- Convert DocBook Files into FO -->
<xslt style="${fo.stylesheet}" extension=".fo" basedir="${src.tmp}"
destdir="${src.tmp}">
<include name="**/*book.xml" />
<include name="**/*article.xml" />
<param name="section.autolabel" expression="1" />
</xslt>
<!-- Convert FO Files into pdf -->
<fop format="application/pdf" outdir="${doc.dir}">
<fileset dir="${src.tmp}">
<include name="**/*.fo" />
</fileset>
</fop>
</target>
<?xml version="1.0"?>
<!--
- Author: Lars Vogel
-->
<project name="docbook-src" default="all">
<description>
This Ant build.xml file is used to transform DocBook XML to
various output formats
</description>
<!--
- Defines the ant task for xinclude
-->
<taskdef name="xinclude" classname="de.vogella.xinclude.XIncludeTask" />
<!--
- Defines the ant task for xinclude
-->
<taskdef name="fop" classname="org.apache.fop.tools.anttasks.Fop" />
<!--
- Configure basic properties that will be used in the file.
-->
<property name="javahelp.dir" value="${basedir}/../Documentation/output/vogella/javahelp" />
<property name="src" value="${basedir}/documentation" />
<property name="output.dir" value="${basedir}/../Documentation/output/vogella/articles" />
<property name="output.tmp" value="${basedir}/output.tmp" />
<property name="lib" value="${basedir}/lib/" />
<property name="docbook.xsl.dir" value="${basedir}/docbook-xsl-1.72.0" />
<property name="xinclude.lib.dir" value="${basedir}/lib/" />
<!--
- Usage of the differect style sheets which will be used for the transformation
-->
<property name="eclipse.stylesheet" value="${docbook.xsl.dir}/eclipse/eclipse.xsl" />
<property name="html.stylesheet" value="${docbook.xsl.dir}/html/docbook.xsl" />
<property name="fo.stylesheet" value="${docbook.xsl.dir}/fo/docbook.xsl" />
<property name="javahelp.stylesheet" value="${docbook.xsl.dir}/javahelp/javahelp.xsl" />
<property name="chunk-html.stylesheet" value="${docbook.xsl.dir}/html/chunk.xsl" />
<!--
- target: usage
-->
<target name="usage" description="Prints the Ant build.xml usage">
<echo message="Use -projecthelp to get a list of the available targets." />
</target>
<!--
- target: clean
-->
<target name="clean" description="Cleans up generated files.">
<delete dir="${output.dir}" />
</target>
<!--
- target: depends
-->
<target name="depends">
<mkdir dir="${output.dir}" />
</target>
<!--
- target: copy
- Copies the images from the subdirectories to the target folder
-->
<target name="copy">
<echo message="Copy the images" />
<copy todir="${output.dir}">
<fileset dir="${src}">
<include name="**/images/*.*" />
</fileset>
</copy>
</target>
<!--
- target: xinclude
- description: Creates one combined temporary files for the different inputs files.
- The combined file will then be processed via different ant tasks
-->
<target name="xinclude">
<xinclude in="${src}/DocBook/article.xml" out="${output.tmp}/DocBook/article.xml" />
<xinclude in="${src}/JavaConventions/article.xml" out="${output.tmp}/JavaConventions/article.xml" />
<xinclude in="${src}/JUnit/article.xml" out="${output.tmp}/JUnit/article.xml" />
<xinclude in="${src}/EclipseReview/article.xml" out="${output.tmp}/EclipseReview/article.xml" />
<xinclude in="${src}/HTML/article.xml" out="${output.tmp}/HTML/article.xml" />
<xinclude in="${src}/Eclipse/article.xml" out="${output.tmp}/Eclipse/article.xml" />
<xinclude in="${src}/Logging/article.xml" out="${output.tmp}/Logging/article.xml" />
<!--
<xinclude in="${src}/ant/article.xml" out="${src.tmp}/ant/article.xml" />
-->
</target>
<!--
- target: build-html
- description: Iterates through a directory and transforms
- .xml files into .html files using the DocBook XSL.
-->
<target name="build-html" depends="depends, xinclude" description="Generates HTML files from DocBook XML">
<xslt style="${html.stylesheet}" extension=".html" basedir="${output.tmp}" destdir="${output.dir}">
<include name="**/*book.xml" />
<include name="**/*article.xml" />
<param name="html.stylesheet" expression="styles.css" />
<param name="section.autolabel" expression="1" />
<param name="html.cleanup" expression="1" />
<outputproperty name="indent" value="yes" />
</xslt>
<!-- Copy the stylesheet to the same directory as the HTML files -->
<copy todir="${output.dir}">
<fileset dir="lib">
<include name="styles.css" />
</fileset>
</copy>
</target>
<!--
- target: build-javahelp
- description: Iterates through a directory and transforms
- .xml files into .html files using the DocBook XSL.
-->
<target name="build-javahelp" depends="depends, xinclude" description="JavaHelp from DocBook XML">
<xslt style="${javahelp.stylesheet}" extension=".html" basedir="${output.tmp}" destdir="${javahelp.dir}">
<include name="**/*book.xml" />
<include name="**/*article.xml" />
<outputproperty name="indent" value="yes" />
</xslt>
</target>
<!--
- target: chunks-html
- description: Iterates through a directory and transforms
- .xml files into seperate .html files using the DocBook XSL.
-->
<target name="build-chunks" depends="depends, xinclude" description="chunk HTML from DocBook XML">
<xslt style="${html.stylesheet}" extension=".html" basedir="${output.tmp}" destdir="${output.dir}">
<include name="**/*book.xml" />
<include name="**/*article.xml" />
<param name="html.stylesheet" expression="styles.css" />
<param name="section.autolabel" expression="1" />
<param name="html.cleanup" expression="1" />
<param name="chunk.first.selection" expression="1" />
</xslt>
<!-- Copy the stylesheet to the same directory as the HTML files -->
<copy todir="${output.dir}">
<fileset dir="lib">
<include name="styles.css" />
</fileset>
</copy>
</target>
<!-- Transformation into pdf
- Two steps
- 1.) First create the FO files
- 2.) Then transform the FO files into pdf files
-->
<!--
- target: build-pdf
- description: Iterates through a directory and transforms
- .xml files into fo files using the DocBook XSL.
- Relativebase is set to true to enable FOP to find the graphics which are included
- in the images directory
-->
<target name="build-pdf" depends="depends, xinclude" description="PDF from DocBook XML">
<!-- Convert DocBook Files into FO -->
<xslt style="${fo.stylesheet}" extension=".fo" basedir="${output.tmp}" destdir="${output.tmp}">
<include name="**/*book.xml" />
<include name="**/*article.xml" />
<param name="section.autolabel" expression="1" />
</xslt>
<!-- Convert FO Files into pdf -->
<fop format="application/pdf" outdir="${output.dir}" relativebase="true">
<fileset dir="${output.tmp}">
<include name="**/*.fo" />
</fileset>
</fop>
</target>
<!--
- target: chunks-html
- description: Iterates through a directory and transforms
- .xml files into seperate .html files using the DocBook XSL.
-->
<target name="build-eclipse" depends="depends, xinclude" description="Eclipse help from DocBook XML">
<xslt style="${eclipse.stylesheet}" basedir="${output.tmp}" destdir="${output.dir}">
<include name="**/*book.xml" />
<include name="**/*article.xml" />
</xslt>
</target>
<target name="all" depends="copy, build-html, build-pdf, build-chunks, build-eclipse">
</target>
</project>
9. Influencing the output result
The XSLT stylesheets have several parameters which can influence the result of the conversion.
9.1. HTML Parameters
You find all HTML relevant parameters at http://docbook.sourceforge.net/release/xsl/current/doc/html/.
| Parameter | Description |
|---|---|
name="section.autolabel" expression="1" |
Turns on the autolabeling for sections (1. Title, 1.1. Subtitle, etc. |
name="chapter.autolabel" expression="1" |
Turns on the autolabeling for chapters |
name="html.stylesheet" expression="styles.css" |
Define the stylesheet which should be used. |
name="html.cleanup" expression="1" |
Will try to clean-up the html code for better readability |
name="chunk.first.sections" expression="0" |
Will try to clean-up the html code for better readability [TODO: Does not work yet] |
9.2. PDF Parameters
You find all FO / PDF relevant parameters at http://docbook.sourceforge.net/release/xsl/current/doc/fo/.
| Parameter | Description |
|---|---|
name="section.autolabel" expression="1" |
Turns on the autolabeling for sections (1. Title, 1.1. Subtitle, etc. |
name="chapter.autolabel" expression="1" |
Turns on the autolabeling for chapters |
name="html.stylesheet" expression="styles.css" |
Define the stylesheet which should be used. |
name="html.cleanup" expression="1" |
Will try to clean-up the html code for better readability |
9.3. Add content into the HTML output
DocBook allows you to include external HTML files into the HTML output. For example, you could use this to add JavaScript into your HTML output.
For example, use the following statement to include some HTML code.
<?dbhtml-include href="../../myadditonalcontent.html"?>
See Inserting external HTML code for details.
10. Advanced Features
10.1. Syntax Highlighting
You can also enable syntax highlighting. This involves creating a customization stylesheet layer, the usage of an external lib and configuration file. Please see Source Code Syntax Highlighting with DocBook for a description of the setup.
To change how the highlighting is done, you could adjust the
following template file:
your_xslt_installation_dir/html/highlight.xsl
To add highlighting for a different language, create a language
description file in the
highlighting
folder of your DocBook XSLT folder, e.g., add the following file
called
bourne-hl.xml
for
syntax
highlighting for the bourne shell.
<?xml version="1.0" encoding="utf-8"?>
<!--
Syntax highlighting definition for SH
xslthl - XSLT Syntax Highlighting
http://sourceforge.net/projects/xslthl/
Copyright (C) 2010 Mathieu Malaterre
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
-->
<highlighters>
<highlighter type="oneline-comment">#</highlighter>
<highlighter type="heredoc">
<start><<</start>
<quote>'</quote>
<quote>"</quote>
<flag>-</flag>
<noWhiteSpace />
<looseTerminator />
</highlighter>
<highlighter type="string">
<string>"</string>
<escape>\</escape>
</highlighter>
<highlighter type="string">
<string>'</string>
<escape>\</escape>
<spanNewLines />
</highlighter>
<highlighter type="hexnumber">
<prefix>0x</prefix>
<ignoreCase />
</highlighter>
<highlighter type="number">
<point>.</point>
<pointStarts />
<ignoreCase />
</highlighter>
<highlighter type="keywords">
<!-- reserved words -->
<keyword>if</keyword>
<keyword>then</keyword>
<keyword>else</keyword>
<keyword>elif</keyword>
<keyword>fi</keyword>
<keyword>case</keyword>
<keyword>esac</keyword>
<keyword>for</keyword>
<keyword>while</keyword>
<keyword>until</keyword>
<keyword>do</keyword>
<keyword>done</keyword>
<!-- built-ins -->
<keyword>exec</keyword>
<keyword>shift</keyword>
<keyword>exit</keyword>
<keyword>times</keyword>
<keyword>break</keyword>
<keyword>export</keyword>
<keyword>trap</keyword>
<keyword>continue</keyword>
<keyword>readonly</keyword>
<keyword>wait</keyword>
<keyword>eval</keyword>
<keyword>return</keyword>
<!-- other commands -->
<keyword>cd</keyword>
<keyword>echo</keyword>
<keyword>hash</keyword>
<keyword>pwd</keyword>
<keyword>read</keyword>
<keyword>set</keyword>
<keyword>test</keyword>
<keyword>type</keyword>
<keyword>ulimit</keyword>
<keyword>umask</keyword>
<keyword>unset</keyword>
</highlighter>
</highlighters>
And register it in the
highlighting/xslthl-config.xml
file, for example:
<?xml version="1.0" encoding="UTF-8"?>
<!--
xslthl - XSLT Syntax Highlighting
http://sourceforge.net/projects/xslthl/
Copyright (C) 2005-2008 Michal Molhanec, Jirka Kosek, Michiel Hendriks
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
Michal Molhanec <mol1111 at users.sourceforge.net>
Jirka Kosek <kosek at users.sourceforge.net>
Michiel Hendriks <elmuerte at users.sourceforge.net>
-->
<xslthl-config>
<highlighter id="java" file="java-hl.xml" />
<highlighter id="delphi" file="delphi-hl.xml" />
<highlighter id="pascal" file="delphi-hl.xml" />
<highlighter id="ini" file="ini-hl.xml" />
<highlighter id="php" file="php-hl.xml" />
<highlighter id="myxml" file="myxml-hl.xml" />
<highlighter id="m2" file="m2-hl.xml" />
<highlighter id="tcl" file="tcl-hl.xml" />
<highlighter id="c" file="c-hl.xml" />
<highlighter id="cpp" file="cpp-hl.xml" />
<highlighter id="csharp" file="csharp-hl.xml" />
<highlighter id="python" file="python-hl.xml" />
<highlighter id="ruby" file="ruby-hl.xml" />
<highlighter id="perl" file="perl-hl.xml" />
<highlighter id="javascript" file="javascript-hl.xml" />
<highlighter id="bourne" file="bourne-hl.xml" />
<namespace prefix="xslthl" uri="http://xslthl.sf.net" />
</xslthl-config>
10.2. Remove certain parts
Sometimes you want to remove certain parts of your document
before processing it. The following is an example where sections
marked with the role="wrapper" will be removed.
The following processing rule will remove the marked section. You would output that to a temporary folder and run your real conversion on the temp folder.
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml" />
<xsl:template match="section[@role='wrapper']">
<xsl:apply-templates select="section" />
</xsl:template>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
10.3. Using own stylesheets
You can also create your own stylesheets, import the default ones and override the parts which you do not like.
The following example shows an own stylesheet. It imports some HTML content and changes the titlepage by including an image.
<?xml version='1.0'?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:import href="../docbook-xsl-1.77.1/html/docbook.xsl" />
<xsl:template name="user.header.content">
<xsl:variable name="codefile"
select="document('../../../de.vogella.publishing/mystylesheets/headerstandalone.html',/)" />
<xsl:copy-of select="$codefile/htmlcode/node()" />
</xsl:template>
<xsl:template name="user.footer.content">
<xsl:variable name="codefile"
select="document('../../../de.vogella.publishing/mystylesheets/footerstandalone.html',/)" />
<xsl:copy-of select="$codefile/htmlcode/node()" />
</xsl:template>
<xsl:template name="user.head.content">
<link rel="shortcut icon" href="http://www.vogella.com/favicon.ico" />
</xsl:template>
<xsl:template name="article.titlepage.before.recto">
<xsl:if test="articleinfo">
<div class="vogellalogo">
<a rel="author" href="http://www.vogella.com/people/larsvogel.html">
<img src="http://www.vogella.com/img/logo/preface.png" height="67"
width="202" alt="About Lars Vogel" />
</a>
</div>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
11. Visual editor support for Docbook in Eclipse
11.1. Oxygen
Oxygen is a commercial set of plug-ins for working with XML and XLST. It provides also a visual editor for editing Docbook files.
The Oxygen editor can be installed via the following update site: http://www.oxygenxml.com/InstData/Editor/Eclipse/site.xml
You require a license to use Oxygen, see the Oxygen for details.
11.3. What is Vex
Vex is an Open Source plug-in for the Eclipse that augments the IDE with a WYSIWYG XML editor. Since August 2008 the plugin is part of the Eclipse project.
11.4. Installation
You can install Vex via an Eclipse update site:
http://download.eclipse.org/vex/releases/1.0/
http://download.eclipse.org/vex/milestones/1.1/
11.5. Usage
To switch between the standard Eclipse XML editor and Vex, do a right-click on the file in the Project Explorer and select Visual XML Editor.
The plug-in is helpful as a quick preview, but still a work in
progress. At the time of writing, embedded text via
XInclude
is just ignored.
12. Language Tool
12.1. What is Language Tool?
Language Tool is an Open Source tool for checking the style and the grammar of a text. It is written in Java. Language Tool contains thousands of rules to detect grammar and style pitfalls as well as common typos. It is hosted at Github Languagetools.
Eclipse plug-in for Language Tool provides integration of this spell checker into the Eclipse IDE. This plug-in allows replacing the default spell-checker with an enhanced one using LanguageTool. The Eclipse integration is work in progress, but already useful as it filters XML tags and indicates problems in plain text with tooltips and red underlines.
12.2. Installation
You can install LanguageTool via an Eclipse update site:
http://download.vogella.com/p2/C-MASTER-Eclipse-LanguageTool/workspace/cx.ath.remisoft.languagetool.p2updatesite/target/repository/
12.3. Usage
To enable LanguageTool click and select LanguageTool in the drop-down box.
| Spell Checking. Be sure to specify a language variant, i.e., American English instead of generic English, to also enable the spell checking dictionary and additional rules. |
12.4. Developing custom LanguageTools rules
LanguageTool development takes place on GitHub at LanguageTool project. Contributing new rules for undetected errors is easy and only requires editing XML files which can be developed using a web-based Rule Editor.
The Eclipse plug-in is available on LanguageTool Eclipse plug-in Github site.
You can also write your own detection algorithms in Java or use the LanguageTool API to embed the functionality in your own Java projects.
12.5. Reporting new words
New words missing in the directory should be reported in https://github.com/kevina/wordlist/issues. Please check if these words are already in the following directory: https://github.com/marcoagpinto/aoo-mozilla-en-dict.
12.6. Language tools forum
You find the public forum of Language Tool here Language Tool forum.
13. Using XInclude with Eclipse XSL
13.1. Overview
XInclude can be used to structure the DocBook source files so
that you have one file per chapter / section and one master file
which includes these files. Via XInclude these
separate files
can be
combined into one file.
You can, for example, include a file foo.xml
into another one via the
following statement
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="foo.xml" />
In case this file should be treated as text:
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" parse="text" href="bar.xml" />
Am XInclude ant task is provided by the Eclipse XSL project. I can proudly say that this ant task was contributed by me to the Eclipse XLS project. :-)
13.2. Eclipse XSL Tools
Eclipse XSL Tools provide support for XSLT transformations. It supports XSL editing and debugging support. We will only use the XInclude task, but you have to install the whole package.
Install the XSL tools via the update manager from the standard Eclipse update site. See Using the Eclipse update manager for details.
13.3. Using the XInclude ant task
From your Eclipse installation take the
org.eclipse.wst.xsl.core.jar and add this JAR file to your ant
classpath. Put the new JAR into your Ant classpath. See
Apache Ant Tutorial - Classpath for details.
You should now be able to create and run the xinclude task. Below an example ant build.xml file.
<?xml version="1.0"?>
<!-- Author: Lars Vogel -->
<project name="docbook-src" default="usage">
<description>
This Ant build.xml file is used to transform DocBook XML to various
output formats
</description>
<!--Configure basic properties that will be used in the file. -->
<property name="doc.dir" value="${basedir}/output" />
<property name="src" value="${basedir}/src" />
<property name="src.tmp" value="${basedir}/src.tmp" />
<property name="lib" value="${basedir}/lib/" />
<property name="docbook.xsl.dir" value="${basedir}/docbook-xsl-1.72.0" />
<property name="html.stylesheet" value="${docbook.xsl.dir}/html/docbook.xsl" />
<property name="xinclude.lib.dir" value="${basedir}/lib/" />
<!-- target: usage -->
<target name="usage" description="Prints the Ant build.xml usage">
<echo message="Use -projecthelp to get a list of the available targets." />
</target>
<!-- target: clean -->
<target name="clean" description="Cleans up generated files.">
<delete dir="${doc.dir}" />
</target>
<!-- target: depends -->
<target name="depends">
<mkdir dir="${doc.dir}" />
</target>
<!--
- target: xinclude
- description: Creates one combined temporary files for the different inputs files.
- The combined file will then be processed via different ant tasks
-->
<target name="xinclude">
<xsl.xinclude in="${src}/DocBook/article.xml" out="${src.tmp}/DocBook/article.xml" />
</target>
<!--
- target: build-html
- description: Iterates through a directory and transforms
- .xml files into .html files using the DocBook XSL.
-->
<target name="build-html" depends="depends, xinclude" description="HTML from DocBook XML">
<xslt style="${html.stylesheet}" extension=".html" basedir="${src.tmp}" destdir="${doc.dir}">
<include name="**/*book.xml" />
<include name="**/*article.xml" />
<param name="html.stylesheet" expression="styles.css" />
</xslt>
<!-- Copy the stylesheet to the same directory as the HTML files -->
<copy todir="${doc.dir}">
<fileset dir="lib">
<include name="styles.css" />
</fileset>
</copy>
</target>
</project>
15. Links and Literature
15.1. vogella GmbH training and consulting support
| TRAINING | SERVICE & SUPPORT |
|---|---|
The vogella company provides comprehensive training and education services from experts in the areas of Eclipse RCP, Android, Git, Java, Gradle and Spring. We offer both public and inhouse training. Whichever course you decide to take, you are guaranteed to experience what many before you refer to as “The best IT class I have ever attended”. |
The vogella company offers expert consulting services, development support and coaching. Our customers range from Fortune 100 corporations to individual developers. |
Appendix A: Copyright and License
Copyright © 2012-2017 vogella GmbH. Free use of the software examples is granted under the terms of the EPL License. This tutorial is published under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Germany license.
See Licence.