× Home Tutorials Training Consulting Products Books Company Donate Contact us

NOW Hiring

Quick links


Java and XML. This article gives an introduction into XML and its usage with Java. The Java Streaming API for XML (Stax) and the Java XPath library are explained and demonstrated.

1. XML Introduction

1.1. XML Overview

XML is the abbreviation for Extensible Markup Language and is an established data exchange format. XML was defined 1998 by the World Wide Web Consortium (W3C).

An XML document consists of elements, each element has a start tag, content and an end tag. An XML document must have exactly one root element (i.e., one tag which encloses the remaining tags). XML differentiates between capital and non-capital letters.

An XML file must be well-formed. This means that it must apply to the following conditions:

  • An XML document always starts with a prolog (see below for an explanation of what a prolog is)

  • Every opening tag has a closing tag.

  • All tags are completely nested.

An valid XML file is well-formed and must contain a link to an XML schema and be valid according to that schema.

The following is a valid, well-formed XML file.

<?xml version="1.0"?>
<!-- This is a comment -->
        <name>Lars </name>
        <street> Test </street>
        <telephone number= "0123"/>

1.2. Comparison of XML to other formats

It is relatively easy to process an XML document, compared with a binary or unstructured format. This is because of the following characteristics:

  • XML is plain text

  • XML represents data without defining how the data should be displayed

  • XML can be transformed into other formats via XSL

  • XML can be easily processed via standard parsers

  • XML files are hierarchical

The XML format is relatively verbose, i.e., if data is represented as XML the size of this data is relatively large compared to other formats. In the Internet JSON or binary formats are frequently used to replace XML if the data throughput is important.

1.3. XML Elements

An XML document always starts with a prolog which describes the XML file. This prolog can be minimal, e.g. <?xml version="1.0"?>. It can also contain other information, e.g. the encoding <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>.

A tag which does not enclose any content is know as an "empty tag", for example <flag/>.

Comments in XML are defined as: <! COMMENT>.

2. Java XML overview

The Java programming language contains several methods for processing and writing XML.

Older Java versions supported only the DOM API (Document Object Model) and the SAX (Simple API for XML) API.

In DOM you access the XML document over an object tree. DOM can be used to read and write XML files.

SAX (Simple API for XML) is a Java API for sequential reading of XML files. SAX can only read XML documents. SAX provides an event driven XML Processing following the Push-Parsing model. In this model you register listeners in the form of Handlers to the Parser. These are notified through call-back methods.

Both DOM and Sax are older APIs and I recommend not using them anymore.

Stax (Streaming API for XML) is an API for reading and writing XML Documents. It was introduced in Java 6.0 and is considered superior to SAX and DOM.

Java Architecture for XML Binding (JAXB) is a Java standard that allows to convert Java objects to XML and vice versa. JAXB defines a programmer API for reading and writing Java objects to from XML documents. It also defines a service provider which allows the selection of the JAXB implementation. JAXB applies a lot of defaults thus making reading and writing of XML via Java very easy.

The following explains the Stax interface; for an introduction to JAXB, please see JAXB tutorial.

3. Streaming API for XML (StaX)

3.1. Overview

Streaming API for XML, called StaX, is an API for reading and writing XML Documents.

StaX is a Pull-Parsing model. Application can take the control over parsing the XML documents by pulling (taking) the events from the parser.

The core StaX API falls into two categories and they are listed below. They are

  • Cursor API

  • Event Iterator API

Applications can use any of these two API for parsing XML documents. The following will focus on the event iterator API as I consider it more convenient to use.

3.2. Event Iterator API

The event iterator API has two main interfaces: XMLEventReader for parsing XML and XMLEventWriter for generating XML.

3.3. XMLEventReader - Read XML Example

This example is stored in project "de.vogella.xml.stax.reader".

Applications loop over the entire document requesting for the Next Event. The Event Iterator API is implemented on top of the Cursor API.

In this example we will read the following XML document and create objects from it.

<?xml version="1.0" encoding="UTF-8"?>
        <item date="January 2009">
        <item date="February 2009">
        <item date="December 2009">

Define therefore the following class to store the individual entries of the XML file.

package de.vogella.xml.stax.model;

public class Item {
        private String date;
        private String mode;
        private String unit;
        private String current;
        private String interactive;

        public String getDate() {
                return date;

        public void setDate(String date) {
                this.date = date;
        public String getMode() {
                return mode;
        public void setMode(String mode) {
                this.mode = mode;
        public String getUnit() {
                return unit;
        public void setUnit(String unit) {
                this.unit = unit;
        public String getCurrent() {
                return current;
        public void setCurrent(String current) {
                this.current = current;
        public String getInteractive() {
                return interactive;
        public void setInteractive(String interactive) {
                this.interactive = interactive;

        public String toString() {
                return "Item [current=" + current + ", date=" + date + ", interactive="
                                + interactive + ", mode=" + mode + ", unit=" + unit + "]";

The following reads the XML file and creates a List of object Items from the entries in the XML file.

package de.vogella.xml.stax.read;

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.Attribute;
import javax.xml.stream.events.EndElement;
import javax.xml.stream.events.StartElement;
import javax.xml.stream.events.XMLEvent;

import de.vogella.xml.stax.model.Item;

public class StaXParser {
        static final String DATE = "date";
        static final String ITEM = "item";
        static final String MODE = "mode";
        static final String UNIT = "unit";
        static final String CURRENT = "current";
        static final String INTERACTIVE = "interactive";

        @SuppressWarnings({ "unchecked", "null" })
        public List<Item> readConfig(String configFile) {
                List<Item> items = new ArrayList<Item>();
                try {
                        // First, create a new XMLInputFactory
                        XMLInputFactory inputFactory = XMLInputFactory.newInstance();
                        // Setup a new eventReader
                        InputStream in = new FileInputStream(configFile);
                        XMLEventReader eventReader = inputFactory.createXMLEventReader(in);
                        // read the XML document
                        Item item = null;

                        while (eventReader.hasNext()) {
                                XMLEvent event = eventReader.nextEvent();

                                if (event.isStartElement()) {
                                        StartElement startElement = event.asStartElement();
                                        // If we have an item element, we create a new item
                                        if (startElement.getName().getLocalPart().equals(ITEM)) {
                                                item = new Item();
                                                // We read the attributes from this tag and add the date
                                                // attribute to our object
                                                Iterator<Attribute> attributes = startElement
                                                while (attributes.hasNext()) {
                                                        Attribute attribute = attributes.next();
                                                        if (attribute.getName().toString().equals(DATE)) {


                                        if (event.isStartElement()) {
                                                if (event.asStartElement().getName().getLocalPart()
                                                                .equals(MODE)) {
                                                        event = eventReader.nextEvent();
                                        if (event.asStartElement().getName().getLocalPart()
                                                        .equals(UNIT)) {
                                                event = eventReader.nextEvent();

                                        if (event.asStartElement().getName().getLocalPart()
                                                        .equals(CURRENT)) {
                                                event = eventReader.nextEvent();

                                        if (event.asStartElement().getName().getLocalPart()
                                                        .equals(INTERACTIVE)) {
                                                event = eventReader.nextEvent();
                                // If we reach the end of an item element, we add it to the list
                                if (event.isEndElement()) {
                                        EndElement endElement = event.asEndElement();
                                        if (endElement.getName().getLocalPart().equals(ITEM)) {

                } catch (FileNotFoundException e) {
                } catch (XMLStreamException e) {
                return items;


You can test the parser via the following test program. Please note that the file config.xml must exist in the Java project folder.

package de.vogella.xml.stax.read;

import java.util.List;

import de.vogella.xml.stax.model.Item;

public class TestRead {
        public static void main(String args[]) {
                StaXParser read = new StaXParser();
                List<Item> readConfig = read.readConfig("config.xml");
                for (Item item : readConfig) {

3.4. Write XML File Example

This example is stored in project "de.vogella.xml.stax.writer".

Let’s assume you would like to write the following simple XML file.

<?xml version="1.0" encoding="UTF-8"?>

StaX does not provide functionality to automatically format the XML file so you have to add end-of-lines and tab information to your XML file.

package de.vogella.xml.stax.writer;

import java.io.FileOutputStream;

import javax.xml.stream.XMLEventFactory;
import javax.xml.stream.XMLEventWriter;
import javax.xml.stream.XMLOutputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.Characters;
import javax.xml.stream.events.EndElement;
import javax.xml.stream.events.StartDocument;
import javax.xml.stream.events.StartElement;
import javax.xml.stream.events.XMLEvent;

public class StaxWriter {
        private String configFile;

        public void setFile(String configFile) {
                this.configFile = configFile;

        public void saveConfig() throws Exception {
                // create an XMLOutputFactory
                XMLOutputFactory outputFactory = XMLOutputFactory.newInstance();
                // create XMLEventWriter
                XMLEventWriter eventWriter = outputFactory
                                .createXMLEventWriter(new FileOutputStream(configFile));
                // create an EventFactory
                XMLEventFactory eventFactory = XMLEventFactory.newInstance();
                XMLEvent end = eventFactory.createDTD("\n");
                // create and write Start Tag
                StartDocument startDocument = eventFactory.createStartDocument();

                // create config open tag
                StartElement configStartElement = eventFactory.createStartElement("",
                                "", "config");
                // Write the different nodes
                createNode(eventWriter, "mode", "1");
                createNode(eventWriter, "unit", "901");
                createNode(eventWriter, "current", "0");
                createNode(eventWriter, "interactive", "0");

                eventWriter.add(eventFactory.createEndElement("", "", "config"));

        private void createNode(XMLEventWriter eventWriter, String name,
                        String value) throws XMLStreamException {

                XMLEventFactory eventFactory = XMLEventFactory.newInstance();
                XMLEvent end = eventFactory.createDTD("\n");
                XMLEvent tab = eventFactory.createDTD("\t");
                // create Start node
                StartElement sElement = eventFactory.createStartElement("", "", name);
                // create Content
                Characters characters = eventFactory.createCharacters(value);
                // create End node
                EndElement eElement = eventFactory.createEndElement("", "", name);



And a little test.

package de.vogella.xml.stax.writer;

public class TestWrite {

        public static void main(String[] args) {
                StaxWriter configFile = new StaxWriter();
                try {
                } catch (Exception e) {

For another (more complex) example of using Stax, please see Reading and creating RSS feeds via Java (with Stax).

4. XPath

4.1. Overview

XPath (XML Path Language) is a language for selecting / searching nodes from an XML document. Java 5 introduced the javax.xml.xpath package which provides a XPath library.

The following explains how to use XPath to query an XML document via Java.

4.2. Using XPath

The following explains how to use XPath. Create a new Java project called UsingXPath.

Create the following xml file.

<?xml version="1.0" encoding="UTF-8"?>

Create a new package "myxml" and a new Java class "QueryXML".

package myxml;

import java.io.IOException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class QueryXML {
        public void query() throws ParserConfigurationException, SAXException,
                        IOException, XPathExpressionException {
                // standard for reading an XML file
                DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
                DocumentBuilder builder;
                Document doc = null;
                XPathExpression expr = null;
                builder = factory.newDocumentBuilder();
                doc = builder.parse("person.xml");

                // create an XPathFactory
                XPathFactory xFactory = XPathFactory.newInstance();

                // create an XPath object
                XPath xpath = xFactory.newXPath();

                // compile the XPath expression
                expr = xpath.compile("//person[firstname='Lars']/lastname/text()");
                // run the query and get a nodeset
                Object result = expr.evaluate(doc, XPathConstants.NODESET);

                // cast the result to a DOM NodeList
                NodeList nodes = (NodeList) result;
                for (int i=0; i<nodes.getLength();i++){

                // new XPath expression to get the number of people with name Lars
                expr = xpath.compile("count(//person[firstname='Lars'])");
                // run the query and get the number of nodes
                Double number = (Double) expr.evaluate(doc, XPathConstants.NUMBER);
                System.out.println("Number of objects " +number);

                // do we have more than 2 people with name Lars?
                expr = xpath.compile("count(//person[firstname='Lars']) >2");
                // run the query and get the number of nodes
                Boolean check = (Boolean) expr.evaluate(doc, XPathConstants.BOOLEAN);

        public static void main(String[] args) throws XPathExpressionException, ParserConfigurationException, SAXException, IOException {
                QueryXML process = new QueryXML();

5. About this website

6. Links and Literature

6.2. vogella GmbH training and consulting support


The vogella company provides comprehensive training and education services from experts in the areas of Eclipse RCP, Android, Git, Java, Gradle and Spring. We offer both public and inhouse training. Whichever course you decide to take, you are guaranteed to experience what many before you refer to as “The best IT class I have ever attended”.

The vogella company offers expert consulting services, development support and coaching. Our customers range from Fortune 100 corporations to individual developers.

Copyright © 2012-2016 vogella GmbH. Free use of the software examples is granted under the terms of the EPL License. This tutorial is published under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Germany license.

See Licence.