This tutorial explains the usage of Jsoup as a HTML parser.

1. jsoup

1.1. What is jsoup?

jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods.

1.2. Using jsoup

The latest version of jsoup can be found via

To use jsoup in a Maven build, add the following dependency to your pom.


To use jsoup in your Gradle build, add the following dependency to your build.gradle file.

implementation 'org.jsoup:jsoup:1.13.1'

1.3. Example

The following code demonstrates how to read a webpage and how to extract its links.


import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;

public class ParseLinksExample {

  public static void main(String[] args) {

    Document doc;
    try {

        doc = Jsoup.connect("").get();

        // get title of the page
        String title = doc.title();
        System.out.println("Title: " + title);

        // get all links
        Elements links ="a[href]");
        for (Element link : links) {

            // get the value from href attribute
            System.out.println("\nLink : " + link.attr("href"));
            System.out.println("Text : " + link.text());

    } catch (IOException e) {



2. jsoup Resources

