Use Dom4j to manipulate XML

Self-built blog address: bytelife.net, welcome visit! This article is a blog automatically synchronized articles, for a better reading experience, you are advised to go to my blog 👇

Link to this article: bytelife.net/articles/47… Copyright Notice: All articles on this blog are licensed BY-NC-SA unless otherwise stated. Reprint please indicate the source!

Introduction: XML (Extensible Markup Language) has been widely used in software development. There are many ways to manipulate XML in the Java language, the most common of which is to use third-party components such as JDom and Dom4j. This article will briefly cover the basic approach to manipulating XML using Dom4j.

Dom4j version 1.6.1 is used for this article and can be downloaded at the end of this article. Without further ado, let’s take a look at the contents of the XML file used in this article:


      
<class id="1">
	<student>
		<num>0001</num>
		<name>Zhang SAN</name>
		<age>19</age>
	</student>
	
	<student>
		<num>0002</num>
		<name>Li si</name>
		<age>21</age>
		<hobby>
			<name>football</name>
			<name>basketball</name>
		</hobby>
	</student>
	
	<teacher>
		<name>Teacher wang</name>
		<age>40</age>
		<course>Java</course>
	</teacher>
</class>
Copy the code

In this XML file, you can see that the root node is class, which has student and teacher children, and the student child contains num, name, age, and hobby grandchildren. The teacher child node contains name, age, course and other grandchildren nodes. Now use dom4J to manipulate the XML file.

To parse the XML

When working with XML using Dom4j, the first thing you might want to do is parse an XML Document, which is very easy in Dom4j. Using the following code, you can easily parse the XML file and return a Document object.

package cn.javacodes.dom4j;

import java.io.File;

import org.dom4j.Document;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;

public class TestDom4j {

	public static void main(String[] args) throws Exception {
		// Get the SAX reader
		SAXReader reader = new SAXReader();
		// Get the Document object
		Document doc = reader.read(new File("d:/DemoXML.xml"));
		// Get the root node
		Element root = doc.getRootElement();
		// Output test
		System.out.println("Root node:" + root.getName() + ",id="
				\+ root.attributeValue("id")); }}Copy the code

Output result:

Root node: class,id=1

Use Iterator Iterator

An Element object can return a standard Java iterator in several ways: (1) Iterate over all child elements

// Iterate over all children of the root element
for (Iterator i = root.elementIterator(); i.hasNext(); ) {
      Element element = (Element) i.next();
      System.out.println(element.getName());
}
Copy the code

Output result:

student student teacher

(2) Iterate through element names

// Iterate over child elements with the element name "student"
for ( Iterator i = root.elementIterator( "student" ); i.hasNext(); ) {
    Element foo = (Element) i.next();
    System.out.println(foo.getName());
}
Copy the code

Output result:

student student

(3) Iterate all attributes

// Iterate over all attributes of the root element
for ( Iterator i = root.attributeIterator(); i.hasNext(); ) {
      Attribute attribute = (Attribute) i.next();
      System.out.println(attribute.getName() + ":" + attribute.getValue());
}
Copy the code

Output result:

id:1

Get element values

Usually we need to get the text inside the XML element tag, that is, the element value. Here is a simple example to display all the element values recursively:

public  static void showAllElementText(Element e){
      for (Iterator i = e.elementIterator(); i.hasNext(); ) {
           Element element = (Element) i.next();
           if(! element.elements().isEmpty()) { showAllElementText(element); }else {
                System.out.println(element.getName()+"="+element.getTextTrim()); }}}Copy the code

The output

Num =0001 name= zhang3 age=19 Num =0002 Name = Zhang4 age=21 Name = soccer name= basketball name= Wang3 age=40 course=Java

Using XPath expressions

XML documents can be manipulated more easily using XPath expressions in Dom4j, while complex operations can be performed with just one line of code using XPath expressions. A few simple examples of XPath in Dom4j are as follows: (1) Query a single node (the default lookup is the first one) :

// Get the SAX reader
SAXReader reader = new SAXReader();
// Get the Document object
Document doc = reader.read(new File("d:/DemoXML.xml"));
// Get the name node of the student element
Node node = doc.selectSingleNode("//student/name");
// Output test
System.out.println(node.getName() + "=" + node.getText());
Copy the code

(2) Query multiple nodes

// Get the name node for all student elements
List<Node> list = doc.selectNodes("//student");
// Output test
for (Node node : list) {
    System.out.println(node.getName() 
                 \+ ":" \+ node.valueOf("name"));
}
Copy the code

These are two commonly used methods, and if you want to find all hypertext links in an XHTML document, you can easily do so by using this trick:

    public void findLinks(Document document) throws DocumentException {
        List list = document.selectNodes( "//a/@href" );
        for(Iterator iter = list.iterator(); iter.hasNext(); ) { Attribute attribute = (Attribute) iter.next(); String url = attribute.getValue(); }}Copy the code

If you need any help with learning the XPah expression language, you can visit Zvon Tutorial, which provides a variety of examples to help you learn.

Rapid cycling

If you need to manipulate a very large XML document, you should use the fast loop method to avoid creating Iterator objects each time through the loop. Here is a simple example:

    public void treeWalk(Document document) {
        treeWalk( document.getRootElement() );
    }

    public void treeWalk(Element element) {
        for ( int i = 0, size = element.nodeCount(); i < size; i++ ) {
            Node node = element.node(i);
            if ( node instanceof Element ) {
                treeWalk( (Element) node );
            }
            else {
                // Write the operation you want to do here}}}Copy the code

Create an XML Document object

When using Dom4j, it is often necessary to create a new document. Here is a simple example:

import org.dom4j.Document;
import org.dom4j.DocumentHelper;
import org.dom4j.Element;

public class Foo {

    public Document createDocument(a) {
        Document document = DocumentHelper.createDocument();
        Element root = document.addElement( "root" );

        Element author1 = root.addElement( "author" )
            .addAttribute( "name"."James" )
            .addAttribute( "location"."UK" )
            .addText( "James Strachan" );
        
        Element author2 = root.addElement( "author" )
            .addAttribute( "name"."Bob" )
            .addAttribute( "location"."US" )
            .addText( "Bob McWhirter" );

        returndocument; }}Copy the code

Write to an XML file

Using Dom4j to write a Document object to an XML file is as simple as one line of code:

document.write( new FileWriter( "foo.xml" ));
Copy the code

If you want to change the format of the output, such as more readable typography or compressed (compact) typography, or if you want to output via Writer or OutputStream, you can use the XMLWriter class:

import org.dom4j.Document;
import org.dom4j.io.OutputFormat;
import org.dom4j.io.XMLWriter;

public class Foo {

    public void write(Document document) throws IOException {

        // Write to a file
        XMLWriter writer = new XMLWriter(
            new FileWriter( "output.xml")); writer.write( document ); writer.close();// Better typography
        OutputFormat format = OutputFormat.createPrettyPrint();
        writer = new XMLWriter( System.out, format );
        writer.write( document );

        // More compact layout
        format = OutputFormat.createCompactFormat();
        writer = newXMLWriter( System.out, format ); writer.write( document ); }}Copy the code

The Document object interacts with the XML code

If you want to pass a Document object or any other node object (such as Attribute or Element), you can convert it to an XML text string using the asXML() method, for example:

Document document = ... ; String text = document.asXML();Copy the code

If you want to turn an XML text string into a Document object, you can parse it using documenthelper.parseText () :

        String text = "<person> <name>James</name> </person>";
        Document document = DocumentHelper.parseText(text);
Copy the code

XSLT

Applying XSLT on a Document is simple through the JAXP API provided by Sum. Here’s an example of creating a transformer using JAXP and applying it to a Document:

import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;

import org.dom4j.Document;
import org.dom4j.io.DocumentResult;
import org.dom4j.io.DocumentSource;

public class Foo {

    public Document styleDocument( Document document, String stylesheet ) throws Exception {

        Load transformer using JAXP
        TransformerFactory factory = TransformerFactory.newInstance();
        Transformer transformer = factory.newTransformer( 
            new StreamSource( stylesheet ) 
        );

        // Style document
        DocumentSource source = new DocumentSource( document );
        DocumentResult result = new DocumentResult();
        transformer.transform( source, result );

        // Return the converted document
        Document transformedDoc = result.getDocument();
        returntransformedDoc; }}Copy the code

To parse the XML

Use Iterator Iterator

Get element values

Using XPath expressions

Rapid cycling

Create an XML Document object

Write to an XML file

The Document object interacts with the XML code

XSLT

Related Posts

Nebula’s Text search engine based on ElasticSearch

【 threads 】 thread pool source code (2)

C# implements the design pattern — the observer pattern