DOM, DOM4J, SAX, and XPath parse XML files

Since XML parsing techniques are often used to parse Spring and related source code, I’ll take you through a review of XML techniques

XML files

1.1 What is AN XML file

XML is the Extensible Markup Language, which means that developers can define their own tags according to their own needs on the basis of conforms to XML naming rules.

1.2 Functions of XML files

Mainly used to store data

1.3 Methods for parsing XML files: DOM, DOM4J, SAX

DOM4J parses XML files

2.1 Importing Dom4J dependencies

2.2 Common objects for Dom4j

2.2.1 SAXReader: Reads the XML file into the Document tree structure file object

2.2.2 Document: Is an XML Document object tree, analogous to Html Document objects

2.2.3 Element: Element node. The Document object can be used to find the element

2.3 Dom4j Parsing Procedure

SAXReader = new SAXReader(); 2.3.2 Document object: obtain Document Document = saxreader.read (“students.xml”); Element root = document.getrootelement (); 2.3.4 Traversing and Parsing child Nodes

2.4 Example 1: Parsing the students.xml file using Dom4j

students.xml


      
<students>
    <student>
        <name>Ha ha</name>
        <college>java</college>
    </student>
    <student>
        <name>search</name>
        <college>c++</college>
    </student>
    <student>
        <name>Ha ha</name>
        <college>app</college>
    </student>
</students>
Copy the code

Introduction of depend on

<dependency>
    <groupId>org.dom4j</groupId>
    <artifactId>dom4j</artifactId>
    <version>2.1.3</version>
</dependency>
Copy the code

Dom4jTest.java

package com.qinghong;

import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;

import java.util.Iterator;

public class Dom4jTest {
    public static void main(String[] args) {
        // Create a parser
        SAXReader reader = new SAXReader();
        // The parser reads the configuration file into memory to generate a Document[org.dom4j] object tree
        try {
            Document document = reader.read("students.xml");
            // Get the root node
            Element root = document.getRootElement();
            / / traverse
            for(Iterator<Element> rootIter = root.elementIterator(); rootIter.hasNext();) { Element element = rootIter.next();for(Iterator<Element> innerIter = element.elementIterator(); innerIter.hasNext();) { Element next = innerIter.next(); String innerValue = next.getStringValue(); System.out.println(innerValue); } System.out.println("-- -- -- -- -- -- -- -- -- -- -- --"); }}catch(DocumentException e) { e.printStackTrace(); }}}Copy the code

results

Sax parses XML files

3.1 SAX mode: event-driven, while reading and writing

3.1.2 Advantages: There is no need to load the entire document into memory, so it consumes less memory and is suitable for parsing extremely large XML files. 3.1.3 SAX four steps: Through the newInstance () method to obtain SAXParserFactory SAXParserFactory = SAXParserFactory. NewInstance (); (2) create the parser SAXParser SAXParser = saxParserFactory. NewSAXParser (); (3) Call the Parser method. (4) Schematic diagram is as follows:

Code implementation:

MySAXParser.java

package com.qinghong;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

public class MySAXParser {
    public static void main(String[] args) {
        SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
        try {
            SAXParser saxParser = saxParserFactory.newSAXParser();
            saxParser.parse("students.xml".new MyDefaultHandler());
        } catch(Exception e) { e.printStackTrace(); }}}class MyDefaultHandler extends DefaultHandler{
    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        System.out.println("<" + qName + ">");
    }

    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        System.out.println("< /" + qName + ">");
    }

    @Override
    public void characters(char[] ch, int start, int length) throws SAXException {
        System.out.println(newString(ch,start,length)); }}Copy the code

Parse XML files using Dom4j’s xPath

4.1 XPath syntax

Tutorial: www.w3school.com.cn/xpath/index…

XPath uses path expressions to select nodes or sets of nodes in an XML document. Nodes are selected by following a path or steps.

The most useful path expressions are listed below:

Select the node

expression	describe
nodename	Selects all children of this node.
/	From the root node.
//	Nodes in the document are selected from the current node selected by the match, regardless of their location.
.	Select the current node.
.	Selects the parent of the current node.
@	Select properties.

Predicates

The predicate is used to find a particular node or node containing a specified value.

The predicate is enclosed in square brackets.

The instance

In the table below, we list some path expressions with predicates and the results of the expressions:

Path expression	The results of
/bookstore/book[1]	Selects the first book element that is the child element of bookstore.
/bookstore/book[last()]	Selects the last book element that is the child element of bookstore.
/bookstore/book[last()-1]	Selects the penultimate book element that is the child element of bookstore.
/bookstore/book[position()<3]	Selects the first two book elements that are children of the Bookstore element.
//title[@lang]	Selects all the title elements that have an attribute named lang.
//title[@lang=’eng’]	Selects all title elements that have a lang attribute with a value of ENG.
/ bookstore/book [price > 35.00]	Selects all book elements of the Bookstore element with a value greater than 35.00 for the price element.
/ bookstore/book [price > 35.00] / title	Selects all the title elements of the book element in the Bookstore element, and the price element must have a value greater than 35.00.

Select unknown node

XPath wildcards can be used to select unknown XML elements.

The wildcard	describe
*	Matches any element node.
@ *	Matches any property node.
node()	Matches any type of node.

The instance

In the table below, we list some path expressions and their results:

Path expression	The results of
/bookstore/*	Selects all the children of the Bookstore element.
/ / *	Selects all elements in the document.
//title[@*]	Selects all title elements with attributes.

Select several paths

By in the path expression using “|” operator, you can select several paths.

The instance

In the table below, we list some path expressions and their results:

Path expression	The results of
//book/title	//book/price
//title	//price
/bookstore/book/title	//price

4.2 Example: Obtaining configuration information from an XML file

The preparatory work


      

<bookstore>

<book>
  <title lang="eng">Harry Potter</title>
  <price>29.99</price>
</book>

<book>
  <title lang="eng">Learning XML</title>
  <price>39.95</price>
</book>

</bookstore>
Copy the code

SysConfigParser.java

package com.qinghong; import org.dom4j.Attribute; import org.dom4j.Document; import org.dom4j.DocumentException; import org.dom4j.Element; import org.dom4j.io.SAXReader; Public class SysConfigParser {public static void main(String[] args) throws DocumentException {// Build the parser SAXReader saxReader = new SAXReader(); Document[dom4j] object tree Document Document = saxreader.read ("book.xml"); Bookstore /book/title // the xpath path of the element bookstore//title // the xpath path of the element bookstore//title // the xpath path of the element bookstore//title // //title Element title = (Element) document.selectSingleNode("//title"); String stringValue = title.getStringValue(); System.out.println(stringValue); Attribute lang = title. Attribute ("lang"); StringValue = lang.getStringValue(); String s = title.attributeValue("lang"); }}Copy the code

Parse XML files using XPath objects

bookstore.xml


      
<! Bookstore -> book[@category='web'][2] -> title xpath /bookstore/book[@category='web'][2]/title/text() 2. Bookstore -> book[@category='web'] -> title[@lang =' en'] /bookstore/book[@category='web']/title[@lang = 'en']/text() 3. Bookstore -> book[@category='cooking'] -> title -> @lang xpath /bookstore/book[@category='cooking']/@lang 4. Bookstore /bookstore/book -->
<bookstore>
    <book category="children">
        <title lang="eng">Harry Potter</title>
        <price>29.99</price>
        <year>2005</year>
    </book>
    <book category="web">
        <title lang="eng">Learning XML</title>
        <price>39.95</price>
        <year>2003</year>
    </book>
    <book category="cooking">
        <title lang="eng">Everyday Italian</title>
        <price>39.95</price>
        <year>2003</year>
    </book>
    <book category="web">
        <title lang="uk">XQuery Kick Start</title>
        <price>39.95</price>
        <year>2003</year>
    </book>
</bookstore>
Copy the code

Example:

package com.qinghong;

import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import java.io.IOException;

public class MyXPathTest {
    public static void main(String[] args){
        try {
            // Create a parse factory
            DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
            // Create a parser
            DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
            // Read the configuration file through the parser and generate an import org.w3c.dom.document package
            Document document = documentBuilder.parse("bookstore.xml");

            // Create an XPath object
            XPath xPath = XPathFactory.newInstance().newXPath();

            // bookstore -> book[@category='web'][2] -> title
            // xpath path: /bookstore/book[@category='web'][2]/title/text()

            String titleXpath = "/bookstore/book[@category='web'][2]/title/text()";
            String titleValue = (String) xPath.evaluate(titleXpath, document, XPathConstants.STRING);
            System.out.println(titleValue);

            // 2. Obtain the contents of the bookstore node whose category value is Web and whose title value is en
            // bookstore -> book[@category='web'] -> title[@lang = 'en']
            // bookstore/book[@category='web']/title[@lang =' en']/text()
            String titleXpath2 = "/bookstore/book[@category='web']/title[@lang = 'en']/text()";
            String titleValue2 = (String) xPath.evaluate(titleXpath2, document, XPathConstants.STRING);
            System.out.println(titleValue2);

            // 3. Obtain the lang property of the title of cooking whose category value is bookstore
            // bookstore -> book[@category='cooking'] -> title -> @lang
            // xpath path: /bookstore/book[@category='cooking']/@lang
            String titleLangAttrXpath = "/bookstore/book[@category='cooking']/@lang";
            String titleLangAttrValue = (String) xPath.evaluate(titleLangAttrXpath, document, XPathConstants.STRING);
            System.out.println(titleLangAttrValue);

            // 4. Obtain a collection of books that are bookstore nodes
            // /bookstore/book
            NodeList nodeList = (NodeList) xPath.evaluate("/bookstore/book", document, XPathConstants.NODESET);
            // Start traversal
            for (int i = 0; i < nodeList.getLength(); i++) {
                Element item = (Element) nodeList.item(i);
                String title = (String) xPath.evaluate("title", item, XPathConstants.STRING); System.out.println(title); }}catch(Exception e) { e.printStackTrace(); }}}Copy the code

DOM, DOM4J, SAX, and XPath parse XML files

XML files

1.1 What is AN XML file

1.2 Functions of XML files

1.3 Methods for parsing XML files: DOM, DOM4J, SAX

DOM4J parses XML files

2.1 Importing Dom4J dependencies

2.2 Common objects for Dom4j

2.2.1 SAXReader: Reads the XML file into the Document tree structure file object

2.2.2 Document: Is an XML Document object tree, analogous to Html Document objects

2.2.3 Element: Element node. The Document object can be used to find the element

2.3 Dom4j Parsing Procedure

2.4 Example 1: Parsing the students.xml file using Dom4j

Sax parses XML files

3.1 SAX mode: event-driven, while reading and writing

Parse XML files using Dom4j’s xPath

4.1 XPath syntax

The most useful path expressions are listed below:

Predicates

The instance

Select unknown node

The instance

Select several paths

The instance

4.2 Example: Obtaining configuration information from an XML file

Parse XML files using XPath objects

Example:

Related Posts

Very nice free docker image repository.

Secret behind idle fish efficient throwing — Kunpeng

Remember a useful SQL query statement