zoukankan      html  css  js  c++  java
  • 使用Dom4j操作XML

    Dom4j也可以很方便完成XML文档的创建、元素的修改、文档的查询遍历等,但dom4j稍比jdom复杂一点,不过在大片文档的情况下dom4j的性能要不jdom好。

     

    准备

    首先,提供相关的jar包

    Dom4j jar包下载:

    http://sourceforge.net/projects/dom4j/files/dom4j-2.0.0-ALPHA-2/

    jaxen jar下载:

    http://repo1.maven.org/maven2/jaxen/jaxen/1.1.1/jaxen-1.1.1.jar

    和dom4j依赖或相关的jar:

    http://dom4j.sourceforge.net/dependencies.html

    Junit-jar下载:

    http://ebr.springsource.com/repository/app/bundle/version/download?name=com.springsource.org.junit&version=4.8.1&type=binary

     

    其次,准备测试案例的部分代码:

     1 package com.hoo.test;
     2  
     3 import java.io.File;
     4 import java.util.Iterator;
     5 import java.util.List;
     6 import org.dom4j.Attribute;
     7 import org.dom4j.Document;
     8 import org.dom4j.DocumentException;
     9 import org.dom4j.DocumentHelper;
    10 import org.dom4j.Element;
    11 import org.dom4j.Node;
    12 import org.dom4j.QName;
    13 import org.dom4j.dom.DOMAttribute;
    14 import org.dom4j.io.SAXReader;
    15 import org.dom4j.tree.BaseElement;
    16 import org.junit.After;
    17 import org.junit.Before;
    18 import org.junit.Test;
    19  
    20 /**
    21  * <b>function:</b> 使用Dom4j操作XML
    22  * @author hoojo
    23  * @createDate 2011-8-5 下午06:15:40
    24  * @file DocumentTest.java
    25  * @package com.hoo.test
    26  * @project Dom4jTest
    27  * @blog http://blog.csdn.net/IBM_hoojo
    28  * @email hoojo_@126.com
    29  * @version 1.0
    30  */
    31 public class DocumentTest {
    32     
    33     private SAXReader reader = null;
    34     
    35     @Before
    36     public void init() {
    37         reader = new SAXReader();
    38     }
    39     
    40     @After
    41     public void destory() {
    42         reader = null;
    43         System.gc();
    44     }
    45     
    46     public void fail(Object o) {
    47         if (o != null)
    48             System.out.println(o);
    49     }
    50 }

     

    创建一篇XML文档

    文档格式如下:

     1 <?xml version="1.0" encoding="UTF-8"?> 
     2 <catalog> 
     3     <!--An XML Catalog--> 
     4     <?target instruction?>
     5     <journal title="XML Zone" publisher="IBM developerWorks"> 
     6          <article level="Intermediate" date="December-2001">
     7              <title>Java configuration with XML Schema</title> 
     8              <author> 
     9                  <firstname>Marcello</firstname> 
    10                  <lastname>Vitaletti</lastname> 
    11              </author>
    12            </article>
    13     </journal> 
    14 </catalog>

     

     

    创建文档代码如下:

     1 /**
     2  * <b>function:</b>创建文档
     3  * @author hoojo
     4  * @createDate 2011-8-5 下午06:18:18
     5  */
     6 @Test
     7 public void createDocument() {
     8     //创建一篇文档
     9     Document doc = DocumentHelper.createDocument();
    10     
    11     //添加一个元素
    12     Element root = doc.addElement("catalog");
    13     //为root元素添加注释
    14     root.addComment("An XML Catalog");
    15     //添加标记
    16     root.addProcessingInstruction("target", "instruction");
    17     
    18     //创建元素
    19     Element journalEl = new BaseElement("journal");
    20     //添加属性
    21     journalEl.addAttribute("title", "XML Zone");
    22     journalEl.addAttribute("publisher", "IBM developerWorks");
    23     root.add(journalEl);
    24     
    25     //添加元素
    26     Element articleEl = journalEl.addElement("article");
    27     articleEl.addAttribute("level", "Intermediate");
    28     articleEl.addAttribute("date", "December-2001");
    29     
    30     Element titleEl = articleEl.addElement("title");
    31     //设置文本内容
    32     titleEl.setText("Java configuration with XML Schema");
    33     //titleEl.addText("Java configuration with XML Schema");
    34     
    35     Element authorEl = articleEl.addElement("author");
    36     authorEl.addElement("firstname").setText("Marcello");
    37     authorEl.addElement("lastname").addText("Vitaletti");
    38     
    39     //可以使用 addDocType() 方法添加文档类型说明。 
    40     doc.addDocType("catalog", null,"file://c:/Dtds/catalog.dtd"); 
    41  
    42     fail(doc.getRootElement().getName());
    43     
    44     //将xml转换成文本
    45     fail(doc.asXML());
    46     
    47     //写入到文件
    48     /*XMLWriter output;
    49     try {
    50         output = new XMLWriter(new FileWriter(new File("file/catalog.xml")));
    51         output.write(doc);
    52         output.close();
    53     } catch (IOException e) {
    54         e.printStackTrace();
    55     }*/
    56 }

    * DocumentHelper是一个文档助手类(工具类),它可以完成文档、元素、文本、属性、注释、CDATA、Namespace、XPath的创建,以及利用XPath完成文档的遍历和将文本转换成Document;

    parseText完成将xml字符串转换成Doc的功能

    Document doc = DocumentHelper.parseText("<root></root>");

    createDocument创建一个文档

    Document doc = DocumentHelper.createDocument();

    如果带参数就会创建一个带有根元素的文档

     

    createElement创建一个元素

    Element el = DocumentHelper.createElement("el");

    * Document的addElement方法可以给当前文档添加一个子元素

    Element root = doc.addElement("catalog");

    * addComment方法可以添加一段注释

    root.addComment("An XML Catalog");

    为root元素添加一段注释

     

    * addProcessingInstruction添加一个标记

    root.addProcessingInstruction("target", "instruction");

    为root元素添加一个标记

     

    * new BaseElement可以创建一个元素

    Element journalEl = new BaseElement("journal");

     

    * addAttribute添加属性

    journalEl.addAttribute("title", "XML Zone");

    * add添加一个元素

    root.add(journalEl);

    将journalEl元素添加到root元素中

     

    * addElement添加一个元素,并返回当前元素

    Element articleEl = journalEl.addElement("article");

    给journalEl元素添加一个子元素article

     

    * setText、addText可以设置元素的文本

    authorEl.addElement("firstname").setText("Marcello");
    authorEl.addElement("lastname").addText("Vitaletti");

    * addDocType可以设置文档的DOCTYPE

    doc.addDocType("catalog", null,file://c:/Dtds/catalog.dtd);

    * asXML可以将文档或元素转换成一段xml字符串

    doc.asXML();
    root.asXML();

    * XMLWriter类可以把文档写入到文件中

    output = new XMLWriter(new FileWriter(new File("file/catalog.xml")));
    output.write(doc);
    output.close();

     

    修改XML文档内容

     1 /**
     2  * <b>function:</b> 修改XML内容
     3  * @author hoojo
     4  * @createDate 2011-8-9 下午03:37:04
     5  */
     6 @SuppressWarnings("unchecked")
     7 @Test
     8 public void modifyDoc() {
     9     try {
    10         Document doc = reader.read(new File("file/catalog.xml"));
    11         
    12         //修改属性内容
    13         List list = doc.selectNodes("//article/@level");
    14         Iterator<Attribute> iter = list.iterator();
    15         while (iter.hasNext()) {
    16             Attribute attr = iter.next();
    17             fail(attr.getName() + "#" + attr.getValue() + "#" + attr.getText());
    18             if ("Intermediate".equals(attr.getValue())) {
    19                 //修改属性值
    20                 attr.setValue("Introductory");
    21                 fail(attr.getName() + "#" + attr.getValue() + "#" + attr.getText());
    22             }
    23         }
    24         
    25         list = doc.selectNodes("//article/@date");
    26         iter = list.iterator();
    27         while (iter.hasNext()) {
    28             Attribute attr = iter.next();
    29             fail(attr.getName() + "#" + attr.getValue() + "#" + attr.getText());
    30             if ("December-2001".equals(attr.getValue())) {
    31                 //修改属性值
    32                 attr.setValue("December-2011");
    33                 fail(attr.getName() + "#" + attr.getValue() + "#" + attr.getText());
    34             }
    35         }
    36         
    37         //修改节点内容
    38         list = doc.selectNodes("//article");
    39         Iterator<Element> it = list.iterator();
    40         while (it.hasNext()) {
    41             Element el = it.next();
    42             fail(el.getName() + "#" + el.getText() + "#" + el.getStringValue());
    43             //修改title元素
    44             Iterator<Element> elIter = el.elementIterator("title");
    45             while(elIter.hasNext()) {
    46                 Element titleEl = elIter.next();
    47                 fail(titleEl.getName() + "#" + titleEl.getText() + "#" + titleEl.getStringValue());
    48                 if ("Java configuration with XML Schema".equals(titleEl.getTextTrim())) {
    49                     //修改元素文本值
    50                     titleEl.setText("Modify the Java configuration with XML Schema");
    51                     fail(titleEl.getName() + "#" + titleEl.getText() + "#" + titleEl.getStringValue());
    52                 }
    53             }
    54         }
    55         
    56         //修改节点子元素内容
    57         list = doc.selectNodes("//article/author");
    58         it = list.iterator();
    59         while (it.hasNext()) {
    60             Element el = it.next();
    61             fail(el.getName() + "#" + el.getText() + "#" + el.getStringValue());
    62             List<Element> childs = el.elements();
    63             for (Element e : childs) {
    64                 fail(e.getName() + "#" + e.getText() + "#" + e.getStringValue());
    65                 if ("Marcello".equals(e.getTextTrim())) {
    66                     e.setText("Ayesha");
    67                 } else if ("Vitaletti".equals(e.getTextTrim())) {
    68                     e.setText("Malik");
    69                 } 
    70                 fail(e.getName() + "#" + e.getText() + "#" + e.getStringValue());
    71             }
    72         }
    73         
    74         //写入到文件
    75         /*XMLWriter output = new XMLWriter(new FileWriter(new File("file/catalog-modified.xml")));
    76         output.write(doc);
    77         output.close();*/
    78     } catch (DocumentException e) {
    79         e.printStackTrace();
    80     } catch (Exception e) {
    81         e.printStackTrace();
    82     }
    83 }

    * reader.read(new File("file/catalog.xml"));读取指定xml文件内容到文档中;

    * selectNodes是XPath的查询方法,完成xml文档的查询,传递xpath路径。其使用方法可以参考jdom的xpath的使用方法:

         http://www.cnblogs.com/hoojo/archive/2011/08/11/2134638.html

    * getName获取元素标签名称、getValue、getText获取值、文本内容;

    * elementIterator("title");获取当前节点下所有的title元素,返回Iterator;

    * elements获取下面所有的子元素,返回的是一个集合List;

     

     

    显示文档相关信息

     1 private String format(int i) {
     2     String temp = "";
     3     while (i > 0) {
     4         temp += "--";
     5         i--;
     6     }
     7     return temp;
     8 }
     9  
    10 /**
    11  * <b>function:</b>递归显示文档内容
    12  * @author hoojo
    13  * @createDate 2011-8-9 下午03:43:45
    14  * @param i
    15  * @param els
    16  */
    17 private void print(int i, List<Element> els) {
    18     i++;
    19     for (Element el : els) {
    20         fail(format(i) + "##" + el.getName() + "#" + el.getTextTrim());
    21         if (el.hasContent()) {
    22             print(i, el.elements());
    23         } 
    24     }
    25 }
    26  
    27 /**
    28  * <b>function:</b>显示文档相关信息
    29  * @author hoojo
    30  * @createDate 2011-8-9 下午03:44:10
    31  */
    32 @Test
    33 public void printInfo() {
    34     try {
    35         Document doc = reader.read(new File("file/catalog.xml"));
    36         fail("asXML: " + doc.asXML());
    37         
    38         fail(doc.asXPathResult(new BaseElement("article")));
    39         List<Node> list = doc.content();
    40         for (Node node : list) {
    41             fail("Node: " + node.getName() + "#" + node.getText() + "#" + node.getStringValue());
    42         }
    43         
    44         fail("-----------------------------");
    45         print(0, doc.getRootElement().elements());
    46         
    47         fail("getDocType: " + doc.getDocType());
    48         fail("getNodeTypeName: " + doc.getNodeTypeName());
    49         fail("getPath: " + doc.getRootElement().getPath());
    50         fail("getPath: " + doc.getRootElement().getPath(new BaseElement("journal")));
    51         fail("getUniquePath: " + doc.getRootElement().getUniquePath());
    52         fail("getXMLEncoding: " + doc.getXMLEncoding());
    53         fail("hasContent: " + doc.hasContent());
    54         fail("isReadOnly: " + doc.isReadOnly());
    55         fail("nodeCount: " + doc.nodeCount());
    56         fail("supportsParent: " + doc.supportsParent());
    57     } catch (DocumentException e) {
    58         e.printStackTrace();
    59     }
    60     fail("getEncoding: " + reader.getEncoding());
    61     fail("isIgnoreComments: " + reader.isIgnoreComments());
    62     fail("isMergeAdjacentText: " + reader.isMergeAdjacentText());
    63     fail("isStringInternEnabled: " + reader.isStringInternEnabled());
    64     fail("isStripWhitespaceText: " + reader.isStripWhitespaceText());
    65     fail("isValidating: " + reader.isValidating());
    66 }

     

     

    删除文档内容

     1 /**
     2  * <b>function:</b> 删除节点内容
     3  * @author hoojo
     4  * @createDate 2011-8-9 下午03:47:44
     5  */
     6 @Test
     7 public void removeNode() {
     8     try {
     9         Document doc = reader.read(new File("file/catalog-modified.xml"));
    10         fail("comment: " + doc.selectSingleNode("//comment()"));
    11         //删除注释
    12         doc.getRootElement().remove(doc.selectSingleNode("//comment()"));
    13         
    14         Element node = (Element) doc.selectSingleNode("//article");
    15         //删除属性
    16         node.remove(new DOMAttribute(QName.get("level"), "Introductory"));
    17         //删除元素 节点
    18         node.remove(doc.selectSingleNode("//title"));
    19         
    20         //只能删除下一级节点,不能超过一级;(需要在父元素的节点上删除子元素)
    21         Node lastNameNode = node.selectSingleNode("//lastname");
    22         lastNameNode.getParent().remove(lastNameNode);
    23         
    24         fail("Text: " + doc.selectObject("//*[text()='Ayesha']"));
    25         Element firstNameEl = (Element)doc.selectObject("//firstname");
    26         fail("Text: " + firstNameEl.selectSingleNode("text()"));
    27         
    28         //删除text文本
    29         //firstNameEl.remove(firstNameEl.selectSingleNode("text()"));
    30         //firstNameEl.remove(doc.selectSingleNode("//firstname/text()"));
    31         firstNameEl.remove(doc.selectSingleNode("//*[text()='Ayesha']/text()"));
    32         
    33         //删除子元素author
    34         //node.remove(node.selectSingleNode("//author"));
    35         
    36         fail(doc.asXML());
    37     } catch (Exception e) {
    38         e.printStackTrace();
    39     }
    40 }

     

    * 删除注释

    doc.getRootElement().remove(doc.selectSingleNode("//comment()"));

    删除root元素下面的注释

     

    * 删除属性

    node.remove(new DOMAttribute(QName.get("level"), "Introductory"));

    删除node节点中的名称为level,其值为Introductory的属性

     

    * 删除元素

    node.remove(doc.selectSingleNode("//title"));

    删除node节点下的title元素

     

    * 删除文本

    firstNameEl.remove(firstNameEl.selectSingleNode("text()"));
    firstNameEl.remove(doc.selectSingleNode("//firstname/text()"));
    firstNameEl.remove(doc.selectSingleNode("//*[text()='Ayesha']/text()"));
  • 相关阅读:
    安装mysql警告 warning: mysql-community-server-5.7.19-1.el6.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 5072e1f5: NOKEY
    RPM方式安装MySQL5.6
    CentOS7安装MySQL冲突和问题解决小结
    Linux(64) 下 Tomcat + java 环境搭建
    自写Jquery插件 Combobox
    自写Jquery插件 Datagrid
    自写Jquery插件 Menu
    scrapy 中间件
    提高scrapy爬取效率配置
    scrapy基于请求传参实现深度爬取
  • 原文地址:https://www.cnblogs.com/XQiu/p/5127048.html
Copyright © 2011-2022 走看看