ch2 XML
SAX解析器
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser(); //建立SAX解析器对象
parser.parse(source,handler); //source可以是文件,URL或者字符串输入流,handle是DefaultHandler的子类
DefaultHandler handler = new DefaultHandler(){ public void startElement(String namespaceURI,String lname,String qname, Attributes attrs) throws SAXException{ if(lname.equalsIngoreCase("a")&&attrs!=null){ for(int i=0;i<attrs.getLength();i++){ String aname = attrs.getLocalName(i); if(aname.equalsIgnoreCase("href")) System.out.println(attrs.getValue(i)); } } } };
其中qname参数是prefix:localname这种形式。如果命名空间处理特性打开,那么namespaceURI和lname描述的就是命名空间和本地(非限定)名。
与DOM一样,命名空间默认是关闭的,调用工厂类的setNamespaceAware方法激活:
SAXParserFactory factory = SAXParserFactory.newInstnce();
factory.setNamespaceAware(true);
SAXParser saxParser = factory.newSAXParser();
备注:XHTML文件总是以一个DTD引用标签开头,W3C也不情愿提供千万亿次的下载,如果自己不需要验证文件,只需调用:
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd",false);
StAX解析器
3 import java.io.*;
4 import java.net.*;
5 import javax.xml.parsers.*;
6 import org.xml.sax.*;
7 import org.xml.sax.helpers.*;
89
/**
10 * This program demonstrates how to use a SAX parser. The program prints all hyperlinks of an
11 * XHTML web page.<br>
12 * Usage: java SAXTest url
13 * @version 1.00 2001-09-29
14 * @author Cay Horstmann
15 */
16 public class SAXTest
17 {
18 public static void main(String[] args) throws Exception
19 {
20 String url;
21 if (args.length == 0)
22 {
23 url = "http://www.w3c.org";
24 System.out.println("Using " + url);
25 }
26 else url = args[0];
27
28 DefaultHandler handler = new DefaultHandler()
29 {
30 public void startElement(String namespaceURI, String lname, String qname,
31 Attributes attrs)
32 {
33 if (lname.equals("a") && attrs != null)
34 {
35 for (int i = 0; i < attrs.getLength(); i++)
36 {
37 String aname = attrs.getLocalName(i);
38 if (aname.equals("href")) System.out.println(attrs.getValue(i));
39 }
40 }
41 }
42 };
43
44 SAXParserFactory factory = SAXParserFactory.newInstance();
45 factory.setNamespaceAware(true);
46 factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd",
false);
47 SAXParser saxParser = factory.newSAXParser();
48 InputStream in = new URL(url).openStream();
49 saxParser.parse(in, handler);
50 }
51 }
生成XML
不带命名空间的文档:
Document doc = builder.newDocument(); //创建一个空文档
Element rootElement = doc.createElement(rootName); //创建文档元素
Element childElement = doc.createElement(childName);
Text textNode = doc.createTextNode(textContents); //创建文本节点
doc.appendChild(rootElement); //创建跟节点
rootElement.appendChild(childElement); //创建子节点
childElement.appendChild(textNode); //创建文本值
rootElement.setAttribute(name,value); //创建元素属性
带命名空间的文档:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
builder = factory.newDocumentBuilder();
String namespace = "http://www.w3.org/2000/svg";
Element rootElement = doc.createElementNS(namespace,"svg"); //创建文档元素
Element svgElement = doc.createElement(namespace,"svg:svg"); //带命名空间前缀的写法
rootElement.setAttributeNS(namespace,qualifiedName,value);
----------------------------------------------------------------
XML DOM输出
方式1
Transformer t = TransformerFactory.newInstance().newTransformer(); //这玩意干嘛的?从源到结果的转换API
t.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM,systemIdentifier);
t.setOutputProperty(Outputkeys.DOCTYPE_SYSTEM,publicIdentifier);
t.setOutputProperty(OutputKeys.INDENT,"yes");
t.setOutputProperty(OutputKeys.METHOD,"xml");
t.setOutputProperty("{http://xml.apache.org/xslt}indent-amount","2");
t.transform(new DOMSource(doc),new StreamResult(new FileOutputStream(file))); //执行上面的设置并且输出到文件中
方式2 LSSerializer
DOMImplementation impl = doc.getImplementation();
DOMImplementationLS implLS = (DOMImplementationLS) impl.getFeature("LS","3.0");
LSSerializer ser = implLS.createLSSerializer();
ser.getDomConfig().setParameter("format-pretty-print",true); //设置空格和换行
String str = ser.writeToString(doc); //将文档转换为字符串
LSOutput out = implLS.createLSOutput();
out.setEncoding("UTF-8");
out.setByteStream(Files.newOutputStream(path));
ser.write(doc,out); //将输出写入文件中
方式3 StAX
XMLOutputFactory factory = XMLOutputFactory.newInstance();
XMLStreamWriter writer = factory.createXMLStreamWriter(out);
接着是一系列的按XML顺序从上到下的操作,如:
产生XML文件头 writer.writeStartDocument();
添加子节点 writer.writeStartElement(name);
...