zoukankan      html  css  js  c++  java
  • The Java XML Validation API(转)

    Validation is a powerful tool. It enables you to quickly check that input is
    roughly in the form you expect and quickly reject any document that is too far
    away from what your process can handle. If there's a problem with the data, it's
    better to find out earlier than later.


    In the context of Extensible Markup Language (XML), validation normally
    involves writing a detailed specification for the document's contents in any of
    several schema languages such as the World Wide Web Consortium (W3C) XML Schema
    Language (XSD), RELAX NG, Document Type Definitions (DTDs), and Schematron.
    Sometimes validation is performed while parsing, sometimes immediately after.
    However, it's usually done before any further processing of the input takes
    place. (This description is painted with broad strokes -- there are
    exceptions.)


    Until recently, the exact Application Programming Interface (API) by which
    programs requested validation varied with the schema language and parser. DTDs
    and XSD were normally accessed as configuration options in Simple API for XML
    (SAX), Document Object Model (DOM), and Java™ API for XML Processing (JAXP).
    RELAX NG required a custom library and API. Schematron might use the
    Transformations API for XML(TrAX); and still other schema languages required
    programmers to learn still more APIs, even though they were performing
    essentially the same operation.


    Java 5 introduced the javax.xml.validation package to provide a
    schema-language-independent interface to validation services. This package is
    also available in Java 1.3 and later when you install JAXP 1.3 separately. Among
    other products, an implementation of this library is included with Xerces 2.8.


    Validation


    The javax.xml.validation API uses three classes to validate
    documents: SchemaFactory, Schema, and
    Validator. It also makes extensive use of the
    javax.xml.transform.Source interface from TrAX to represent the XML
    documents. In brief, a SchemaFactory reads the schema document
    (often an XML file) from which it creates a Schema object. The
    Schema object creates a Validator object. Finally, the
    Validator object validates an XML document represented as a
    Source.


    Listing 1 shows a simple
    program to validate a URL entered on the command line against the DocBook XSD
    schema.


    Listing 1. Validating an Extensible
    Hypertext Markup Language (XHTML) document





    import java.io.*;
    import javax.xml.transform.Source;
    import javax.xml.transform.stream.StreamSource;
    import javax.xml.validation.*;
    import org.xml.sax.SAXException;
    
    public class DocbookXSDCheck {
    
        public static void main(String[] args) throws SAXException, IOException {
    
            // 1. Lookup a factory for the W3C XML Schema language
            SchemaFactory factory = 
                SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
            
            // 2. Compile the schema. 
            // Here the schema is loaded from a java.io.File, but you could use 
            // a java.net.URL or a javax.xml.transform.Source instead.
            File schemaLocation = new File("/opt/xml/docbook/xsd/docbook.xsd");
            Schema schema = factory.newSchema(schemaLocation);
        
            // 3. Get a validator from the schema.
            Validator validator = schema.newValidator();
            
            // 4. Parse the document you want to check.
            Source source = new StreamSource(args[0]);
            
            // 5. Check the document
            try {
                validator.validate(source);
                System.out.println(args[0] + " is valid.");
            }
            catch (SAXException ex) {
                System.out.println(args[0] + " is not valid because ");
                System.out.println(ex.getMessage());
            }  
            
        }
    
    }


    Here's some typical output when checking an invalid document using the
    version of Xerces bundled with Java 2 Software Development Kit (JDK) 5.0:


    file:///Users/elharo/CS905/Course_Notes.xml is not valid because
    cvc-complex-type.2.3: Element 'legalnotice' cannot have character [children],
    because the type's content type is element-only.


    You can easily change the schema to validate against, the document to
    validate, and even the schema language. However, in all cases, validation
    follows these five steps:



    1. Load a schema factory for the language the schema is written in.
    2. Compile the schema from its source.
    3. Create a validator from the compiled schema.
    4. Create a Source object for the document you want to validate. A
      StreamSource is usually simplest.
    5. Validate the input source. If the document is invalid, the
      validate() method throws a SAXException. Otherwise, it
      returns quietly.

    You can reuse the same validator and the same schema multiple times in
    series. However, only the schema is thread safe. Validators and schema factories
    are not. If you validate in multiple threads simultaneously, make sure each one
    has its own Validator and SchemaFactory objects.


    Validate
    against a document-specified schema


    Some documents specify the schema they expect to be validated against,
    typically using xsi:noNamespaceSchemaLocation and/or
    xsi:schemaLocation attributes like this:





    <document xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:noNamespaceSchemaLocation="http://www.example.com/document.xsd">
      ...


    If you create a schema without specifying a URL, file, or source, then the
    Java language creates one that looks in the document being validated to find the
    schema it should use. For example:





    SchemaFactory factory = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
    Schema schema = factory.newSchema();
    


    However, normally this isn't what you want. Usually the document consumer
    should choose the schema, not the document producer. Furthermore, this approach
    works only for XSD. All other schema languages require an explicitly specified
    schema location.







    Abstract
    factories


    SchemaFactory is an abstract factory. The abstract factory
    design pattern enables this one API to support many different schema languages
    and object models. A single implementation usually supports only a subset of the
    numerous languages and models. However, once you learn the API for validating
    DOM documents against RELAX NG schemas (for instance), you can use the same API
    to validate JDOM documents against W3C schemas.


    For example, Listing 2
    shows a program that validates DocBook documents against DocBook's RELAX NG
    schema. It's almost identical to Listing 1. The only
    things that have changed are the location of the schema and the URL that
    identifies the schema language.


    Listing 2.
    Validating a DocBook document using RELAX NG





    import java.io.*;
    import javax.xml.transform.Source;
    import javax.xml.transform.stream.StreamSource;
    import javax.xml.validation.*;
    import org.xml.sax.SAXException;
    
    public class DocbookRELAXNGCheck {
    
        public static void main(String[] args) throws SAXException, IOException {
    
            // 1. Specify you want a factory for RELAX NG
            SchemaFactory factory 
             = SchemaFactory.newInstance("http://relaxng.org/ns/structure/1.0");
            
            // 2. Load the specific schema you want. 
            // Here I load it from a java.io.File, but we could also use a 
            // java.net.URL or a javax.xml.transform.Source
            File schemaLocation = new File("/opt/xml/docbook/rng/docbook.rng");
            
            // 3. Compile the schema.
            Schema schema = factory.newSchema(schemaLocation);
        
            // 4. Get a validator from the schema.
            Validator validator = schema.newValidator();
            
            // 5. Parse the document you want to check.
            String input 
             = "file:///Users/elharo/Projects/workspace/CS905/build/Java_Course_Notes.xml";
            
            // 6. Check the document
            try {
                validator.validate(source);
                System.out.println(input + " is valid.");
            }
            catch (SAXException ex) {
                System.out.println(input + " is not valid because ");
                System.out.println(ex.getMessage());
            }  
            
        }
    
    }


    If you run this program with the stock Sun JDK and no extra libraries, you'll
    probably see something like this:





    Exception in thread "main" java.lang.IllegalArgumentException: 
    http://relaxng.org/ns/structure/1.0
    	at javax.xml.validation.SchemaFactory.newInstance(SchemaFactory.java:186)
    	at DocbookRELAXNGCheck.main(DocbookRELAXNGCheck.java:14)


    This is because, out of the box, the JDK doesn't include a RELAX NG
    validator. When the schema language isn't recognized,
    SchemaFactory.newInstance() throws an
    IllegalArgumentException. However, if you install a RELAX NG
    library such as Jing and a JAXP 1.3 adapter, then it should produce the same
    answer the W3C schema does.


    Identify the
    schema language


    The javax.xml.constants class defines several constants to
    identify schema languages:



    • XMLConstants.W3C_XML_SCHEMA_NS_URI:
      http://www.w3.org/2001/XMLSchema
    • XMLConstants.RELAXNG_NS_URI:
      http://relaxng.org/ns/structure/1.0
    • XMLConstants.XML_DTD_NS_URI:
      http://www.w3.org/TR/REC-xml

    This isn't a closed list. Implementations are free to add other URLs to this
    list to identify other schema languages. Typically, the URL is the namespace
    Uniform Resource Identifier (URI) for the schema language. For example, the URL
    http://www.ascc.net/xml/schematron identifies Schematron schemas.


    Sun's JDK 5 only supports XSD schemas. Although DTD validation is supported,
    it isn't accessible through the javax.xml.validation API. For DTDs,
    you have to use the regular SAX XMLReader class. However, you can
    install additional libraries that add support for these and other schema
    languages.


    How schema
    factories are located


    The Java programming language isn't limited to a single schema factory. When
    you pass a URI identifying a particular schema language to
    SchemaFactory.newInstance(), it searches the following locations in
    this order to find a matching factory:



    1. The class named by the
      "javax.xml.validation.SchemaFactory:schemaURL" system
      property
    2. The class named by the
      "javax.xml.validation.SchemaFactory:schemaURL" property
      found in the $java.home/lib/jaxp.properties file
    3. javax.xml.validation.SchemaFactory service providers found in
      the META-INF/services directories of any available Java Archive (JAR) files
    4. A platform default SchemaFactory,
      com.sun.org.apache.xerces.internal.jaxp.validation.xs.SchemaFactoryImpl
      in JDK 5

    To add support for your own custom schema language and corresponding
    validator, all you have to do is write subclasses of SchemaFactory,
    Schema, and Validator that know how to process your
    schema language. Then, install your JAR in one of these four locations. This is
    useful for adding constraints that are more easily checked in a Turing-complete
    language like Java than in a declarative language like the W3C XML Schema
    language. You can define a mini-schema language, write a quick implementation,
    and plug it into the validation layer.







    Error
    handlers


    The default response from a schema is to throw a
    SAXException if there's a problem and do nothing if there isn't.
    However, you can provide a SAX ErrorHandler to receive more
    detailed information about the document's problems. For example, suppose you
    want to log all validation errors, but you don't want to stop processing when
    you encounter one. You can install an error handler such as that in Listing 3.


    Listing 3. An error
    handler that merely logs non-fatal validity errors





    import org.xml.sax.ErrorHandler;
    import org.xml.sax.SAXException;
    import org.xml.sax.SAXParseException;
    
    public class ForgivingErrorHandler implements ErrorHandler {
    
        public void warning(SAXParseException ex) {
            System.err.println(ex.getMessage());
        }
    
        public void error(SAXParseException ex) {
            System.err.println(ex.getMessage());
        }
    
        public void fatalError(SAXParseException ex) throws SAXException {
            throw ex;
        }
    
    }


    To install this error handler, you create an instance of it and pass that
    instance to the Validator's setErrorHandler() method:





      ErrorHandler lenient = new ForgivingErrorHandler();
      validator.setErrorHandler(lenient);

    from  http://www.ibm.com/developerworks/xml/library/x-javaxmlvalidapi/index.html
  • 相关阅读:
    iOS开发之Socket
    IOS开发之Bug--使用KVC的易错情况
    IOS开发之功能模块--给任意的UIView添加点击事件
    IOS开发之开发者账号遇到的bug
    iOS开发--关于TableViewCell的可视化设置细节
    学习Coding-iOS开源项目日志(四)
    Learn how to Use UIPageViewController in iOS
    关于Storyboard的使用
    学习Coding-iOS开源项目日志(三)
    学习Coding-iOS开源项目日志(二)
  • 原文地址:https://www.cnblogs.com/wufengtinghai/p/2137691.html
Copyright © 2011-2022 走看看