zoukankan      html  css  js  c++  java
  • python处理xml的常用包(lib.xml、ElementTree、lxml)

    python处理xml的三种常见机制

    • dom(随机访问机制)
    • sax(Simple APIs for XML,事件驱动机制)
    • etree

    python处理xml的三种包

    • 标准库中的xml
    • Fredrik Lundh 的 ElementTree
    • Stefan Behnel 的 lxml

    对以上三种包的介绍和对比

    摘录自:http://infohost.nmt.edu/tcc/help/pubs/pylxml/web/index.html

    With the continued growth of both Python and XML, there is a plethora(过剩/过多) of packages out there that help you read, generate, and modify XML files from Python scripts. Compared to most of them, the lxml package has two big advantages:

    • Performance. Reading and writing even fairly large XML files takes an almost imperceptible(小得无法察觉的) amount of time.
    • Ease of programming. The lxml package is based on ElementTree, which Fredrik Lundh invented to simplify and streamline XML processing.

    lxml is similar in many ways to two other, earlier packages:

    • Fredrik Lundh continues to maintain his original version of ElementTree.
    • xml.etree.ElementTree is now an official part of the Python library. There is a C-language version called cElementTree which may be even faster than lxml for some applications.

    However, the author prefers lxml for providing a number of additional features that make life easier. In particular, support for XPath makes it considerably easier to manage more complex XML structures.

    标准库中的xml包

    摘录自:http://docs.python.org/library/xml.html

    The XML handling submodules are:

    • xml.etree.ElementTree: the ElementTree API, a simple and lightweight XML processor
    • xml.dom: the DOM API definition
    • xml.dom.minidom: a minimal DOM implementation
    • xml.dom.pulldom: support for building partial DOM trees
    • xml.sax: SAX2 base classes and convenience functions
    • xml.parsers.expat: the Expat parser binding

    ElementTree包

    PYPI的介绍:https://pypi.python.org/pypi/elementtree/

    The Element type is a flexible container object, designed to store hierarchical data structures in memory. Element structures can be converted to and from XML.

    其作者对lxml的推介:http://effbot.org/zone/element-index.htm
    There’s also an independent implementation, lxml.etree, based on the well-known libxml2/libxslt libraries. This adds full support for XSLT, XPath, and more.

    IBM文档库的介绍文章:XML 问题: 使用 ElementTree,以 Python 语言处理 XML

    lxml介绍

    摘录自:http://lxml.de/

    lxml - XML and HTML with Python

    lxml is the most feature-rich and easy-to-use library for processing XML and HTML in the Python language.

    The lxml XML toolkit is a Pythonic binding for the C libraries libxml2 and libxslt. It is unique in that it combines the speed and XML feature completeness of these libraries with the simplicity of a native Python API, mostly compatible but superior to the well-known ElementTree API. The latest release works with all CPython versions from 2.6 to 3.6.

    总结

    一般情况下使用lxml可获得高效率和易用性。

    扩展阅读

    Python的XML处理方案:
    Python XML解析
    JAVA的xml方案:
    java解析xml的几种方式
    lxml教程:
    Python XML processing with lxml
    命名空间相关:
    Parsing XML with lxml and elementtree
    XML 命名空间

    本文原创发表于http://www.cnblogs.com/qijj,转载请保留此声明。
  • 相关阅读:
    Spring JdbcTemplate源码阅读报告
    Linux 套接字通信笔记(一)
    Python的科学计算包matplotlib setup
    创建Spring Boot项目
    Java反射与自定义注解
    二手前端入门React项目
    Spring 并发事务的探究
    使用IDEA结合MAVEN创建一个Spring Java Web项目
    FutureTask与Fork/Join
    结合业务,精炼SQL
  • 原文地址:https://www.cnblogs.com/qijj/p/6265308.html
Copyright © 2011-2022 走看看