zoukankan      html  css  js  c++  java
  • python基础之读取xml

    python怎么操作xml文件详细介绍链接:https://www.jb51.net/article/50812.htm

    从结构上来说,xml很像常见的HTML超文本标记语言。不过超文本语言被设计用来显示数据,其焦点是数据的外观。xml被设计用来传输和存储数据,其焦点是数据的内容。

    特征:

    1. 标签对组成:<TEST></TEST>

    2. 标签可以有属性<TEST Loop="1"></TEST>

    3. 标签可以嵌入数据:<TEST>CPU</TEST>

    4. 标签可以嵌入子标签(具有层级关系)

    Python读取xml

    import xml.dom.minidom

    打开xml文件:xml.dom.minidom.parse()

    每个节点都有nodeName, nodeValue, nodeType,nodeName为节点名字,nodeValue是节点的值,只对文本节点有效。catalog是ELEMENT_NODE类型

    现在有以下几种:

    'ATTRIBUTE_NODE'
    'CDATA_SECTION_NODE'
    'COMMENT_NODE'
    'DOCUMENT_FRAGMENT_NODE'
    'DOCUMENT_NODE'
    'DOCUMENT_TYPE_NODE'
    'ELEMENT_NODE'
    'ENTITY_NODE'
    'ENTITY_REFERENCE_NODE'
    'NOTATION_NODE'
    'PROCESSING_INSTRUCTION_NODE'
    'TEXT_NODE'

    举个例子,有这样一份xml:

    abc.xml

    <?xml version="1.0" encoding="utf-8"?>
    <catalog>
        <maxid>4</maxid>
        <login username="pytest" passwd='123456'>
            <caption>Python</caption>
            <item id="4">
                <caption>测试</caption>
            </item>
        </login>
        <item id="2">
            <caption>Zope</caption>
        </item>
    </catalog>
    View Code

    读取根节点:

    from xml.dom.minidom import parse
    
    
    def read_xml_root_node(xml_path):
        dom = parse(xml_path)
        root = dom.documentElement
        return root
    
    
    if __name__ == "__main__":
        root_node = read_xml_root_node("abc.xml")
        print(root_node.nodeName)
        print(root_node.nodeType)
    View Code

    输出结果:

    catalog
    1

    为什么打印出来的类型是1呢,1代表什么呢。参考nodeType

    获取子节点以及value:

    from xml.dom.minidom import parse
    
    
    def read_xml_root_node(xml_path):
        dom = parse(xml_path)
        root = dom.documentElement
        return root
    
    
    def read_child_label(node, label_name):
        child = node.getElementsByTagName(label_name)
        return child
    
    
    if __name__ == "__main__":
        root_node = read_xml_root_node("abc.xml")
        print(root_node.nodeName)
        print(root_node.nodeType)
        child_nodes = read_child_label(root_node, "maxid")
        for child_node in child_nodes:
            print(child_node.nodeName)
            print(child_node.nodeType)
            print(child_node.childNodes[0].nodeValue)
    View Code

    输出结果:

    catalog
    1
    maxid
    1
    4

    获取标签属性

    from xml.dom.minidom import parse
    
    
    def read_xml_root_node(xml_path):
        dom = parse(xml_path)
        root = dom.documentElement
        return root
    
    
    def read_child_label(node, label_name):
        child = node.getElementsByTagName(label_name)
        return child
    
    
    def read_attribute(node, attr_name):
        attribute = node.getAttribute(attr_name)
        return attribute
    
    
    if __name__ == "__main__":
        root_node = read_xml_root_node("abc.xml")
        print(root_node.nodeName)
        print(root_node.nodeType)
        child_nodes_login = read_child_label(root_node, "login")
        for child_node in child_nodes_login:
            attr_username = read_attribute(child_node, "username")
            print(attr_username)
    View Code

    输出结果:

    catalog
    1
    pytest

    另一种模块读取xml的方法,可以遍历指定标签下的子标签

    from xml.etree import ElementTree as ET
    
    
    per = ET.parse("abc.xml")
    p = per.findall("./login/item")
    
    for opener in p:
        for child in opener.getchildren():
            print(child.tag, ":", child.text)
    
    
    p = per.findall("./item")
    
    for oneper in p:
        for child in oneper.getchildren():
            print(child.tag, ":", child.text)
    View Code

    输出结果:

    caption : 测试
    caption : Zope
  • 相关阅读:
    POJ 1300 Open Door
    POJ 2230 Watchcow
    codevs 1028 花店橱窗布置
    codevs 1021 玛丽卡
    codevs 1519 过路费
    codevs 3287 货车运输
    codevs 3305 水果姐逛水果街二
    codevs 1036 商务旅行
    codevs 4605 LCA
    POJ 1330 Nearest Common Ancestors
  • 原文地址:https://www.cnblogs.com/smart-zihan/p/11982892.html
Copyright © 2011-2022 走看看