zoukankan      html  css  js  c++  java
  • python XML梳理

    导入ElementTree模块

    import xml.etree.ElementTree as ET

    为了创建一个element实例,使用Element 构造函数或者SubElement()工厂函数。
      ET.Element():通常用于创建根节点
      ET.SubElement(): 用于创建子节点

    ElementTree 类可以用来包裹一个element结构,用于与XML进行相互转换。

    通常使用:
      ElementTree 遍历整个文档。
      Element遍历单独的节点或者子节点通常使用 。

    Element:方法以及函数
    		tag = None
    		attrib = None
    		text = None
    		tail = None
        def append(self, subelement):
        def extend(self, elements):
        def insert(self, index, subelement):
        def remove(self, subelement):
        def getchildren(self):
        def find(self, path, namespaces=None):
        def findtext(self, path, default=None, namespaces=None):
        def findall(self, path, namespaces=None):
        def iterfind(self, path, namespaces=None):
        def clear(self):
        def get(self, key, default=None):
        def set(self, key, value):
        def keys(self):
        def items(self):
        def iter(self, tag=None):
        def itertext(self):
    
    ElementTree:常用函数
        def getroot(self): 
        def parse(self, source, parser=None):  #打开xml文件
        def iter(self, tag=None):
        def getiterator(self, tag=None):
        def find(self, path, namespaces=None):
        def findtext(self, path, default=None, namespaces=None):
        def findall(self, path, namespaces=None):
        def iterfind(self, path, namespaces=None):
        def write(self, file_or_filename,
                  encoding=None,
                  xml_declaration=None,
                  default_namespace=None,
                  method=None, *,
                  short_empty_elements=True):
    

    一、解析(获取到根节点[Element])
    1、str方式:
      node = ET.XML(str_xml) = ET.fromstring(str_xml) #获取到根节点 (Element)
      #XML()=fromstring()
    2、文件方式:
      result = ET.parse("file.xml") # 打开文件,(ElementTree)
      root = result.getroot() # 获取到根节点,(Element)

    二、修改(使用[Element]型对象)
      tag、attrib、text、find、iter、remove、set......

    三、重新写入文件

    写入必须使用ElementTree对象调用write方法操作。
      1、str解析方式写入文件

        node = ET.XML(str_xml) #获取到根节点
        ...操作
        et = ET.ElementTree(root) #创建一个tree
        et.write("file.xml", encoding="utf-8", xml_declaration=True) #写入文件

      2、文件方式写回文件

      result = ET.parse("file.xml") #获取Tree
      root = result.getroot() #获取到根节点
      ...修改
      result.write("file.xml", encoding="utf-8", xml_declaration=True) 写回文件
    

    四、创建XMl文件

      Element #创建根节点
      SubElement # 创建子节点
      ElementTree # 创建tree,用于写入文件

    五、缩进

    导入minidom模块
      from xml.dom import minidom
    在写入文件是不在使用tree,使用下面的代码(其中的root是Element跟节点)

      c = minidom.parseString(ET.tostring(root, encoding="Utf-8")).toprettyxml(indent="	")
      f = open("file.xml", "w", encoding="utf-8")
      f.write(c)
      f.close()

    对以上的操作可以定义一个函数使用:

      def wrap(root):
        a = ET.tostring(root, encoding="Utf-8")
        b = minidom.parseString(a)
        c = b.toprettyxml(indent="	")
        return c
    

    六、命名空间

      1、注册命名空间:

        ET.register_namespace("com", "http://www.ehaomiao.com")

      2、调用(在需要使用命名空间的标签前加入,格式为{})如下:

        School = ET.Element("{http://www.ehaomiao.com}school")
        University = ET.SubElement(School, "{http://www.ehaomiao.com}University", attrib={"time": "4"})

      3、结果
        根节点的显示如下:(多了一句xmlns:com="http://www.ehaomiao.com"这样的语句)
          <com:school xmlns:com="http://www.ehaomiao.com">
        在每个调用命名空间的节点显示如下:(多了一个com:的标志)
          <com:University time="4">

    七、重要

    在操作过程中如遇到有关于对象类型的问题,可以使用type()方法查看一下。

    八、创建XML文件练习

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    # @Time    : 2017/12/1 0001 14:07
    # @Author  : ming
    import xml.etree.ElementTree as ET
    from xml.dom import minidom
    
    ET.register_namespace("com", "http://www.ehaomiao.com")  # 注册命名空间
    
    School = ET.Element("{http://www.ehaomiao.com}school")  # 调用命名空间
    University = ET.SubElement(School, "{http://www.ehaomiao.com}University", attrib={"time": "4"})
    d1 = ET.SubElement(University, "d1")  # University是其父亲节点
    d1.text = "大一"
    d2 = ET.SubElement(University, "d2")
    d2.text = "大二"
    d3 = ET.SubElement(University, "d3")
    d3.text = "大三"
    d4 = ET.SubElement(University, "d4")
    d4.text = "大四"
    
    High_school = ET.SubElement(School, "{http://www.ehaomiao.com}High_school", attrib={"time": "3"})
    g1 = ET.SubElement(High_school, "g1")
    g1.text = "高一"
    g2 = ET.SubElement(High_school, "g2")
    g2.text = "高二"
    g3 = ET.SubElement(High_school, "g3")
    g3.text = "高三"
    
    middle_school = ET.SubElement(School, "{http://www.ehaomiao.com}middle_school", attrib={"time": "3"})
    c1 = ET.SubElement(middle_school, "c1")
    c1.text = "初一"
    c2 = ET.SubElement(middle_school, "c2")
    c2.text = "初一"
    c3 = ET.SubElement(middle_school, "c3")
    c3.text = "初一"
    
    # 写入文件无缩进,写入到file1.xml 文件中
    et = ET.ElementTree(School)
    et.write("file1.xml", encoding="utf=8", xml_declaration=True)
    
    
    def wrap(root):
        """
        将XML文件的所有节点添加换行符
        :param root: 根节点[Element类型]
        :return: 返回添加了缩进的字符串
        """
        a = ET.tostring(root, encoding="Utf-8")
        b = minidom.parseString(a)
        c = b.toprettyxml(indent="	")
        return c
    
    
    # 写入文件有缩进,写入到file2.xml 文件中
    a = wrap(School)
    f = open("file2.xml", "w", encoding="utf-8")
    f.write(a)
    f.close()
    <?xml version="1.0" ?>
    <com:school xmlns:com="http://www.ehaomiao.com">
        <com:University time="4">
            <d1>大一</d1>
            <d2>大二</d2>
            <d3>大三</d3>
            <d4>大四</d4>
        </com:University>
        <com:High_school time="3">
            <g1>高一</g1>
            <g2>高二</g2>
            <g3>高三</g3>
        </com:High_school>
        <com:middle_school time="3">
            <c1>初一</c1>
            <c2>初一</c2>
            <c3>初一</c3>
        </com:middle_school>
    </com:school>
    file2.xml
    <?xml version='1.0' encoding='utf=8'?>
    <com:school xmlns:com="http://www.ehaomiao.com"><com:University time="4"><d1>大一</d1><d2>大二</d2><d3>大三</d3><d4>大四</d4></com:University><com:High_school time="3"><g1>高一</g1><g2>高二</g2><g3>高三</g3></com:High_school><com:middle_school time="3"><c1>初一</c1><c2>初一</c2><c3>初一</c3></com:middle_school></com:school>
    file1.xml
  • 相关阅读:
    架构漫谈阅读笔记
    《七步掌握业务分析》读书笔记六
    《七步掌握业务分析》读书笔记五
    《七步掌握业务分析》读书笔记四
    使用JSON Web Token完成用户认证(REST framework JWT Auth)
    APIView与GenericAPIView
    支付宝支付
    视频托管和插入广告
    redis淘汰机制
    redis五种数据结构和应用场景
  • 原文地址:https://www.cnblogs.com/ming5218/p/7955081.html
Copyright © 2011-2022 走看看