zoukankan      html  css  js  c++  java
  • python处理xml文件

    参考:https://docs.python.org/2/library/xml.etree.elementtree.html

    例子:

    <?xml version="1.0"?>
    <data>
        <country name="Liechtenstein">
            <rank>1</rank>
            <year>2008</year>
            <gdppc>141100</gdppc>
            <neighbor name="Austria" direction="E"/>
            <neighbor name="Switzerland" direction="W"/>
        </country>
        <country name="Singapore">
            <rank>4</rank>
            <year>2011</year>
            <gdppc>59900</gdppc>
            <neighbor name="Malaysia" direction="N"/>
        </country>
        <country name="Panama">
            <rank>68</rank>
            <year>2011</year>
            <gdppc>13600</gdppc>
            <neighbor name="Costa Rica" direction="W"/>
            <neighbor name="Colombia" direction="E"/>
        </country>
    </data>

    1、解析xml文件

    >>> os.getcwd()
    'D:\workspace\testpython'
    >>> import xml.etree.ElementTree as ET
    >>> tree = ET.parse('test.xml')
    >>> root = tree.getroot()
    >>> print root
    <Element 'data' at 0x1d2a8b0>
    >>> print tree
    <xml.etree.ElementTree.ElementTree object at 0x01D2A9D0>
    >>> root.tag
    'data'
    >>> root.attrib
    {}
    >>> #遍历子节点
    >>> for child in root:
        print child.tag,child.attrib
    
        
    country {'name': 'Liechtenstein'}
    country {'name': 'Singapore'}
    country {'name': 'Panama'}
    >>> root[0].text
    '
            '
    >>> root[0][1].text
    '2008'
    >>> root[1][3].text
    >>> root[1][2].text
    '59900'

    2、查找元素:root.iter()迭代,element.findall()element.find()element.get(),element.text

    >>> #查询元素
    >>> for neighbor in root.iter('neighbor'):
        print neighbor.attrib
    
        
    {'direction': 'E', 'name': 'Austria'}
    {'direction': 'W', 'name': 'Switzerland'}
    {'direction': 'N', 'name': 'Malaysia'}
    {'direction': 'W', 'name': 'Costa Rica'}
    {'direction': 'E', 'name': 'Colombia'}
    >>> root.iter('neighbor')
    <generator object iter at 0x01D3CF30>
    >>> root.findall('country')
    [<Element 'country' at 0x1d2aa90>, <Element 'country' at 0x1d2ad30>, <Element 'country' at 0x1d2af10>]
    >>> for country in root.findall('country'): #element.findall()查询当前元素的子元素
        rank=country.find('rank').text  #element.find()查询指定标签的第一个子元素,element.text获取元素的内容
        name=country.get('name')  #element.get()获取元素的属性值
        print name,rank
    
        
    Liechtenstein 1
    Singapore 4
    Panama 68

    3、修改xml文件:element.set()修改属性,element.append()增加子元素,element.remove()删除元素

    修改元素属性:

    >>> for rank in root.iter('rank'):
        new_rank = int(rank.text) + 1
        rank.text=str(new_rank)
        rank.set('updated','yes')
    
        
    >>> tree.write('test.xml')

     删除元素:

    >>> for country in root.findall('country'):
        rank = int(country.find('rank').text)
        if rank>50:
            root.remove(country)
    
            
    >>> tree.write('test.xml')
    <?xml version="1.0"?>
    <data>
        <country name="Liechtenstein">
            <rank updated="yes">2</rank>
            <year>2008</year>
            <gdppc>141100</gdppc>
            <neighbor name="Austria" direction="E"/>
            <neighbor name="Switzerland" direction="W"/>
        </country>
        <country name="Singapore">
            <rank updated="yes">5</rank>
            <year>2011</year>
            <gdppc>59900</gdppc>
            <neighbor name="Malaysia" direction="N"/>
        </country>
    </data>

    4、构建xml文件:SubElement()

    >>> a = ET.Element('a')
    >>> b = ET.SubElement(a, 'b')
    >>> c = ET.SubElement(a, 'c')
    >>> d = ET.SubElement(c, 'd')
    >>> ET.dump(a)
    <a><b /><c><d /></c></a>
    >>> 

    5、用命名空间来解析xml文档

    If the XML input has namespaces, tags and attributes with prefixes in the form prefix:sometag get expanded to {uri}sometag where the prefix is replaced by the full URI. Also, if there is a default namespace, that full URI gets prepended to all of the non-prefixed tags.

    Here is an XML example that incorporates two namespaces, one with the prefix “fictional” and the other serving as the default namespace:

    <?xml version="1.0"?>
    <actors xmlns:fictional="http://characters.example.com"
            xmlns="http://people.example.com">
        <actor>
            <name>John Cleese</name>
            <fictional:character>Lancelot</fictional:character>
            <fictional:character>Archie Leach</fictional:character>
        </actor>
        <actor>
            <name>Eric Idle</name>
            <fictional:character>Sir Robin</fictional:character>
            <fictional:character>Gunther</fictional:character>
            <fictional:character>Commander Clement</fictional:character>
        </actor>
    </actors>
    >>> import xml.etree.ElementTree as ET
    >>> tree=ET.parse('namespace.xml')
    >>> root=tree.getroot()
    >>> for actor in root.findall('real_person:actor', ns):
        name = actor.find('real_person:name', ns)
        print name.text
        for char in actor.findall('role:character', ns):
            print ' |-->', char.text
    
            
    John Cleese
     |--> Lancelot
     |--> Archie Leach
    Eric Idle
     |--> Sir Robin
     |--> Gunther
     |--> Commander Clement
    >>> 

     定位、编辑、保存元素属性:

    >>> import xml.etree.ElementTree as ET
    >>> tree=ET.parse('spring-subtract.xml')
    >>> root=tree.getroot()
    >>> print root
    <Element '{http://www.springframework.org/schema/beans}beans' at 0x1d423f0>
    >>> print root.getchildren()
    [<Element '{http://www.springframework.org/schema/beans}bean' at 0x1d422f0>, <Element '{http://www.springframework.org/schema/beans}bean' at 0x1d422d0>, <Element '{http://www.springframework.org/schema/beans}bean' at 0x1d425d0>]
    >>> print root.getchildren()[2]
    <Element '{http://www.springframework.org/schema/beans}bean' at 0x1d425d0>
    >>> print root.getchildren()[2].getchildren()
    [<Element '{http://www.springframework.org/schema/beans}property' at 0x1d42610>, <Element '{http://www.springframework.org/schema/beans}property' at 0x1d42670>]
    >>> print root.getchildren()[2].getchildren()[1].attrib
    {'name': 'cronExpression', 'value': '0000'}
    >>> print root.getchildren()[2].getchildren()[1].attrib['value']
    0000
    >>> #编辑属性值
    >>> root.getchildren()[2].getchildren()[1].set('value','222')
    >>> tree.write('spring-subtract.xml')  #保存文件
    >>> print root.getchildren()[2].getchildren()[1].attrib
    {'name': 'cronExpression', 'value': '222'}
    >>> 
  • 相关阅读:
    VS2013 update4+Cocos2d-x 3.7 Win8下安装方法及配置
    它处资料:二分图最大匹配的匈牙利算法
    DataGuard备库ORA-01196故障恢复一则
    Leetcode41: Remove Duplicates from Sorted List
    BEGINNING SHAREPOINT&#174; 2013 DEVELOPMENT 第3章节--SharePoint 2013 开发者工具 使用Napa开发SharePoint应用程序
    关于OC的内存管理-01
    P2002 消息扩散
    P1726 上白泽慧音
    2594 解药还是毒药
    P3385 【模板】负环
  • 原文地址:https://www.cnblogs.com/apple2016/p/6207962.html
Copyright © 2011-2022 走看看