zoukankan      html  css  js  c++  java
  • python解析xml文件 修改xml指定标签中的内容

    1 文章背景

    使用labelImg对图片进行标注以进行YOLOv3模型参数训练,现希望修改图片尺寸,故对应的xml文件也需要调整。

    2 涉及知识点

    使用python进行xml文件解析,修改指定的标签内容

    2.1 使用python进行xml文件解析

    # encoding:utf-8
    import os
    import xml.etree.ElementTree as ET
    
    nowDir = os.getcwd()  # 得到进程当前工作目录
    fileList = os.listdir(nowDir)  # 得到进程当前工作目录中的所有文件名称列表
    for fileName in fileList:  # 获取文件列表中的文件
        if fileName.endswith("xml"):  # 只看xml文件
            print fileName
            tree = ET.parse(fileName)
            root = tree.getroot()
            # 第一层解析
            print "root-tag:", root.tag, ',root-attrib:', root.attrib, ',root-text:', root.text
            # 第二层解析
            for child in root:
                print 'child-tag:', child.tag, ',child.attrib:', child.attrib, ',child.text:', child.text
                # 第三层解析
                for sub in child:
                    print 'sub-tag:', sub.tag, ',sub.attrib:', sub.attrib, ',sub.text:', sub.text
                    # 第四层解析
                    for subchild in sub:
                        print 'subchild-tag:', subchild.tag, ',subchild.attrib:', subchild.attrib, ',subchild.text:', subchild.text
    

    原始xml文件内容如下:

    <annotation>
    	<folder>VedioPicture</folder>
    	<filename>0000001.jpg</filename>
    	<path>J:4_forPapersickfecesVedioPicture000001.jpg</path>
    	<source>
    		<database>Unknown</database>
    	</source>
    	<size>
    		<width>2560</width>
    		<height>1440</height>
    		<depth>1</depth>
    	</size>
    	<segmented>0</segmented>
    	<object>
    		<name>sickfeces</name>
    		<pose>Unspecified</pose>
    		<truncated>0</truncated>
    		<difficult>0</difficult>
    		<bndbox>
    			<xmin>2071</xmin>
    			<ymin>235</ymin>
    			<xmax>2154</xmax>
    			<ymax>288</ymax>
    		</bndbox>
    	</object>
    </annotation>
    

    打印结果如下:

    000001.xml
    root-tag: annotation ,root-attrib: {} ,root-text: 
    	
    child-tag: folder ,child.attrib: {} ,child.text: VedioPicture
    child-tag: filename ,child.attrib: {} ,child.text: 0000001.jpg
    child-tag: path ,child.attrib: {} ,child.text: J:4_forPapersickfecesVedioPicture000001.jpg
    child-tag: source ,child.attrib: {} ,child.text: 
    		
    sub-tag: database ,sub.attrib: {} ,sub.text: Unknown
    child-tag: size ,child.attrib: {} ,child.text: 
    		
    sub-tag: width ,sub.attrib: {} ,sub.text: 2560
    sub-tag: height ,sub.attrib: {} ,sub.text: 1440
    sub-tag: depth ,sub.attrib: {} ,sub.text: 1
    child-tag: segmented ,child.attrib: {} ,child.text: 0
    child-tag: object ,child.attrib: {} ,child.text: 
    		
    sub-tag: name ,sub.attrib: {} ,sub.text: sickfeces
    sub-tag: pose ,sub.attrib: {} ,sub.text: Unspecified
    sub-tag: truncated ,sub.attrib: {} ,sub.text: 0
    sub-tag: difficult ,sub.attrib: {} ,sub.text: 0
    sub-tag: bndbox ,sub.attrib: {} ,sub.text: 
    			
    subchild-tag: xmin ,subchild.attrib: {} ,subchild.text: 2071
    subchild-tag: ymin ,subchild.attrib: {} ,subchild.text: 235
    subchild-tag: xmax ,subchild.attrib: {} ,subchild.text: 2154
    subchild-tag: ymax ,subchild.attrib: {} ,subchild.text: 288
    

    2.2 修改指定的标签内容

    # encoding:utf-8
    import os
    import xml.etree.ElementTree as ET
    
    nowDir = os.getcwd()  # 得到进程当前工作目录
    fileList = os.listdir(nowDir)  # 得到进程当前工作目录中的所有文件名称列表
    for fileName in fileList:  # 获取文件列表中的文件
        if fileName.endswith("xml"):
            print fileName
            tree = ET.parse(fileName)
            root = tree.getroot()
    
            for shuink in shuinkList:
                for child in root:
                    for sub in child:
                        if sub.tag == "width" or sub.tag == "height":
                            sub.text = str(int(sub.text)/shuink)
                        for subchild in sub:
                            if subchild.tag == "xmin" or subchild.tag == "xmax" or subchild.tag == "ymin" or subchild.tag == "ymax":
                                subchild.text = str(int(subchild.text) / shuink)
                tree.write( fileName)  # 保存修改后的XML文件
    

    修改后的xml内容如下:

    <annotation>
    	<folder>VedioPicture</folder>
    	<filename>0000001.jpg</filename>
    	<path>J:4_forPapersickfecesVedioPicture000001.jpg</path>
    	<source>
    		<database>Unknown</database>
    	</source>
    	<size>
    		<width>160</width>
    		<height>90</height>
    		<depth>1</depth>
    	</size>
    	<segmented>0</segmented>
    	<object>
    		<name>sickfeces</name>
    		<pose>Unspecified</pose>
    		<truncated>0</truncated>
    		<difficult>0</difficult>
    		<bndbox>
    			<xmin>129</xmin>
    			<ymin>14</ymin>
    			<xmax>134</xmax>
    			<ymax>18</ymax>
    		</bndbox>
    	</object>
    </annotation>
    
  • 相关阅读:
    [转] 传统 Ajax 已死,Fetch 永生
    React组件属性部类(propTypes)校验
    [转]webpack进阶构建项目(一)
    package.json 字段全解析
    [转]Nodejs基础中间件Connect
    [转]passport.js学习笔记
    [转]Travis Ci的最接底气的中文使用教程
    建站笔记1:centos6.5下安装mysql
    [软件人生]关于认知,能力的思考——中国城市里的无知现象片段
    一步一步学Spring.NET——1、Spring.NET环境准备
  • 原文地址:https://www.cnblogs.com/dindin1995/p/13059138.html
Copyright © 2011-2022 走看看