zoukankan html css js c++ java

python解析xml文件修改xml指定标签中的内容

1 文章背景

使用labelImg对图片进行标注以进行YOLOv3模型参数训练，现希望修改图片尺寸，故对应的xml文件也需要调整。

2 涉及知识点

使用python进行xml文件解析，修改指定的标签内容

2.1 使用python进行xml文件解析

# encoding:utf-8
import os
import xml.etree.ElementTree as ET

nowDir = os.getcwd()  # 得到进程当前工作目录
fileList = os.listdir(nowDir)  # 得到进程当前工作目录中的所有文件名称列表
for fileName in fileList:  # 获取文件列表中的文件
    if fileName.endswith("xml"):  # 只看xml文件
        print fileName
        tree = ET.parse(fileName)
        root = tree.getroot()
        # 第一层解析
        print "root-tag:", root.tag, ',root-attrib:', root.attrib, ',root-text:', root.text
        # 第二层解析
        for child in root:
            print 'child-tag:', child.tag, ',child.attrib:', child.attrib, ',child.text:', child.text
            # 第三层解析
            for sub in child:
                print 'sub-tag:', sub.tag, ',sub.attrib:', sub.attrib, ',sub.text:', sub.text
                # 第四层解析
                for subchild in sub:
                    print 'subchild-tag:', subchild.tag, ',subchild.attrib:', subchild.attrib, ',subchild.text:', subchild.text

原始xml文件内容如下：

<annotation>
	<folder>VedioPicture</folder>
	<filename>0000001.jpg</filename>
	<path>J:4_forPapersickfecesVedioPicture000001.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>2560</width>
		<height>1440</height>
		<depth>1</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>sickfeces</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>2071</xmin>
			<ymin>235</ymin>
			<xmax>2154</xmax>
			<ymax>288</ymax>
		</bndbox>
	</object>
</annotation>

打印结果如下：

000001.xml
root-tag: annotation ,root-attrib: {} ,root-text: 
	
child-tag: folder ,child.attrib: {} ,child.text: VedioPicture
child-tag: filename ,child.attrib: {} ,child.text: 0000001.jpg
child-tag: path ,child.attrib: {} ,child.text: J:4_forPapersickfecesVedioPicture000001.jpg
child-tag: source ,child.attrib: {} ,child.text: 
		
sub-tag: database ,sub.attrib: {} ,sub.text: Unknown
child-tag: size ,child.attrib: {} ,child.text: 
		
sub-tag: width ,sub.attrib: {} ,sub.text: 2560
sub-tag: height ,sub.attrib: {} ,sub.text: 1440
sub-tag: depth ,sub.attrib: {} ,sub.text: 1
child-tag: segmented ,child.attrib: {} ,child.text: 0
child-tag: object ,child.attrib: {} ,child.text: 
		
sub-tag: name ,sub.attrib: {} ,sub.text: sickfeces
sub-tag: pose ,sub.attrib: {} ,sub.text: Unspecified
sub-tag: truncated ,sub.attrib: {} ,sub.text: 0
sub-tag: difficult ,sub.attrib: {} ,sub.text: 0
sub-tag: bndbox ,sub.attrib: {} ,sub.text: 
			
subchild-tag: xmin ,subchild.attrib: {} ,subchild.text: 2071
subchild-tag: ymin ,subchild.attrib: {} ,subchild.text: 235
subchild-tag: xmax ,subchild.attrib: {} ,subchild.text: 2154
subchild-tag: ymax ,subchild.attrib: {} ,subchild.text: 288

2.2 修改指定的标签内容

# encoding:utf-8
import os
import xml.etree.ElementTree as ET

nowDir = os.getcwd()  # 得到进程当前工作目录
fileList = os.listdir(nowDir)  # 得到进程当前工作目录中的所有文件名称列表
for fileName in fileList:  # 获取文件列表中的文件
    if fileName.endswith("xml"):
        print fileName
        tree = ET.parse(fileName)
        root = tree.getroot()

        for shuink in shuinkList:
            for child in root:
                for sub in child:
                    if sub.tag == "width" or sub.tag == "height":
                        sub.text = str(int(sub.text)/shuink)
                    for subchild in sub:
                        if subchild.tag == "xmin" or subchild.tag == "xmax" or subchild.tag == "ymin" or subchild.tag == "ymax":
                            subchild.text = str(int(subchild.text) / shuink)
            tree.write( fileName)  # 保存修改后的XML文件

修改后的xml内容如下：

<annotation>
	<folder>VedioPicture</folder>
	<filename>0000001.jpg</filename>
	<path>J:4_forPapersickfecesVedioPicture000001.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>160</width>
		<height>90</height>
		<depth>1</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>sickfeces</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>129</xmin>
			<ymin>14</ymin>
			<xmax>134</xmax>
			<ymax>18</ymax>
		</bndbox>
	</object>
</annotation>

查看全文

相关阅读:
解决content-type为"application/json"的post过来的数据在php端接受不到的问题
 webshell导致项目崩溃
 mysql启动报错 mysql InnoDB: Error: could not open single-table tablespace file
php性能优化
 post表单翻页保存搜索条件
 PHP7 MongDB 安装与使用
 Mac下编译Thrift的时候Python2.7会报错 site-packages': Operation not permitted
苹果系统通过brew安装sshpass
volatile关键字深入理解
 java语言中application异常退出和线程异常崩溃的捕获方法，并且在捕获的钩子方法中进行异常处理

原文地址：https://www.cnblogs.com/dindin1995/p/13059138.html

python解析xml文件 修改xml指定标签中的内容

1 文章背景

2 涉及知识点

2.1 使用python进行xml文件解析

2.2 修改指定的标签内容

python解析xml文件修改xml指定标签中的内容