问题
I am trying to extract object, xmin, ymin, xmax and xmax value of every object tag there is.
XML
<annotation>
<folder>Plates_Number</folder>
<filename>1.png</filename>
<source>
<database>Unknown</database>
</source>
<size>
<width>294</width>
<height>60</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>2</name>
<pose>Unspecified</pose>
<truncated>1</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>40</xmin>
<ymin>1</ymin>
<xmax>69</xmax>
<ymax>42</ymax>
</bndbox>
</object>
<object>
<name>10</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>67</xmin>
<ymin>3</ymin>
<xmax>101</xmax>
<ymax>43</ymax>
</bndbox>
</object>
<object>
<name>1</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>122</xmin>
<ymin>2</ymin>
<xmax>153</xmax>
<ymax>45</ymax>
</bndbox>
</object>
<object>
<name>10</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>151</xmin>
<ymin>3</ymin>
<xmax>183</xmax>
<ymax>44</ymax>
</bndbox>
</object>
<object>
<name>2</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>186</xmin>
<ymin>4</ymin>
<xmax>216</xmax>
<ymax>47</ymax>
</bndbox>
</object>
<object>
<name>5</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>214</xmin>
<ymin>5</ymin>
<xmax>245</xmax>
<ymax>46</ymax>
</bndbox>
</object>
</annotation>
This is what I tried but didn't get the expected result
python
import xml.etree.ElementTree as ET
import csv
tree = ET.parse("1.xml")
root = tree.getroot()
# open a file for writing
data = open('test.csv', 'r+')
# create the csv writer object
csvwriter = csv.writer(data)
data_head = []
count = 0
for member in root.findall('object'):
obj = []
bndbox_list = []
if count == 0:
name = member.find('name').tag
data_head.append(name)
bndbox = member[4].tag
data_head.append(bndbox)
csvwriter.writerow(data_head)
count = count + 1
name = member.find('name').text
obj.append(name)
bndbox = member[4][0].text
bndbox_list.append(bndbox)
xmin = member[4][1].text
bndbox_list.append(xmin)
ymin = member[4][2].text
bndbox_list.append(ymin)
xmax = member[4][3].text
bndbox_list.append(xmax)
ymax = member[4][4].text
bndbox_list.append(ymax)
obj.append(bndbox)
csvwriter.writerow(data)
data.close()
I expect Name xmin ymin xmax ymax 2 40 1 69 42 10 67 3 101 43 1 122 2 153 45 10 151 3 183 44 2 186 4 216 47 5 214 5 245 46
but I am only getting these two header
Name bndbox
and no value
回答1:
If you can use BeautifulSoup, you could use
from bs4 import BeautifulSoup
soup = BeautifulSoup(input_xml_string)
tgs = soup.find_all('object', 'xml')
l = [(i.find('name').string, i.xmin.string, i.ymin.string, i.xmax.string, i.ymax.string) for i in tgs]
where input_xml_string
is the input xml in string form.
soup
would be a BeautifulSoup object which is a representation of the xml tree.
An xml parser is used.
Then the find_all()
function is used to find all the <object>
tags in the xml. The result is stored in tgs
.
Now from the elements in tgs
, which would be children tags of <object>
, we select the tags we need, which are Tag objects, and get their values using their string
attribute.
We could have accessed the value in name
using its string
attribute but name
is the name of an attribute of the
Tag
class. So we first used find()
to get the <name>
child of <object>
and then we got its content.
Now if we print the values in l
,
for i in l:
print(i)
we would get,
('2', '40', '1', '69', '42')
('10', '67', '3', '101', '43')
('1', '122', '2', '153', '45')
('10', '151', '3', '183', '44')
('2', '186', '4', '216', '47')
('5', '214', '5', '245', '46')
回答2:
code :
import xml.etree.ElementTree as ET
root = ET.parse('file.xml').getroot()
for type_tag in root.findall('object'):
name = type_tag.find('name').text
xmin = type_tag.find('bndbox/xmin').text
ymin = type_tag.find('bndbox/ymin').text
xmax = type_tag.find('bndbox/xmax').text
ymax = type_tag.find('bndbox/ymax').text
print([name,xmin,ymin,xmax,ymax])
output:
['2', '40', '1', '69', '42']
['10', '67', '3', '101', '43']
['1', '122', '2', '153', '45']
['10', '151', '3', '183', '44']
['2', '186', '4', '216', '47']
['5', '214', '5', '245', '46']
来源:https://stackoverflow.com/questions/56020248/how-to-get-specific-values-from-a-xml-file-into-csv-file-using-python