Convert XML to CSV file
65,984
Do not use the findall
function, as it will look for att tags in the whole tree. Just iterate the tree in order from top to bottom and grab the relevant elements in them.
from xml.etree import ElementTree
tree = ElementTree.parse('input.xml')
root = tree.getroot()
for att in root:
first = att.find('attval').text
for subatt in att.find('children'):
second = subatt.find('attval').text
print('{},{}'.format(first, second))
Which gives:
$ python process.py
Data,Studyval
Data,Site
Info,age
Info,gender
Comments
-
pam over 3 years
I have an XML file like this:
<hierachy> <att> <Order>1</Order> <attval>Data</attval> <children> <att> <Order>1</Order> <attval>Studyval</attval> </att> <att> <Order>2</Order> <attval>Site</attval> </att> </children> </att> <att> <Order>2</Order> <attval>Info</attval> <children> <att> <Order>1</Order> <attval>age</attval> </att> <att> <Order>2</Order> <attval>gender</attval> </att> </children> </att> </hierachy>
I'm trying to convert it to a CSV file like this:
Data,Studyval Date,Site Info,age Info,gender
My problem is, both the parent and child names are the same -
'att'
and'attval'
. How do I tell Python to distinguish between them both and give me the output?I tried this:
import xml.etree.cElementTree as ET tree = ET.parse('input.xml') rebase = tree.getroot() list = [] for att in rebase.findall('att'): name = att.find('attval').text for each_att in att.findall('attval'): try: val = att.find('attval').text print name, val except AttributeError: print name
and it printed the same things twice.
-
pam almost 9 yearsThat is perfect! Thanks a ton!