ElementTree - findall to recursively select all child elements
Solution 1
Quoting findall
,
Element.findall()
finds only elements with a tag which are direct children of the current element.
Since it finds only the direct children, we need to recursively find other children, like this
>>> import xml.etree.ElementTree as ET
>>>
>>> def find_rec(node, element, result):
... for item in node.findall(element):
... result.append(item)
... find_rec(item, element, result)
... return result
...
>>> find_rec(ET.parse("h.xml"), 'saybye', [])
[<Element 'saybye' at 0x7f4fce206710>, <Element 'saybye' at 0x7f4fce206750>, <Element 'saybye' at 0x7f4fce2067d0>]
Even better, make it a generator function, like this
>>> def find_rec(node, element):
... for item in node.findall(element):
... yield item
... for child in find_rec(item, element):
... yield child
...
>>> list(find_rec(ET.parse("h.xml"), 'saybye'))
[<Element 'saybye' at 0x7f4fce206a50>, <Element 'saybye' at 0x7f4fce206ad0>, <Element 'saybye' at 0x7f4fce206b10>]
Solution 2
From version 2.7 on, you can use xml.etree.ElementTree.Element.iter
:
import xml.etree.ElementTree as ET
root = ET.parse("h.xml")
print root.iter('saybye')
See 19.7. xml.etree.ElementTree — The ElementTree XML API
Solution 3
If you aren't afraid of a little XPath, you can use the //
syntax that means find any descendant node:
import xml.etree.ElementTree as ET
root = ET.parse("h.xml")
print(root.findall('.//saybye'))
Full XPath isn't supported, but here's the list of what is: https://docs.python.org/2/library/xml.etree.elementtree.html#supported-xpath-syntax
Admin
Updated on April 17, 2021Comments
-
Admin about 3 years
Python code:
import xml.etree.ElementTree as ET root = ET.parse("h.xml") print root.findall('saybye')
h.xml code:
<hello> <saybye> <saybye> </saybye> </saybye> <saybye> </saybye> </hello>
Code outputs,
[<Element 'saybye' at 0x7fdbcbbec690>, <Element 'saybye' at 0x7fdbcbbec790>]
saybye
which is a child of anothersaybye
is not selected here. So, how to instruct findall to recursively walk down the DOM tree and collect all threesaybye
elements?