python xml query get parent

14,100

Solution 1

This XPath:

/Node/Node[Val[@name='number']/f='265456']/@name

Outputs:

1764466544

Solution 2

Unfortunately, when using the ElementTree API, each Element object has no reference back to its parent, so you cannot go up the tree from a known point. Instead, you have to find the possible parent objects and filter the ones you want.

This is commonly done with XPath expressions. However, ElementTree only supports a subset of XPath (see the docs), the most useful parts of which were only added in ElementTree 1.3, which only comes with Python 2.7+ or 3.2+.

And even, ElementTree's XPath it cannot work with your file as is - there is no way to select based on the text of a node, only its attributes (or attribute values).

My experimentation has only found two ways you can proceed with ElementTree. If you are using Python 2.7+ (or are able to download and install a newer version of ElementTree to work with older Python versions), and you can modify the format of the XML file to put the numbers as attributes, like so

<Val name="number"><f val="265456" /></Val>

then the following Python code will pull out the nodes of interest:

import xml.etree.ElementTree as ETree
tree = ETree.ElementTree(file='sample.xml')
nodes = tree.findall(".//Node/Val[@name='number']/f[@val='265456']....")

For older Pythons, or if you cannot modify the XML format, you will have to filter the invalid nodes manually. The following worked for me:

import xml.etree.ElementTree as ETree
tree = ETree.ElementTree(file='sample.xml')
all = tree.findall(".//Node")
nodes = []

# Filter matching nodes and put them in the nodes variable.
for node in all:
    for val in node.getchildren():
        if val.attrib['name'] == 'number' and val.getchildren()[0].text =='265456':
            nodes.append(node)

Neither of these solutions is what I would call ideal, but they're the only ones I have been able to make work with the ElementTree library (since that is what you mentioned using). You might be better off using a third-party library rather than using the built-in ones; see the Python wiki entry on XML for a list of options. lxml is the Python bindings for the widely-used libxml2 library, and would be the one I would suggest looking at first. It has XPath support so you should be able to use the queries from the other answers.

Share:
14,100
itwb
Author by

itwb

Updated on October 14, 2022

Comments

  • itwb
    itwb over 1 year

    I have a big xml document that looks like this:

    <Node name="foo">
        <Node name="16764764625">
            <Val name="type"><s>3</s></Val>
            <Val name="owner"><s>1</s></Val>
            <Val name="location"><s>4</s></Val>
            <Val name="brb"><n/></Val>
            <Val name="number"><f>24856</f></Val>
            <Val name="number2"><f>97000.0</f></Val>
        </Node>
        <Node name="1764466544">
            <Val name="type"><s>1</s></Val>
            <Val name="owner"><s>2</s></Val>
            <Val name="location"><s>6</s></Val>
            <Val name="brb"><n/></Val>
            <Val name="number"><f>265456</f></Val>
            <Val name="number2"><f>99000.0</f></Val>
        </Node>
        ...
    </Node>
    

    My mission is to get the value of the parent node: 1764466544 (value of name in 2nd Node) by doing a search to find if the subelement of the node Val name="number" contains 265456

    I've been doing a heap of reading on XPath, and ElementTree, but I am still not sure where to start actually query this. Looking for examples... I can't find any that reference a parent node as a result.

    Still new to python.. any suggestions would be appreciated.

    Thanks

  • Wayne
    Wayne about 13 years
    @itwb - I have never attempted XPath in Python, so that part is up to you, but the XPath above works in the abstract. Test it here, for example: xmlme.com/XpathTool.aspx
  • itwb
    itwb about 13 years
    Yeah, thanks for that. Now I'm getting this error: SyntaxError: cannot use absolute path on element.
  • Wayne
    Wayne about 13 years
    I'm in unfamiliar territory here, but this link shows the following code for XPath expressions with a leading /: raise SyntaxError("cannot use absolute path on element"). Maybe try a relative expression? This Node/Node[Val[@name='number']/f='265456']/@name or this //Node/Node[Val[@name='number']/f='265456']/@name
  • gmo
    gmo almost 10 years
    OP has more than 3 years old... It's good idea to clarify if your answer actually works now, with current version, use to work before, with old versions, or anithing you find relevant knowing this.
  • Samuel
    Samuel over 9 years
    Really annoying that Python added some XPath support but I can't use the ".." syntax to go up from the current node. It should be stated in the Python documentation. Actually the documentation states that this syntax is supported. Maybe it's supported as long as you don't go above the current element, e.g. "person/.."? I've spent about an hour trying to figure out why this wasn't working.
  • Stabledog
    Stabledog over 8 years
    This does not work with ElementTree, there's no such attribute in any version of the library.