Python ElementTree default namespace?

14,920

Solution 1

NOTE: for Python 3.8+ please see this answer.


There is no straight-forward way to handle the default namespaces transparently. Assigning the empty namespace a non-empty name is a common solution, as you've already mentioned:

ns = {"mvn":"http://maven.apache.org/POM/4.0.0"}
pom = xml.etree.ElementTree.parse("pom.xml")
print(pom.findall("mvn:version", ns))

Note that lxml.etree does not allow the use of empty namespaces explicitly. You would get:

ValueError: empty namespace prefix is not supported in ElementPath


You can though, make things simpler, by removing the default namespace definition while loading the XML input data:

import xml.etree.ElementTree as ET
import re
 
with open("pom.xml") as f:
    xmlstring = f.read()
 
# Remove the default namespace definition (xmlns="http://some/namespace")
xmlstring = re.sub(r'\sxmlns="[^"]+"', '', xmlstring, count=1)
 
pom = ET.fromstring(xmlstring) 
print(pom.findall("version"))

Solution 2

ElementTree in Python 3.8 allows empty string as a prefix, so you can declare:

ns = {'': 'http://maven.apache.org/POM/4.0.0'}

and use that as the second arg in the find* methods.

Source: https://docs.python.org/3.8/library/xml.etree.elementtree.html?highlight=xml#xml.etree.ElementTree.Element.find

Solution 3

You can retrieve the default namespace with:

namespace = pom.getroot().tag.split("}")[0]+"}"

Then when you search for elements you add it to your search path:

print(pom.findall(namespace+"version"))

Not an elegant solution, but it works.

Share:
14,920

Related videos on Youtube

Robert Fraser
Author by

Robert Fraser

Updated on September 16, 2022

Comments

  • Robert Fraser
    Robert Fraser over 1 year

    Is there a way to define the default/unprefixed namespace in python ElementTree? This doesn't seem to work...

    ns = {"":"http://maven.apache.org/POM/4.0.0"}
    pom = xml.etree.ElementTree.parse("pom.xml")
    print(pom.findall("version", ns))
    

    Nor does this:

    ns = {None:"http://maven.apache.org/POM/4.0.0"}
    pom = xml.etree.ElementTree.parse("pom.xml")
    print(pom.findall("version", ns))
    

    This does, but then I have to prefix every element:

    ns = {"mvn":"http://maven.apache.org/POM/4.0.0"}
    pom = xml.etree.ElementTree.parse("pom.xml")
    print(pom.findall("mvn:version", ns))
    

    Using Python 3.5 on OSX.

    EDIT: if the answer is "no", you can still get the bounty :-). I just want a definitive "no" from someone who's spent a lot of time using it.