Namespace handling in Groovys XmlSlurper

13,040

Solution 1

By default XMLSlurper is not namespace aware. This can be turned on by declaring namespaces with the declareNamespace Method.

def str = """ 
<foo xmlns:weird="http://localhost/">
  <bar>sudo </bar>
  <weird:bar>make me a sandwich!</weird:bar>
</foo>
""" 
def xml = new XmlSlurper().parseText(str).declareNamespace('weird':'http://localhost/')
println xml.bar // without namespace awareness, will print "sudo make me a sandwich!"
println xml.':bar' // will only print "sudo"
println xml.'weird:bar' // will only print "make me a sandwich!"

The output is:

sudo make me a sandwich!
sudo
make me a sandwich!

The first println will still not be namespace aware. The second println will only print the tag without namespace. If you qualify element with the prefix shown in the third println you only get the namespaced tag.

Solution 2

I know this was answered a while ago, but here's an alternative for anyone else facing the same issue. The XmlSlurper class has three constructors, a couple of which allow you to specify you want it to be namespace-aware.

public XmlSlurper(boolean validating, boolean namespaceAware)

Declare the slurper by calling new XmlSlurper(false, true). I hope this is useful to others.

Share:
13,040
codeporn
Author by

codeporn

I live on a slice of special cake.

Updated on June 11, 2022

Comments

  • codeporn
    codeporn almost 2 years

    The situation:

    def str = """
      <foo xmlns:weird="http://localhost/">
        <bar>sudo </bar>
        <weird:bar>make me a sandwich!</weird:bar>
      </foo>
    """
    def xml = new XmlSlurper().parseText(str)
    println xml.bar
    

    The output of this snippet is

    # sudo make me a sandwich!
    

    It seems like the parser merges the contents of <bar> and <weird:bar>.

    Is this behavior desired and if yes, how can I avoid this and select only <bar> or <weird:bar>?

  • codeporn
    codeporn over 12 years
    Thank you, that explains a few things :) I'm parsing different xmls and I do not know which namespaces they are using; is there any possibility to achieve the output of println xml.':bar' apart from parsing the root element for namespaces and declaring them?
  • jalopaba
    jalopaba over 8 years
    I think that XmlSlurper is namespace aware by default, since new XmlSlurper() -> new XmlSlurper(false, true): stackoverflow.com/questions/33418826/…
  • Umopepisdn
    Umopepisdn almost 8 years
    Ugh. For some reason, this isn't working for me. I'm parsing an RSS document with namespaces declared in the headers. I can access rss.channel.link but rss.channel.'atom:link'.text() and rss.channel.'atom:link'[email protected]() all return blank values. :(
  • Umopepisdn
    Umopepisdn almost 8 years
    I fixed it with .declareNamespace('atom':'w3.org/2005/Atom'). But argh, why should I have to do this when the header is in the xmlns element in the response?