Ruby XPath to find Attribute
Solution 1
Your starting point would be REXML
The "challenge" here is how to treat an attribute node as a child node, and this can be done by using singleton methods, then everything else follows naturally:
require "rexml/document"
include REXML # so that we don't have to prefix everything with REXML::...
def get_pair(xml_doc, key, value)
XPath.each(xml_doc, key) do |node|
if node.is_a?(Attribute)
def node.parent
self.element
end
end
puts "\"#{node}\" \"#{XPath.first(node, value)}\""
end
end
xml_doc = Document.new <<EOF
<root>
<add key="A" value="B" />
<add key="C" value="D" />
<add foo="E" bar="F" />
</root>
EOF
get_pair xml_doc, "//*/@key", "../@value"
get_pair xml_doc, "//*/@foo", "../@bar"
produces:
"A" "B"
"C" "D"
"E" "F"
Solution 2
Apparently Nokogiri is the fastest Ruby XML parser
See http://www.rubyinside.com/nokogiri-ruby-html-parser-and-xml-parser-1288.html
Was using it today and it's great.
For your example:
doc = Nokogiri::XML(your_xml)
doc.xpath("/root/add").map do |add|
puts [add['key'], add['value']]
end
Edit: It unsurprisingly turns outthat the claim that Nokogiri is faster is not uncontroversial.
However, we have found it more stable than libxml in our production environmenty (libxml was occasionally crashing; just swapping in Nokogiri has solved the issue)
Solution 3
And if you will be parsing a decent amount of data in any area where performance matters, then you will need libxml-ruby. REXML and Hpricot are good, but I recently had to make the switch on my own server for some parsing stuff because it was about 1200% faster.
Comments
-
Hadeel Fouad about 2 years
What Ruby library can be used to select attribute using XPath, and to use it as the starting point for other XPath queries.
Example:
<root> <add key="A" value="B" /> <add key="C" value="D" /> <add foo="E" bar="F" /> </root>
Desired code:
get_pair "//*/@key", "../@value" get_pair "//*/@foo", "../@bar"
Expected output:
"A" "B" "C" "D" "E" "F"
Pseudo implementation:
def get_pair(key, value) xml_doc.select[key].each do |a| puts [a, a.select[value]] end end
-
oligan over 15 yearsIt's described as "slightly slower than libxml-ruby" in the tenderlovemaking.com/2008/10/30/nokogiri-is-released comments section.