Perl, how to parse XML file, xpath
Solution 1
This review points out that XML::XPath
hasn't been updated since 2003, and recommends XML::LibXML
instead
use 5.010;
use strict;
use warnings;
use XML::LibXML;
my $dom = XML::LibXML->new->parse_file('data.xml');
for my $node ($dom->findnodes('/category/event/@name')) {
say $node->toString;
}
See XML::LibXML::Parser
and XML::LibXML::Node
.
Solution 2
The find
method returns an XML::XPath::NodeSet
object which is a collection of all the nodes found. I can't imagine what you can have done to see one long string with all of the attribute values.
Having retrieved the set of nodes, you work on its contents with methods like size
, get_node
and get_nodelist
(see the docs I've linked above). get_nodelist
will return a Perl list of, in this case, XML::XPath::Node::Attribute
objects which also have their own methods. This program should get you started
use strict;
use warnings;
use XML::XPath;
my $xp = XML::XPath->new(ioref => \*DATA);
my $names = $xp->find('/category/event/@name');
for my $node ($names->get_nodelist) {
say $node->getNodeValue;
}
__DATA__
<category name="a">
<event name="cat1" />
<event name="cat2" />
<event name="cat3" />
<event name="cat4" />
<event name="cat5" />
</category>
OUTPUT
cat1
cat2
cat3
cat4
cat5
liverpaul
Updated on February 03, 2020Comments
-
liverpaul over 4 years
I want to parse an XML file using Perl. I was able to do it using the XML::Simple module, but now I want to start using the XML::XPath module instead because it uses XPath expressions. From my limited knowledge I think XPaths will make future parsing easier, right? Here's the Perl code I have so far:
use strict; use warnings; use XML::XPath; my $file = "data.xml"; my $path = XML::XPath->new(filename => $file); my $name = $path->find('/category/event/@name'); print $name."\n";
My question is how do I separate each name attribute (category/event/@name) so that I can perform tests on each value I parse. At the moment I'm just getting a big string full of the parsed data, whereas I want several small strings that I can test. How can I do this? Thanks :-)
-
Borodin over 12 yearsare you recommending
XML::LibXML
because you know it better, or because you think it has a genuine advantage overXML::XPath
? As far as I know the latter works fine. It is also pure Perl, which makes it slower than LibXML but usable without the help of an external library. -
daxim over 12 yearsThat's a hyper-link up there. Do follow it.
-
liverpaul over 12 yearsThanks for the reply. After reading the link posted by daxim I've decided to use XML::LibXML instead. It seems to be the best one out there, so as a beginner I think it would be better for me to learn a module that is better documented. I appreciate the introduction info you wrote, it helped me understand things a bit better :-)
-
liverpaul over 12 years@daxim Thanks for the reply. I tried that it it worked, but not 100% the way I wanted. My output is name="attribute_value", but I want just attribute_value. Is there a way to just output the attribute_value without the name=""?
-
liverpaul over 12 yearsAfter a bit more research I found that changing the line ´$node->toString´ to ´$node->to_literal´ gives me output of just the attribute_value with no name="". This is what I wanted. If this is a bad way to do things, please tell me, otherwise my question is answered. Thanks again for the help :-)
-
daxim over 12 yearsNo, calling the documented method
to_literal
is not a bad thing. - Please mark the answer as accepted. -
Venkatesh almost 9 yearsusing
XML::XPath
, can we use^
or*
inside path?. EX :my $names = $xp->find('/category/eve*');
.Insidecategory
, search for tag starting witheve