Perl script to parse XML using XML::LibXML;

38,393

Solution 1

You have the right method for getting the tag names, you just need an extra loop to run through the tags inside each <sample>:

#!/usr/bin/perl

use strict;
use warnings;

use XML::LibXML;

my $filename = "data.xml";

my $parser = XML::LibXML->new();
my $xmldoc = $parser->parse_file($filename);

for my $sample ($xmldoc->findnodes('/Statistics/Stats/Sample')) {
    for my $property ($sample->findnodes('./*')) {
        print $property->nodeName(), ": ", $property->textContent(), "\n";
    }
    print "\n";
}

Edit: I have now created a tutorial site called Perl XML::LibXML by Example which answers exactly this type of question.

Solution 2

You need to iterate over the children of sample node,

for my $sample ( $xmldoc->findnodes('/Statistics/Stats/Sample') ) {
    print $sample->nodeName(), "\n";
    foreach my $child ( $sample->getChildnodes ) {
        if ( $child->nodeType() == XML_ELEMENT_NODE ) {
            print "\t", $child->nodeName(), ":", $child->textContent(), "\n";
        }
    }
}

will show,

Sample
        Name:System1
        Type:IBM
        Memory:2GB
        StartTime:2012-04-26T14:30:01Z
        EndTime:2012-04-26T14:45:01Z
Sample
        Name:System2
        Type:Intel
        Disks:2
        StartTime:2012-04-26T15:30:01Z
        EndTime:2012-04-26T15:45:01Z
        Video:1
Share:
38,393
Admin
Author by

Admin

Updated on July 09, 2022

Comments

  • Admin
    Admin almost 2 years

    I think this is a very simple issue, but I cannot figure it out despite many searches.

    I am trying to parse the following XML to print something similar to TAG=VALUE, so that I can write this to a CSV file. The problem is the tags are not always the same for each sample. I cannot seem to figure out how to get the actual tag names. Any help appreciated!!!

    XML File -

    <Statistics>
      <Stats>
        <Sample>
            <Name>System1</Name>
            <Type>IBM</Type>
            <Memory>2GB</Memory>
            <StartTime>2012-04-26T14:30:01Z</StartTime>
            <EndTime>2012-04-26T14:45:01Z</EndTime>
        </Sample>
    
        <Sample>
            <Name>System2</Name>
            <Type>Intel</Type>
            <Disks>2</Disks>
            <StartTime>2012-04-26T15:30:01Z</StartTime>
            <EndTime>2012-04-26T15:45:01Z</EndTime>
            <Video>1</Video>
        </Sample>
      </Stats>
    </Statistics>
    

    Script -

    #!/usr/bin/perl
    use XML::LibXML;
    
    $filename = "data.xml";
    
    my $parser = XML::LibXML->new();
    my $xmldoc = $parser->parse_file($filename);
    
    for my $sample ($xmldoc->findnodes('/Statistics/Stats/Sample')) {
    
    print $sample->nodeName(), ": ", $sample->textContent(), "\n";
    
    }
    
  • ikegami
    ikegami almost 12 years
    Line 3+4+6 can be replaced with: foreach my $child ($sample->findnodes('*')) {