Split XML file into multiple files
First off - I'll say I think it's quite a bad idea to do XML parsing with anything other than an XML parser. Regular expressions may look like they're going to work, but this is a really good way to make some brittle code - XML that's semantically equivalent can look different to different REs (such as indents/linefeeds and unary tags).
So with that in mind - I would use Perl and the XML::Twig
library. This is a pretty standard thing - there are prebuilt packages ubiquitously available.
However perhaps most importantly of all - the XML you have posted is NOT valid. I'm going to assume that's because it's a sample, and not the real XML, and so you've missed a bit off. I'm using as my sample:
<root>
<unix>
<mm />
</unix>
<osx>
<nn />
</osx>
</root>
And using this code will do what you ask for:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new( 'pretty_print' => 'indented' );
$twig->parsefile("your_xml.xml");
foreach my $element ( $twig->root->children ) {
my $tag = $element->tag;
print "Processing $tag\n";
#print to STDOUT for debugging
print $element ->sprint;
#print to output file
open( my $output, ">", "$tag.xml" ) or warn $!;
print {$output} $element->sprint;
close($output);
}
If of course, your posting of XML is literally what you have, then it is broken XML and you should ideally go and hit whoever gave you it a with a rolled up copy of the spec document. If that is impractical due to it being real life, then I would offer you this answer on Stack Overflow: https://stackoverflow.com/a/28913945/2566198
Related videos on Youtube
DisplayName
Updated on September 18, 2022Comments
-
DisplayName over 1 year
I have an xml file that have different nodes, I want to split files like this:
<unix> <mm> </unix> <osx> <nn> </osx>
When I run the script I want it to make one xml file called
unix.xml
, which contains this<unix <mm> </unix>
And then another xml file called
osx.xml
, which contains this<osx> <nn> </osx>
-
minorcaseDev over 9 yearsThis is no valid XML. An XML file has one root tag.
-