Scripting: what is the easiest to extract a value in a tag of a XML file?
Solution 1
xml2 can convert xml to/from line-oriented format:
xml2 < pom.xml | grep /project/version= | sed 's/.*=//'
Solution 2
Other way: xmlgrep and XPath:
xmlgrep --text_only '/project/version' pom.xml
Disadvantage: slow
Solution 3
Using python
$ python -c 'from xml.etree.ElementTree import ElementTree; print ElementTree(file="pom.xml").findtext("{http://maven.apache.org/POM/4.0.0}version")'
1.0.74-SNAPSHOT
Using xmlstarlet
$ xml sel -N x="http://maven.apache.org/POM/4.0.0" -t -m 'x:project/x:version' -v . pom.xml
1.0.74-SNAPSHOT
Using xmllint
$ echo -e 'setns x=http://maven.apache.org/POM/4.0.0\ncat /x:project/x:version/text()' | xmllint --shell pom.xml | grep -v /
1.0.74-SNAPSHOT
Solution 4
Clojure way. Requires only jvm with special jar file:
java -cp clojure.jar clojure.main -e "(use 'clojure.xml) (->> (java.io.File. \"pom.xml\") (clojure.xml/parse) (:content) (filter #(= (:tag %) :version)) (first) (:content) (first) (println))"
Scala way:
java -Xbootclasspath/a:scala-library.jar -cp scala-compiler.jar scala.tools.nsc.MainGenericRunner -e 'import scala.xml._; println((XML.load(new java.io.FileInputStream("pom.xml")) match { case <project>{children @ _*}</project> => for (i <- children if (i match { case <version>{children @ _*}</version> => true; case _ => false; })) yield i })(0) match { case <version>{Text(x)}</version> => x })'
Groovy way:
java -classpath groovy-all.jar groovy.ui.GroovyMain -e 'println (new XmlParser().parse(new File("pom.xml")).value().findAll({ it.name().getLocalPart()=="version" }).first().value().first())'
Solution 5
Here's an alternative in Perl
$ perl -MXML::Simple -e'print XMLin("pom.xml")->{version}."\n"'
1.0.74-SNAPSHOT
It works with the revised/extended example in the questions which has multiple "version" elements at different depths.
Related videos on Youtube
Comments
-
Anthony Kong over 1 year
I want to read a pom.xml ('Project Object Model' of Maven) and extract the version information. Here is an example:
<?xml version="1.0" encoding="UTF-8"?><project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.mycompany</groupId> <artifactId>project-parent</artifactId> <name>project-parent</name> <version>1.0.74-SNAPSHOT</version> <dependencies> <dependency> <groupId>com.sybase.jconnect</groupId> <artifactId>jconnect</artifactId> <version>6.05-26023</version> </dependency> <dependency> <groupId>joda-time</groupId> <artifactId>joda-time</artifactId> <version>1.5.2</version> </dependency> <dependency> <groupId>com.sun.jdmk</groupId> <artifactId>jmxtools</artifactId> <version>1.2.1</version> </dependency> <dependency> <groupId>org.easymock</groupId> <artifactId>easymock</artifactId> <version>2.4</version> </dependency> </dependencies> </project>
How can I extract the version '1.0.74-SNAPSHOT' from above?
Would love to be able to do so using simple bash scripting sed or awk. Otherwise a simple python is preferred.
EDIT
Constraint
The linux box is in a corporate environment so I can only use tools that are already installed (not that I cannot request utility such as xml2, but I have to go through a lot of red-tape). Some of the solutions are very good (learn a few new tricks already), but they may not be applicable due to the restricted environment
updated xml listing
I added the dependencies tag to the original listing. This will show some hacky solution may not work in this case
Distro
The distro I am using is RHEL4
-
bbaja42 over 12 yearsIs this stackoverflow.com/questions/29004/… sufficient?
-
Anthony Kong over 12 yearsNot really. There are a lot of version tag in the xml (e.g. under dependencies tag). I only want '/project/version'
-
Vi. over 12 yearsWhich xml-related tools and libraries are available? Are jvm-based soltuions OK?
-
Anthony Kong over 12 yearsSo far I can tell xml2, xmlgrep and perl XML module are not present. Most unix command-line utilities are present. The distro is Redhat EL 4.
-
JStrahl over 11 years(I couldn't add a comment so I have to reply as an answer, overkill somewhat) Some great answers can be found here..... stackoverflow.com/questions/2735548/…
-
Ciro Santilli Путлер Капут 六四事 over 8 years
-
Vi. over 12 yearsSlow, (although faster than xmlgrep)
-
Anthony Kong over 12 yearsThanks for the suggestion, but unfortunately it will not return what I want. Please see the updated pom model.
-
Vi. over 12 yearsReturns "1.0.74-SNAPSHOT". Note that I changed the script after reading about multiple
<version>
things. -
Vi. over 12 yearsNote: this solution is provided "just for fun" and is not intended to be used in actual product. Better use xml2/xmlgrep/XML::Simple solution.
-
Anthony Kong over 12 yearsThanks! even though it is 'just for fun' but it is probably the 'most suitable' solution by far because it has minimum number of dependencies: It only requires perl ;-)
-
Vi. over 12 yearsWhat about doing it from Java? Using pom files implies having JVM installed.
-
Anthony Kong over 12 yearsThe background is that I am building a SIT (system integration test) script around the existing maven process. Part of it requires knowing the version of the maven project. I really want to keep it simple and scripting is the way to go.
-
Anthony Kong over 12 yearsThis is awesome! Great idea!
-
David H over 12 yearsIf xsltproc is on your system, and it probably is as libxslt is on RHEL4, then you can use it and the above stylesheet to output the tag, i.e. xsltproc x.xsl prom.xsl.
-
kev over 12 years
cat (//x:version)[1]/text()
when usingxmllint
also works! -
Vi. over 12 yearsRelies on absence of parameters in elements and that extra
<version>
s can be only inside dependencies. -
Simon Sheehan over 12 yearsWhat exactly does this script do?
-
Samus_ over 12 yearsit loads the XML as a DOM structure using Python's minidom implementation: docs.python.org/library/xml.dom.minidom.html the idea is to grab the <project> tag that is unique and then iterate over its child nodes (direct childs only) to find the tag <version> that we're looking for and not other tags with the same name in other places.
-
fixer1234 over 8 yearsCan you expand your answer to explain this? Thanks.
-
GAD3R about 5 yearscommand updated to
xml_grep
-
Charlweed almost 5 yearsPowershell is now open source and runs on Linux and other platforms. We use it for building in preference to bash, cygwin and ming64.
-
SMerrill8 about 4 yearsThis does appear to work, but beware: What it does is set the field separator (FS) to the set of characters < and >; then it finds all lines with the word "packaging" in them and give you the third field.
-
user5249203 over 3 yearsxml2 can be found at github.com/clone/xml2 - it's original website etc have disappeared.