XPath with regex match on an attribute value
Solution 1
I'm trying to get the total number of event nodes that contain the text ' doubles ' in the value of the description attribute.
matches()
is a standard XPath 2.0 function. It is not available in XPath 1.0.
You can use:
count(/*/*/event[contains(@description, ' doubles ')])
To verify this, here is a small XSLT transformation which just outputs the result of evaluating the above XPath expression on the provided XML document:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:value-of select=
"count(/*/*/event[contains(@description, ' doubles ')])"/>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the provided XML document:
<game id="2009/05/02/arimlb-milmlb-1" pk="244539">
<team id="109" name="Arizona" home_team="false">
<event number="9" inning="1" description="Felipe Lopez doubles to left fielder Chris Duffy. "/>
<event number="15" inning="1" description="Augie Ojeda flies out to center fielder Mike Cameron. "/>
<event number="23" inning="1" description="Chad Tracy doubles to right fielder Joe Sanchez. "/>
<event number="52" inning="2" description="Mark Reynolds lines out to left fielder Chris Duffy. "/>
<!-- more data here -->
</team>
</game>
the wanted, correct result is produced:
2
Solution 2
Try the following variants:
/game/team/event[matches(@description, ' doubles ')]/@description
/game/team/event[matches(@description, '^.*?doubles.*$')]/@description
/game/team/event[contains(@description, ' doubles ')]/@description
Solution 3
Since I'm just trying to match a fragment of the value of the description attribute, it's possible to use the XPath 2.0 function 'matches', right?
Yes, as long as you are using an XPath 2.0 engine to evaluate the XPath expression.
If you were to execute that XPath using an XPath 2.0 engine, it would select the appropriate @description
attributes.
If so, what am I doing wrong?
If you are using an XPath 2.0 engine, your issue may be that you have selected a sequence of nodes, but are expecting the count.
If you want to return the count of those attributes, you could use the count()
function:
count(/game/team/event/@description[matches(.,' doubles ')])
Related videos on Youtube
Gabe
Updated on January 23, 2020Comments
-
Gabe over 4 years
All -
I've searched and tinkered around for hours in an effort to crack this one, but I'm still having problems. I have the XML data below:
<game id="2009/05/02/arimlb-milmlb-1" pk="244539"> <team id="109" name="Arizona" home_team="false"> <event number="9" inning="1" description="Felipe Lopez doubles to left fielder Chris Duffy. "/> <event number="15" inning="1" description="Augie Ojeda flies out to center fielder Mike Cameron. "/> <event number="23" inning="1" description="Chad Tracy doubles to right fielder Joe Sanchez. "/> <event number="52" inning="2" description="Mark Reynolds lines out to left fielder Chris Duffy. "/> <!-- more data here --> </team> </game>
I'm trying to get the total number of event nodes that contain the text ' doubles ' in the value of the description attribute. This is what I've been trying so far, to no avail (irb throws an error):
"/game/team/event/@description[matches(.,' doubles ')]"
Since I'm just trying to match a fragment of the value of the description attribute, it's possible to use the XPath 2.0 function 'matches', right? If so, what am I doing wrong?
Thanks in advance for any help!
-
Dimitre Novatchev about 13 yearsGood question, +1. See my answer for a complete, short and easy one-liner XPath-expression solution :)
-
Gabe about 13 yearsMikecito - I'm using Java 1.6, but I also use Ruby irb for development and debugging purposes with stuff like XPath. That is, I was trying to scan the file using XPath in irb first, and once I had an XPath that got what I wanted, I would transfer it over to my Java code.
-
Gabe about 13 yearsDimitre - The XPath you listed ended up working great. I'll leave a separate comment below your posting.
-
-
Gabe about 13 yearsThe first two didn't work when I tried them in Java 1.6, I suspect because XPath 2.0 apparently isn't available. The third one didn't cause any exception, but it didn't appear to map to the desired nodes -- the answer that came back was NaN. But thanks very much anyway.
-
Gabe about 13 yearsThe matches() function doesn't appear to be available in Java 1.6 (not that I really knew that beforehand) -- it throws an XPathExpressionException. But thanks anyway! I appreciate it.
-
Mads Hansen about 13 yearsThe standard Java 1.6 libraries don't support XPath 2.0, but Saxon and PsychoPath do.
-
Gabe about 13 yearsDimitre - As I mentioned above, the XPath in your posting worked great. I ended up making it more specific with the following:
"count(/game/team[@home_team='false']/event[contains(@description, ' doubles ')])"
Thanks very much for your help! I really appreciate it. And Java 1.6 doesn't speak XPath 2.0, apparently.