XQuery looking for text with 'single' quote

28,986

Solution 1

Here's a hackaround (Thanks Dimitre Novatchev) that will allow me to search for any text in xpaths, whether it contains single or double quotes. Implemented in JS, but could be easily translated to other languages

function cleanStringForXpath(str)  {
    var parts = str.match(/[^'"]+|['"]/g);
    parts = parts.map(function(part){
        if (part === "'")  {
            return '"\'"'; // output "'"
        }

        if (part === '"') {
            return "'\"'"; // output '"'
        }
        return "'" + part + "'";
    });
    return "concat(" + parts.join(",") + ")";
}

If I'm looking for I'm reading "Harry Potter" I could do the following

var xpathString = cleanStringForXpath( "I'm reading \"Harry Potter\"" );
$x("//*[text()="+ xpathString +"]");
// The xpath created becomes 
// //*[text()=concat('I',"'",'m reading ','"','Harry Potter','"')]

Here's a (much shorter) Java version. It's exactly the same as JavaScript, if you remove type information. Thanks to https://stackoverflow.com/users/1850609/acdcjunior

String escapedText = "concat('"+originalText.replace("'", "', \"'\", '") + "', '')";!

Solution 2

In XPath 2.0 and XQuery 1.0, the delimiter of a string literal can be included in the string literal by doubling it:

let $a := "He said ""I won't"""

or

let $a := 'He said "I can''t"'

The convention is borrowed from SQL.

Solution 3

This is an example:

/*/*[contains(., "'") and contains(., '"') ]/text()

When this XPath expression is applied on the following XML document:

<text>
    <t>I'm reading "Harry Potter"</t>
    <t>I am reading "Harry Potter"</t>
    <t>I am reading 'Harry Potter'</t>
</text>

the wanted, correct result (a single text node) is selected:

I'm reading "Harry Potter"

Here is verification using the XPath Visualizer (A free and open source tool I created 12 years ago, that has taught XPath the fun way to thousands of people):

enter image description here

Your problem may be that you are not able to specify this XPath expression as string in the programming language that you are using -- this isn't an XPath problem but a problem in your knowledge of your programming language.

Solution 4

Additionally, if you were using XQuery, instead of XPath, as the title says, you could also use the xml entities:

   "&quot; for double and &apos; for single quotes"

they also work within single quotes

Solution 5

You can do this using a regular expression. For example (as ES6 code):

export function escapeXPathString(str: string): string {
    str = str.replace(/'/g, `', "'", '`);

    return `concat('${str}', '')`;
}

This replaces all ' in the input string by ', "'", '.

The final , '' is important because concat('string') is an error.

Share:
28,986
Ruan Mendes
Author by

Ruan Mendes

Client side/ middle tier web developer. Have programmed in C, C++, C#, Groovy, Java, ActionScript, Lingo, JavaScript, PHP, TypeScript. Basic My language of choice is TypeScript, on the browser or Deno. Technologies I've worked with substantially: HTML, CSS, DOM, AJAX, Angular, React, jQuery, Google Closure Templates, Sencha touch, Ext-JS ASP, PHP, JSP, Struts, Velocity, Node.js, Kohana Windows, Unix, OpenVMS, Solaris Ant, make, maven XML-RPC, RESTful services JSUnit, JUnit, PhpUnit, Karma, Jasmine, js-test-driver, NUnit, YUI tests Selenium, Cucumber, Cypress Grails ASP.NET

Updated on October 20, 2022

Comments

  • Ruan Mendes
    Ruan Mendes over 1 year

    I can't figure out how to search for text containing single quotes using XPATHs.

    For example, I've added a quote to the title of this question. The following line

    $x("//*[text()='XQuery looking for text with &#39;single&#39; quote']")
    

    Returns an empty array.

    However, if I try the following

    $x("//*[text()=\"XQuery looking for text with 'single' quote\"]")
    

    It does return the link for the title of the page, but I would like to be able to accept both single and double quotes in there, so I can't just tailor it for the single/double quote.

    You can try it in chrome's or firebug's console on this page.

  • Ruan Mendes
    Ruan Mendes over 11 years
    Your answer tells you how to find text nodes that contain both a single and a double quote, by hardcoding single quotes inside double quotes ("'") and double quotes inside single quotes('"'). However, What I need is a query that would search for the specific text with a query like //div[text()="I'm reading "Harry Potter""]... obviously, my example is not properly escaping the quotes. I would expect //*[text()='I&#39;m reading &#34;Harry Potter&#34;'] to work
  • Ruan Mendes
    Ruan Mendes over 11 years
    No, the problem is not in "my knowledge of my programming language". The question is about how to escape quotes inside quoted content in XPath
  • Dimitre Novatchev
    Dimitre Novatchev over 11 years
    @JuanMendes: Use: /*/*[.=concat("I'm reading ", '"Harry Potter"')]/text() . In addition to this, in XPath 2.0 (and this means also in XQuery and XSLT 2.0) a quote is escaped simply by doubling it.
  • Ruan Mendes
    Ruan Mendes over 11 years
    To do that, it would require knowing which part of the string contains double or single quotes. I can't do that, the text to search for is beyond my control, it's handed to a method and I have to create an XPATH for it. Again, I cannot just use the reverse quoting, because that requires knowing the string to search for in advance
  • Dimitre Novatchev
    Dimitre Novatchev over 11 years
    @JuanMendes, Your method can find every occurence of a quote and apostrophe -- therefore it can generate the XPath expression that contains the concat() function. In case you are generating an XPath 2.0 expression, simply double every quote in the string -- this is a simple replace() function invocation.
  • Ruan Mendes
    Ruan Mendes over 11 years
    Double quoting does not work, not sure if XPath 2.0 is supported in browsers. The following does not yield any results: $x("//*[text()=\"XQuery looking for text with ''single'' quote\"]")
  • Dimitre Novatchev
    Dimitre Novatchev over 11 years
    XPath 2.0 is not supported in any browser. I was surprized by your use of the term XQuery together with Chrome and firebug.
  • Ruan Mendes
    Ruan Mendes over 11 years
    If I can't find another way, I will break up the string into all the necessary parts and combine them with the required concat("'") and concat('"') that you suggested. I'm using Selenium, by the way. I use $x() to test it without having to run Selenium
  • Dimitre Novatchev
    Dimitre Novatchev over 11 years
    @JuanMendes, Yes, this can be done in C# -- not entirely trivial, but possible. I did something similar many years ago using C++.
  • Ruan Mendes
    Ruan Mendes over 11 years
    +1 Though it's not exactly what I was looking for, there's something I can fall back to in the comments
  • Dimitre Novatchev
    Dimitre Novatchev over 11 years
    Juan, I am glad that my answer to your comments led to the solution. Please, consider accepting my answer.
  • Ruan Mendes
    Ruan Mendes over 11 years
    @DimitreNovatchev Your answer doesn't have the information I put in in my answer, the real trick is hidden in the comments. I based my answer off your answer, but your answer is not specifically answering my question. If you improve your to address my question specifically, I will accept it.
  • Dimitre Novatchev
    Dimitre Novatchev over 11 years
    This is the best one could do in XPath 1.0. If the expression was part of an XML document, it would be possible to use the two entities &quot; and &apos; -- but this isn't your case. I could provide a C# solution, but you seem to be using Javascript, in which I am not too-fluent.
  • Ruan Mendes
    Ruan Mendes over 11 years
    @DimitreNovatchev I'm just saying that for your answer to actually answer the question, it would have to at least specify that you need to wrap your entire string with concat() and then you can run a replace to change ' and " into "'" and '"'. You wouldn't have to come up with code as I did. JS was just the quickest way to explain what it takes. I'm actually writing a Java version which is what I'm really going to use. Right now the answer looks like you misunderstood the question
  • Dimitre Novatchev
    Dimitre Novatchev over 11 years
    Juan Mendes, sure -- this is a not so complex C# work. Another way is to replace ' and " with two corresponding strings (preferrably single characters, so that the XPath translate() function can be used on one of the two ends) that are guaranteed not to occur in the text nodes. This would require two chained calls to translate() in the XPath expression and two calls to a replace() function on the programming-language end.
  • Ruan Mendes
    Ruan Mendes over 11 years
    I'm not sure what you mean by using XQuery instead of XPath, can you expand on that? I'm writing automation tests using Selenium
  • BeniBela
    BeniBela over 11 years
    Well, you mentioned XQuery in the title. I don't know if Selenium supports XQuery. Anyways, the strings there supports basic xml entities, while those of XPath do not. (compare the XQuery and XPath standards)
  • Nicolas Barbulesco
    Nicolas Barbulesco over 10 years
    This answer does not answer the question. I want to write a ' in 'this string'. @Juan, I use Selenium too.
  • Nicolas Barbulesco
    Nicolas Barbulesco over 10 years
    This answer is interesting. But I use Selenium with Firefox, and alas they seem to support XPath but not XPath 2. I say they seem, this is very scarcely documented.
  • Dimitre Novatchev
    Dimitre Novatchev over 10 years
    @NicolasBarbulesco, I would recommend that you ask a separate question. It isn't clear from your comment what exactly you need to find and in what. As for whether this answer answers the question, please read the final solution by the submitter of the question, where he says: "Here's a hackaround (Thanks Dimitre Novatchev)".
  • acdcjunior
    acdcjunior about 10 years
    This rationale is very useful! Helped a lot! I used this in Java: String escapedText = "concat('"+originalText.replace("'", "', \"'\", '") + "', '')";!
  • casper
    casper over 8 years
    and what will happen if ' is the first character?
  • lfurini
    lfurini over 6 years
    @ktxmatrix such an extended edit of an existing answer would better be a separate answer altogether.
  • Seb D.
    Seb D. over 6 years
    Good idea, but does not work when the quote is the first or the last character.
  • tokland
    tokland about 6 years
    I think this snippet needs some extra work. I see two cases where it fails: 1) str = "", 2) any str without quotes, since concat requires at least two arguments.