Encoding XPath Expressions with both single and double quotes

30,333

Solution 1

Wow, you all sure are making this complicated. Why not just do this?

public static string XpathExpression(string value)
{
    if (!value.Contains("'"))
        return '\'' + value + '\'';

    else if (!value.Contains("\""))
        return '"' + value + '"';

    else
        return "concat('" + value.Replace("'", "',\"'\",'") + "')";
}

.NET Fiddle & test

Solution 2

Though it certainly won't work in all circumstances, here's a way to sidestep the problem:

doc.DocumentElement.SetAttribute("searchName", name);
XmlNode n = doc.SelectNodes("//review[@name=/*/@searchName]");

Solution 3

This is what I've come up with

public static string EncaseXpathString(string input)
{         
    // If we don't have any " then encase string in "
    if (!input.Contains("\""))
        return String.Format("\"{0}\"", input);

    // If we have some " but no ' then encase in '
    if (!input.Contains("'"))
        return String.Format("'{0}'", input);

    // If we get here we have both " and ' in the string so must use Concat
    StringBuilder sb = new StringBuilder("concat(");           

    // Going to look for " as they are LESS likely than ' in our data so will minimise
    // number of arguments to concat.
    int lastPos = 0;
    int nextPos = input.IndexOf("\"");
    while (nextPos != -1)
    {
        // If this is not the first time through the loop then seperate arguments with ,
        if (lastPos != 0)
            sb.Append(",");

        sb.AppendFormat("\"{0}\",'\"'", input.Substring(lastPos, nextPos - lastPos));
        lastPos = ++nextPos;

        // Find next occurance
        nextPos = input.IndexOf("\"", lastPos);
    }

    sb.Append(")");
    return sb.ToString();
}

Called using something like

XmlNode node = doc.SelectSingleNode("//review[@name=" + EncaseXpathString("Fred's \"Fancy Pizza\"" + "]")

So we get the following results

EncaseXpathString("Pizza Shed") == "'Pizza Shed'";
EncaseXpathString("Bob's pizza") == "\"Bob's Pizza\"";
EncaseXpathString("\"Pizza\" Pam" == "'\"Pizza\" Pam'";
EncaseXpathString("Fred's \"Fancy Pizza\"") == "concat(\"Fred's \",'\"',\"Fancy Pizza\",'\"')";

So it's only using concat when its necessary (both " and ' in string)

The last result show the concat operation is not as short as it could be (see question) but its close enough and anything more optimal would be very complex as you would have to look for matching pairs of " or '.

Solution 4

I've had problems with all solutions so far. One has extra text sections (e.g. '"' or "'") which breaks what you're looking for. One drops all text after the last quote/dblquote which breaks as well.

This is a dumb and quick solution from a dumb vb developer:

Function ParseXpathString(ByVal input As String) As String
    input = Replace(input, "'", Chr(1))
    input = Replace(input, """", Chr(2))
    input = Replace(input, Chr(1), "',""'"",'")
    input = Replace(input, Chr(2), "','""','")
    input = "concat('','" + input + "')"
    Return input
End Function

Usage (same as previous examples):

x.SelectNodes("/path[@attr=" & ParseXpathString(attrvalue) & "]")

Solution 5

Another variation...my concat() part is a little lazy, but at least it uses the whole value.

    /// <summary>
    /// Returns an XPath string literal to use for searching attribute values (wraped in apostrophes, quotes, or as a concat function).
    /// </summary>
    /// <param name="attributeValue">Attribute value to encode and wrap.</param>
    public static string CreateXpathLiteral(string attributeValue)
    {
        if (!attributeValue.Contains("\""))
        {
            // if we don't have any quotes, then wrap string in quotes...
            return string.Format("\"{0}\"", attributeValue);
        }
        else if (!attributeValue.Contains("'"))
        {
            // if we have some quotes, but no apostrophes, then wrap in apostrophes...
            return string.Format("'{0}'", attributeValue);
        }
        else
        {
            // must use concat so the literal in the XPath will find a match...
            return string.Format("concat(\"{0}\")", attributeValue.Replace("\"", "\",'\"',\""));
        }
    }
Share:
30,333
Ryan
Author by

Ryan

Developer working primarily with SharePoint

Updated on November 12, 2020

Comments

  • Ryan
    Ryan over 3 years

    XPath (v1) contains no way to encode expressions.

    If you only have single OR double quotes then you can use expressions such as

    //review[@name="Bob's Pizza"]
    //review[@name='"Pizza" Pam']
    

    But if you have BOTH e.g [Fred's "Fancy Pizza"] then you have to use something like this Escaping Strings in XPath (C++) to generate

    //review[@name=Concat("Fred's ",'"Fancy Pizza"')]
    

    Anyone have a function in c# to do this?

    Some links that are close

    EDIT: A few answers have suggested escaping ' with &apos; and " with &quot; but although this makes sense it does not work; try it using the XML fragment:

    <review name="Bob's Pizza"/>
    

    and the xpath

    //review[@name='Bob&apos;s Pizza']
    
  • Jeff Olson
    Jeff Olson almost 13 years
    This worked like a charm for me in Java as well, with the exception being I had to add another "/" in the query: doc.getDocumentElement().setAttribute("searchAttribute", name); String type = xpath.evaluate("//review[@name=//@searchAttribute]/@type", doc);
  • mhenry1384
    mhenry1384 almost 13 years
    Seems like a slick solution... except as far as I can tell it doesn't work under any circumstances. var n1 = xmlDoc.SelectNodes("//*/*/Credential[@Name='zzz']"); //returns 1 node xmlDoc.DocumentElement.SetAttribute("searchName", "zzz"); var n2 = xmlDoc.SelectNodes("//*/*/Credential[@Name=/@searchName]"); // returns 0 nodes
  • Robert Rossney
    Robert Rossney almost 13 years
    I think that /@foo is wrong: to get an attribute on the top-level element, you need to do /*/@foo. So the predicate would be [@Name=/*/@searchName].
  • herzbube
    herzbube almost 11 years
    Your code assumes that the last double quote is at the end of the input string. To cover the case where the last double quote is somewhere else, the line sb.Append(")"); should be written as sb.AppendFormat(",\"{0}\")", input.Substring(lastPos));. Otherwise, good job, thanks for a working example!
  • justinf
    justinf almost 11 years
    Hi I tried your code it did not worK I am looking for /Math/Show[Quiz = 'This 'wont hurt'] I have tried calling it the following ways but I still get an error saying has an invalid token var node = xd.XPathSelectElements("/Math/Show[Quiz = " + EncaseXpathString("'This 'wont hurt'"+ "]")); var node = xd.XPathSelectElements("/Math/Show[Quiz = " + EncaseXpathString("This 'wont hurt"+ "]"));
  • Gigo
    Gigo about 9 years
    Something is broken here. EncaseXpathString( "'foo\"bar\"baz") returns concat("'foo",'"',"bar",'"')
  • Shawn
    Shawn almost 8 years
    Can you leave some comments in the code describing how it works?
  • Ryan
    Ryan almost 8 years
    Nice! Funny how sometimes you can't see the wood for the trees eh!
  • Kaleb
    Kaleb almost 8 years
    In all fairness, I did notice after posting this that tiwahu had posted an extremely similar answer, but with the more expensive string.Format() method. And his answer even has comments! ;-)
  • binki
    binki over 4 years
    @RobertRossney That’s kind of bad. That means that it’ll match that attribute regardless of where it is in the document instead of only getting the one you want. Don’t you want something like @name=/path/to/@searchName? Seems to me that the answer was better prior to your edit…
  • Plasmabubble
    Plasmabubble almost 4 years
    Just a heads-up to mention that the "else" statements are not needed.