Which special characters need escaping in a solr query?

56,184

Solution 1

You need to use the lucene solr syntax for regexes: http://lucene.apache.org/core/6_5_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Escaping_Special_Characters

Solution 2

It might be a good idea looking at http://lucene.apache.org/solr/4_2_1/solr-solrj/org/apache/solr/client/solrj/util/ClientUtils.html#escapeQueryChars(java.lang.String)

Share:
56,184

Related videos on Youtube

hairyhenderson
Author by

hairyhenderson

Just some guy.

Updated on July 09, 2022

Comments

  • hairyhenderson
    hairyhenderson almost 2 years

    Update: I think this question has to do with solr syntax in general, and not Chef in particular. So while I ran into this working with Chef, I presume that anyone working with Solr will also experience this...


    I'm working on an application that communicates with the Chef server's search API to find particular nodes.

    Based on this http://docs.opscode.com/essentials_search.html#special-characters, it seems that a number of special characters need to be escaped.

    Note: I'm only concerned with exact-matching patterns, not wildcards. I realize that the reason some of these characters are wildcards.

    Here's the list at the time of this writing, as copied from the URL above:

    +  -  &&  | |  !  ( )  { }  [ ]  ^  "  ~  *  ?  :  \
    

    When I try various knife search commands with these characters, however, I see inconsistent behaviour.

    For the following examples, I set up a node that is tagged with +&|!(){}[]^\"~*?:\\"

    These commands were run from a Linux box, in a bash shell:

    $ knife search node 'tags:+&|!(){}[]^"~*?:\'
    ERROR: knife search failed: invalid search query: 'tags:+&|!(){}[]^"~*?:\'
    

    That behaved as expected, since nothing was escaped. Now, I escape everything with a single \ as the docs suggest:

    $ knife search node 'tags:\+\&\|\!\(\)\{\}\[\]\^\"\~\*\?\:\\'
    ERROR: knife search failed: invalid search query: 'tags:\+\&\|\!\(\)\{\}\[\]\^\"\~\*\?\:\\'
    

    Strange.

    Can anyone shed some light on this, and maybe suggest a query that's capable of matching that tag?

    It's obviously unlikely that anyone will ever have an attribute containing all those special characters, but I'd like to understand better how the special characters should be escaped.

    Thanks!

    • StephenKing
      StephenKing about 10 years
      Maybe you find more information when searching for the same but for solr instead of chef..? That's what used for search.
    • Display Name is missing
      Display Name is missing about 10 years
      ! ( ) { } [ ] ^ " ~ * ? : \ Those all work for me but + - && | | all fail
    • hairyhenderson
      hairyhenderson about 10 years
      @better_use_mkstemp: thanks. That partially helps. I'm also a little confused why && and || are considered special characters.
    • hairyhenderson
      hairyhenderson about 10 years
      After reading the URL posted by @sethvargo below, I now understand why +, -, &&, and || are interpreted specially. They're considered boolean operators. However it's still not clear how to properly escape these.
  • hairyhenderson
    hairyhenderson about 10 years
    thanks. It looks like the Chef docs simply copied the Lucene docs at this URL: lucene.apache.org/core/2_9_4/queryparsersyntax.html#Escaping Special Characters , which isn't any more helpful...