Finding exact match using Lucene search API
Solution 1
You can use KeywordAnalyzer to index and search on this field. Keyword Analyzer will generate only one token for the entire string.
Solution 2
I googled a lot with no help for the same problem. After scratching my head for a while I found the solution. Search the string within double quotes, that will solve your problem.
National Bancorp will return both #1 and #2 but "National Bancorp" will return only #2.
Solution 3
This is something that may warrant the use of the shingle filter. This filter groups multiple words together. For example, Abigail Adams National Bancorp with a ShingleFilter of 3 tokens would produce (assuming a simple WhitespaceAnalyzer) [Abigail], [Abigail Adams], [Abigail Adams National], [Adams National Bancorp], [Adams National], [Adams], [National], [National Bancorp] and [Bancorp].
If a user the queries for National Bancorp, you will get an exact match on National Bancorp itself, and a lower scored exact match on Abigail Adams National Bancorp (lower scored because this one has much more tokens in the field, thus lowering the idf). I think it makes sense to return both documents on such a query.
You may want to apply the shingle filter at query time as well, depending on the use case.
Steve Chapman
Updated on June 08, 2022Comments
-
Steve Chapman almost 2 years
I'm working on a company search API using Lucene. My Lucene company index has got 2 companies: 1.Abigail Adams National Bancorp, Inc. 2.National Bancorp
If the user types in National Bancorp, then only company # 2(ie. National Bancorp) should be returned and not #1.....ie. only exact matches should be returned. How do I achieve this functionality?
Thanks for reading.
-
Steve Chapman almost 15 yearsCan you please answer this one? stackoverflow.com/questions/899542/…