Combine LIKE and CONTAINS in SAP HANA
Solution 1
select name from(
SELECT name FROM "NEO_123456789ABCDE"."MYTABLE1" WHERE contains(name, 'apple', FUZZY(0.2)) --Part I
UNION ALL
SELECT name FROM "NEO_123456789ABCDE"."MYTABLE1" WHERE contains(name, '%apple%') --Part II
)group by name
This query basically joins the search results from contains clause with fuzzy search and normal search. You can also replace the Part II of query with like instead.
Solution 2
The fuzzy algorithm matches against the complete content of the column in your example. So it compares "apple" with "red apple" and "appel yellow"
What you want is a matching against the tokens in your columns. You can achieve this by creating a fulltext index, which will tokenize the content in the columns into a fulltext index. contains() will automatically use the fulltext index.
drop table MYTABLE1;
create column table MYTABLE1
(
name nvarchar(100)
);
insert into MYTABLE1 (name) values ('red apple');
insert into MYTABLE1 (name) values ('yellow apple');
insert into MYTABLE1 (name) values ('apple green');
insert into MYTABLE1 (name) values ('red aple');
insert into MYTABLE1 (name) values ('appel yellow');
CREATE FULLTEXT INDEX i_MYTABLE1 ON MYTABLE1(name) FUZZY SEARCH INDEX ON SYNC;
SELECT name FROM "MYTABLE1" WHERE contains(name, 'apple', FUZZY(0.5))
Evgenij Reznik
Updated on July 09, 2022Comments
-
Evgenij Reznik almost 2 years
Consider the following entries in my table:
- red apple
- yellow apple
- apple green
- red aple
- appel yellow
Several people have populated this table using a non-consistend notation (the color before or after 'apple'), also entering some spelling errors. Now I want to query all entries with the word
apple
, regardless of color or spelling.With
FUZZY()
:SELECT name FROM "NEO_123456789ABCDE"."MYTABLE1" WHERE contains(name, 'apple', FUZZY(0.5))
I only get:
- red apple
- red aple
When adding wildcards:
SELECT name FROM "NEO_123456789ABCDE"."MYTABLE1" WHERE contains(name, '%apple%', FUZZY(0.5))
I only get all entries, where
apple
was spelled right:- red apple
- yellow apple
- apple green
Why I can't combine both operators LIKE and CONTAINS in one query?
I need to find:
- entries, where
apple
is surrounded by other words (in my case colors) - all forms of
apple
(regardless of the spelling)