Combine LIKE and CONTAINS in SAP HANA

35,793

Solution 1

select name from(
         SELECT name FROM "NEO_123456789ABCDE"."MYTABLE1" WHERE contains(name, 'apple', FUZZY(0.2)) --Part I
         UNION ALL
         SELECT name FROM "NEO_123456789ABCDE"."MYTABLE1" WHERE contains(name, '%apple%') --Part II
)group by name

This query basically joins the search results from contains clause with fuzzy search and normal search. You can also replace the Part II of query with like instead.

Solution 2

The fuzzy algorithm matches against the complete content of the column in your example. So it compares "apple" with "red apple" and "appel yellow"

What you want is a matching against the tokens in your columns. You can achieve this by creating a fulltext index, which will tokenize the content in the columns into a fulltext index. contains() will automatically use the fulltext index.

drop table MYTABLE1;
create column table MYTABLE1 
(
  name nvarchar(100)
);

insert into MYTABLE1 (name) values ('red apple');
insert into MYTABLE1 (name) values ('yellow apple');
insert into MYTABLE1 (name) values ('apple green');
insert into MYTABLE1 (name) values ('red aple');
insert into MYTABLE1 (name) values ('appel yellow');


CREATE FULLTEXT INDEX i_MYTABLE1 ON MYTABLE1(name) FUZZY SEARCH INDEX ON SYNC;

SELECT name FROM "MYTABLE1" WHERE contains(name, 'apple', FUZZY(0.5)) 
Share:
35,793
Evgenij Reznik
Author by

Evgenij Reznik

Updated on July 09, 2022

Comments

  • Evgenij Reznik
    Evgenij Reznik almost 2 years

    Consider the following entries in my table:

    • red apple
    • yellow apple
    • apple green
    • red aple
    • appel yellow

    Several people have populated this table using a non-consistend notation (the color before or after 'apple'), also entering some spelling errors. Now I want to query all entries with the word apple, regardless of color or spelling.

    With FUZZY():

    SELECT name FROM "NEO_123456789ABCDE"."MYTABLE1" WHERE contains(name, 'apple', FUZZY(0.5))  
    

    I only get:

    • red apple
    • red aple

    When adding wildcards:

    SELECT name FROM "NEO_123456789ABCDE"."MYTABLE1" WHERE contains(name, '%apple%', FUZZY(0.5)) 
    

    I only get all entries, where apple was spelled right:

    • red apple
    • yellow apple
    • apple green

    Why I can't combine both operators LIKE and CONTAINS in one query?

    I need to find:

    • entries, where apple is surrounded by other words (in my case colors)
    • all forms of apple (regardless of the spelling)