How to make a multiple column mysql fulltext search where partial words are matched
Solution 1
In boolean mode, requiring strings to be present (instead of just scoring higher), is done with +
. prefix matching is done with an ending *
. This seems to be what you want, so search for:
+John* +S*
+John* +Smith*
+Smith* +J*
+Jo* +S*
Note that Full Text indexes cannot help you searching 'anywhere in a word'. so something like *mith*
is bound to fail: they're meant to match from character 1 in an index.
If you also want to order them by match values, and for instance, need John Smith
before Johnny Smithson
, you'd do this:
SELECT * FROM user
WHERE MATCH(..fields..) AGAINST ('match' IN BOOLEAN MODE)
ORDER BY MATCH(..fields..) AGAINST ('match' IN BOOLEAN MODE) DESC;
Which you will see will get you nowhere unless you add all the words >= ft_min_word_len
again separately:
+John* +S* John
+John* +Smith* John Smith
+Smith* +J* Smith
+Jo* +S*
For the last one, both are < the default 4 characters, so we can't add sorting params for that in default mysql, but you could set ft_min_world_len
differently is desired.
Solution 2
IN BOOLEAN MODE
you can use the +
-modifier to force AND
or the -
-modifier to force NOT
. No operator, your case, mean optional.
And you need to check the minimal word length in your mysql configuration to make the FULLTEXT INDEX index words smaller than a certain length.
I had to set
ft_min_word_len = 2
in my.cnf and had to rebuild the index to make this effective. By default it is 3.
To find out your min_word_len check (and upvote) this question
Solution 3
See http://dev.mysql.com/doc/refman/5.5/en//fulltext-boolean.html
You may place a "+", "-", or no operator before a word to make it search for "AND contains this word", "NOT contains this word", and no operator is "OR contains this word"
If I type in "John S", only the first result shows which is the desired behavior.
There's only one John, so this works, S is below the minimum word length and is discarded
If I type in "John Smith", only the first result shows which is the desired behavior.
There's only one John so this works
If I type "Smith J", both results show even though Bob is not a match.
J is below the minimum word length, so its only matching smith which is both rows
If I type "Smith John", both results show even though Bob is not a match.
Since you're in BOOLEAN MODE MySQL interprets this as Smith OR John... Smith matches both.
Lastly, if I type "Jo S", no results are returned despite the partial match on "Jo" and "S".
Jo and S are below the minimum word length - I believe MySQL treats this as searching for nothing
You'll want to add a "+" before your search parameters to turn them into an AND search... +Smith +John
Max
Updated on July 19, 2022Comments
-
Max almost 2 years
I currently have a single search field searching against multiple columns using this code:
$searchArray = explode(" ", $searchVal); $query="SELECT * FROM users WHERE "; $i=0; foreach ($searchArray as $word) { if ($i != 0) $query .= " OR "; $query .= " MATCH (`first_name`, `last_name`, `email`) AGAINST ('".$word."*' IN BOOLEAN MODE)"; $i++; }
Lets say I have these two rows in the table:
id | last_name | first_name | email 1 | Smith | John | [email protected] 2 | Smith | Bob | [email protected]
If I type in "John S", only the first result shows which is the desired behavior.
If I type in "John Smith", only the first result shows which is the desired behavior.
If I type "Smith J", both results show even though Bob is not a match.
If I type "Smith John", both results show even though Bob is not a match.
Lastly, if I type "Jo S", no results are returned despite the partial match on "Jo" and "S".
Can anyone help me fix my query to deal with the desired functionality of the order not being important and partial results matching? If it can be sorted by the best matches (i.e. the longest part of the word, starting from the first letter only not a section in the middle, in the highest number of columns), that would be a huge help also.
UPDATE:
Just wanted to post the final code that worked based on the solution. My loop creating multiple match statements was incorrect as was my ft_min_word_len.
My code is now:
$searchArray = explode(" ", $searchVal); $query="SELECT * FROM users WHERE MATCH (`first_name`, `last_name`, `email`) AGAINST ('"; $i=0; foreach ($searchArray as $word) { $query .= "+".$word."* "; } $query .= "' IN BOOLEAN MODE)";
-
Patrick B. about 11 yearsThe field
ft_min_word_len
is not for the MATCH-query, it is for the index which has been created and the query is now matching to. So, someone called Jo Smith would not match a+Jo*
match. -
Wrikken about 11 yearsAlready deleted my first comment (as it was false, boolean mode does use min word length, my apologies for that).
-
Max about 11 yearsThere still seems to be an issue with the order of the terms though. I added Pablo Picasso to the DB to get a longer name for testing. The term "Pablo Pica" returns a result. "Pica Pablo" does not. Should I be passing the full set of terms with a + and a * into a single MATCH statement or doing multiple MATCH statements with an OR as I have above?
-
Max about 11 yearsThere still seems to be an issue with the order of the terms. I added Pablo Picasso to the DB. The term "Pablo Pica" returns a result. "Pica Pablo" does not. Should I be passing the full set of terms with a + and a * into a single MATCH statement or doing multiple MATCH statements with an OR as I have above?
-
Wrikken about 11 years@Max: I cannot reproduce this, both
Pica Pablo
asPablo Pica
(or+Pica* +Pablo*
/+Pablo* +Pica*
) return the same user for me. Yes, in a singleMATCH() AGAINST()
statement. Every single term in theMATCH()
would most likely require the+term*
format. Do read @PatrickB's comment though: names < 4 characters will never match. -
Max about 11 yearsThanks! The combination of the ft_min_word_len and changing it to a single MATCH statement fixed it. I will update the question with my final code.
-
Alex Lacayo almost 9 yearsSo i have a users table with a
name
column. I have a row with the name "James DeQuan" and ... SELECT * FROM
users` WHERE MATCH(name) AGAINST('+james* +de*' in BOOLEAN MODE) returns nothing. Works fine for all other entries without an apostrophe. Any clue why?