MySQL full text search with partial words

20,603

Solution 1

My understanding is that MySQL FULLTEXT indexes support searching for prefixes (MATCH (a.article_name) AGAINST ('MySQL*' IN BOOLEAN MODE)) only.

Solution 2

Quoting from MySQL manual: "The search result is empty because the word “MySQL” is present in at least 50% of the rows." http://dev.mysql.com/doc/refman/5.0/en/fulltext-natural-language.html

Share:
20,603
Admin
Author by

Admin

Updated on July 19, 2022

Comments

  • Admin
    Admin almost 2 years

    MySQL Full Text searching appears to be great and the best way to search in SQL. However, I seem to be stuck on the fact that it won't search partial words. For instance if I have an article titled "MySQL Tutorial" and search for "MySQL", it won't find it.

    Having done some searching I found various references to support for this coming in MySQL 4 (i'm using 5.1.40). I've tried using "MySQL" and "%MySQL%", but neither works (one link I found suggested it was stars but you could only do it at the end or the beginning not both).

    Here's my table structure and my query, if someone could tell me where i'm going wrong that would be great. I'm assuming partial word matching is built in somehow.

    CREATE TABLE IF NOT EXISTS `articles` (
      `article_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
      `article_name` varchar(64) NOT NULL,
      `article_desc` text NOT NULL,
      `article_link` varchar(128) NOT NULL,
      `article_hits` int(11) NOT NULL,
      `article_user_hits` int(7) unsigned NOT NULL DEFAULT '0',
      `article_guest_hits` int(10) unsigned NOT NULL DEFAULT '0',
      `article_rating` decimal(4,2) NOT NULL DEFAULT '0.00',
      `article_site_id` smallint(5) unsigned NOT NULL DEFAULT '0',
      `article_time_added` int(10) unsigned NOT NULL,
      `article_discussion_id` smallint(5) unsigned NOT NULL DEFAULT '0',
      `article_source_type` varchar(12) NOT NULL,
      `article_source_value` varchar(12) NOT NULL,
      PRIMARY KEY (`article_id`),
      FULLTEXT KEY `article_name` (`article_name`,`article_desc`,`article_link`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=7 ;
    
    INSERT INTO `articles` VALUES
    (1, 'MySQL Tutorial', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.', 'http://www.domain.com/', 6, 3, 1, '1.50', 1, 1269702050, 1, '0', '0'),
    (2, 'How To Use MySQL Well', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.', 'http://www.domain.com/', 1, 2, 0, '3.00', 1, 1269702050, 1, '0', '0'),
    (3, 'Optimizing MySQL', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.', 'http://www.domain.com/', 0, 1, 0, '3.00', 1, 1269702050, 1, '0', '0'),
    (4, '1001 MySQL Tricks', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.', 'http://www.domain.com/', 0, 1, 0, '3.00', 1, 1269702050, 1, '0', '0'),
    (5, 'MySQL vs. YourSQL', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.', 'http://www.domain.com/', 0, 2, 0, '3.00', 1, 1269702050, 1, '0', '0'),
    (6, 'MySQL Security', 'Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.', 'http://www.domain.com/', 0, 2, 0, '3.00', 1, 1269702050, 1, '0', '0');
    
    SELECT count(a.article_id) FROM articles a
    
                WHERE MATCH (a.article_name, a.article_desc, a.article_link) AGAINST ('mysql')
                GROUP BY a.article_id
                ORDER BY a.article_time_added ASC
    

    The prefix is used as it comes from a function that sometimes adds additional joins.

    As you can see a search for MySQL should return a count of 6, but unfortunately it doesn't.

    Update

    No results where returned as every single row was matched.

    http://dev.mysql.com/doc/refman/5.1/en/fulltext-natural-language.html

    "The search result is empty because the word “MySQL” is present in at least 50% of the rows. As such, it is effectively treated as a stopword. For large data sets, this is the most desirable behavior: A natural language query should not return every second row from a 1GB table. For small data sets, it may be less desirable."