MySQL subquery and temporary table is slow

11,809

These aren't terribly difficult queries. You're doing the right thing by using SQL_NO_CACHE, and by timing them. But you also need to look at the results of EXPLAIN.

Use JOIN syntax instead of comma-separated table lists. The queries should be equivalent, but the old style syntax is harder to understand.

SELECT SQL_NO_CACHE 
       t.topic_id
  FROM bb_topics  AS t
  JOIN bb_posters AS ps ON t.topic_id = ps.topic_id
 WHERE t.forum_id IN (2, 6, 7, 10, 15, 20)
 ORDER BY ps.timestamp desc
 LIMIT 20

Try using some composite (multicolumn) covering indexes to get your performance to the next level.

You need to order the bb_posters table by timestamp, and you need the topic_id. So try this index: (timestamp, topic_id) If you can use a statement like

    WHERE ps.timestamp >= DATE(NOW()) - INTERVAL 7 DAY

to limit the timeframe of the search, it will help performance even more.

You need topic_id and forum_id from your bb_topics table. So try this index (topic_id, forum_id)

You can use similar composite covering indexes for the other tables you're trying to join.

If your tables are well-indexed, queries on them should be just as efficient as queries on temporary tables. Creating temporary tables tends to do things to the server like flush out table data cached in RAM, which has an unintended negative effect on performance.

Share:
11,809
amq
Author by

amq

Updated on June 29, 2022

Comments

  • amq
    amq almost 2 years

    I wanted to optimize the following query:

    SELECT SQL_NO_CACHE t.topic_id
    FROM bb_topics t, bb_posters ps
    WHERE t.topic_id = ps.topic_id
    AND forum_id IN (2, 6, 7, 10, 15, 20)
    ORDER BY ps.timestamp desc
    LIMIT 20
    
    Query took 0.1475 sec
    

    So at first I replaced WHERE IN with an INNER JOIN subquery:

    SELECT SQL_NO_CACHE t.topic_id
    FROM ( SELECT * FROM bb_topics WHERE forum_id IN (2, 6, 7, 10, 15, 20) ) t
    INNER JOIN bb_posters ps ON t.topic_id = ps.topic_id
    ORDER BY ps.timestamp desc
    LIMIT 20
    
    Query took 0.1541 sec
    

    Then I tried to create a temporary table:

    CREATE TEMPORARY TABLE IF NOT EXISTS bb_topics_tmp ( INDEX(topic_id) )
    ENGINE=MEMORY
    AS ( SELECT * FROM bb_topics WHERE forum_id IN (2, 6, 7, 10, 15, 20) );
    
    SELECT SQL_NO_CACHE t.topic_id
    FROM bb_topics_tmp t, bb_posters ps
    AND t.topic_id = ps.topic_id
    ORDER BY ps.timestamp desc
    LIMIT 20
    
    Query took 0.1467 sec
    

    I don't understand why selecting from a complete table with 38,522 rows is much faster than from a temporary one with 9,943 rows:

    SELECT SQL_NO_CACHE t.topic_id
    FROM bb_topics t, bb_posters ps
    WHERE t.topic_id = ps.topic_id
    ORDER BY ps.timestamp desc
    LIMIT 20
    
    Query took 0.0006 sec
    

    Both topic_id and timestamp have indexes.

    Funny thing is that even using something like this is much faster than the list of forums:

    AND pt.post_text LIKE '%searchterm%'
    

    UPD:

    Here is the output of EXPLAIN:

    SELECT SQL_NO_CACHE t.topic_id, t.topic_title, ps.timestamp, u.username,
    u.user_id, ps.size, ps.downloaded, ROUND(a.rating_sum/a.rating_count) AS Rating,
    a.attach_id, pt.bbcode_uid, pt.post_text
    FROM bb_topics t
    JOIN bb_posters ps ON ps.topic_id = t.topic_id
    LEFT JOIN bb_users u ON u.user_id = t.topic_poster
    LEFT JOIN bb_posts_text pt ON pt.post_id = bt.post_id
    LEFT JOIN bb_attachments_desc a ON bt.attach_id = a.attach_id
    WHERE t.forum_id IN (2, 6, 7, 10, 15, 20)
    ORDER BY ps.timestamp desc
    LIMIT 1, 20
    
    id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
    1   SIMPLE  t   range   PRIMARY,forum_id    forum_id    2   NULL    8379    Using where; Using temporary; Using filesort
    1   SIMPLE  ps  eq_ref  topic_id    topic_id    3   DB.t.topic_id       1    
    1   SIMPLE  u   eq_ref  PRIMARY     PRIMARY     3   DB.t.topic_poster   1   Using index
    1   SIMPLE  pt  eq_ref  PRIMARY     PRIMARY     3   DB.bt.post_id       1   Using index
    1   SIMPLE  a   eq_ref  PRIMARY     PRIMARY     3   DB.bt.attach_id     1   Using index
    
    Query took 0.8527 sec
    

    The same query without WHERE t.forum_id IN:

    id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
    1   SIMPLE  ps  index   topic_id    timestamp   4   NULL                21   
    1   SIMPLE  t   eq_ref  PRIMARY     PRIMARY     3   DB.bt.topic_id      1    
    1   SIMPLE  u   eq_ref  PRIMARY     PRIMARY     3   DB.t.topic_poster   1    
    1   SIMPLE  pt  eq_ref  PRIMARY     PRIMARY     3   DB.bt.post_id       1    
    1   SIMPLE  a   eq_ref  PRIMARY     PRIMARY     3   DB.bt.attach_id     1    
    
    Query took 0.0022 sec
    

    UPD 2:

    Adding USE INDEX (timestamp) solved the problem:

    SELECT SQL_NO_CACHE t.topic_id, t.topic_title, ps.timestamp, u.username,
    u.user_id, ps.size, ps.downloaded, ROUND(a.rating_sum/a.rating_count) AS Rating,
    a.attach_id, pt.bbcode_uid, pt.post_text
    FROM bb_topics t
    JOIN bb_posters ps USE INDEX (timestamp) ON ps.topic_id = t.topic_id
    LEFT JOIN bb_users u ON u.user_id = t.topic_poster
    LEFT JOIN bb_posts_text pt ON pt.post_id = bt.post_id
    LEFT JOIN bb_attachments_desc a ON bt.attach_id = a.attach_id
    WHERE t.forum_id IN (2, 6, 7, 10, 15, 20)
    ORDER BY ps.timestamp desc
    LIMIT 1, 20
    
    Query took 0.0023 sec