MySQL - Views - Super slow query

13,366

Solution 1

Try re-creating your view using this:

CREATE ALGORITHM = MERGE VIEW `V1a_sentiment_AI_current` AS    
SELECT CAST(classifier_results.msgDate as DATE) AS mdate
   ,classifier_results.objClass AS objClass
   ,COUNT(classifier_results.objClass) AS obj
   ,classifier_results.subjClass AS subjClass
   ,COUNT(classifier_results.subjClass) AS subj 
FROM classifier_results 
WHERE (classifier_results.msgDate >= (curdate() - 20)) 
GROUP BY 
  CAST(classifier_results.msgDate as DATE)
  ,classifier_results.objClass
  ,classifier_results.subjClass 
ORDER BY classifier_results.msgDate DESC

More information on MySQL's view processing algorithms can be found here.

Solution 2

This is a really common problem. It can be very hard to write DRY, re-usable SQL. There is a workaround I've found though.

Firstly, as others have pointed out, you can and should use VIEWs to do this wherever possible using the set ALGORITHM = MERGE, so that any queries using them are optimised on the merged SQL statement's where clause rather than having the VIEW evaluated for the entire view which can be catastrophically large.

In this case, since you cannot use MERGE because of the group/count aspect, you might want to try using a stored procedure that creates a temporary session table as a workaround.

This technique allows you to write reusable queries that can be accessed from middleware / framework code and called from inside other stored procedures, so you can keep code contained, maintainable and reusable.

I.e. if you know in advance that the query will be filtered on certain conditions, put those in a stored procedure. (It may be more efficient to post-filter the data set, or a combination - it depends how you use the data and what common sets are needed).

CREATE PROCEDURE sp_create_tmp_V1a_sentiment_AI_current(parm1, parm2 etc)
BEGIN

  drop temporary table if exists tmp_V1a_sentiment_AI_current;

  create temporary table tmp_V1a_sentiment_AI_current
  as
  SELECT CAST(classifier_results.msgDate as DATE) AS mdate
         ,classifier_results.objClass AS objClass
         ,COUNT(classifier_results.objClass) AS obj
         ,classifier_results.subjClass AS subjClass
         ,COUNT(classifier_results.subjClass) AS subj 
  FROM classifier_results 
  WHERE (classifier_results.msgDate >= (curdate() - 20)) 
  -- and/or other filters on parm1, parm2 passed in
  GROUP BY 
    CAST(classifier_results.msgDate as DATE)
    ,classifier_results.objClass
    ,classifier_results.subjClass 
  ORDER BY classifier_results.msgDate DESC;

END;

Now, any time you need to work with this data, you call the procedure and then either select the result (possibly with additional where clause parameters) or join with it in any other query.

The table is a session temporary table so it will persist beyond the call to the procedure. The calling code can either drop it once it's finished with the data or it'll go automatically when the session goes or a subsequent call to the sproc is made.

Hope that's helpful.

Share:
13,366
NightWolf
Author by

NightWolf

Updated on July 20, 2022

Comments

  • NightWolf
    NightWolf almost 2 years

    This is a weird one. I am trying to use Views in MySQL (I'm reasonably new to MySQL with more experience with Sybase and SQL Server). Any way this new project we are using MySQL as it seems to have good performance. However to make querying for a web front end simpler we decided to create a few views, all work well, but they take forever to run.

    The views are very simple, just select statements (these tables do have a few million rows in them). Say for example this query:

    SELECT CAST(classifier_results.msgDate as DATE) AS mdate
           ,classifier_results.objClass AS objClass
           ,COUNT(classifier_results.objClass) AS obj
           ,classifier_results.subjClass AS subjClass
           ,COUNT(classifier_results.subjClass) AS subj 
    FROM classifier_results 
    WHERE (classifier_results.msgDate >= (curdate() - 20)) 
    GROUP BY 
      CAST(classifier_results.msgDate as DATE)
      ,classifier_results.objClass
      ,classifier_results.subjClass 
    ORDER BY classifier_results.msgDate DESC
    

    When run as a normal select takes around 1.5 seconds to return a result.

    However when this query is put into a view (as is) - i.e.

    CREATE VIEW  V1a_sentiment_AI_current AS    
    SELECT CAST(classifier_results.msgDate as DATE) AS mdate
           ,classifier_results.objClass AS objClass
           ,COUNT(classifier_results.objClass) AS obj
           ,classifier_results.subjClass AS subjClass
           ,COUNT(classifier_results.subjClass) AS subj 
    FROM classifier_results 
    WHERE (classifier_results.msgDate >= (curdate() - 20)) 
    GROUP BY 
      CAST(classifier_results.msgDate as DATE)
      ,classifier_results.objClass
      ,classifier_results.subjClass 
    ORDER BY classifier_results.msgDate DESC
    

    The query takes about 10 times longer (22-30 seconds). So I'm thinking maybe there is some optimization or query caching that doesnt work with Views or maybe there is some setting we've missed in the MySQL config. But is there any way to speed up this view so its just a nice placeholder for this query?

    Running EXPLAIN on the two queries: The normal select gives:

    1, SIMPLE, classifier_results, ALL, idx_date, , , , 594845, Using where; Using temporary; Using filesort

    The view select gives:

    1, PRIMARY, , ALL, , , , , 100,
    2, DERIVED, classifier_results, ALL, idx_date, , , , 594845, Using where; Using temporary; Using filesort

  • NightWolf
    NightWolf over 12 years
    Ok thanks for this link. The view was UNDEFINED before. By trying to change the algo to MERGE gives: 0 row(s) affected, 1 warning(s): 1354 View merge algorithm can't be used here for now (assumed undefined algorithm). Trying to create the view with the TEMPTABLE algo works fine. So with UNDEFINED im guessing it might be using TempTable as undefined seems to make a choice between merge and temptable. So this may be the issue, as the manual says merge is more efficient...
  • Cocowalla
    Cocowalla over 12 years
    Try removing the ORDER BY clause from the view and see if it will work with the MERGE algorithm
  • NightWolf
    NightWolf over 12 years
    Removing the ORDER BY still doesnt want to work as merge. 1354 View merge algorithm can't be used here for now (assumed undefined algorithm)
  • Cocowalla
    Cocowalla over 12 years
    Sorry, I'm out of ideas :( MySQL is saying it can't use the MERGE algo, but I'm not sure why - hopefully someone else can help out.
  • miherrma
    miherrma over 12 years
    Learned a lot from both question and answer/comments, thanks - in the link provided in the answer it says that the MERGE algo cannot work with count in the select part of the view.
  • NightWolf
    NightWolf over 12 years
    It would appear that thats the case. SO is there any work around to have a view query using count run fast? There is this but they dont have a work around for the count issue. mysqlperformanceblog.com/2010/05/19/…
  • adamlamar
    adamlamar over 11 years
    I found that removing all aggregation functions and the GROUP BY from the view will allow performance to return to normal. This means each of your queries using the view will require GROUP BY, but at least it helps manage some complexity (especially if you have JOINS in the view).
  • funder7
    funder7 over 3 years
    MERGE algorithm is not allowed when using COUNT() SUM() and so on!