Multiple Column Index vs Multiple Indexes

11,425

Solution 1

You should use a multi-column index on (primaryId, imgDate) so that MySQL is able to use it for selecting the rows and sorting.

If all the columns used for sorting are not in the index used for selection, MySQL uses the "filesort" strategy, which consists of sorting all rows (in memory if there is not too much rows; on disk else).

If all columns used for sorting are in the index, MySQL uses the index to get the rows order (with some restrictions).

MySQL uses a tree structure for the indexes. This allows to access keys in order directly without sorting.

A multi-column index is basically an index of the concatenation of the columns. This allows MySQL to find the first row matching primaryId={$imgId}, and then access all the other rows directly in the right order.

With a single-row index on primaryId, MySQL can find all the rows matching primaryId={$imgId}, but it will find the rows in no particular order; so it will have to sort them after that.

See EXPLAIN and ORDER BY Optimization.

Solution 2

Your explain looks like this:

[id] => 1 
[select_type] => SIMPLE 
[table] => secondary_images 
[type] => ref 
[possible_keys] => primaryId 
[key] => primaryId 
[key_len] => 5 
[ref] => const 
[rows] => 1 
[Extra] => Using where; Using filesort 

Let's walk through it.

[id] => 1 

Means we're talking about the first table. You're only calling one table in your statement.

[select_type] => SIMPLE 

We're doing a simple SELECT.

[table] => secondary_images 

The table name in question.

[type] => ref 

The select type, most important for joins.

[possible_keys] => primaryId 

This is an important field: it shows which keys can possibly be used to aid the query in executing faster. In this case, only your primary key is deemed useful.

[key] => primaryId 

This is an important field: it shows which key(s) finally were used. In this case, the primary key.

[key_len] => 5 
[ref] => const 
[rows] => 1 

Guesssing the number of rows examined by the query.

[Extra] => Using where; Using filesort 

The most important field imho. - Using where: You are using a where-statement. Quite ok. - Using filesort: the result of your query is so big, it can't be sorted in memory. MySQL has to write it to a file, sort the file, and then output. This means disk access and will slow down everything. Adding an index that can aid the sorting often helps, but solving "using filesort" is a chapter on its own.

Share:
11,425

Related videos on Youtube

stefmikhail
Author by

stefmikhail

Updated on June 14, 2022

Comments

  • stefmikhail
    stefmikhail almost 2 years

    I have the following table in a MySQL database:

    CREATE TABLE `secondary_images` (
      `imgId` int(10) unsigned NOT NULL AUTO_INCREMENT,
      `primaryId` int(10) unsigned DEFAULT NULL,
      `view` varchar(255) DEFAULT NULL,
      `imgURL` varchar(255) DEFAULT NULL,
      `imgDate` datetime DEFAULT NULL,
      PRIMARY KEY (`imgId`),
      KEY `primaryId` (`primaryId`),
      KEY `imgDate` (`imgDate`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=latin1 ;
    

    The SQL will be the following:

    SELECT imgURL, view FROM secondary_images 
    WHERE primaryId={$imgId} ORDER BY imgDate DESC
    

    As you can see I made both the primaryId and imgDate, Index Keys. My thinking behind that was because the WHERE clause queries results using the primaryId, and the ORDER clause uses imgDate.

    My question is, would it be better to use Multiple Indexes as I am right now? Or should I a Multiple Column Index (something I don't understand all too well at the moment)?

    This is what I get from EXPLAIN:

    id = 1   
    select_type = simple      
    table = secondary_images         
    type = ref
    possible_keys = primaryId
    key = primaryId
    key_len = 5
    ref = const
    rows = 1
    extra = Using where; Using filesort
    

    NOTE: This is not using a Multiple Column Index, it is the result from using the above table description.

    • Konerak
      Konerak over 12 years
      Can you post the EXPLAIN for the select? :)
    • Marvo
      Marvo over 12 years
      Keep in mind that indeces are not free. If you have multiple indeces, that means on each insert or update, each index needs to be updated. You need to weigh the performance hit of those updates to the performance improvement you'll see on retrieval.
    • Poodlehat
      Poodlehat over 12 years
      Also remember that if you have a column that has the same value in many rows, indexes might actually make performance worse.
    • stefmikhail
      stefmikhail over 12 years
      @Marvo - This database is used purely for retrieval. I will be the only person placing data in the database. I think that solves any problems concerning what you mentioned?
    • stefmikhail
      stefmikhail over 12 years
      @Poodlehat - I do actually have what you mention. The primaryId is not unique to each row. In fact, there will often be the same primaryId for up to 10 entries. That will cause a problem you think?
    • stefmikhail
      stefmikhail over 12 years
      @Konerak - I'm sorry, what are you asking me to post?
    • Konerak
      Konerak over 12 years
      @stef: aha! You're about to learn something very useful. Do the SELECT query, but add the explain keyword before it: explain SELECT imgURL, view FROM secondary_images WHERE primaryId={$imgId} ORDER BY imgDate DESC. Read Optimizing Queries With Explain.
    • stefmikhail
      stefmikhail over 12 years
      @Konerak - ooooh I've never heard of that! I'll give it a go!
    • Marvo
      Marvo over 12 years
      @stefmikhail Wasn't implying there were problems with multiple indexes. Just something you need to consider.
    • Poodlehat
      Poodlehat over 12 years
      It really depends on the size of your table. Something to look out for is to run the EXPLAIN and see if MySQL is actually using your index or not. It might decide to not use it if there are many duplicates. I have a book that calls this the "scan vs seek" problem. Basically, you would do something like SELECT COUNT (DISTINCT primaryId) FROM secondary_images and SELECT COUNT(*) FROM secondary_images and calculate the ratio. The larger the number, the higher the "selectivity". If the number is too low, it won't make a useful index.
    • stefmikhail
      stefmikhail over 12 years
      @Marvo - Good to know for the future. Thanks.
    • stefmikhail
      stefmikhail over 12 years
      @Konerak - And what, do I echo something after the query with explain at the beginning?
    • stefmikhail
      stefmikhail over 12 years
      @Poodlehat - I'm only slightly following what you're saying. Have a useful link that might explain more? I would ask you to write me a detailed email on the subject, but I think a link would be easier on you ;)
    • Konerak
      Konerak over 12 years
      @stefmikhail: eh, just execute the explain. Use a MySQL client (phpmyadmin, console, mysql query browser, toad...) and see?
    • stefmikhail
      stefmikhail over 12 years
      @Konerak - I found a way to print it on my page in an array. But I have no idea what it means. Any chance you could take a quick look and explain? I'll post it in my question above.
    • Poodlehat
      Poodlehat over 12 years
    • stefmikhail
      stefmikhail over 12 years
      @Poodlehat - Thanks for all the information, and I will, indeed read it. But in your opinion, should I, or should I not, switch from having 2 indexes, to having 1 multiple column index?
    • Poodlehat
      Poodlehat over 12 years
      I think I've been convinced to make the switch.
    • Markus Winand
      Markus Winand over 12 years
    • stefmikhail
      stefmikhail over 12 years
      @Poodlehat - And lastly (I promise), could you please post the sql code to make the switch? I'm still not that apt when it comes to sql. :)
    • Poodlehat
      Poodlehat over 12 years
      something like ALTER TABLE secondary_images DROP INDEX primaryId, DROP INDEX imgDate, ADD INDEX IdNDate (imgId DESC, primaryId ASC) ; should do the trick.
    • stefmikhail
      stefmikhail over 12 years
      @Poodlehat - Thanks a bunch. Wish you posted an answer so I could give you props!
  • Poodlehat
    Poodlehat over 12 years
    I don't understand why - it is comparing one column to another, not two columns simultaneously to two other columns. Can you explain?
  • stefmikhail
    stefmikhail over 12 years
    @arnaud576875 - Thanks very much! Any chance you could give me a brief explanation of why that is best choice in this situation? Also, how would I go about changing my current table using SQL to use a multi-column index on (primaryId, imgDate)?
  • stefmikhail
    stefmikhail over 12 years
    @arnaud576875 - Excellent explanation. Curious however; the primaryId is not unique. Up to ten different entires may have the same primaryId. Does this change anything?
  • stefmikhail
    stefmikhail over 12 years
    Thanks again! I will change it.
  • stefmikhail
    stefmikhail over 12 years
    Wow, wow wow. Thank you very much for that. So much easier to understand. So how can I use this information to decide if a multiple index key is the way to go? I won't ask you to get into the filesort issue as you yourself said that's another problem entirely, but would switching to a multiple index key help with this?
  • Konerak
    Konerak over 12 years
    You should first read up about explain. The MySQL site is a great place to start, and 'High Performance MySQL' is the best MySQL book I ever read. Then, understand that it very much depends on your table, the storage engine, your configuration (cache sizes and so), and the data in the table. So, the best way to test: copy the table, and add the desired index on the copy. Then, compare the explains. This is why you need to understand explain :)
  • stefmikhail
    stefmikhail over 12 years
    Thanks again. Much appreciated. I'll pick up that book.
  • Konerak
    Konerak over 12 years
    I usually prefer (free) internet resources (googleable, ctrl-f, bookmarks) over dead trees, but that book is great. A few of the authors have very interesting blogs too, their site is percona.com
  • stefmikhail
    stefmikhail over 12 years
    I've never checked out those internet resources. Thanks for the links. I do have several apps for my iPhone which are O'Reilly Media books on MySQL. I should read more of them.
  • timetofly
    timetofly almost 10 years
    Important correction: "Using filesort" does NOT necessarily mean it has to write to disk. It can also do it in memory if it doesn't exceed the sort buffer size.
  • Martin
    Martin over 8 years
    @Blossoming_Flower does the appearance of using filesort imply query inefficiency? I have queries that return a few ~16Kb from a table of ~30 rows but they always use filesort....
  • timetofly
    timetofly over 8 years
    In general, yes, using filesort means it's using more resources. However, for such a small data set as yours, you won't see any difference. For differences in performance you'd need a lot more data involved.