Multiple Column Index vs Multiple Indexes
Solution 1
You should use a multi-column index on (primaryId, imgDate) so that MySQL is able to use it for selecting the rows and sorting.
If all the columns used for sorting are not in the index used for selection, MySQL uses the "filesort" strategy, which consists of sorting all rows (in memory if there is not too much rows; on disk else).
If all columns used for sorting are in the index, MySQL uses the index to get the rows order (with some restrictions).
MySQL uses a tree structure for the indexes. This allows to access keys in order directly without sorting.
A multi-column index is basically an index of the concatenation of the columns. This allows MySQL to find the first row matching primaryId={$imgId}
, and then access all the other rows directly in the right order.
With a single-row index on primaryId
, MySQL can find all the rows matching primaryId={$imgId}
, but it will find the rows in no particular order; so it will have to sort them after that.
See EXPLAIN and ORDER BY Optimization.
Solution 2
Your explain looks like this:
[id] => 1
[select_type] => SIMPLE
[table] => secondary_images
[type] => ref
[possible_keys] => primaryId
[key] => primaryId
[key_len] => 5
[ref] => const
[rows] => 1
[Extra] => Using where; Using filesort
Let's walk through it.
[id] => 1
Means we're talking about the first table. You're only calling one table in your statement.
[select_type] => SIMPLE
We're doing a simple SELECT.
[table] => secondary_images
The table name in question.
[type] => ref
The select type, most important for joins.
[possible_keys] => primaryId
This is an important field: it shows which keys can possibly be used to aid the query in executing faster. In this case, only your primary key is deemed useful.
[key] => primaryId
This is an important field: it shows which key(s) finally were used. In this case, the primary key.
[key_len] => 5
[ref] => const
[rows] => 1
Guesssing the number of rows examined by the query.
[Extra] => Using where; Using filesort
The most important field imho. - Using where: You are using a where-statement. Quite ok. - Using filesort: the result of your query is so big, it can't be sorted in memory. MySQL has to write it to a file, sort the file, and then output. This means disk access and will slow down everything. Adding an index that can aid the sorting often helps, but solving "using filesort" is a chapter on its own.
Related videos on Youtube
stefmikhail
Updated on June 14, 2022Comments
-
stefmikhail almost 2 years
I have the following table in a MySQL database:
CREATE TABLE `secondary_images` ( `imgId` int(10) unsigned NOT NULL AUTO_INCREMENT, `primaryId` int(10) unsigned DEFAULT NULL, `view` varchar(255) DEFAULT NULL, `imgURL` varchar(255) DEFAULT NULL, `imgDate` datetime DEFAULT NULL, PRIMARY KEY (`imgId`), KEY `primaryId` (`primaryId`), KEY `imgDate` (`imgDate`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 ;
The SQL will be the following:
SELECT imgURL, view FROM secondary_images WHERE primaryId={$imgId} ORDER BY imgDate DESC
As you can see I made both the
primaryId
andimgDate
, Index Keys. My thinking behind that was because theWHERE
clause queries results using theprimaryId
, and theORDER
clause usesimgDate
.My question is, would it be better to use Multiple Indexes as I am right now? Or should I a Multiple Column Index (something I don't understand all too well at the moment)?
This is what I get from EXPLAIN:
id = 1 select_type = simple table = secondary_images type = ref possible_keys = primaryId key = primaryId key_len = 5 ref = const rows = 1 extra = Using where; Using filesort
NOTE: This is not using a Multiple Column Index, it is the result from using the above table description.
-
Konerak over 12 yearsCan you post the EXPLAIN for the select? :)
-
Marvo over 12 yearsKeep in mind that indeces are not free. If you have multiple indeces, that means on each insert or update, each index needs to be updated. You need to weigh the performance hit of those updates to the performance improvement you'll see on retrieval.
-
Poodlehat over 12 yearsAlso remember that if you have a column that has the same value in many rows, indexes might actually make performance worse.
-
stefmikhail over 12 years@Marvo - This database is used purely for retrieval. I will be the only person placing data in the database. I think that solves any problems concerning what you mentioned?
-
stefmikhail over 12 years@Poodlehat - I do actually have what you mention. The
primaryId
is not unique to each row. In fact, there will often be the sameprimaryId
for up to 10 entries. That will cause a problem you think? -
stefmikhail over 12 years@Konerak - I'm sorry, what are you asking me to post?
-
Konerak over 12 years@stef: aha! You're about to learn something very useful. Do the SELECT query, but add the
explain
keyword before it:explain SELECT imgURL, view FROM secondary_images WHERE primaryId={$imgId} ORDER BY imgDate DESC
. Read Optimizing Queries With Explain. -
stefmikhail over 12 years@Konerak - ooooh I've never heard of that! I'll give it a go!
-
Marvo over 12 years@stefmikhail Wasn't implying there were problems with multiple indexes. Just something you need to consider.
-
Poodlehat over 12 yearsIt really depends on the size of your table. Something to look out for is to run the EXPLAIN and see if MySQL is actually using your index or not. It might decide to not use it if there are many duplicates. I have a book that calls this the "scan vs seek" problem. Basically, you would do something like
SELECT COUNT (DISTINCT primaryId) FROM secondary_images
andSELECT COUNT(*) FROM secondary_images
and calculate the ratio. The larger the number, the higher the "selectivity". If the number is too low, it won't make a useful index. -
stefmikhail over 12 years@Marvo - Good to know for the future. Thanks.
-
stefmikhail over 12 years@Konerak - And what, do I
echo
something after the query with explain at the beginning? -
stefmikhail over 12 years@Poodlehat - I'm only slightly following what you're saying. Have a useful link that might explain more? I would ask you to write me a detailed email on the subject, but I think a link would be easier on you ;)
-
Konerak over 12 years@stefmikhail: eh, just execute the explain. Use a MySQL client (phpmyadmin, console, mysql query browser, toad...) and see?
-
stefmikhail over 12 years@Konerak - I found a way to print it on my page in an array. But I have no idea what it means. Any chance you could take a quick look and explain? I'll post it in my question above.
-
Poodlehat over 12 years@stefmikhail look at slide #12 joinfu.com/presentations/index_coding_optimization.pdf
-
stefmikhail over 12 years@Poodlehat - Thanks for all the information, and I will, indeed read it. But in your opinion, should I, or should I not, switch from having 2 indexes, to having 1 multiple column index?
-
Poodlehat over 12 yearsI think I've been convinced to make the switch.
-
Markus Winand over 12 years
-
stefmikhail over 12 years@Poodlehat - And lastly (I promise), could you please post the sql code to make the switch? I'm still not that apt when it comes to sql. :)
-
Poodlehat over 12 yearssomething like
ALTER TABLE secondary_images DROP INDEX primaryId, DROP INDEX imgDate, ADD INDEX IdNDate (imgId DESC, primaryId ASC) ;
should do the trick. -
stefmikhail over 12 years@Poodlehat - Thanks a bunch. Wish you posted an answer so I could give you props!
-
-
Poodlehat over 12 yearsI don't understand why - it is comparing one column to another, not two columns simultaneously to two other columns. Can you explain?
-
stefmikhail over 12 years@arnaud576875 - Thanks very much! Any chance you could give me a brief explanation of why that is best choice in this situation? Also, how would I go about changing my current table using
SQL
to use a multi-column index on(primaryId, imgDate)
? -
stefmikhail over 12 years@arnaud576875 - Excellent explanation. Curious however; the
primaryId
is not unique. Up to ten different entires may have the sameprimaryId
. Does this change anything? -
stefmikhail over 12 yearsThanks again! I will change it.
-
stefmikhail over 12 yearsWow, wow wow. Thank you very much for that. So much easier to understand. So how can I use this information to decide if a multiple index key is the way to go? I won't ask you to get into the filesort issue as you yourself said that's another problem entirely, but would switching to a multiple index key help with this?
-
Konerak over 12 yearsYou should first read up about explain. The MySQL site is a great place to start, and 'High Performance MySQL' is the best MySQL book I ever read. Then, understand that it very much depends on your table, the storage engine, your configuration (cache sizes and so), and the data in the table. So, the best way to test: copy the table, and add the desired index on the copy. Then, compare the explains. This is why you need to understand explain :)
-
stefmikhail over 12 yearsThanks again. Much appreciated. I'll pick up that book.
-
Konerak over 12 yearsI usually prefer (free) internet resources (googleable, ctrl-f, bookmarks) over dead trees, but that book is great. A few of the authors have very interesting blogs too, their site is percona.com
-
stefmikhail over 12 yearsI've never checked out those internet resources. Thanks for the links. I do have several apps for my iPhone which are O'Reilly Media books on MySQL. I should read more of them.
-
timetofly almost 10 yearsImportant correction: "Using filesort" does NOT necessarily mean it has to write to disk. It can also do it in memory if it doesn't exceed the sort buffer size.
-
Martin over 8 years@Blossoming_Flower does the appearance of
using filesort
imply query inefficiency? I have queries that return a few ~16Kb from a table of ~30 rows but they always use filesort.... -
timetofly over 8 yearsIn general, yes, using filesort means it's using more resources. However, for such a small data set as yours, you won't see any difference. For differences in performance you'd need a lot more data involved.