MySQL Index is bigger than the data stored

10,378

Solution 1

You can remove index A, because, as you have noted, it is a subset of another index. And it's possible to do this without disrupting normal processing.

The size of the index files is not alarming in itself and it can easily be true that the net benefit is positive. In other words, the usefulness and value of an index shouldn't be discounted because it results in a large file.

Index design is a complex and subtle art involving a deep understanding of the query optimizer explanations and extensive testing. But one common mistake is to include too few fields in an index in order to make it smaller. Another is to test indexes with insufficient, or insufficiently representative data.

Solution 2

I may be wrong, but the first index (idx_customer_invoice) is UNIQUE, the second (idx_customer_invoice_order) is not, so you'll probably lose the uniqueness constraint when you remove it. No?

Solution 3

Is there a way to find unused indexes in MySQL?

The database engine optimizer will select a proper index when attempting to optimize your query. Depending on when you collected statistics on your indexes last, the index which is chosen will vary. Unused indexes could suddenly become used because of new data repartition.

Can indexA safely be removed?

I would say yes, if indexA and indexB are B-Tree indexes. This is because an index that starts with the same columns in the same order will have the same structure.

Solution 4

use

show indexes from table;

to define what indexes do you have in a particular table. Cardinality would tell how useful your index is.

You can remove your indexes safely (it will not break a table), but beware: some queries might execute slower. First you should analyze your queries to decide whether you need a certain index or not.

I don't think you can find out data length of a particular index, though.

BUT, I think you probably think that if indexes length is greater than data length twice is something abnormal... Well, you are wrong. All of your indexes might be useful ;) If you have a table that provides a lot of information and you have to search on it upon a large number of column, it can easily be that indexes of this table will 2 times bigger in size that the tables data.

Share:
10,378
Peter Lindqvist
Author by

Peter Lindqvist

Updated on June 03, 2022

Comments

  • Peter Lindqvist
    Peter Lindqvist almost 2 years

    I have a database with the following stats

    Tables     Data   Index   Total
    11     579,6 MB  0,9 GB  1,5 GB
    

    So as you can see the Index is close to 2x bigger. And there is one table with ~7 million rows that takes up at least 99% of this.

    I also have two indexes that are very similar

    a) UNIQUE KEY `idx_customer_invoice` (`customer_id`,`invoice_no`),
    b) KEY `idx_customer_invoice_order` (`customer_id`,`invoice_no`,`order_no`)
    

    Update: Here is the table definition (at least structurally) of the largest table

    CREATE TABLE `invoices` (
      `id` int(10) unsigned NOT NULL auto_increment,
      `customer_id` int(10) unsigned NOT NULL,
      `order_no` varchar(10) default NULL,
      `invoice_no` varchar(20) default NULL,
      `customer_no` varchar(20) default NULL,
      `name` varchar(45) NOT NULL default '',
      `archived` tinyint(4) default NULL,
      `invoiced` tinyint(4) default NULL,
      `time` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
      `group` int(11) default NULL,
      `customer_group` int(11) default NULL,
      PRIMARY KEY  (`id`),
      UNIQUE KEY `idx_customer_invoice` (`customer_id`,`invoice_no`),
      KEY `idx_time` (`time`),
      KEY `idx_order` (`order_no`),
      KEY `idx_customer_invoice_order` (`customer_id`,`invoice_no`,`order_no`)
    ) ENGINE=InnoDB AUTO_INCREMENT=9146048 DEFAULT CHARSET=latin1 |
    

    Update 2:

    mysql> show indexes from invoices;
    +----------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
    | Table    | Non_unique | Key_name                   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
    +----------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
    | invoices |          0 | PRIMARY                    |            1 | id          | A         |     7578066 |     NULL | NULL   |      | BTREE      |         |
    | invoices |          0 | idx_customer_invoice       |            1 | customer_id | A         |          17 |     NULL | NULL   |      | BTREE      |         |
    | invoices |          0 | idx_customer_invoice       |            2 | invoice_no  | A         |     7578066 |     NULL | NULL   | YES  | BTREE      |         |
    | invoices |          1 | idx_time                   |            1 | time        | A         |      541290 |     NULL | NULL   |      | BTREE      |         |
    | invoices |          1 | idx_order                  |            1 | order_no    | A         |        6091 |     NULL | NULL   | YES  | BTREE      |         |
    | invoices |          1 | idx_customer_invoice_order |            1 | customer_id | A         |          17 |     NULL | NULL   |      | BTREE      |         |
    | invoices |          1 | idx_customer_invoice_order |            2 | invoice_no  | A         |     7578066 |     NULL | NULL   | YES  | BTREE      |         |
    | invoices |          1 | idx_customer_invoice_order |            3 | order_no    | A         |     7578066 |     NULL | NULL   | YES  | BTREE      |         |
    +----------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
    

    My questions are:

    1. Is there a way to find unused indexes in MySQL?
    2. Are there any common mistakes that impact the size of the index?
    3. Can indexA safely be removed?
    4. How can you measure the size of each index? All I get is the total of all indexes.
  • Benoit
    Benoit about 13 years
    For the two other questions I am not sure I can answer properly.
  • Peter Lindqvist
    Peter Lindqvist about 13 years
    Can you specify how cardinality corresponds with usefulness? And the indexes are there for a reason, the performance of the queries against this table is critical to the application.
  • Peter Lindqvist
    Peter Lindqvist about 13 years
    But, I'm glad to hear that it's not abnormal at least... :P
  • Nemoden
    Nemoden about 13 years
    It is said in official documentation: "The higher the cardinality, the greater the chance that MySQL uses the index when doing joins". Practically it means this column will be used in joins more times then other with low cardinality. Cardinality evaluation is based on statistics. How exactly? Well... I don't know :) It is also true that if cardinality is high, this index consumes more volume since "Cardinality is an estimate of the number of unique values in the index".
  • Peter Lindqvist
    Peter Lindqvist about 13 years
    3. This gives me a number, 1003831296, what does it mean?
  • Nemoden
    Nemoden about 13 years
    @Peter Lindqvist, absolutely not. In fact, if we did not have many indexes on main table at site I work on (600k+ every day users), our MySQL server would always be down. Index length/Data length in this tables is 2 (index is 2 times bigger). And storage engine is InnoDB. Just like in your case.
  • Peter Lindqvist
    Peter Lindqvist about 13 years
    I can agree on the 'subtle art'.
  • neocanable
    neocanable about 13 years
    @Peter Lindqvist index length is 1003831296B ; you also can use show table status like 'your_table_name'
  • Nemoden
    Nemoden about 13 years
    And... yes. I agree with the colleagues: indexA may be removed.
  • Peter Lindqvist
    Peter Lindqvist about 13 years
    Hmm, I would like to see size of individual indexes.