MySQL Index is bigger than the data stored
Solution 1
You can remove index A, because, as you have noted, it is a subset of another index. And it's possible to do this without disrupting normal processing.
The size of the index files is not alarming in itself and it can easily be true that the net benefit is positive. In other words, the usefulness and value of an index shouldn't be discounted because it results in a large file.
Index design is a complex and subtle art involving a deep understanding of the query optimizer explanations and extensive testing. But one common mistake is to include too few fields in an index in order to make it smaller. Another is to test indexes with insufficient, or insufficiently representative data.
Solution 2
I may be wrong, but the first index (idx_customer_invoice
) is UNIQUE, the second (idx_customer_invoice_order
) is not, so you'll probably lose the uniqueness constraint when you remove it. No?
Solution 3
Is there a way to find unused indexes in MySQL?
The database engine optimizer will select a proper index when attempting to optimize your query. Depending on when you collected statistics on your indexes last, the index which is chosen will vary. Unused indexes could suddenly become used because of new data repartition.
Can indexA safely be removed?
I would say yes, if indexA and indexB are B-Tree indexes. This is because an index that starts with the same columns in the same order will have the same structure.
Solution 4
use
show indexes from table;
to define what indexes do you have in a particular table. Cardinality would tell how useful your index is.
You can remove your indexes safely (it will not break a table), but beware: some queries might execute slower. First you should analyze your queries to decide whether you need a certain index or not.
I don't think you can find out data length of a particular index, though.
BUT, I think you probably think that if indexes length is greater than data length twice is something abnormal... Well, you are wrong. All of your indexes might be useful ;) If you have a table that provides a lot of information and you have to search on it upon a large number of column, it can easily be that indexes of this table will 2 times bigger in size that the tables data.
Peter Lindqvist
Updated on June 03, 2022Comments
-
Peter Lindqvist almost 2 years
I have a database with the following stats
Tables Data Index Total 11 579,6 MB 0,9 GB 1,5 GB
So as you can see the Index is close to 2x bigger. And there is one table with ~7 million rows that takes up at least 99% of this.
I also have two indexes that are very similar
a) UNIQUE KEY `idx_customer_invoice` (`customer_id`,`invoice_no`), b) KEY `idx_customer_invoice_order` (`customer_id`,`invoice_no`,`order_no`)
Update: Here is the table definition (at least structurally) of the largest table
CREATE TABLE `invoices` ( `id` int(10) unsigned NOT NULL auto_increment, `customer_id` int(10) unsigned NOT NULL, `order_no` varchar(10) default NULL, `invoice_no` varchar(20) default NULL, `customer_no` varchar(20) default NULL, `name` varchar(45) NOT NULL default '', `archived` tinyint(4) default NULL, `invoiced` tinyint(4) default NULL, `time` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP, `group` int(11) default NULL, `customer_group` int(11) default NULL, PRIMARY KEY (`id`), UNIQUE KEY `idx_customer_invoice` (`customer_id`,`invoice_no`), KEY `idx_time` (`time`), KEY `idx_order` (`order_no`), KEY `idx_customer_invoice_order` (`customer_id`,`invoice_no`,`order_no`) ) ENGINE=InnoDB AUTO_INCREMENT=9146048 DEFAULT CHARSET=latin1 |
Update 2:
mysql> show indexes from invoices; +----------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +----------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | invoices | 0 | PRIMARY | 1 | id | A | 7578066 | NULL | NULL | | BTREE | | | invoices | 0 | idx_customer_invoice | 1 | customer_id | A | 17 | NULL | NULL | | BTREE | | | invoices | 0 | idx_customer_invoice | 2 | invoice_no | A | 7578066 | NULL | NULL | YES | BTREE | | | invoices | 1 | idx_time | 1 | time | A | 541290 | NULL | NULL | | BTREE | | | invoices | 1 | idx_order | 1 | order_no | A | 6091 | NULL | NULL | YES | BTREE | | | invoices | 1 | idx_customer_invoice_order | 1 | customer_id | A | 17 | NULL | NULL | | BTREE | | | invoices | 1 | idx_customer_invoice_order | 2 | invoice_no | A | 7578066 | NULL | NULL | YES | BTREE | | | invoices | 1 | idx_customer_invoice_order | 3 | order_no | A | 7578066 | NULL | NULL | YES | BTREE | | +----------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
My questions are:
- Is there a way to find unused indexes in MySQL?
- Are there any common mistakes that impact the size of the index?
- Can indexA safely be removed?
- How can you measure the size of each index? All I get is the total of all indexes.
-
Benoit about 13 yearsFor the two other questions I am not sure I can answer properly.
-
Peter Lindqvist about 13 yearsCan you specify how cardinality corresponds with usefulness? And the indexes are there for a reason, the performance of the queries against this table is critical to the application.
-
Peter Lindqvist about 13 yearsBut, I'm glad to hear that it's not abnormal at least... :P
-
Nemoden about 13 yearsIt is said in official documentation: "The higher the cardinality, the greater the chance that MySQL uses the index when doing joins". Practically it means this column will be used in joins more times then other with low cardinality. Cardinality evaluation is based on statistics. How exactly? Well... I don't know :) It is also true that if cardinality is high, this index consumes more volume since "Cardinality is an estimate of the number of unique values in the index".
-
Peter Lindqvist about 13 years3. This gives me a number, 1003831296, what does it mean?
-
Nemoden about 13 years@Peter Lindqvist, absolutely not. In fact, if we did not have many indexes on main table at site I work on (600k+ every day users), our MySQL server would always be down. Index length/Data length in this tables is 2 (index is 2 times bigger). And storage engine is InnoDB. Just like in your case.
-
Peter Lindqvist about 13 yearsI can agree on the 'subtle art'.
-
neocanable about 13 years@Peter Lindqvist index length is 1003831296B ; you also can use show table status like 'your_table_name'
-
Nemoden about 13 yearsAnd... yes. I agree with the colleagues: indexA may be removed.
-
Peter Lindqvist about 13 yearsHmm, I would like to see size of individual indexes.