MySQL optimization of huge table

48,542

Here are some innodb examples that work on large tables of approx. 60 to 500 million rows that demonstrate the advantages of a well designed innodb table and how best to use clustered indexes (only available with innodb)

MySQL and NoSQL: Help me to choose the right one

60 million entries, select entries from a certain month. How to optimize database?

Rewriting mysql select to reduce time and writing tmp to disk

You will also want to read the following:

http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html

http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/

Once you've sorted out your table designs and optimised your innodb config:

http://www.mysqlperformanceblog.com/2006/09/29/what-to-tune-in-mysql-server-after-installation/

http://www.mysqlperformanceblog.com/2007/11/03/choosing-innodb_buffer_pool_size/

You can try something like:

start transaction;

insert into target_table (x,y) select x,y from source_table order by x,y;

commit;

Hope this helps.

Share:
48,542

Related videos on Youtube

lordstyx
Author by

lordstyx

Updated on July 09, 2022

Comments

  • lordstyx
    lordstyx over 1 year

    I've been trying to get some speed improvements on a certain SELECT query. The situation is as follows: There is a (in my eyes) huge crossing table. It currently has about 20 million rows, but I expect this to grow alot. Based on this crossing table I need to create another table. To do so, I need to execute the following query:

    SELECT hugeCrossingTable.field3, otherTable.field1, hugeCrossingTable.field2 * otherTable.field3 AS someName
    FROM hugeCrossingTable 
    INNER JOIN otherTable ON hugeCrossingTable.field1 = otherTable.field2
    

    Now this currently results in about a million rows. I already have indexes on both the field1 in the 2 tables, but it still takes 18 minutes to finish.. I thought about splitting the table, but then I'd need to find a way on how to split the data, and since it's just a crossing table nothing comes to mind on how to do this.

    Any ideas on how this can be optimized?

    Thanks.

    On request here's the create statement:

    CREATE TABLE  `hugeCrossingTable` (
      `field` int(11) NOT NULL,
      `field1` int(11) NOT NULL,
      `field2` double(10,5) DEFAULT NULL,
      `field3` int(4) DEFAULT NULL,
      KEY `field1` (`field1`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
    
    CREATE TABLE  `otherTable` (
      `field` int(10) unsigned NOT NULL AUTO_INCREMENT,
      `field1` int(10) unsigned NOT NULL,
      `field2` int(10) unsigned NOT NULL,
      `field3` decimal(5,2) NOT NULL,
      PRIMARY KEY (`field`)
    ) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1;
    

    And the explain output:

    id, select_type, table              , type , possible_keys, key     , key_len, ref                , rows, Extra
    1 , 'SIMPLE'   , 'otherTable'       , 'ALL', ''           , ''      , ''     , ''                 , 294 , ''
    1 , 'SIMPLE'   , 'hugeCrossingTable', 'ref', 'field1'     , 'field1', '4'    , 'otherTable.field2', 69  , 'Using where'
    
    • Wrikken
      Wrikken almost 13 years
      Show us the CREATE TABLE statements, and the EXPLAIN output of the query. If the indexes are correct, but you have very specific ranges, PARTITIONING the table might help.
    • Joseph Lust
      Joseph Lust almost 13 years
      JOIN is the devil for large tables. You pretty much only want to access a table that size by an index or the primary key. Can you change the table format to suite the query, or even use a temp memory table? This can take you from minutes to ms in query time. Remember, Normalization does not imply speed.
  • lordstyx
    lordstyx almost 13 years
    Wow, this really looks promising! Thanks alot already. Updating the indexes of the 20 mil table is going to take a while, I'll get back to this once it's done.
  • Jon Black
    Jon Black almost 13 years
    probably quicker to select into outfile in primary key order then import back using load data infile - 10 mins vs... hours :P
  • lordstyx
    lordstyx almost 13 years
    Alright, so I removed KEY field1 (field1) from hugeCrossingTable and added PRIMARY KEY (field1,field3) and to otherTable I added KEY theKey (field1,field2) and that brought the time of the query down to 4 seconds. (Maybe it would have been faster to do that, but I did it overnight so that's no problem!)
  • lordstyx
    lordstyx almost 13 years
    Oh that should be 4 miliseconds. Sorry!
  • lordstyx
    lordstyx almost 13 years
    This optimization is great for reading, but what about writing? I have a table about 1/4th the size (but growing of course) of this one that's written to quite a lot, but also read from quite frequently (writes > reads). It takes about 30 seconds to read from, and the writing isn't quite fast either. Any ideas on how to get the time down on both queries? Adding more indexes would only make the reading faster
  • haneulkim
    haneulkim about 4 years
    What if your engine is MyISAM?

Related