How-to manage a MySQL table with over 2 million records

11,963

Solution 1

From what I can see there the table seems fairly self contained (i.e. you don't need to do any LOJ's to pull out normalised data) so MyISAM could certainly have a positive effect on the access speed.

Secondly, and most importantly, do you have the correct indexes for your queries? 2 million rows is a few, but it's not really that many. You need to carefully go through all your SELECT queries and make sure that you have an appropriate index for each one. This will consume a bit of disk space, but the tradeoff is incredibly fast query times.

Thirdly, and this is just a personal preference and not really much to do with your specific problem I don't think, but NDATA_INVOICE_USER_ELEMENT_ATTRIBUTE1 to NDATA_INVOICE_USER_ELEMENT_ATTRIBUTE50 - this could be designed a lot smarter by moving them into a table called DATA_INVOICE_USER_ELEMENT_ATTRIBUTES with a PK of INVID,ATTRIBUTEID and having them stored vertically in there, and immediately you've saved yourself 6.25kb of space per row.

Solution 2

make sure at least your indexes fit in memory. set innodb_buffer_pool_size large enough. if you need transactions or have lot of concurrent write access - stick to innodb.

if it's read only mostly and little updates - myisam perheps + tuned memory allocation for it.

try mysqltuner.pl for some generic advice and dive into mysqlperformanceblog for more in-depth details.

Share:
11,963

Related videos on Youtube

kschneider
Author by

kschneider

Just a dude.

Updated on September 17, 2022

Comments

  • kschneider
    kschneider over 1 year

    I have an InnoDB table running on MySQL 5.0.45 in CentOS. To make matters worse it is all running in a virtual machine. There are multiple tables that have the probability of exceeding 2 million records very easily. I have noticed that starting around the 900K to 1M record mark DB performance starts to nosedive.

    I have all the power necessary to make any and all changes for the future to keep this thing standing up as well as possible under the circumstances. Should I be using MyIsam? There are only a few indices and I am most worried about getting good write performance. The program that writes data to the tables batches them in groups of about 250 requests per table and performs them on a per table basis to help things out.

    I've included the create table statement for one of the larger tables, and yes, it is a very wide table- I understand. I have tried to make the columns as narrow as I can, while still being able to reliably accommodate the data coming in.

    EDIT:

    The program does employ transactions for rolling back the changes if something hits the fan during the run, but it's basically just pumping data into the database firehose style. One 8 hour run can easily put 400K lines in EACH table like this one. This table is one of 25 that are of similar size and all have the same indices. They are all joined for querying by LINE and RUN_ID. The read performance- I am not too particularly worried about. I am trying to make the write as fast as possible.

    CREATE TABLE IF NOT EXISTS `TMD_INDATA_INVOICE` (
      `ID` int(11) NOT NULL auto_increment,
      `LINE` int(11) NOT NULL,
      `RUN_ID` int(11) NOT NULL,
      `INDATA_INVOICE_ALLOCATION_GROUP_NAME` varchar(128) default NULL,
      `INDATA_INVOICE_ALLOCATION_GROUP_OWNER` varchar(128) default NULL,
      `INDATA_INVOICE_ALLOCATION_NAME` varchar(128) default NULL,
      `INDATA_INVOICE_IS_AUDITED` varchar(5) default NULL,
      `INDATA_INVOICE_BASIS_PERCENT` varchar(32) default NULL,
      `INDATA_INVOICE_COUNTRY_OF_ORIGIN` varchar(64) default NULL,
      `INDATA_INVOICE_CUSTOMER_GROUP_NAME` varchar(128) default NULL,
      `INDATA_INVOICE_CUSTOMER_GROUP_OWNER` varchar(128) default NULL,
      `INDATA_INVOICE_CUSTOMER_NAME` varchar(128) default NULL,
      `INDATA_INVOICE_CUSTOMER_TAX_CATEGORY` varchar(128) default NULL,
      `INDATA_INVOICE_DELIVERY_TERMS` varchar(128) default NULL,
      `INDATA_INVOICE_DEPARTMENT_OF_CONSIGN` varchar(128) default NULL,
      `INDATA_INVOICE_DOCUMENT_TYPE` varchar(128) default NULL,
      `INDATA_INVOICE_END_USE` varchar(128) default NULL,
      `INDATA_INVOICE_END_USER_NAME` varchar(128) default NULL,
      `INDATA_INVOICE_FILTER_GROUP_NAME` varchar(128) default NULL,
      `INDATA_INVOICE_FILTER_GROUP_OWNER` varchar(128) default NULL,
      `INDATA_INVOICE_FISCAL_DATE` varchar(32) default NULL,
      `INDATA_INVOICE_INPUT_RECOVERY_TYPE` varchar(50) default NULL,
      `INDATA_INVOICE_INVOICE_NUMBER` varchar(128) default NULL,
      `INDATA_INVOICE_IS_AUDITING_MESSAGES` varchar(5) default NULL,
      `INDATA_INVOICE_IS_AUDIT_UPDATE` varchar(5) default NULL,
      `INDATA_INVOICE_IS_BUSINESS_SUPPLY` varchar(5) default NULL,
      `INDATA_INVOICE_IS_CREDIT` varchar(5) default NULL,
      `INDATA_INVOICE_IS_EXEMPT` varchar(5) default NULL,
      `INDATA_INVOICE_IS_NO_TAX` varchar(5) default NULL,
      `INDATA_INVOICE_IS_REPORTED` varchar(5) default NULL,
      `INDATA_INVOICE_IS_REVERSED` varchar(5) default NULL,
      `INDATA_INVOICE_IS_ROUNDING` varchar(5) default NULL,
      `INDATA_INVOICE_IS_SIMPLIFICATION` varchar(5) default NULL,
      `INDATA_INVOICE_MODE_OF_TRANSPORT` varchar(128) default NULL,
      `INDATA_INVOICE_MOVEMENT_DATE` varchar(32) default NULL,
      `INDATA_INVOICE_MOVEMENT_TYPE` varchar(128) default NULL,
      `INDATA_INVOICE_NATURE_OF_TRANSACTION_CODE` varchar(128) default NULL,
      `INDATA_INVOICE_OVERRIDE_AMOUNT` varchar(128) default NULL,
      `INDATA_INVOICE_OVERRIDE_RATE` varchar(32) default NULL,
      `INDATA_INVOICE_PORT_OF_ENTRY` varchar(128) default NULL,
      `INDATA_INVOICE_PORT_OF_LOADING` varchar(128) default NULL,
      `INDATA_INVOICE_PRODUCT_MAPPING_GROUP_NAME` varchar(128) default NULL,
      `INDATA_INVOICE_PRODUCT_MAPPING_GROUP_OWNER` varchar(128) default NULL,
      `INDATA_INVOICE_REGIME` varchar(128) default NULL,
      `INDATA_INVOICE_SUPPLY_EXEMPT_PERCENT` varchar(32) default NULL,
      `INDATA_INVOICE_SUPPLY_TYPE` varchar(128) default NULL,
      `INDATA_INVOICE_TITLE_TRANSFER_LOCATION` varchar(128) default NULL,
      `INDATA_INVOICE_VENDOR_NAME` varchar(128) default NULL,
      `INDATA_INVOICE_VENDOR_NUMBER` varchar(128) default NULL,
      `INDATA_INVOICE_VENDOR_TAX` varchar(128) default NULL,
      `INDATA_INVOICE_VERSION` varchar(5) default NULL,
      `INDATA_INVOICE_CALCULATION_DIRECTION` varchar(5) default NULL,
      `INDATA_INVOICE_CALLING_SYSTEM_NUMBER` varchar(128) default NULL,
      `INDATA_INVOICE_COMPANY_NAME` varchar(128) default NULL,
      `INDATA_INVOICE_COMPANY_ROLE` varchar(20) default NULL,
      `INDATA_INVOICE_CUSTOMER_NUMBER` varchar(128) default NULL,
      `INDATA_INVOICE_CURRENCY_CODE` varchar(32) default NULL,
      `INDATA_INVOICE_EXTERNAL_COMPANY_ID` varchar(128) default NULL,
      `INDATA_INVOICE_HOST_SYSTEM` varchar(128) default NULL,
      `INDATA_INVOICE_INVOICE_DATE` varchar(32) default NULL,
      `INDATA_INVOICE_POINT_OF_TITLE_TRANSFER` varchar(32) default NULL,
      `INDATA_INVOICE_REGISTRATIONS_BUYER_ROLE` varchar(32) default NULL,
      `INDATA_INVOICE_REGISTRATIONS_MIDDLEMAN_ROLE` varchar(32) default NULL,
      `INDATA_INVOICE_REGISTRATIONS_SELLER_ROLE` varchar(32) default NULL,
      `INDATA_INVOICE_VAT_GROUP_REGISTRATION` varchar(32) default NULL,
      `INDATA_INVOICE_TRANSACTION_TYPE` varchar(5) default NULL,
      `INDATA_INVOICE_UNIQUE_INVOICE_NUMBER` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE1` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE2` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE3` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE4` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE5` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE6` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE7` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE8` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE9` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE10` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE11` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE12` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE13` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE14` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE15` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE16` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE17` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE18` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE19` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE20` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE21` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE22` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE23` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE24` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE25` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE26` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE27` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE28` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE29` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE30` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE31` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE32` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE33` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE34` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE35` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE36` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE37` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE38` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE39` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE40` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE41` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE42` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE43` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE44` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE45` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE46` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE47` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE48` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE49` varchar(128) default NULL,
      `INDATA_INVOICE_USER_ELEMENT_ATTRIBUTE50` varchar(128) default NULL,
      `INDATA_INVOICE_ORIGINAL_DOCUMENT_ID` varchar(128) default NULL,
      `INDATA_INVOICE_ORIGINAL_DOCUMENT_ITEM` varchar(128) default NULL,
      `INDATA_INVOICE_ORIGINAL_DOCUMENT_TYPE` varchar(128) default NULL,
      `INDATA_INVOICE_ORIGINAL_INVOICE_DATE` varchar(32) default NULL,
      `INDATA_INVOICE_ORIGINAL_INVOICE_NUMBER` varchar(128) default NULL,
      `INDATA_INVOICE_ORIGINAL_MOVEMENT_DATE` varchar(32) default NULL,
      PRIMARY KEY  (`ID`),
      KEY `RUN_ID` USING BTREE (`RUN_ID`),
      KEY `LINE` (`LINE`)
    ) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=4011 ;
    
    --
    -- Constraints for dumped tables
    --
    
    --
    -- Constraints for table `TMD_INDATA_INVOICE`
    --
    ALTER TABLE `TMD_INDATA_INVOICE`
      ADD CONSTRAINT `TMD_INDATA_INVOICE_ibfk_1` FOREIGN KEY (`RUN_ID`) REFERENCES `RunHistory` (`id`) ON DELETE CASCADE;
    
  • Nick Kavadias
    Nick Kavadias over 13 years
    normalizing those attributes is not just a personal preference. IT WILL improve performance once they become normalized. Less table scanning (assuming you create some effective indexes) and smaller row size
  • kschneider
    kschneider over 13 years
    The reason the USER_ELEMENT_ATTRIBUTE# are denormalized and wide in the table, is because they all pretty much get used. If they were used sparingly, I would probably have put them in their own lookup table. The application that uses them uses them the same way I have them laid out as generic information dumps based on user defined settings, but they are almost always utilized to about 40-45 columns, so its not quite as retarded as I feel it looks.