myslqimport --use-threads

12,857

Solution 1

Are you using MySQL 5.1.7 or later?

If you want to test whether things are actually going through as expected, why not use a test schema and only a sample of data so that it runs faster?

Update With regards to whether --use-threads is working, I'm not sure of a way to definitively check. However, I can't see any real difference in some tests that I just ran with ~130M data:

mysqlimport --lock-tables --debug-info --use-threads=2 test /tmp/test/*.txt
Records: 2076063  Deleted: 0  Skipped: 0  Warnings: 0

User time 0.02, System time 0.08
Maximum resident set size 3884, Integral resident set size 0
Non-physical pagefaults 737, Physical pagefaults 0, Swaps 0
Blocks in 0 out 0, Messages in 0 out 0, Signals 0
Voluntary context switches 1340, Involuntary context switches 17

----

mysqlimport --lock-tables --debug-info --use-threads=4 test /tmp/test/*.txt
Records: 2076063  Deleted: 0  Skipped: 0  Warnings: 0

User time 0.03, System time 0.09
Maximum resident set size 3884, Integral resident set size 0
Non-physical pagefaults 738, Physical pagefaults 0, Swaps 0
Blocks in 0 out 0, Messages in 0 out 0, Signals 0
Voluntary context switches 1343, Involuntary context switches 41

----

mysqlimport --lock-tables --debug-info --use-threads=8 test /tmp/test/*.txt
Records: 2076063  Deleted: 0  Skipped: 0  Warnings: 0

User time 0.02, System time 0.09
Maximum resident set size 3884, Integral resident set size 0
Non-physical pagefaults 738, Physical pagefaults 0, Swaps 0
Blocks in 0 out 0, Messages in 0 out 0, Signals 0
Voluntary context switches 1341, Involuntary context switches 30

Solution 2

in fact it appears that the threading only kicks in when you are specifying multiple files to be imported (into multiple tables) -- it doesn't help with a single large file.

the way to tell if it's actually threading is to look at SHOW PROCESSLIST output. here you can see it actually working, though as mentioned, on different tables.

mysql> show processlist;
+-------+--------+------------------+------+---------+------+-----------+------------------------------------------------------------------------------------------------------+
| Id    | User   | Host             | db   | Command | Time | State     | Info                                                                                                 |
+-------+--------+------------------+------+---------+------+-----------+------------------------------------------------------------------------------------------------------+
|  4097 | root   | 127.0.0.1:33372  | test | Query   |    0 | executing | show processlist                                                                                     | 
|  6145 | root   | 10.2.13.44:44182 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls2.tsv' INTO TABLE `qpcrecpls2` IGNORE 0 LINES                           | 
|  7169 | root   | 10.2.13.44:44186 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls1.tsv' INTO TABLE `qpcrecpls1` IGNORE 0 LINES                           | 
|  8193 | root   | 10.2.13.44:44184 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls3.tsv' INTO TABLE `qpcrecpls3` IGNORE 0 LINES                           | 
|  9217 | root   | 10.2.13.44:44188 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls4.tsv' INTO TABLE `qpcrecpls4` IGNORE 0 LINES                           | 
| 10241 | root   | 10.2.13.44:44183 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls5.tsv' INTO TABLE `qpcrecpls5` IGNORE 0 LINES                           | 
| 11265 | root   | 10.2.13.44:44185 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls.tsv' INTO TABLE `qpcrecpls` IGNORE 0 LINES                             | 
| 12289 | root   | 10.2.13.44:44189 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls6.tsv' INTO TABLE `qpcrecpls6` IGNORE 0 LINES                           | 
| 13313 | root   | 10.2.13.44:44190 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls7.tsv' INTO TABLE `qpcrecpls7` IGNORE 0 LINES                           | 
+-------+--------+------------------+------+---------+------+-----------+------------------------------------------------------------------------------------------------------+

--verbose output is also illuminating.

Solution 3

Like @bill-karwin, I also use mydumper / myloader

On Ubuntu:

sudo apt-get install mydumper

On anything else, follow these instructions: http://centminmod.com/mydumper.html

Then, to backup:

mydumper -h [host] -u [user] -p [pass] -P [port] -B [database] -c -C --regex '^(?!excluded_table_1|excluded_table_2)' -v 3

Then, to restore:

myloader -d [dir_created_by_mydumper] -h [host] -u [user] -p [pass] -P [port] -o -C -v 3

Notes:

  • -C flag compresses MySQL protocol, only use on externally hosted DBs
  • -c uses gzip compression, good idea to leave on all the time
  • -v 3 gives verbose output (so you can see what is going on), script will run faster if you leave it off.
  • --regex can be any valid regex.
  • create a ~/.my.cnf file, and you don't need host, user, pass, port anymore.
  • mydumper --help and myloader --help will give you full list of options

Solution 4

By skimming the source code, I think you should disable the option --lock-tables, see below:

 #ifdef HAVE_LIBPTHREAD
  if (opt_use_threads && !lock_tables)
  {
    pthread_t mainthread;            /* Thread descriptor */
    pthread_attr_t attr;          /* Thread attributes */
    pthread_attr_init(&attr);
    pthread_attr_setdetachstate(&attr,
                                PTHREAD_CREATE_DETACHED);

    VOID(pthread_mutex_init(&counter_mutex, NULL));
    VOID(pthread_cond_init(&count_threshhold, NULL));

    for (counter= 0; *argv != NULL; argv++) /* Loop through tables */
    {
      pthread_mutex_lock(&counter_mutex);
      while (counter == opt_use_threads)
      {
        struct timespec abstime;

        set_timespec(abstime, 3);
        pthread_cond_timedwait(&count_threshhold, &counter_mutex, &abstime);
      }
      /* Before exiting the lock we set ourselves up for the next thread */
      counter++;
      pthread_mutex_unlock(&counter_mutex);
      /* now create the thread */
      if (pthread_create(&mainthread, &attr, worker_thread, 
                         (void *)*argv) != 0)
      {
        pthread_mutex_lock(&counter_mutex);
        counter--;
        pthread_mutex_unlock(&counter_mutex);
        fprintf(stderr,"%s: Could not create thread\n",
                my_progname);
      }
    }

    /*
      We loop until we know that all children have cleaned up.
    */
    pthread_mutex_lock(&counter_mutex);
    while (counter)
    {
      struct timespec abstime;

      set_timespec(abstime, 3);
      pthread_cond_timedwait(&count_threshhold, &counter_mutex, &abstime);
    }
    pthread_mutex_unlock(&counter_mutex);
    VOID(pthread_mutex_destroy(&counter_mutex));
    VOID(pthread_cond_destroy(&count_threshhold));
    pthread_attr_destroy(&attr);
  }
Share:
12,857
Kevin
Author by

Kevin

Updated on June 04, 2022

Comments

  • Kevin
    Kevin almost 2 years

    I have a large database I'm copying to a slave server. Trying to import it (about 15GB) via a regular mysqldump took 2 days and failed. So I'm trying the mysqldump --tab trick.

    I also want to import using --use-threads - but it doesn't seem to be doing multiple tables at once. Is there any way to tell if it's even working?

    mysqldump --single-transaction --quick --hex-blob --master-data=1 --tab=/tmp/backup/ apm
    

    on slave:

    cat *.sql | mysql apm
    mysqlimport --lock-tables --use-threads=4 apm /tmp/backup/*.txt
    

    Also: Any idea how to disable binlog without editing the conf file and restarting the server? Seems kinda dumb & slow that mysql's copying all the data into the binlog again too.

  • Kevin
    Kevin almost 13 years
    I am. But how would I tell with a small sample that it was doing it in parallel still?
  • Kevin
    Kevin almost 13 years
    Yes, I'm on 5.5.8. Surprisingly little mention of --use-threads online, does no1 do backups/restores of large databases??! Seems like anything over a few gigs would take several days with innodb. Percona has a data copy tool but it copies all database and will require more setup for me.
  • cEz
    cEz almost 13 years
    Sorry, I read "Is there any way to tell if it's even working?" as part of the whole question, not just for the use of threads. The sample suggestion was to quickly check if your process was working, as you hadn't indicated that it was known to work.
  • Kevin
    Kevin almost 11 years
    How interesting, thanks :) I've actually been using the percona mysql's hot backup scripts for the last year or so. They make a full 'bootable' mysql data directory, so there's nothing to import.
  • tiomno
    tiomno over 5 years
    Yep, from mysqlimport --help -> -l, --lock-tables Lock all tables for write (this disables threads).
  • John
    John over 3 years
    I've had segmentation faults with that tool, it's not stable