Loading utf-8 encoded text into MySQL table

66,080

Solution 1

as said in http://dev.mysql.com/doc/refman/5.1/en/load-data.html, you can specify the charset used by your CSV file with the "CHARACTER SET" optional parameter of LOAD DATA LOCAL INFILE

Solution 2

Try

LOAD DATA INFILE 'file'
IGNORE INTO TABLE table
CHARACTER SET UTF8
FIELDS TERMINATED BY ';'
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'

Solution 3

Do not need encode your characters in the file, but you need to make sure that your file is encoding at UTF-8 before load this file to database.

Solution 4

You should send

init_command = 'SET NAMES UTF8'
use_unicode = True
charset = 'utf8'

when doing MySQLdb.connect() e.g.

dbconfig = {}
dbconfig['host']            = 'localhost'
dbconfig['user']            = ''
dbconfig['passwd']          = ''
dbconfig['db']              = ''
dbconfig['init_command']    = 'SET NAMES UTF8'
dbconfig['use_unicode']     = True
dbconfig['charset']         = 'utf8'

conn = MySQLdb.connect(**dbconfig)

edit: ah, sorry, I see you've added that you're using "LOAD DATA LOCAL INFILE" -- this wasn't clear from your initial question :)

Share:
66,080
Hossein
Author by

Hossein

Updated on July 11, 2020

Comments

  • Hossein
    Hossein almost 4 years

    I have a large CSV file that I am going to load it into a MySQL table. However, these data are encoded into utf-8 format, because they include some non-english characters. I have already set the character set of the corresponding column in the table to utf-8. But when I load my file. the non-english characters turn into weird characters(when I do a select on my table rows). Do I need to encode my data before I load the into the table? if yes how Can I do this. I am using Python to load the data and using LOAD DATA LOCAL INFILE command. thanks

  • nemnesic
    nemnesic almost 8 years
    adding "CHARACTER SET UTF8" was the key!
  • John
    John over 6 years
    Oh my, took me so long. Tried everything, it just kept converting utf8 to latin and importing it into a utf 8 table. The encoding option worked wonders.
  • John
    John over 5 years
    Basically it's an error of mysql up until the latest version, including MariaDb. If a table or column is UTF8 it needs to automatically take the correct values. Well it does not, you need to specify it and hope you've no mixed table.
  • John
    John over 5 years
    It does not say that mysql wrongly uses another charset, regardless what column charset you've set !
  • miyalys
    miyalys about 5 years
    This is programming language specific.
  • simon
    simon about 5 years
    @miyalys -- yes, it's python as specified in the question... did you downvote for that?!
  • miyalys
    miyalys about 5 years
    ...and yes. I tried to undo it but the site sadly prevents me from changing the vote before the answer is edited. So if you edit it in some fashion at some point let me know and then I'll undo it.
  • venkatesh .b
    venkatesh .b over 2 years
    CHARACTER SET UTF8 Works like magic. tried many things but this was the solution