Special characters won't work in MySQL (UTF-8)

29,428

Solution 1

This thread, although somewhat old, seems to result in the conclusion that cmd.exe and the mysql client don't handle UTF-8 encoding properly (with the blame being more aimed at cmd.exe).

Reading in SQL from a file is recommended, as is using an alternative client - or a flavour of UNIX. :)

Solution 2

The solution is to set the connection variables to whatever codepage your installation of windows uses (not latin1 like what a lot of pages out there recommend - cmd.exe's character encoding isn't latin1).

In my case the codepage is 850:

mysql> SET NAMES cp850;

Here's an example with the connection set to UTF-8:

mysql> show variables like '%char%';
+--------------------------+---------------------------------+
| Variable_name            | Value                           |
+--------------------------+---------------------------------+
| character_set_client     | utf8                            |
| character_set_connection | utf8                            |
| character_set_database   | utf8                            |
| character_set_filesystem | binary                          |
| character_set_results    | utf8                            |
| character_set_server     | utf8                            |
| character_set_system     | utf8                            |
| character_sets_dir       | C:\xampp\mysql\share\charsets\  |
+--------------------------+---------------------------------+
8 rows in set (0.00 sec)

This is what happens to accented characters:

mysql> select nom from assignatura where nom like '%prob%';
+---------------------------------------+
| nom                                   |
+---------------------------------------+
| Probabilitat i Processos Estocàstics |
| Probabilitat i Processos Estocàstics |
+---------------------------------------+
2 rows in set (0.03 sec)

Notice the extraneous character just before the á. Also the accent is the wrong direction, it should be à.

After executing SET NAMES cp850;:

mysql> show variables like '%char%';
+--------------------------+--------------------------------+
| Variable_name            | Value                          |
+--------------------------+--------------------------------+
| character_set_client     | cp850                          |
| character_set_connection | cp850                          |
| character_set_database   | utf8                           |
| character_set_filesystem | binary                         |
| character_set_results    | cp850                          |
| character_set_server     | utf8                           |
| character_set_system     | utf8                           |
| character_sets_dir       | C:\xampp\mysql\share\charsets\ |
+--------------------------+--------------------------------+
8 rows in set (0.00 sec)

We finally get the correct accented character:

mysql> select nom from assignatura where nom like '%prob%';
+--------------------------------------+
| nom                                  |
+--------------------------------------+
| Probabilitat i Processos Estocàstics |
| Probabilitat i Processos Estocàstics |
+--------------------------------------+
2 rows in set (0.00 sec)
Share:
29,428
Nisto
Author by

Nisto

Updated on March 15, 2020

Comments

  • Nisto
    Nisto about 4 years

    So, I've had some issues while trying to come over from Latin1 encoded databases, tables as well as columns, and now that everything is finally in UTF-8, I can't seem to update a row in a column. I am trying to replace an "e" with an e with acute (é). But it gives me this:

    ERROR 1366 (HY000): Incorrect string value: '\x82m ...' for column 'Name' at row 1

    when running this:

    UPDATE access SET Name='ém' WHERE id="2";

    All databases gives me this when running the status command (except the current database part of course):


    Connection id:          1  
    Current database:       access  
    Current user:           root@localhost  
    SSL:                    Not in use  
    Using delimiter:        ;  
    Server version:         5.1.47-community MySQL Community Server (GPL)  
    Protocol version:       10  
    Connection:             localhost via TCP/IP  
    Server characterset:    utf8  
    Db     characterset:    utf8  
    Client characterset:    utf8  
    Conn.  characterset:    utf8  
    TCP port:               3306  
    Uptime:                 20 min 16 sec  
    
    Threads: 1 Questions: 110 Slow queries: 0 Opens: 18 Flush tables: 1 Open tables: 11  Queries per second avg: 0.90
    

    And running the chcp command in cmd gives me 850. Oh, and at some points I got this:

    ERROR 1300 (HY000): Invalid utf8 character string: 'ém' WHERE id="2"

    I've looked everywhere for a solution, but I couldn't seem to find anything anywhere, and since I've always had good responses on Stackoverflow, I thought I'd ask here.

    Thanks for any help!