COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'binary'?

21,024

Solution 1

The COLLATE in my example sets the collation of the return value of LOCATE, the result of which is of type binary.

To set the collation of the arguments:

mysql> SELECT LOCATE(_utf8"n" COLLATE utf8_general_ci,
                     _utf8"München" COLLATE utf8_general_ci) AS locate;
+--------+
| locate |
+--------+
|      3 |
+--------+
1 row in set (0.00 sec)

My motivation actually was finding out whether MySQL takes the collation into account when searching for the substring. Unfortunately it does not. See the result of the second command:

mysql> SELECT LOCATE(_utf8"ü" COLLATE utf8_general_ci,
                     _utf8"München" COLLATE utf8_general_ci) AS locate;
+--------+
| locate |
+--------+
|      2 |
+--------+
1 row in set (0.00 sec)

mysql> SELECT LOCATE(_utf8"u" COLLATE utf8_general_ci,
                     _utf8"München" COLLATE utf8_general_ci) AS locate;
+--------+
| locate |
+--------+
|      0 |
+--------+
1 row in set (0.00 sec)

Test with a temporary table (collation taken into account in WHERE clause, but not in LOCATE):

mysql> CREATE TEMPORARY TABLE test
       (text VARCHAR(100) CHARACTER SET utf8 COLLATE utf8_general_ci);
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO test VALUES("München");
Query OK, 1 row affected (0.00 sec)

mysql> SELECT text FROM test WHERE text LIKE "%u%";
+---------+
| text    |
+---------+
| München |
+---------+
1 row in set (0.00 sec)

mysql> SELECT LOCATE("u", text) AS locate FROM test WHERE text LIKE "%u%"; 
+--------+
| locate |
+--------+
|      0 |
+--------+
1 row in set (0.01 sec)

Solution 2

Possibly the server has been compiled with a default character set of binary, so that string literals are being interpreted as such, or the client is set to use a binary mode when communicating with the server. You can change the client and connection character set by calling SET NAMES utf8 (though this is not recommended if your SQL statements are being issued from PHP, for example, as PHP will have its own commands for setting the connection character set). See Connection Character Sets and Collations in the MySQL reference manual.

Alternatively you can use "introducers" to specify explicitly the charset used for the string literals in your LOCATE function, for instance:

LOCATE(_utf8"n", _utf8"München")

See the reference manual page Character String Literal Character Set and Collation for more details.

Share:
21,024
feklee
Author by

feklee

XMPP: [email protected] Freenode IRC: feklee PGP 0x5EF8B6017F668171259945D6BEF6EFD38FE8DCA0

Updated on October 24, 2020

Comments

  • feklee
    feklee over 3 years
    mysql> SELECT LOCATE("n", "München") COLLATE utf8_general_ci;
    ERROR 1253 (42000): COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'binary'
    

    How do I get rid of this error?

    What I already tried (copy&paste):

    $ mysql -u admin -p $DATABASE
    Enter password:
    Reading table information for completion of table and column names
    You can turn off this feature to get a quicker startup with -A
    
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 2
    Server version: 5.1.69 Source distribution
    
    Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
    
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    mysql> SELECT LOCATE("n", "München") COLLATE utf8_general_ci;
    ERROR 1253 (42000): COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'binary'
    mysql> SET NAMES utf8;
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> SELECT LOCATE("n", "München") COLLATE utf8_general_ci;
    ERROR 1253 (42000): COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'binary'
    mysql> SELECT LOCATE(_utf8"n", _utf8"München") COLLATE utf8_general_ci;
    ERROR 1253 (42000): COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'binary'
    mysql> SHOW VARIABLES LIKE "character_set_database";
    +------------------------+-------+
    | Variable_name          | Value |
    +------------------------+-------+
    | character_set_database | utf8  |
    +------------------------+-------+
    1 row in set (0.00 sec)
    
  • feklee
    feklee almost 11 years
    Which table? My query does not reference a table.
  • Bobulous
    Bobulous almost 11 years
    Sorry, half asleep here. I've replaced the answer completely.
  • feklee
    feklee almost 11 years
    Already before asking the question, I tried SET NAMES utf8, and now again, without success. The introducers look like a nice solution, but they dont't work: The error message persists. I tried: SELECT LOCATE(_utf8"n", _utf8"München") COLLATE utf8_general_ci By the way, I am directly using the console, to play around. I started the console with: mysql -u admin -p $database I just ran show variables like "character_set_database". Result: utf8
  • Bobulous
    Bobulous almost 11 years
    I am wondering, why does the collation matter? LOCATE should return a numeric result, so you surely can't vary the order of a set of one integer?
  • feklee
    feklee almost 11 years
    I now added copy&pasted an example session into my question.