MYSQL case sensitive search for utf8_bin field

34,342

Solution 1

A string in MySQL has a character set and a collation. Utf8 is the character set, and utf8_bin is one of its collations. To compare your string literal to an utf8 column, convert it to utf8 by prefixing it with the _charset notation:

_utf8 'Something'

Now a collation is only valid for some character sets. The case-sensitive collation for utf8 appears to be utf8_bin, which you can specify like:

_utf8 'Something' collate utf8_bin

With these conversions, the query should work:

select * from page where pageTitle = _utf8 'Something' collate utf8_bin

The _charset prefix works with string literals. To change the character set of a field, there is CONVERT ... USING. This is useful when you'd like to convert the pageTitle field to another character set, as in:

select * from page 
where convert(pageTitle using latin1) collate latin1_general_cs = 'Something'

To see the character and collation for a column named 'col' in a table called 'TAB', try:

select distinct collation(col), charset(col) from TAB

A list of all character sets and collations can be found with:

show character set
show collation

And all valid collations for utf8 can be found with:

show collation where charset = 'utf8'

Solution 2

Try this, Its working for me

SELECT * FROM users WHERE UPPER(name) = UPPER('josé') COLLATE utf8_bin;

Solution 3

Also please note that in case of using "Collate utf8_general_ci" or "Collate latin1_general_ci", i.e. "force" collate - such a converting will prevent from usage of existing indexes! This could be a bottleneck in future for performance.

Solution 4

May I ask why you have a need to explicitly change the collation when you do a SELECT? Why not just collate in the way you want to retrieve the records when sorted?

The problem you are having with your searches being case sensitive is that you have a binary collation. Try instead to use the general collation. For more information about case sensitivity and collations, look here: Case Sensitivity in String Searches

Share:
34,342

Related videos on Youtube

Admin
Author by

Admin

Updated on December 24, 2020

Comments

  • Admin
    Admin over 3 years

    I created a table and set the collation to utf8 in order to be able to add a unique index to a field. Now I need to do case insensitive searches, but when I performed some queries with the collate keyword and I got:

    mysql> select * from page where pageTitle="Something" Collate utf8_general_ci;
    

    ERROR 1253 (42000): COLLATION 'utf8_general_ci' is not valid for CHARACTER SET 'latin1'

    mysql> select * from page where pageTitle="Something" Collate latin1_general_ci;
    

    ERROR 1267 (HY000): Illegal mix of collations (utf8_bin,IMPLICIT) and (latin1_general_ci,EXPLICIT) for operation '='

    I am pretty new to SQL, so I was wondering if anyone could help.

  • umpirsky
    umpirsky over 14 years
    But what if I need a binary collation, and I want case insensitive search. With general collation, if you have unique field, you'll get error when trying to insert 'Čačak' if 'Cacak' already exists.
  • Crazy Joe Malloy
    Crazy Joe Malloy over 13 years
    Awesome - I had a similar problem but I needed latin1 instead of utf8, _latin1 did the job for me.