Specified "COLLATE Latin1_General_CS_AS", query result still "case insensitive"

12,724

Solution 1

According the SQL Server Books Online, the characters included in range searches depend on the sorting rules of the collation. Collation Latin1_General_CS_AS uses dictionary order so both upper and lower case characters of the specified range are included.

Specify a binary collation to get the behavior you want (the 26 ASCII characters in the 97-122 code point range):

SELECT empid, lastname
FROM HR.Employees
WHERE lastname COLLATE Latin1_General_BIN LIKE '[a-z]%';

Note that the explicit collation results in a non-sargable expression so a full table/index scan will be required even if an index on lastname exists. Although it probably won't help here since most (if not all) last names start with a letter, one can generally add a sargable expression for the case-insensitive superset to facilitate an index seek. If the index is used, rows matching the seek predicate will be additionally filtered by the case-sensitive predicate.

SELECT empid, lastname
FROM HR.Employees
WHERE lastname COLLATE Latin1_General_BIN LIKE '[a-z]%'
    AND lastname LIKE '[a-z]%';

Solution 2

[a-c] will included a, A, b, B, c, so give each letter you want in this range like bellow will works.

select empid, lastname
from HR.Employees
where LastName LIKE '[a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z]%' COLLATE Latin1_General_CS_AS;

Solution 3

Another approach might be to look at the ASCII-Code of the first character:

DECLARE @tbl TABLE(Name VARCHAR(100));
INSERT INTO @tbl VALUES('Adam'),('adam')
                      ,('John'),('john')
                      ,('Jane'),('jane')
                      ,('Zaccharias'),('zaccharias');
SELECT * 
FROM @tbl
WHERE ASCII(LEFT(Name,1)) BETWEEN ASCII('a') AND ASCII('z')
Share:
12,724

Related videos on Youtube

Thor
Author by

Thor

Updated on September 18, 2022

Comments

  • Thor
    Thor over 1 year

    I want to only selects rows from a table where the last name of the employees stands with a lowercase letter. However, when I run the query below

    SELECT empid, lastname
     FROM HR.Employees
     WHERE lastname COLLATE Latin1_General_CS_AS LIKE '[a-z]%';
    

    it returned all rows where the employees' last name starts with either lowercase letter or uppercase letter.

    I have already specified that the query should be case sensitive with COLLATE Latin1_General_CS_AS, how come it is not filtering out last name that starts with capital letter?

  • Thor
    Thor over 7 years
    Thank you very much for the answer. Judging from your answer, I assume that by specifying "BIN", the query result is still accent sensitive?
  • Dan Guzman
    Dan Guzman over 7 years
    @TonyStark, yes, binary collations use the underling code points so characters are sensitive in all regards.