SQL server SORT order does not correspond to ASCII code order

10,878

Solution 1

Thanks Ross et al! Found the following documentation which has an excellent explanation on MS SQL Server collation, thought I'd post it here so as to benefit those who comes across this question or related questions.


Collation A collation specifies the bit patterns that represent each character in a data set. Collations also determine the rules that sort and compare data. SQL Server supports storing objects that have different collations in a single database. For non-Unicode columns, the collation setting specifies the code page for the data and which characters can be represented. Data that is moved between non-Unicode columns must be converted from the source code page to the destination code page.

For more details read it here http://goo.gl/RpBGWN


From my code snippet, if I wanted to sort the value in a binary order, the query can be changed to the following: select c, ASCII(c) ascvalue from testtable order by c collate Latin1_General_BIN

Or change collation definition when creating the table create table testtable (c nvarchar(1) collate Latin1_General_BIN null)

Which yields the following result:

c | ascvalue
------------
8 | 56
9 | 57
: | 58
; | 59

Solution 2

The collation is not BINARY. It is CI case insensitive, so it will fold upper and lowercase characters to sort the same. It defines the sort order of punctuation and digits in a non-ASCII way as you have discovered. It is a SQL collation, doing what SQL collations do: define the sort order according to some particular rules.

Solution 3

If you want the sort to be in ASCII value of each character then you should mention that explicitly in Order by clause.

select c, ASCII(c) ascvalue from #testtable order by ASCII(c)

else SQL_Latin1_General_CP1_CI_AS

tells us that the supported language is English.

There is no BIN in the collation name that means it supports Dictionary sorting

in Dictionary Sorting; comparison of character data is based on dictionary order ('A' and 'a' < 'B' and 'b').

Dictionary order is default when no other ordering is defined explicitly

CI means character data is case insensitive (that mean 'ABC' = 'abc').

AS means character data is accent sensitive ('à' <> 'ä').

Share:
10,878
Leo Nix
Author by

Leo Nix

Updated on June 26, 2022

Comments

  • Leo Nix
    Leo Nix almost 2 years

    I'm using SQL Server 2012, and I have a database with a SQL_Latin1_General_CP1_CI_AS collation:

    create table testtable (c nvarchar(1) null)
    
    insert into testtable (c) values('8')
    insert into testtable (c) values('9')
    insert into testtable (c) values(':')
    insert into testtable (c) values(';')
    
    select c, ASCII(c) ascvalue 
    from testtable 
    order by c
    

    I get this output:

    c | ascvalue
    ------------
    : | 58
    ; | 59
    8 | 56
    9 | 57
    

    Why does SQL Server sort the order contrary to ascii code order?