Using distinct on a column and doing order by on another column gives an error

33,945

Solution 1

As far as i understood from your question .

distinct :- means select a distinct(all selected values should be unique). order By :- simply means to order the selected rows as per your requirement .

The problem in your first query is For example : I have a table

ID name
01 a
02 b
03 c
04 d 
04 a

now the query select distinct(ID) from table order by (name) is confused which record it should take for ID - 04 (since two values are there,d and a in Name column). So the problem for the DB engine is here when you say order by (name).........

Solution 2

You might think about using group by instead:

select n_num
from abc_test
group by n_num
order by min(k_str)

Solution 3

The first query is impossible. Lets explain this by example. we have this test:

n_num k_str
2     a
2     c
1     b

select distinct (n_num) from abc_test is

2
1

Select n_num from abc_test order by k_str is

2
1
2

What do you want to return

select distinct (n_num) from abc_test order by k_str?

it should return only 1 and 2, but how to order them?

Solution 4

How do extended sort key columns

The logical order of operations in SQL for your first query, is (simplified):

  • FROM abc_test
  • SELECT n_num, k_str i.e. add a so called extended sort key column
  • ORDER BY k_str DESC
  • SELECT n_num i.e. remove the extended sort key column again from the result.

Thanks to the SQL standard extended sort key column feature, it is possible to order by something that is not in the SELECT clause, because it is being temporarily added to it behind the scenes prior to ordering, and then removed again after ordering.

So, why doesn't this work with DISTINCT?

If we add the DISTINCT operation, it would need to be added between SELECT and ORDER BY:

  • FROM abc_test
  • SELECT n_num, k_str i.e. add a so called extended sort key column
  • DISTINCT
  • ORDER BY k_str DESC
  • SELECT n_num i.e. remove the extended sort key column again from the result.

But now, with the extended sort key column k_str, the semantics of the DISTINCT operation has been changed, so the result will no longer be the same. This is not what we want, so both the SQL standard, and all reasonable databases forbid this usage.

Workarounds

PostgreSQL has the DISTINCT ON syntax, which can be used here for precisely this job:

SELECT DISTINCT ON (k_str) n_num
FROM abc_test
ORDER BY k_str DESC

It can be emulated with standard syntax as follows, if you're not using PostgreSQL

SELECT n_num
FROM (
  SELECT n_num, MIN(k_str) AS k_str
  FROM abc_test
  GROUP BY n_num
) t
ORDER BY k_str

Or, just simply (in this case)

SELECT n_num, MIN(k_str) AS k_str
FROM abc_test
GROUP BY n_num
ORDER BY k_str

I have blogged about SQL DISTINCT and ORDER BY more in detail here.

Share:
33,945

Related videos on Youtube

prateek gupta
Author by

prateek gupta

Updated on July 09, 2022

Comments

  • prateek gupta
    prateek gupta almost 2 years

    I have a table: abc_test with columns n_num, k_str.

    This query doesnt work:

        select distinct(n_num) from abc_test order by(k_str)
    

    But this one works:

        select n_num from abc_test order by(k_str)
    

    How do DISTINCT and ORDER BY keywords work internally that output of both the queries is changed?

    • John Doyle
      John Doyle over 12 years
      Could you show the results if you run those two queries? The first query should return error ORA-01791: not a SELECTed expression as the k_str column is not being selected. Is this the actual query you ran? (It may be acceptable in 9i, but I can't be sure.)
    • onedaywhen
      onedaywhen over 12 years
      Same effect on SQL Server but with a more descriptive error message, "ORDER BY items must appear in the select list if SELECT DISTINCT is specified."
  • Florin Ghita
    Florin Ghita over 12 years
    see the example in my answer. What do you want to return your query?
  • Nils Magne Lunde
    Nils Magne Lunde over 12 years
    @FlorinGhita What do you mean by that? I just tried the query, and to me it looks like it is working.
  • John Woo
    John Woo over 12 years
    how about this? select distinct (n_num) as iresult from abc_test ORDER BY iresult ASC
  • Florin Ghita
    Florin Ghita over 12 years
    heeeei, you edited your question. order by min(k_str) is a different thing that order by k_str
  • Florin Ghita
    Florin Ghita over 12 years
    Sure he can order by first column. The OP was about why this query won't work. What you say can be simply: select distinct n_num from abc_test order by 1
  • Nils Magne Lunde
    Nils Magne Lunde over 12 years
    @FlorinGhita Yes, the min(..) part was missing from my initial answer. Apologize for that.
  • onedaywhen
    onedaywhen over 12 years
    "how to order them?" -- I can think of possible approaches: all arbitrarily; order the unambiguous values in order then the ambiguous ones in arbitrarily; etc. But I rather suspect you were making a point that is lost on me!
  • onedaywhen
    onedaywhen over 12 years
    If all values for n_num are distinct in the table, does the error still occur? If so, does this change your conclusions?
  • Florin Ghita
    Florin Ghita over 12 years
    it is not hard to understand: for 1 you have b. for 2 you have a an c. Wich is first? 1 or 2? :)
  • Florin Ghita
    Florin Ghita over 12 years
    @onedaywhen yes. the query is technicaly wrong. It defies the theory or relation.
  • Battlefury
    Battlefury almost 3 years
    This cannot be right. Are you sure it doesnt result in duplicates?