ROW_NUMBER Without ORDER BY

89,606

Solution 1

There is no need to worry about specifying constant in the ORDER BY expression. The following is quoted from the Microsoft SQL Server 2012 High-Performance T-SQL Using Window Functions written by Itzik Ben-Gan (it was available for free download from Microsoft free e-books site):

As mentioned, a window order clause is mandatory, and SQL Server doesn’t allow the ordering to be based on a constant—for example, ORDER BY NULL. But surprisingly, when passing an expression based on a subquery that returns a constant—for example, ORDER BY (SELECT NULL)—SQL Server will accept it. At the same time, the optimizer un-nests, or expands, the expression and realizes that the ordering is the same for all rows. Therefore, it removes the ordering requirement from the input data. Here’s a complete query demonstrating this technique:

SELECT actid, tranid, val,
 ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS rownum
FROM dbo.Transactions;

enter image description here

Observe in the properties of the Index Scan iterator that the Ordered property is False, meaning that the iterator is not required to return the data in index key order


The above means that when you are using constant ordering is not performed. I will strongly recommend to read the book as Itzik Ben-Gan describes in depth how the window functions are working and how to optimize various of cases when they are used.

Solution 2

Try just order by 1. Read the error message. Then reinstate the order by (select 1). Realise that whoever wrote this has, at some point, read the error message and then decided that the right thing to do is to trick the system into not raising an error rather than realising the fundamental truth that the error was trying to alert them to.

Tables have no inherent order. If you want some form of ordering that you can rely upon, it's up to you to provide enough deterministic expression(s) to any ORDER BY clause such that each row is uniquely identified and ordered.

Anything else, including tricking the system into not emitting errors, is hoping that the system will do something sensible without using the tools provided to you to ensure that it does something sensible - a well specified ORDER BY clause.

Solution 3

You can use any literal value

ex

order by (select 0)

order by (select null)

order by (select 'test')

etc

Refer this for more information https://exploresql.com/2017/03/31/row_number-function-with-no-specific-order/

Solution 4

What is select 1 here?

In this scenario, the author of query does not really have any particular sorting in mind. ROW_NUMBER requires ORDER BY clause so providing it is a way to satisfy the parser.

Sorting by "constant" will create "undeterministic" order(query optimizer is able to choose whatever order it found suitable).

Easiest way to think about it is as:

ROW_NUMBER() OVER(ORDER BY 1)    -- error
ROW_NUMBER() OVER(ORDER BY NULL) -- error

There are few possible scenarios to provide constant expression to "trick" query optimizer:

ROW_NUMBER() OVER(ORDER BY (SELECT 1)) -- already presented

Other options:

ROW_NUMBER() OVER(ORDER BY 1/0)       -- should not be used
ROW_NUMBER() OVER(ORDER BY @@SPID)
ROW_NUMBER() OVER(ORDER BY DB_ID())
ROW_NUMBER() OVER(ORDER BY USER_ID())

db<>fiddle demo

Share:
89,606
lucy
Author by

lucy

Updated on July 05, 2022

Comments

  • lucy
    lucy almost 2 years

    I've to add row number in my existing query so that I can track how much data has been added into Redis. If my query failed so I can start from that row no which is updated in other table.

    Query to get data start after 1000 row from table

    SELECT * FROM (SELECT *, ROW_NUMBER() OVER (Order by (select 1)) as rn ) as X where rn > 1000
    

    Query is working fine. If any way that I can get the row no without using order by.

    What is select 1 here?

    Is the query optimized or I can do it by other ways. Please provide the better solution.

  • lucy
    lucy about 7 years
    so I am doing right, select 1.? I did not find anything else for this problem
  • lucy
    lucy about 7 years
    gotqn said in comment that select 1 means random order. So if add condition in query where row no > 1000. So it may be give previous record because it is giving random order. Or it is correct to use
  • lucy
    lucy about 7 years
    every time select 0 will give the same order when I execute the query?
  • lucy
    lucy about 7 years
    so query, SELECT * FROM (SELECT *, ROW_NUMBER() OVER (Order by (select NULL)) as rn ) as X where rn > 1000 is always give the correct result. means order remains same for every exection of query
  • gotqn
    gotqn about 7 years
    @lucy No, the order does not remain the same for every execution of the query. Using (SELECT constant) means that there is NO ORDER. You are selecting a specific amount of the data - you can execute the query 1 million times and it to return the same data, but there is no guarantee for this. Without specific ordering, the result is not deterministic.
  • MikeTeeVee
    MikeTeeVee almost 6 years
    I partially disagree with the statement, "Tables have no inherent order." I've found (when selecting from a Table) it will return its Records using the Clustered-Index Order every time. Now start joining tables and making a more complicated query, then I think the Default-Order may be up to whatever route the Query-Optimizer takes. BTW: The Questioner is not asking about leaving the Order-By Clause out all together. You could still have one, she just wanted to know if it could be left unspecified in the Row_Number() expression, which it can: ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
  • Damien_The_Unbeliever
    Damien_The_Unbeliever almost 6 years
    @MikeTeeVee - it's a topic that has been done to death. There is no inherent user observable order. Often a select from a single table will follow the clustered index order but SQL Server offers no guarantee that it's the case. Even in this case, though, you can end up with a carousel scan - your query piggy-backs on another query that has already started a table scan. You query starts getting the results from the middle of the table because that's where the scan was and then will start a second scan to finish off. Out of order results even with just the clustered index defined.
  • MikeTeeVee
    MikeTeeVee almost 6 years
    Wow, I had no clue. Thanks for the super-informative response! After all these years, I feel like a newb again.
  • Reversed Engineer
    Reversed Engineer about 5 years
    @lucy - maybe it will sometimes, but it is not guaranteed since there is then no order. To guarantee the same results, use some data that is unique for each record in the ORDER BY clause.
  • CervEd
    CervEd over 2 years
    ROW_NUMBER is still useful wo. order