Get sequential number of a row (rank) within a partition without using ROW_NUMBER() OVER function

29,011

Solution 1

If you can't do it with a correlated subquery, you can still do this with a join:

select t1.name, t1.price,
       coalesce(count(t2.name) + 1, 1)
from my_table t1 join
     my_table t2
     on t2.name = t1.name and
        t2.price < t1.price
order by t1.name, t1.price;

Note that this doesn't exactly do row_number() unless all the prices are distinct for a given name. This formulation is actually equivalent to rank().

For row_number(), you need a unique row identifier.

By the way, the following is equivalent to dense_rank():

select t1.name, t1.price,
       coalesce(count(distinct t2.name) + 1, 1)
from my_table t1 join
     my_table t2
     on t2.name = t1.name and
        t2.price < t1.price
order by t1.name, t1.price;

Solution 2

The usual workaround for systems not supporting window functions is something like this:

select name, 
       price,
       (select count(*) 
        from my_table t2 
        where t2.name = t1.name  -- this is the "partition by" replacement
        and t2.price < t1.price) as row_number
from my_table t1
order by name, price;

SQLFiddle example: http://sqlfiddle.com/#!2/3b027/2

Share:
29,011
Andrey Dmitriev
Author by

Andrey Dmitriev

Geek ;)

Updated on July 28, 2021

Comments

  • Andrey Dmitriev
    Andrey Dmitriev almost 3 years

    I need to rank rows by partition (or group), i.e. if my source table is:

    NAME PRICE
    ---- -----
    AAA  1.59
    AAA  2.00
    AAA  0.75
    BBB  3.48
    BBB  2.19
    BBB  0.99
    BBB  2.50
    

    I would like to get target table:

    RANK NAME PRICE
    ---- ---- -----
    1    AAA  0.75
    2    AAA  1.59
    3    AAA  2.00
    1    BBB  0.99
    2    BBB  2.19
    3    BBB  2.50
    4    BBB  3.48
    

    Normally I would use ROW_NUMBER() OVER function, so in Apache Hive it would be:

    select
      row_number() over (partition by NAME order by PRICE) as RANK,
      NAME,
      PRICE
    from
      MY_TABLE
    ;
    

    Unfortunately, Cloudera Impala does not support (at the moment) ROW_NUMBER() OVER function, so I'm looking for a workaround. Preferably not to use UDAF, as it will be politically difficult to convince to deploy it to the server.