Get sequential number of a row (rank) within a partition without using ROW_NUMBER() OVER function
Solution 1
If you can't do it with a correlated subquery, you can still do this with a join:
select t1.name, t1.price,
coalesce(count(t2.name) + 1, 1)
from my_table t1 join
my_table t2
on t2.name = t1.name and
t2.price < t1.price
order by t1.name, t1.price;
Note that this doesn't exactly do row_number()
unless all the prices are distinct for a given name
. This formulation is actually equivalent to rank()
.
For row_number()
, you need a unique row identifier.
By the way, the following is equivalent to dense_rank()
:
select t1.name, t1.price,
coalesce(count(distinct t2.name) + 1, 1)
from my_table t1 join
my_table t2
on t2.name = t1.name and
t2.price < t1.price
order by t1.name, t1.price;
Solution 2
The usual workaround for systems not supporting window functions is something like this:
select name,
price,
(select count(*)
from my_table t2
where t2.name = t1.name -- this is the "partition by" replacement
and t2.price < t1.price) as row_number
from my_table t1
order by name, price;
SQLFiddle example: http://sqlfiddle.com/#!2/3b027/2
Comments
-
Andrey Dmitriev almost 3 years
I need to rank rows by partition (or group), i.e. if my source table is:
NAME PRICE ---- ----- AAA 1.59 AAA 2.00 AAA 0.75 BBB 3.48 BBB 2.19 BBB 0.99 BBB 2.50
I would like to get target table:
RANK NAME PRICE ---- ---- ----- 1 AAA 0.75 2 AAA 1.59 3 AAA 2.00 1 BBB 0.99 2 BBB 2.19 3 BBB 2.50 4 BBB 3.48
Normally I would use
ROW_NUMBER() OVER
function, so in Apache Hive it would be:select row_number() over (partition by NAME order by PRICE) as RANK, NAME, PRICE from MY_TABLE ;
Unfortunately, Cloudera Impala does not support (at the moment)
ROW_NUMBER() OVER
function, so I'm looking for a workaround. Preferably not to use UDAF, as it will be politically difficult to convince to deploy it to the server.