Update top N values using PostgreSQL

23,370

Solution 1

WITH cte AS (
   SELECT id, row_number() OVER (ORDER BY account DESC NULLS LAST) AS rn
   FROM   accountrecords    
   ORDER  BY account DESC NULLS LAST
   LIMIT  10
   )
UPDATE accountrecords a
SET    accountrank = cte.rn
FROM   cte
WHERE  cte.id = a.id;

Joining in a table expression is typically faster than correlated subqueries. It is also shorter.

With the window function row_number() distinct numbers are guaranteed. Use rank() (or possibly dense_rank()) if you want rows with equal values for account to share the same number.

Only if there can be NULL values in account, you need to append NULLS LAST for descending sort order, or NULL values sort on top:

If there can be concurrent write access, the above query is subject to a race condition. Consider:

However, if that was the case, the whole concept of hard-coding the top ten would be a dubious approach to begin with.

Use a CTE instead of a plain subquery to enforce the LIMIT reliably. See links above.

Solution 2

Sure, you can use your select statement in a subquery. Generating the rank-order isn't trivial, but here's at least one way to do it. I haven't tested this, but off the top of my head:

update accountrecords
set accountrank =
    (select count(*) + 1 from accountrecords r where r.account > account)
where id in (select id from accountrecords order by account desc limit 10);

This has the quirk that if two records have the same value for account, then they will get the same rank. You could consider that a feature... :-)

Share:
23,370
djq
Author by

djq

Currently working for a large tech company in Dublin. Previously, a co-founder of a technology startup; there I worked with a small team using Python, Django, Django-Rest-Framework, Pandas (and more!). I've a background in urban analysis (geospatial data), stats (R) and data visualization (ggplot2, D3) Excited about how technology can be used for social good.

Updated on July 09, 2022

Comments

  • djq
    djq almost 2 years

    I want to update the top 10 values of a column in table. I have three columns; id, account and accountrank. To get the top 10 values I can use the following:

    SELECT  * FROM accountrecords    
    ORDER BY account DESC
    LIMIT 10;
    

    What I would like to do is to set the value in accountrank to be a series of 1 - 10, based on the magnitude of account. Is this possible to do in PostgreSQL?

    • wildplasser
      wildplasser over 11 years
      If your poatgres version is 8.4 or above, you could use windowing functions + rank() or row_number().