PostgreSQL - repeating rows from LIMIT OFFSET

16,985

Solution 1

Why does "foo" appear in both queries?

Because all rows that are returned have the same value for the status column. In that case the database is free to return the rows in any order it wants.

If you want a reproducable ordering you need to add a second column to your order by statement to make it consistent. E.g. the ID column:

SELECT students.* 
FROM students 
ORDER BY students.status asc, 
         students.id asc

If two rows have the same value for the status column, they will be sorted by the id.

Solution 2

For more details from PostgreSQL documentation (http://www.postgresql.org/docs/8.3/static/queries-limit.html) :

When using LIMIT, it is important to use an ORDER BY clause that constrains the result rows into a unique order. Otherwise you will get an unpredictable subset of the query's rows. You might be asking for the tenth through twentieth rows, but tenth through twentieth in what ordering? The ordering is unknown, unless you specified ORDER BY.

The query optimizer takes LIMIT into account when generating a query plan, so you are very likely to get different plans (yielding different row orders) depending on what you give for LIMIT and OFFSET. Thus, using different LIMIT/OFFSET values to select different subsets of a query result will give inconsistent results unless you enforce a predictable result ordering with ORDER BY. This is not a bug; it is an inherent consequence of the fact that SQL does not promise to deliver the results of a query in any particular order unless ORDER BY is used to constrain the order.

Share:
16,985

Related videos on Youtube

keewooi
Author by

keewooi

Updated on June 06, 2022

Comments

  • keewooi
    keewooi about 2 years

    I noticed some repeating rows in a paginated recordset.

    When I run this query:

    SELECT "students".* 
    FROM "students" 
    ORDER BY "students"."status" asc 
    LIMIT 3 OFFSET 0
    

    I get:

        | id | name  | status |
        | 1  | foo   | active |
        | 12 | alice | active |
        | 4  | bob   | active |
    

    Next query:

    SELECT "students".* 
    FROM "students" 
    ORDER BY "students"."status" asc 
    LIMIT 3 OFFSET 3
    

    I get:

        | id | name  | status |
        | 1  | foo   | active |
        | 6  | cindy | active |
        | 2  | dylan | active |
    

    Why does "foo" appear in both queries?

  • keewooi
    keewooi over 11 years
    Thanks for the answer! Is this PostgreSQL only? I can't reproduce this kind of behavior in MySQL.
  • a_horse_with_no_name
    a_horse_with_no_name over 11 years
    @amazoom: I don't really know MySQL that well, but the database is free to return the rows in any order it seems fit. My guess(!!) is that MySQL uses the clustered index to return the rows in case of identical values and as the clustered index basically sorts the rows in a table this leads to the result you see. PostgreSQL will return them in the order they are retrieved. But you should not rely on that ordering in MySQL either. An index scan, a join or other things can change that.
  • Badman
    Badman about 6 years
    For future viewer : github.com/cakephp/cakephp/issues/1827. It is an issue of postgresql.
  • a_horse_with_no_name
    a_horse_with_no_name about 6 years
    @Badman: that's not an "issue" that's well documented behaviour. Anyone who relies on a specific sort order if no order by is specified puts a bug into her/his software.
  • Badman
    Badman about 6 years
    @ a_horse_with_no_name I tried it to order by primary key. It is giving result while I gave a large offset while that no of data doesn't exist .
  • bo-oz
    bo-oz over 4 years
    almost 8 years later this saved me from another few hours of headaches :)
  • Pankaj Shinde
    Pankaj Shinde over 2 years
    Awesome. Works. Thanks