Retrieving the most recent records within a query

23,131

Solution 1

MySQL doesn't have ranking/analytical/windowing functionality.

SELECT tl.locationid, tl.timestamp, tp.name, X, Y, Z
  FROM tblPerson tp
  JOIN tblLocation tl ON tl.personid = tp.personid
  JOIN (SELECT t.personid,
               MAX(t.timestamp) AS max_date
          FROM tblLocation t
      GROUP BY t.personid) x ON x.personid = tl.personid
                            AND x.max_date = tl.timestamp

SQL Server 2005+ and Oracle 9i+ support analytics, so you could use:

SELECT x.locationid, x.timestamp, x.name, x.X, x.Y, x.Z
  FROM (SELECT tl.locationid, tl.timestamp, tp.name, X, Y, Z,
               ROW_NUMBER() OVER (PARTITION BY tp.name ORDER BY tl.timestamp DESC) AS rank
          FROM tblPerson tp
          JOIN tblLocation tl ON tl.personid = tp.personid) x
WHERE x.rank = 1

Using a variable to get same as ROW_NUMBER functionality on MySQL:

SELECT x.locationid, x.timestamp, x.name, x.X, x.Y, x.Z
  FROM (SELECT tl.locationid, tl.timestamp, tp.name, X, Y, Z,
               CASE
                 WHEN @name != t.name THEN
                   @rownum := 1
                 ELSE @rownum := @rownum + 1
               END AS rank,
               @name := tp.name
          FROM tblLocation tl
          JOIN tblPerson tp ON tp.personid = tl.personid
          JOIN (SELECT @rownum := NULL, @name := '') r
      ORDER BY tp.name, tl.timestamp DESC) x
WHERE x.rank = 1

Solution 2

As @Mark Byers mentions, this problem comes up frequently on Stack Overflow.

Here's the solution I most frequently recommend, given your tables:

SELECT p.*, l1.*
FROM tblPerson p
JOIN tblLocation l1 ON p.PersonID = l1.PersonID
LEFT OUTER JOIN tblLocation l2 ON p.PersonID = l2.PersonID AND 
  (l1.timestamp < l2.timestamp OR l1.timestamp = l2.timestamp AND l1.LocationId < l2.LocationId)
WHERE l2.LocationID IS NULL;

To see other examples, follow the tag greatest-n-per-group, which I added to your question.

Solution 3

This is a classic 'max per group' question that comes up on Stack Overflow almost every day. There are many ways to solve it and you can find example solutions by searching Stack Overflow. Here is one way that you can do it in MySQL:

SELECT
    location.LocationId,
    location.Timestamp,
    person.Name,
    location.X,
    location.Y,
    location.Z
FROM (
    SELECT
        LocationID,
        @rn := CASE WHEN @prev_PersonID = PersonID
                    THEN @rn + 1
                    ELSE 1
               END AS rn,
        @prev_PersonID := PersonID
    FROM (SELECT @prev_PersonID := NULL) vars, tblLocation
    ORDER BY PersonID, Timestamp DESC
) T1
JOIN tblLocation location ON location.LocationID = T1.LocationId
JOIN tblPerson person ON person.PersonID = location.PersonID
WHERE rn = 1
Share:
23,131
Onion-Knight
Author by

Onion-Knight

Work as a Software Developer for a small IT Company.

Updated on January 12, 2020

Comments

  • Onion-Knight
    Onion-Knight over 4 years

    I have the following tables:

    tblPerson:

    PersonID | Name
    ---------------------
       1     | John Smith
       2     | Jane Doe
       3     | David Hoshi
    

    tblLocation:

    LocationID | Timestamp | PersonID | X | Y | Z | More Columns...
    ---------------------------------------------------------------
        40     | Jan. 1st  |     3    | 0 | 0 | 0 | More Info...
        41     | Jan. 2nd  |     1    | 1 | 1 | 0 | More Info...
        42     | Jan. 2nd  |     3    | 2 | 2 | 2 | More Info...
        43     | Jan. 3rd  |     3    | 4 | 4 | 4 | More Info...
        44     | Jan. 5th  |     2    | 0 | 0 | 0 | More Info...
    

    I can produce an SQL query that gets the Location records for each Person like so:

    SELECT LocationID, Timestamp, Name, X, Y, Z 
    FROM tblLocation 
    JOIN tblPerson 
    ON tblLocation.PersonID = tblPerson.PersonID;
    

    to produce the following:

    LocationID | Timestamp |    Name     | X | Y | Z |
    --------------------------------------------------
        40     | Jan. 1st  | David Hoshi | 0 | 0 | 0 |
        41     | Jan. 2nd  | John Smith  | 1 | 1 | 0 |
        42     | Jan. 2nd  | David Hoshi | 2 | 2 | 2 |
        43     | Jan. 3rd  | David Hoshi | 4 | 4 | 4 |
        44     | Jan. 5th  | Jane Doe    | 0 | 0 | 0 |
    

    My issue is that we're only concerned with the most recent Location record. As such, we're only really interested in the following Rows: LocationID 41, 43, and 44.

    The question is: How can we query these tables to give us the most recent data on a per-person basis? What special grouping needs to happen to produce the desired result?