How to select the most recent set of dated records from a mysql table

102,021

Solution 1

Self answered, but I'm not sure that it will be an efficient enough solution as the table grows:

SELECT timestamp,method,id,response FROM rpc_responses 
INNER JOIN
(SELECT max(timestamp),method,id FROM rpc_responses GROUP BY method,id) latest
USING (timestamp,method,id);

Solution 2

This solution was updated recently.
Comments below may be outdated

This can query may perform well, because there are no joins.

SELECT * FROM (
    SELECT *,if(@last_method=method,0,1) as new_method_group,@last_method:=method 
    FROM rpc_responses 
    ORDER BY method,timestamp DESC
) as t1
WHERE new_method_group=1;

Given that you want one resulting row per method this solution should work, using mysql variables to avoid a JOIN.

FYI, PostgreSQL has a way of doing this built into the language:

SELECT DISTINCT ON (method) timestamp, method, id, response
FROM rpc_responses
WHERE 1 # some where clause here
ORDER BY method, timestamp DESC

Solution 3

Try this...

SELECT o1.id, o1.timestamp, o1.method, o1.response   
FROM rpc_responses o1
WHERE o1.timestamp = ( SELECT max(o2.timestamp)
                       FROM rpc_responses o2
                       WHERE o1.id = o2.id )
ORDER BY o1.timestamp, o1.method, o1.response

...it even works in Access!

Share:
102,021

Related videos on Youtube

Ken
Author by

Ken

Web programmer at cms.scot - 20 years of experience working with custom code bases for a variety of accommodation, retail, mapping and weather sites. Mainly PHP/MySQL on custom sites, incorporating the likes of CS-Cart, WordPress, Google Maps and including custom back end integrations with external services (Booking.com, Italian train and ticket services, Stripe payments...). Conscientious version control (currently SVN) and more pragmatic Unit Testing and object-oriented programming. Remote worker with extensive experience of dealing with external programmers and providers, sales, accounting and office staff in both English and Italian. Current tools (although always happy to learn something new): Ubuntu terminal/Vim with scripting (Bash/PHP) to handle automation, data processing and to integrate with remote sites (SSH, Curl, etc.)

Updated on December 22, 2021

Comments

  • Ken
    Ken over 2 years

    I am storing the response to various rpc calls in a mysql table with the following fields:

    Table: rpc_responses
    
    timestamp   (date)
    method      (varchar)
    id          (varchar)
    response    (mediumtext)
    
    PRIMARY KEY(timestamp,method,id)
    

    What is the best method of selecting the most recent responses for all existing combinations of method and id?

    • For each date there can only be one response for a given method/id.

    • Not all call combinations are necessarily present for a given date.

    • There are dozens of methods, thousands of ids and at least 365 different dates

    Sample data:

    timestamp  method  id response
    2009-01-10 getThud 16 "....."
    2009-01-10 getFoo  12 "....."
    2009-01-10 getBar  12 "....."
    2009-01-11 getFoo  12 "....."
    2009-01-11 getBar  16 "....."
    

    Desired result:

    2009-01-10 getThud 16 "....."
    2009-01-10 getBar 12 "....."
    2009-01-11 getFoo 12 "....."
    2009-01-11 getBar 16 "....."
    

    (I don't think this is the same question - it won't give me the most recent response)

  • Ken
    Ken over 15 years
    I want the most recent record for each combination of method/id. Not all combinations are changed with every timestamp so I can't just specify the latest timestamp.
  • Ken
    Ken over 15 years
    HAVING max(timestamp) = timestamp gives me an empty set
  • Adam Bellaire
    Adam Bellaire over 15 years
    As far as I know, you have to use a subquery to get what you want.
  • Ken
    Ken almost 13 years
    Unless I'm missing something you need USING(method) on your join?
  • Fred Haslam
    Fred Haslam over 12 years
    Add a 'limit 100' clause and you have the best answer.
  • mkoistinen
    mkoistinen over 11 years
    This method appears to depend on the fact that the GROUP BY will collapse the found rows in t1 to only the first. Is this guaranteed in MySQL?
  • velcrow
    velcrow about 11 years
    Not SQL standard, but yes, it is guaranteed in MySQL. What guarantees it is the "ORDER BY timestamp DESC". If someone enables 'ONLY_FULL_GROUP_BY' mode, it will cease to work though. see stackoverflow.com/a/9797138/461096 stackoverflow.com/a/1066504/461096 rpbouman.blogspot.com/2007/05/debunking-group-by-myths.html
  • Gunni
    Gunni over 9 years
    For me the grouping did not work, until i added a "DISTINCT" in the inner query. Dont know why, and there is no logical reason for this behaviour, but seems to work. Without the DISTINCT the query did not always pick the first row of the inner query. But genious idea, never would think of this on my own.
  • tumultous_rooster
    tumultous_rooster over 8 years
    Why am I made nervous to learn that ORDER BY, which I previously thought was just a tool to make reading output easier, actually plays a huge role in GROUP BY...
  • cgaldiolo
    cgaldiolo over 8 years
    This is wrong. From MySQL manual: "The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values within each group the server chooses."
  • mostafa.S
    mostafa.S over 8 years
    The sub-query solution with join works will all types of DBMS though. however this one, works with MySql defaults only. which is still good :D
  • zDaniels
    zDaniels about 8 years
    This method works best when creating views because MySQL views do not allow subqueries.
  • Jannes
    Jannes about 8 years
    @cgaldiolo is correct here! This is a terrible answer! There is no guarantee that this will work under all circumstances with current MySQL version, let alone any future versions.
  • Bastiaan
    Bastiaan about 8 years
    The most recent response for each combination of id and method was asked, this will just give you the most recent responses regardless the id and method.
  • DiegoDD
    DiegoDD over 7 years
    sorry for reviving this after so long, but shouldn't the max(timestamp) in the subquery have an alias called timestamp ? Otherwise, mysql gives an error: SQL Error (1054): Unknown column 'timestamp' in 'from clause', because USING() requires both tables to have the same column names (I tried it in mysql version 5.1 and 5.5). Adding the alias solves the issue.
  • Tarik
    Tarik about 7 years
    @Jannes cgaldiolo's answer would be totally correct IF there is only one query existed. However, in the answer, GROUP BY is used on the OUTSIDE query. Yes the GROUP BY does not care ORDER BY in the SAME query but this case with being on the outside query, GROUP BY already receives ordered rows. HOWEVER, the answer is still too risky to use in production level, I would not use it.
  • Jannes
    Jannes about 7 years
    @Tarik That's totally implementation dependent and IF it works, it's purely coincidental and never future proof. For one, I'm pretty sure the inner ORDER BY is simply dropped completely by the optimizer in recent MySQL versions, so the resultset is actually never sorted at all (except by the GROUP BY eventually, but if I'm not mistaken they're dropping that too in an upcoming version). This really should not be the most upvoted answer at all.