Return the data from the rows with the most recent date of each distinct candidate_id

26,278

Solution 1

You must group by everything not using an aggregate function:

SELECT candidate_id, message, max(created_unix), jobpost_id, staffuserid 
    FROM messages 
       WHERE employer_id='$employerid' AND last='company' 
          GROUP BY candidate_id, message, jobpost_id, staffuserid 

If your message is different per row and you want to group by candidate_id, then you must not be using message. In that case, simply remove it from your select list and you won't need it in your group by list. The same goes for any other field you aren't using.

Remember, when using aggregate functions, you must contain each field in either an aggregate function or the group by. Otherwise, SQL won't know from which row to pull the data for the row returned.

Update:

After seeing what you're looking for, this will do the trick:

SELECT candidate_id, message, max(created_unix), jobpost_id, staffuserid 
    FROM messages 
       WHERE employer_id='$employerid' AND last='company' AND
       created_unix = (
           SELECT max(subm.created_unix)
           FROM messages subm
           WHERE subm.candidate_id = messages.candidate_id
       )

Solution 2

SELECT m1.*
FROM messages as m1
  LEFT OUTER JOIN messages AS m2
    ON    (m1.candidate_id = m2.candidate_id
      AND (m1.created_unix < m2.created_unix)
WHERE m2.created_unix is NULL
AND employer_id='$employerid' AND last='company'

This joins messages to itself on candidate_id and makes rows with picks all dates in m2 that are greater than each date in m1, substituting NULL if none are greater. So you get NULL precisely when there is no greater date within that candidate_id.

Share:
26,278
arrogantprick
Author by

arrogantprick

Updated on May 06, 2020

Comments

  • arrogantprick
    arrogantprick about 4 years

    I am attempting to return the data from the rows with the most recent date of each distinct candidate_id. It is correctly returning the most recent date (the created_unix column), but not the rest of the data from the corresponding row.

    SELECT candidate_id, message, max(created_unix), jobpost_id, staffuserid 
        FROM messages 
           WHERE employer_id='$employerid' AND last='company' 
              GROUP BY candidate_id
    
  • arrogantprick
    arrogantprick over 12 years
    Thanks for the feedback, Justin. Doing this will list out all rows (each row has a unique message). I only want one row per distinct candidate_id that meets there WHERE criteria. There isn't a need to group by jobpost_id or staffuserid as there is only 1 per candidate_id.
  • Devin Burke
    Devin Burke over 12 years
    But there is a need to do that. When you use aggregate functions, for each field you return, you must either use it in an aggregate function or the group by. It sounds like you're not using the message field anyway, so just remove it from your select statement and you won't need it in your group by.
  • arrogantprick
    arrogantprick over 12 years
    I am using the message field as I am printing it out in a table. My whole problem was that it was printing a message from the incorrect row (not the row containing the max of the created_unix) field.
  • Devin Burke
    Devin Burke over 12 years
    I have updated my answer. It will pull the entire row that contains the max of the created_unix for that particular candidate_id.
  • arrogantprick
    arrogantprick over 12 years
    Thank you very much. I'll test this shortly. I looks like what I am looking for.