Group by minimum value in one field while selecting distinct rows

198,579

Solution 1

How about something like:

SELECT mt.*     
FROM MyTable mt INNER JOIN
    (
        SELECT id, MIN(record_date) AS MinDate
        FROM MyTable
        GROUP BY id
    ) t ON mt.id = t.id AND mt.record_date = t.MinDate

This gets the minimum date per ID, and then gets the values based on those values. The only time you would have duplicates is if there are duplicate minimum record_dates for the same ID.

Solution 2

I could get to your expected result just by doing this in :

 SELECT id, min(record_date), other_cols 
  FROM mytable
  GROUP BY id

Does this work for you?

Solution 3

To get the cheapest product in each category, you use the MIN() function in a correlated subquery as follows:

    SELECT categoryid,
       productid,
       productName,
       unitprice 
    FROM products a WHERE unitprice = (
                SELECT MIN(unitprice)
                FROM products b
                WHERE b.categoryid = a.categoryid)

The outer query scans all rows in the products table and returns the products that have unit prices match with the lowest price in each category returned by the correlated subquery.

Solution 4

I would like to add to some of the other answers here, if you don't need the first item but say the second number for example you can use rownumber in a subquery and base your result set off of that.

SELECT * FROM
(
    SELECT
        ROW_NUM() OVER (PARTITION BY Id ORDER BY record_date, other_cols) as rownum,
        *
    FROM products P
) INNER
WHERE rownum = 2

This also allows you to order off multiple columns in the subquery which may help if two record_dates have identical values. You can also partition off of multiple columns if needed by delimiting them with a comma

Solution 5

This does it simply:

select t2.id,t2.record_date,t2.other_cols 
from (select ROW_NUMBER() over(partition by id order by record_date)as rownum,id,record_date,other_cols from MyTable)t2 
where t2.rownum = 1
Share:
198,579

Related videos on Youtube

user2765924
Author by

user2765924

Updated on July 08, 2022

Comments

  • user2765924
    user2765924 almost 2 years

    Here's what I'm trying to do. Let's say I have this table t:

    key_id | id | record_date | other_cols
    1      | 18 | 2011-04-03  | x
    2      | 18 | 2012-05-19  | y
    3      | 18 | 2012-08-09  | z
    4      | 19 | 2009-06-01  | a
    5      | 19 | 2011-04-03  | b
    6      | 19 | 2011-10-25  | c
    7      | 19 | 2012-08-09  | d
    

    For each id, I want to select the row containing the minimum record_date. So I'd get:

    key_id | id | record_date | other_cols
    1      | 18 | 2011-04-03  | x
    4      | 19 | 2009-06-01  | a
    

    The only solutions I've seen to this problem assume that all record_date entries are distinct, but that is not this case in my data. Using a subquery and an inner join with two conditions would give me duplicate rows for some ids, which I don't want:

    key_id | id | record_date | other_cols
    1      | 18 | 2011-04-03  | x
    5      | 19 | 2011-04-03  | b
    4      | 19 | 2009-06-01  | a
    
    • Asclepius
      Asclepius over 2 years
      If there is a min_by function, consider it for this purpose. It saved me from writing something more complicated.
  • user2765924
    user2765924 over 10 years
    Ah, initially I was using an expression to output a date which was causing the 'and' condition on the inner join to not work properly. Changed it to an actual column and it works now (and had to modify some other things as a result), thanks!
  • user2765924
    user2765924 over 10 years
    For whatever reason, this appears to work in the contrived example (sqlfiddle.com/#!2/f8469/6/0), but in practice I get "Column 'database.table.col_name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause." I was able to get it working with astander's answer anyway, thanks.
  • Pedro Braz
    Pedro Braz over 8 years
    Yeah I'm running into the same issue, I'd like a simple answer like this one on SQL Server
  • rajat
    rajat about 8 years
    this would not work when two records of same id and date are present, will get you multiple rows right?
  • FluffyKitten
    FluffyKitten over 3 years
    Welcome to Stack Overflow. Code-only answers are discouraged on Stack Overflow because they don't explain how it solves the problem. Please edit your answer to explain what this code does and how it improves of the existing answers this question already has, so that it is useful to other users with similar issues.
  • jarlh
    jarlh about 2 years
    Newer MySQL versions will raise an error here. (Unless in compatibility mode. See dev.mysql.com/doc/refman/5.7/en/…)
  • wearego
    wearego about 2 years
    The reason this works or worked in older versions of MySQL is that MySQL would just give you arbitrarily the first values it encounters (maybe based on insertion order or whatever) for the non-grouped-by columns. This is really arbitrary and this is why other DBMSes don't do this, and apparently MySQL also chose to do the right thing.