SQL syntax: select only if more than X results

27,677

Solution 1

You were right to use HAVING, and to think about using the self-join... just had the order of the operations slightly off...

select m1.location, m1.value
from measures m1
join (
  select location
  from measures
  group by location
  having count(*) > 1
) m2 on m2.location = m1.location

The sub-select gets all the locations that have more than one entry... and then this is joined to the table again to get the full results.

SQL Fiddle

Solution 2

Use a nested select:

SELECT location,value,type,value_added
  FROM measures
  WHERE location IN
    (SELECT location FROM measures
      GROUP BY location HAVING COUNT(*)>1)

(Syntax is by memory, might be somewhat off)

Solution 3

The idea is to get the list of locations that have more than one value. The following uses in to fetch the records:

select m.*
from measures m
where m.location in (select location from measures group by location having count(*) > 1);

You can also formulate this with a join:

select m.*, mdup.numdups
from measures m join
     (select location, count(*) as numdups
      from measures
      group by location
      having count(*) > 1
     ) mdup
     on m.location = mdup.location;

One advantage to doing the query this way is that you can get the number of duplicates.

Solution 4

SELECT * FROM measures WHERE
(location) IN (
    SELECT
        location
    FROM
        measures
    GROUP BY
        location
    HAVING
        COUNT(location) > 1
) ORDER BY ASC
Share:
27,677
beta
Author by

beta

Updated on August 08, 2020

Comments

  • beta
    beta over 3 years

    I have a table with measurements called measures. The table has one column for the location and a second colum for a corresponding value (example is simplified).

    The table looks like (note 2 entries for loc1):

    location | value
    -----------------
    loc1     | value1
    loc1     | value2
    loc2     | value3
    loc3     | value4
    loc4     | value5
    

    i now want to formulate a SQL query (actually i use sqlite) which only returns the first two rows of the table (i.e. loc+value1 and loc1+value2), because this location has more than one entry in this table.

    the pseudotext formulation would be: show me the rows of the locations, which are present more than once in the whole table
    pseudcode:

    SELECT * from measures WHERE COUNT(location over the whole table) > 1
    

    the solution may be really simple, but somehow i seem not to crack the nut.

    what i have so far is a SELECT statement, which returns locations which have more than one entry. as a next step i would need exactly all rows which correspond to the locations returned from this query:

    SELECT location FROM measures GROUP BY location HAVING count(*) > 1
    

    so as a next step i tried to do a JOIN with the same table and incorporate above query, but the results are incorrect. i tried it like this, but this is wrong:

    select t1.location, t1.value
    from 
         measures as t1
         join 
         measures as t2 on t1.location = t2.location 
    group by
          t2.location 
    having count(*) > 1
    

    help is appreciated!

  • Chris
    Chris over 10 years
    Out of interest is there any difference in practical terms between this and the method of putting the subselect in a where clause rather than using join? I always think the where clause seems more readable and feels more natural to me but I don't know if there are performance differences or similar...
  • beta
    beta over 10 years
    thanks! this works smoothly :) i am happy now. i think this type of query has many use-cases. at least i need it :) and i also did not know SQL Fiddle. seems very good. and yes, i'd be also interested in which solution is better or more practicable. the JOIN-solution or the nested query-solution?
  • beta
    beta over 10 years
    this also works. seems a bit more intuitive for me than the JOIN solution. still, both yield the same results. thanks!
  • beta
    beta over 10 years
    and chris. see this answer (stackoverflow.com/a/18108159/973158) for explanation of differences of both solutions. the JOIN solution lets you also display the number of duplicates.
  • Tassos Bassoukos
    Tassos Bassoukos over 10 years
    They are essentially equivalent, and the query planner/optimizer should execute them the same way. I say 'should' because I don't know sqlite at all...