How to Use Group By clause when we use Aggregate function in the Joins?

53,990

Solution 1

GROUP BY for any unique combination of the specified columns does aggregation (like sum, min etc). If you don't specify some column name in the GROUP BY clause or in the aggregate function its unknown to the SQL engine which value it should return for that kind of column.

Solution 2

GROUP BY (Transact-SQL) groups a selected set of rows into a set of summary rows by the values of one or more columns or expressions in SQL Server 2008 R2. One row is returned for each group. Aggregate functions in the SELECT clause list provide information about each group instead of individual rows.

SELECT a.City, COUNT(bea.AddressID) AS EmployeeCount
FROM Person.BusinessEntityAddress AS bea 
    INNER JOIN Person.Address AS a
        ON bea.AddressID = a.AddressID
GROUP BY a.City

The GROUP BY clause has an ISO-compliant syntax and a non-ISO-compliant syntax. Only one syntax style can be used in a single SELECT statement. Use the ISO compliant syntax for all new work. The non-ISO compliant syntax is provided for backward compatibility.

In ISO-compliant syntax each table or view column in any nonaggregate expression in the list must be included in the GROUP BY list.

select pub_id, type, avg(price), sum(total_sales)
from titles
group by pub_id, type

Refering to Organizing query results into groups: the group by clause

Sybase or non-ISO-compliant syntax lifts restrictions on what you can include or omit in the select list of a query that includes group by:

  • The columns in the select list are not limited to the grouping columns and columns used with the vector aggregates.

  • The columns specified by group by are not limited to those non-aggregate columns in the select list.

Example:

select type, title_id, avg(price), avg(advance) 
from titles 
group by type 

Solution 3

To use aggregate functions like sum without group by, use the over clause.

See: http://msdn.microsoft.com/en-us/library/ms189461.aspx

Example:

CREATE TABLE #a (ida int, name varchar(50))
CREATE TABLE #b  (ida int, number int)

INSERT INTO #a VALUES(1,'one')
INSERT INTO #a VALUES(2,'two')

INSERT INTO #b VALUES(1,2)
INSERT INTO #b VALUES(1,3)
INSERT INTO #b VALUES(2,1)

SELECT DISTINCT a.ida, sum(number) OVER (PARTITION BY a.ida) FROM #a a
INNER JOIN #b b on a.ida = b.ida
Share:
53,990
thevan
Author by

thevan

Software Engineering Senior Analyst at Accenture Solutions Private Limited, Chennai, India. Interested in ASP.Net, MVC, Web API, WCF, Web Services, ADO.Net, C#.Net, VB.Net, Entity Framework, MS SQLServer, Angular.js, JavaScript, JQuery, Ajax, HTML and CSS

Updated on December 10, 2020

Comments

  • thevan
    thevan over 3 years

    I want to join three tables and to calculate the Sum(Quantity) of the Table A. I tried something and I get the desired output. But still I have confusion based on aggregate function and Group By clause.

    While calculating the sum value by joining two or more tables, what are the columns we need to mention in the Group By clause and why do we need to give those columns?

    For Example: Here is my table and the desired query.

    TableA: ItemID, JobOrderID, CustomerID, DivisionID, Quantity
    TableB: ItemID, ItemName, SpecificationID
    TableC: SpecificationID, SpecificationName
    TableD: DivisionID, DivisionName
    TableE: JobOrderID, JobOrderNo.
    TableF: CustomerID, CustomerName
    

    I want to get the Sum(Quantity) based on ItemID, CustomerID, JobOrderID and DivisionID.

    I wrote the following query and it's working fine. But if I remove any column in the Group By clause, it doesn't give the desired result. Why? What does the Group By clause do here? How to specify the Group By clause when using Aggregate function? Here is my Query.

        SELECT 
                B.ItemName + ' - ' + C.SpecificationName AS 'ItemName',
                SUM(A.Quantity) AS 'Quantity',
                A.ItemID,
                D.DivisionName,
                F.CustomerName,
                E.JobOrderNo,
                A.DivisionID,
                A.JobOrderID,
                A.CustomerID
    
        FROM
                TableA A  
                INNER JOIN TableB B ON B.ItemID = A.ItemID 
                INNER JOIN TableC C ON C.SpecificationID = B.SpecificationID
                INNER JOIN TableD D ON D.DivisionID = A.DivisionID
                LEFT JOIN TableE E ON E.JobOrderID = A.JobOrderID
                LEFT JOIN TableF F ON F.CustomerID = A.CustomerID
        WHERE
                A.ItemID = @ItemID
        GROUP BY
                A.ItemID,
                A.JobOrderID,
                A.DivisionID,
                A.CustomerID,
                D.DivisionName,
                F.CustomerName,
                E.JobOrderNo,
                B.ItemName,
                C.SpecificationName
    

    Any one please give suggestion about the Group By Clause by considering this as an example.

  • thevan
    thevan almost 13 years
    So if we select some columns means, we need to give those columns in the Group By clause too. Is it so?
  • thevan
    thevan almost 13 years
    So if we select some columns means, we need to give those columns in the Group By clause too. Is it so?
  • Piotr Auguscik
    Piotr Auguscik almost 13 years
    Yes, selected column must be in group by or in agreagtion like: select max(column_name)
  • william.eyidi
    william.eyidi over 6 years
    this makes a lot more sense, group by only show a reprensentative row for the group therefore all the non aggregate fields should be included into the group by clause.