GROUP BY without aggregate function
Solution 1
That's how GROUP BY works. It takes several rows and turns them into one row. Because of this, it has to know what to do with all the combined rows where there have different values for some columns (fields). This is why you have two options for every field you want to SELECT : Either include it in the GROUP BY clause, or use it in an aggregate function so the system knows how you want to combine the field.
For example, let's say you have this table:
Name | OrderNumber
------------------
John | 1
John | 2
If you say GROUP BY Name, how will it know which OrderNumber to show in the result? So you either include OrderNumber in group by, which will result in these two rows. Or, you use an aggregate function to show how to handle the OrderNumbers. For example, MAX(OrderNumber)
, which means the result is John | 2
or SUM(OrderNumber)
which means the result is John | 3
.
Solution 2
Given this data:
Col1 Col2 Col3
A X 1
A Y 2
A Y 3
B X 0
B Y 3
B Z 1
This query:
SELECT Col1, Col2, Col3 FROM data GROUP BY Col1, Col2, Col3
Would result in exactly the same table.
However, this query:
SELECT Col1, Col2 FROM data GROUP BY Col1, Col2
Would result in:
Col1 Col2
A X
A Y
B X
B Y
B Z
Now, a query:
SELECT Col1, Col2, Col3 FROM data GROUP BY Col1, Col2
Would create a problem: the line with A, Y is the result of grouping the two lines
A Y 2
A Y 3
So, which value should be in Col3, '2' or '3'?
Normally you would use a GROUP BY
to calculate e.g. a sum:
SELECT Col1, Col2, SUM(Col3) FROM data GROUP BY Col1, Col2
So in the line, we had a problem with we now get (2+3) = 5.
Grouping by all your columns in your select is effectively the same as using DISTINCT, and it is preferable to use the DISTINCT keyword word readability in this case.
So instead of
SELECT Col1, Col2, Col3 FROM data GROUP BY Col1, Col2, Col3
use
SELECT DISTINCT Col1, Col2, Col3 FROM data
Solution 3
You're experiencing a strict requirement of the GROUP BY clause. Every column not in the group-by clause must have a function applied to reduce all records for the matching "group" to a single record (sum, max, min, etc).
If you list all queried (selected) columns in the GROUP BY clause, you are essentially requesting that duplicate records be excluded from the result set. That gives the same effect as SELECT DISTINCT which also eliminates duplicate rows from the result set.
Solution 4
The only real use case for GROUP BY without aggregation is when you GROUP BY more columns than are selected, in which case the selected columns might be repeated. Otherwise you might as well use a DISTINCT.
It's worth noting that other RDBMS's do not require that all non-aggregated columns be included in the GROUP BY. For example in PostgreSQL if the primary key columns of a table are included in the GROUP BY then other columns of that table need not be as they are guaranteed to be distinct for every distinct primary key column. I've wished in the past that Oracle did the same as it would have made for more compact SQL in many cases.
Solution 5
Let me give some examples.
Consider this data.
CREATE TABLE DATASET ( VAL1 CHAR ( 1 CHAR ),
VAL2 VARCHAR2 ( 10 CHAR ),
VAL3 NUMBER );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'b', 'b-details', 2 );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'a', 'a-details', 1 );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'c', 'c-details', 3 );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'a', 'dup', 4 );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'c', 'c-details', 5 );
COMMIT;
Whats there in table now
SELECT * FROM DATASET;
VAL1 VAL2 VAL3
---- ---------- ----------
b b-details 2
a a-details 1
c c-details 3
a dup 4
c c-details 5
5 rows selected.
--aggregate with group by
SELECT
VAL1,
COUNT ( * )
FROM
DATASET A
GROUP BY
VAL1;
VAL1 COUNT(*)
---- ----------
b 1
a 2
c 2
3 rows selected.
--aggregate with group by multiple columns but select partial column
SELECT
VAL1,
COUNT ( * )
FROM
DATASET A
GROUP BY
VAL1,
VAL2;
VAL1
----
b
c
a
a
4 rows selected.
--No aggregate with group by multiple columns
SELECT
VAL1,
VAL2
FROM
DATASET A
GROUP BY
VAL1,
VAL2;
VAL1
----
b b-details
c c-details
a dup
a a-details
4 rows selected.
--No aggregate with group by multiple columns
SELECT
VAL1
FROM
DATASET A
GROUP BY
VAL1,
VAL2;
VAL1
----
b
c
a
a
4 rows selected.
You have N columns in select (excluding aggregations), then you should have N or N+x columns
Related videos on Youtube
XForCE07
Updated on August 19, 2020Comments
-
XForCE07 over 3 years
I am trying to understand GROUP BY (new to oracle dbms) without aggregate function.
How does it operate?
Here is what i have tried.EMP table on which i will run my SQL.
SELECT ename , sal FROM emp GROUP BY ename , sal
SELECT ename , sal FROM emp GROUP BY ename;
Result
ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action:
Error at Line: 397 Column: 16SELECT ename , sal FROM emp GROUP BY sal;
Result
ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action: Error at Line: 411 Column: 8SELECT empno , ename , sal FROM emp GROUP BY sal , ename;
Result
ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action: Error at Line: 425 Column: 8SELECT empno , ename , sal FROM emp GROUP BY empno , ename , sal;
So, basically the number of columns have to be equal to the number of columns in the GROUP BY clause, but i still do not understand why or what is going on.
-
Bhaskar over 9 yearsIf there's no aggregate function and if you don't mind getting the result in ascending or descending order, you can use sorting instead (ORDER BY).
-
ZeroK over 8 yearsFunctionally, if you use GROUP BY with no Aggregate functions in the select, you are just doing a DISTINCT. Oracle seems to use different methods for each, but it ends with the same result.
-
-
ZeroK over 8 yearsOne note: You can also have Constant columns that are not in the GROUP BY clause. But it is true all columns must be in one of three categories: An aggregate function, a constant, or it must appear in the GROUP BY clause. For clarity, when I say constant I mean something like "Select 1 sort_order FROM table1" where you are assigning a constant value in the actual SQL.
-
Santanu Sur about 6 yearswhat would result in
SELECT Col1, Col2, Col3 FROM data GROUP BY Col1
? -
oerkelens about 6 years@SantanuSur That simply creates the same problem I explained for
SELECT Col1, Col2, Col3 FROM data GROUP BY Col1, Col2
but with an extra problematic column. What values would you expect forCol2
andCol3
for the line whereCol1
= A? -
Santanu Sur about 6 yearsI just want to
group the data
with respect to one column -
Santanu Sur about 6 yearssuppose i have a table with 3 columns...and the third column has a number of duplicates..i want to extract that table...without the third column getting jumbled up... for example 3rd column :-
A B A B
i want to get all columns with 3rd column result like this :-A A B B
-
Santanu Sur about 6 yearsWill
select * from table group by 3rd Column
will work ?? -
A. Cedano about 6 yearsAs @Varun says, ¡best explanation ever! Helped me to understand simply what happens with
GROUP BY
,ORDER BY
and aggregate functions. Simply, clair, with one very easy example. ¡Thanks a lot! -
Merin Nakarmi over 3 yearsThis is the best answer!
-
beedrill about 3 yearsThis is the answer I am looking for, this should go as accepted.