I have my query like this:
Select
a.abc,
a.cde,
a.efg,
a.agh,
c.dummy
p.test
max(b.this)
sum(b.sugar)
sum(b.bucket)
sum(b.something)
Generally, any column that isn't in the group by
section can only be included in the select
section if it has an aggregating function applied to it. Or, another way, any non-aggregated data in the select
section must be grouped on.
Otherewise, how do you know what you want done with it. For example, if you group on a.abc
, there can only be one thing that a.abc
can be for that grouped row (since all other values of a.abc
will come out in a different row). Here's a short example, with a table containing:
LastName FirstName Salary
-------- --------- ------
Smith John 123456
Smith George 111111
Diablo Pax 999999
With the query select LastName, Salary from Employees group by LastName
, you would expect to see:
LastName Salary
-------- ------
Smith ??????
Diablo 999999
The salary for the Smiths is incalculable since you don't know what function to apply to it, which is what's causing that error. In other words, the DBMS doesn't know what to do with 123456
and 111111
to get a single value for the grouped row.
If you instead used select LastName, sum(Salary) from Employees group by LastName
(or max()
or min()
or ave()
or any other aggregating function), the DBMS would know what to do. For sum()
, it will simply add them and give you 234567
.
In your query, the equivalent of trying to use Salary
without an aggregating function is to change sum(b.this)
to just b.this
but not include it in the group by
section. Or alternatively, remove one of the group by
columns without changing it to an aggregation in the select
section.
In both cases, you'll have one row that has multiple possible values for the column.
The DB2 docs at publib for sqlstate 42803 describe your problem:
A column reference in the SELECT or HAVING clause is invalid, because it is not a grouping column; or a column reference in the GROUP BY clause is invalid.
SQL will insist that any column in the SELECT section is either included in the GROUP BY section or has an aggregate function applied to it in the SELECT section.
This article gives a nice explanation of why this is the case. The article is sql server specific but the principle should be roughly similar for all RDBMS