I saw an example where there was a list (table) of employees with their respective monthly salaries. I did a sum of the salaries and saw the exact same table in the ouptput
The sad thing is that there is one database that supports the syntax you are suggesting:
SELECT EmployeeID, SUM (MonthlySalary)
FROM Employee
However, MySQL does not do what you expect. It returns the overall sum of the MonthlySalary for everyone, and one arbitrary EmployeeId. Alas.
Your question is about SQL syntax. The answer is that is how SQL has been defined, and it is not going to change. Determining the aggregation fields from the SELECT
clause is not unreasonable, but it is not how this language is defined.
I do, however, have some sympathy for the question. Many people learning SQL think of "grouping" as something done in the context of sorting the rows. Something like "sort the cities in the US and group them by state in the output". Makes sense. But "group by" in SQL really means "summarize by" not "keep together".
If you don't specify GROUP BY
, aggregate functions operate over all the records selected. In that case, it doesn't make sense to also select a specific column like EmployeeID
. Either you want per-employee totals, in which case you select the employee ID and group by employee, or you want a total across the entire table, so you leave out the employee ID and the GROUP BY
clause.
In your query, if you leave out the GROUP BY
, which employee ID would you like it to show?
If you wanted to add up all the numbers you would not have a GROUP BY:
SELECT SUM(MonthlySalary) AS TotalSalary
FROM Employee
+-----------+
|TotalSalary|
+-----------+
|777400 |
+-----------+
The point of the GROUP BY is that you get a separate total for each employee.
+--------+------+
|Employee|Salary|
+--------+------+
|John |123400|
+--------+------+
|Frank |413000|
+--------+------+
|Bill |241000|
+--------+------+
It might be easier if you think of GROUP BY as "for each" for the sake of explanation. The query below:
SELECT empid, SUM (MonthlySalary)
FROM Employee
GROUP BY EmpID
is saying:
"Give me the sum of MonthlySalary's for each empid"
So if your table looked like this:
+-----+------------+
|empid|MontlySalary|
+-----+------------+
|1 |200 |
+-----+------------+
|2 |300 |
+-----+------------+
result:
+-+---+
|1|200|
+-+---+
|2|300|
+-+---+
Sum wouldn't appear to do anything because the sum of one number is that number. On the other hand if it looked like this:
+-----+------------+
|empid|MontlySalary|
+-----+------------+
|1 |200 |
+-----+------------+
|1 |300 |
+-----+------------+
|2 |300 |
+-----+------------+
result:
+-+---+
|1|500|
+-+---+
|2|300|
+-+---+
Then it would because there are two empid 1's to sum together. Not sure if this explanation helps or not, but I hope it makes things a little clearer.