using the SQL query
select u.name,count(u.name) as \'followers\'
from user u,follow f
where u.type = \'c\' AND f.followee = u.email
group by u.name
When you use count without group by
it will count all the records and returns single line
while when you use count with group by
it will group the users
on the base of their names and returns the count of each group
.
the exact same query without the
group by
line only gives me the first value.
Not quite.
The query without group by
looks like this:
select u.name, count(u.name) as 'followers'
from user u, follow f
where u.type = 'c' AND f.followee = u.email
The query uses COUNT()
that is a GROUP BY aggregate function. These functions require the presence of a GROUP BY
clause in the query. However, the SQL standard is tolerant and accepts you query and creates a single group from all the rows filtered by the WHERE
clause.
On the other side, your query without the GROUP BY
clause is invalid.
This is how the GROUP BY
queries work:
WHERE
clause are grouped; all the rows from a group have the same value for the first expression present in the GROUP BY
clause;GROUP BY
clause contains two or more expressions, each group created on the first step is split into sub-groups using the second expression from the GROUP BY
clause;GROUP BY
clause, creating nested sub-groups;If a column or an expression from the SELECT
clause does not use a GROUP BY
aggregate function and is not present in the GROUP BY
clause then some groups may contain rows having different values for that column/expression; this is an error.
In order to avoid this to happen, the SQL standard allows in the SELECT
clause only expressions that satisfy one of these conditions:
GROUP BY
clause;GROUP BY
clause.Let's analyze the expressions in the SELECT
clause of your query:
u.name
- on the initial query it satisfies condition #1; on the query without GROUP BY
it doesn't satisfy any condition. This makes the query invalid SQL.count(u.name)
- it satisfies condition #2 on both versions of the query; it doesn't make problems.Even if the version of the query without GROUP BY
is not valid SQL, up to version 5.7.5, MySQL allows it but it reserves itself the freedom to return indeterminate values for the invalid expressions (u.name
).
A quote from the documentation:
In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate, which is probably not what you want. Furthermore, the selection of values from each group cannot be influenced by adding an
ORDER BY
clause.
In plain English this means that your query without GROUP BY
returns the correct value for followers
but the value returned for name
can be different on different executions of the same query. You cannot observe this behaviour if you run the query multiple times but chances are it will happen after you add or remove rows from the table or you backup the table, truncate it then restore it from the backup (or recreate it on a different machine or different version of MySQL).