问题
I would like to count duplicate date values. I already know that my field "WHEN" is a timestamp, so I have to cast it to type of date.
my actual query looks like this:
SELECT
u.USERNAME,
r."WHEN",
r.UPDATEINOUT,
case (r.UPDATEINOUT) when 0 then 0 when 1 then 1 else r.INOUT end INOUT
FROM
ATTENDANT r
LEFT JOIN
USERS u ON r.USERID = u.ID
where
u.USERNAME = 'rk' and (r.UPDATEINOUT = 1 or r.UPDATEINOUT = 0 or r.UPDATEINOUT is null)
group by
r."WHEN",
INOUT,
u.USERNAME,
r.UPDATEINOUT
order by
r."WHEN"
And this is the result:
Username WHEN UPDATEINOUT INOUT
rk 09.04.2018, 14:59:45.000 [null] 0
rk 09.04.2018, 14:59:51.000 [null] 1
rk 11.04.2018, 08:31:02.000 [null] 0
rk 11.04.2018, 12:06:52.000 [null] 1
rk 11.04.2018, 12:10:29.000 [null] 0
rk 11.04.2018, 12:23:09.000 [null] 1
rk 11.04.2018, 12:43:47.000 [null] 0
rk 11.04.2018, 17:07:40.000 [null] 1
Now I would like to count the duplicate dates, the result should look like this:
Username WHEN UPDATEINOUT INOUT Count
rk 09.04.2018, 14:59:45.000 [null] 0 2
rk 09.04.2018, 14:59:51.000 [null] 1 2
rk 11.04.2018, 08:31:02.000 [null] 0 6
rk 11.04.2018, 12:06:52.000 [null] 1 6
rk 11.04.2018, 12:10:29.000 [null] 0 6
rk 11.04.2018, 12:23:09.000 [null] 1 6
rk 11.04.2018, 12:43:47.000 [null] 0 6
rk 11.04.2018, 17:07:40.000 [null] 1 6
When I add
count(cast(r."WHEN" as date))
then it shows me only one.
回答1:
You could upgrade to Firebird 3.0 and use window functions.
Alternatively, you can use CTEs
with t as ( < your query here > )
select t.*, tw.cnt
from t join
(select cast("WHEN" as date) as wdate, count(*) as cnt
from t
group by cast("WHEN" as date)
) tw
on cast(t."WHEN" as date) = tw.wdate;
回答2:
It seems there are answers suggesting "what to do" but not trying to explain why the results are like what they are. What is happening. It seems like the topic starter is being given a fish, not a fishing rod.
When I add
count(cast(r."WHEN" as date))
then it shows me only one.
This is a bit funny. This line shows novice misconception between natural human language like English, and mathematical sets languages like SQL.
R.Kut reads this his added line as "i want to count
how many [distinct] values of the given expression
are there".
But that is not what this command actually means in SQL. In SQL it means "I want to count
how many rows are there, where the given expression
is not null."
And so there is actually NO DIFFERENCE between count(cast(r."WHEN" as date))
and count(r."WHEN")
- those two parameters-expressions are either both NULL
or both NOT NULL
. Hence the value of count
of those equally nullable parameters is equal too.
The very attempt to truncate a value of the parameter of the aggregate function, like if it could alter nullability, is a misconception. I was there too. It takes time to get used what aggregates over mathematical sets really mean, and that you do not reading English when you are reading SQL.
Frankly, you could do just count (1)
here removing not only the typecast but the column itself - it would still be the same, as those are rows, not values that are being asked to be counted. Unless there are rows where "WHEN" IS NULL
- that would be accounted for by group by
but not by count
. After you read and think through the next section, come back and toy with http://sqlfiddle.com/#!9/ee09a/7
Now, there is another parameter to the count
function that I kind of mentioned above. It is that "distinct" parameter.
Remark: One may say distinct
is a keyword of the SQL language, not a parameter for a function, but de facto, not de jure, it changes the way the function works, so in my perception it is the parameter, in that unusual SQLish way the parameters are often given to functions. Or, another way to reason about this, it may be seen a part of the function name, if to imagine have we two functions to choose from, count
and count-distinct
.
So, the topic starter could had added count(distinct cast(r."WHEN" as date))
instead, and...
....and see that nothing changed. Because while this time he would really say to the server to count row with not-null (always only not-null!) and distinct
values - the counting goes within the group.
And what those groups are?
group by
r."WHEN",
INOUT,
u.USERNAME,
r.UPDATEINOUT
See, across the groups we have rows with distinct time-and-date values of "WHEN". And other columns too, but I am not focusing on them. The thing here is that within every group both time and date parts of "WHEN" are kept the same. And "being the same" means "there is one distinct
value, repeated again and again". And if there is only one distinct
time-and-date value then the reduced values of only-time or only-date would have the same counts (reducing a value can only make to previously different values equal, but not make previously equal values now different).
Granted, in other situations, when the counting is going over columns not included in the group (or when there is not group by
clause at all), the result may differ. There count(distinct ...)
might do what the topic starter is expecting it to do.
http://sqlfiddle.com/#!9/0d65bf/7 - example.
However one has to keep in mind:
- that happens at the expense of the extra sorting and grouping work done by a server, potentially making the query slow or consuming lot of memory
- still it would work within the group (just groups then would start to contain different values for the
count
ed columns). - ....just sometimes the group is the whole query resultset (most simple variant - the whole table), if the programmer did not set it to be different by adding
group by
clause :-D
The code of the example, linked above:
create table X(a integer, b integer);
insert into X values (1,1);
insert into X values (1,2);
insert into X values (1,2);
commit;
select count(distinct b) from x group by a
-- Result: 1 row: 2
-- or if the whole table is the group
select count(distinct b) from x
-- Result: 1 row: 2
-- but if the group includes the counted column
-- then every group would contain EXACTLY ONE
-- row with a not-null distinct value
select count(distinct b) from x group by b
-- Result: 2 rows: 1 and 1
Here we come to yet another modification to toy with.
group by
cast(r."WHEN" as date), -- <====
INOUT,
u.USERNAME,
r.UPDATEINOUT
Now, this time we are telling the server to assemble the groups, in which only "date" part of "WHEN" is the same, while "time" part might differ.
However....
- I think not every SQL server supports expressions in
group by
clause. - Unless there is an
index
made exactly by that expression, the server would have to do much extra work, probably resorting to natural scan and temporary files grouping, which would make the query heavy and slow. - And of course you would HAVE to modify the columns list accordingly.
.
SELECT
u.USERNAME,
cast(r."WHEN" as date), -- <=== no more raw r."WHEN"
r.UPDATEINOUT,
Because, well, you just cannot have "One True Value" for WHEN
in the row, if you explicitly asked server to group different values of it.
Here you may come to the conclusion, that there is no simple straightforward way to have both groups and across-the-groups aggregates. Or to put it differently, to have two different sets of group by
in one query (I do not want to mention UNION
here, mkay?).
You would need ONE set of group by
criteria for counting rows with the same date part (but maybe different time part), and yet ANOTHER group by
criteria for picking and rendering groups different by both date-and-time.
In plain and simple SQL 1999 that means you would have to make TWO select
s to make both groupings one way or another, and here Yogesh's and Gordon's answer kick in.
Like Gordon mentioned in post-99 SQL there came Window Functions which could let you have those criteria sets in one select
, but they are not available in Firebird 2.x anyway.
回答3:
I think you need subquery
:
SELECT u.USERNAME, r."WHEN", r.UPDATEINOUT,
case (r.UPDATEINOUT) when 0 then 0 when 1 then 1 else r.INOUT end INOUT,
(SELECT COUNT(*)
FROM ATTENDANT r1
WHERE cast(r1."WHEN" as date)) = cast(r."WHEN" as date)
) as Count
FROM ATTENDANT r LEFT JOIN
USERS u
ON r.USERID = u.ID
WHERE u.USERNAME = 'rk' AND
(r.UPDATEINOUT = 1 or r.UPDATEINOUT = 0 or r.UPDATEINOUT is null)
GROUP BY r."WHEN", INOUT, u.USERNAME, r.UPDATEINOUT
ORDER BY r."WHEN";
来源:https://stackoverflow.com/questions/51152947/count-equal-date-parts-of-the-timestamp-column-across-the-groups-by-the-said-col