问题
I have the following scenario in Postgres (I'm using 9.4.1).
I have a table of this format:
create table test(
id serial,
val numeric not null,
created timestamp not null default(current_timestamp),
fk integer not null
);
What I then have is a threshold numeric
field in another table which should be used to label each row of test
. For every value which is >= threshold
I want to have that record marked as true
but if it is true
it should reset subsequent counts to 0 at that point, e.g.
Data set:
insert into test(val, created, fk)
(100, now() + interval '10 minutes', 5),
(25, now() + interval '20 minutes', 5),
(30, now() + interval '30 minutes', 5),
(45, now() + interval '40 minutes', 5),
(10, now() + interval '50 minutes', 5);
With a threshold of 50 I would like to get the output as:
100 -> true (as 100 > 50) [reset]
25 -> false (as 25 < 50)
30 -> true (as 25 + 30 > 50) [reset]
45 -> false (as 45 < 50)
10 -> true (as 45 + 10 > 50)
Is it possible to do this in a single SQL query? So far I have experimented with using a window function.
select t.*,
sum(t.val) over (
partition by t.fk order by t.created
) as threshold_met
from test t
where t.fk = 5;
As you can see I have got it to the point where I have a cumulative frequency and suspect that the tweaking of rows between x preceding and current row
may be what I'm looking for. I just can't work out how to perform the reset, i.e. set x
, in the above to the appropriate value.
回答1:
Create your own aggregate function, which can be used as window function.
Specialized aggregate function
It's easier than one might think:
CREATE OR REPLACE FUNCTION f_sum_cap50 (numeric, numeric)
RETURNS numeric LANGUAGE sql AS
'SELECT CASE WHEN $1 > 50 THEN 0 ELSE $1 END + $2';
CREATE AGGREGATE sum_cap50 (numeric) (
sfunc = f_sum_cap50
, stype = numeric
, initcond = 0
);
Then:
SELECT *, sum_cap50(val) OVER (PARTITION BY fk
ORDER BY created) > 50 AS threshold_met
FROM test
WHERE fk = 5;
Result exactly as requested.
db<>fiddle here
Old sqlfiddle
Generic aggregate function
To make it work for any thresholds and any (numeric) data type, and also allow NULL
values:
CREATE OR REPLACE FUNCTION f_sum_cap (anyelement, anyelement, anyelement)
RETURNS anyelement
LANGUAGE sql STRICT AS
$$SELECT CASE WHEN $1 > $3 THEN '0' ELSE $1 END + $2;$$;
CREATE AGGREGATE sum_cap (anyelement, anyelement) (
sfunc = f_sum_cap
, stype = anyelement
, initcond = '0'
);
Then, to call with a limit of, say, 110 with any numeric type:
SELECT *
, sum_cap(val, '110') OVER (PARTITION BY fk
ORDER BY created) AS capped_at_110
, sum_cap(val, '110') OVER (PARTITION BY fk
ORDER BY created) > 110 AS threshold_met
FROM test
WHERE fk = 5;
db<>fiddle here
Old sqlfiddle
Explanation
In your case we don't have to defend against NULL
values since val
is defined NOT NULL
. If NULL
can be involved, define f_sum_cap()
as STRICT
and it works because (per documentation):
If the state transition function is declared "strict", then it cannot be called with null inputs. With such a transition function, aggregate execution behaves as follows. Rows with any null input values are ignored (the function is not called and the previous state value is retained) [...]
Both function and aggregate take one more argument. For the polymorphic variant it can be a hard coded data type or the same polymorphic type as the leading arguments.
About polymorphic functions:
- Initial array in function to aggregate multi-dimensional array
Note the use of untyped string literals, not numeric literals, which would default to integer
!
来源:https://stackoverflow.com/questions/29417300/cumulative-adding-with-dynamic-base-in-postgres