问题
When I put next query into function it goes 76times slower. The only difference at plan is: bitmap-index scan VS index scan
Plan1: http://tatiyants.com/pev/#/plans/plan_1562919134481
Plan2: http://tatiyants.com/pev/#/plans/plan_1562918860704
plan1
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
SELECT
sum( t.group_suma ) OVER( PARTITION BY (t.o).id ) AS total_suma,
*
FROM (
SELECT
sum( ocd.item_cost ) AS group_cost,
sum( ocd.item_suma ) AS group_suma,
max( (ocd.ic).consumed ) AS consumed,
(ocd.ic).consumed_period,
ocd.o
FROM order_cost_details( tstzrange( '2019-04-01', '2019-05-01' ) ) ocd
GROUP BY ocd.o, (ocd.ic).consumed_period
) t
WHERE (t.o).id IN ( 6154 ) AND t.consumed_period @> '2019-04-01'::timestamptz
;
Plan2
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)
SELECT * FROM order_total_suma( tstzrange( '2019-04-01', '2019-05-01' ) ) ots
WHERE (ots.o).id IN ( 6154 ) AND ots.consumed_period @> '2019-04-01'::timestamptz
;
The function:
CREATE FUNCTION "order_total_suma" (in _target_range tstzrange default app_period())
RETURNS table(
total_suma double precision,
group_cost double precision,
group_suma double precision,
consumed double precision,
consumed_period tstzrange,
o order_bt
)
LANGUAGE sql
STABLE
AS $$
SELECT
sum( t.group_suma ) OVER( PARTITION BY (t.o).id ) AS total_suma,
*
FROM (
SELECT
sum( ocd.item_cost ) AS group_cost,
sum( ocd.item_suma ) AS group_suma,
max( (ocd.ic).consumed ) AS consumed,
(ocd.ic).consumed_period,
ocd.o
FROM order_cost_details( _target_range ) ocd
GROUP BY ocd.o, (ocd.ic).consumed_period
) t
$$
;
Why for the query inside function the filtering is done at the last subquery scan?
Is it possible to do something so that they work equally?
UPD
Server version is PostgreSQL 12beta2
Because of 30000 characters limit I post plans here and here
回答1:
Thank to RhodiumToad from IRC:
I suspect something's stopping the planner from being able to deduce that (t.o).id is safe to push through a GROUP BY ocd.o
that might be fixable by making it a separate column of its own
Thus I additionally GROUP BY
odc.id column. So my final query is:
SELECT * FROM (
SELECT
sum( t.group_suma ) OVER( PARTITION BY t.order_id ) AS total_suma,
-- sum( t.group_suma ) OVER( PARTITION BY (t.o).id ) AS total_suma, -- For any WHERE this takes 2700ms
*
FROM (
SELECT
sum( ocd.item_cost ) AS group_cost,
sum( ocd.item_suma ) AS group_suma,
max( (ocd.ic).consumed ) AS consumed,
(ocd.ic).consumed_period,
ocd.o,
(ocd.o).id as order_id
FROM order_cost_details( tstzrange( '2019-04-01', '2019-05-01' ) ) ocd
GROUP BY ocd.o, (ocd.o).id, (ocd.ic).consumed_period
) t
) t
WHERE t.order_id = 6154 AND t.consumed_period @> '2019-04-01'::timestamptz -- This takes 2ms
-- WHERE (t.o).id = 6154 AND t.consumed_period @> '2019-04-01'::timestamptz -- This takes 2700ms
This change also makes call via function faster. I just need to sort via order_id
field:
SELECT * FROM order_total_suma( tstzrange( '2019-04-01', '2019-05-01' ) ) ots
-- This WHERE takes 2.5ms
WHERE ots.order_id IN ( 6154 ) AND ots.consumed_period @> '2019-04-01'::timestamptz
-- This WHERE takes 2500ms
-- WHERE (ots.o).id IN ( 6154 ) AND ots.consumed_period @> '2019-04-01'::timestamptz
回答2:
The plans are quite different.
The problem is the misestimate in the result count of the join between public.order_bt
and the split_period
subquery. That causes the function public.service_level_price
to be evaluated 2882 times rather than once, which is where the time is spent.
Not sure what to do about this (we don't have the view definition, and it's probably nasty). Raising the COST
of the function probably doesn't help as the optimizer thinks it will call it only once.
Actually, the best bet may be
ALTER FUNCTION public.calc_item_suma ROWS 1;
which might get the optimizer to choose a different plan.
来源:https://stackoverflow.com/questions/57003113/why-the-query-is-executed-76-times-slower-when-i-put-it-into-function