Reuse computed select value

旧街凉风 提交于 2019-12-01 06:50:02

Test timing

You don't see the evaluation of individual functions per row in the EXPLAIN output.

Test with EXPLAIN ANALYZE to get actual query times to compare overall effectiveness. Run a couple of times to rule out caching artifacts. For simple queries like this, you get more reliable numbers for the total runtime with:

EXPLAIN (ANALYZE, TIMING OFF) SELECT ...

Requires Postgres 9.2+. Per documentation:

TIMING

Include actual startup time and time spent in each node in the output. The overhead of repeatedly reading the system clock can slow down the query significantly on some systems, so it may be useful to set this parameter to FALSE when only actual row counts, and not exact times, are needed. Run time of the entire statement is always measured, even when node-level timing is turned off with this option. This parameter may only be used when ANALYZE is also enabled. It defaults to TRUE.

Prevent repeated evaluation

Normally, expressions in a subquery are evaluated once. But Postgres can collapse trivial subqueries if it thinks that will be faster.

To introduce an optimization barrier, you could use a CTE instead of the subquery. This guarantees that Postgres computes ST_SnapToGrid(geom, 50) once only:

WITH cte AS (
   SELECT ST_SnapToGrid(geom, 50) AS geom1
   FROM   points
   )
SELECT COUNT(*)   AS n
     , ST_X(geom1) AS x
     , ST_Y(geom1) AS y
FROM   cte
GROUP  BY geom1;         -- see below

However, this it's probably slower than a subquery due to more overhead for a CTE. The function call is probably very cheap. Generally, Postgres knows better how to optimize a query plan. Only introduce such an optimization barrier if you know better.

Simplify

I changed the name of the computed point in the subquery / CTE to geom1 to clarify it's different from the original geom. That helps to clarify the more important thing here:

GROUP BY geom1

instead of:

GROUP BY x, y

That's obviously cheaper - and may have an influence on whether the function call is repeated. So, this is probably fastest:

SELECT COUNT(*) AS n
     , ST_X(ST_SnapToGrid(geom, 50)) AS x
     , ST_y(ST_SnapToGrid(geom, 50)) AS y
FROM   points
GROUP  BY ST_SnapToGrid(geom, 50);         -- same here!

Or maybe this:

SELECT COUNT(*)    AS n
     , ST_X(geom1) AS x
     , ST_y(geom1) AS y
FROM (
   SELECT ST_SnapToGrid(geom, 50) AS geom1
   FROM   points
   ) AS tmp
GROUP  BY geom1;

Test all three with EXPLAIN ANALYZE or EXPLAIN (ANALYZE, TIMING OFF) and see for yourself. Testing >> guessing.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!