Interpreting Results From Explain Analyze in Postgres

问题

Recently did a query which took about 9 minutes to complete. Attempting to determine why, I used EXPLAIN ANALYZE to help solve the problem. From the output, it looks as though everything has the appropriate indexes, it's just taking an extremely long time.

I've put the query and the results below. Is it just taking this long due to the amount of data? Or is there something which I am doing wrong. Does my query need to fundamentally change, in order to improve the performance?

Additional Info: PostgreSQL 10.6. System in a Xeon @ 3.4Ghz, with SSD and 24GB of memory, so it's fairly good spec desktop.

SELECT      s.start_date, s.end_date, s.resources, s.activity_index, r.resource_id, sa.usedresourceset 
FROM        rm_o_resource_usage_instance_splits_new s 
INNER JOIN  rm_o_resource_usage r ON s.usage_id = r.id 
INNER JOIN  scheduledactivities sa ON s.activity_index = sa.activity_index 
    AND r.schedule_id = sa.solution_id 
    and s.solution = sa.solution_id 
WHERE       r.schedule_id = 14349 
ORDER BY   r.resource_id, s.start_date
----------------------------------------------------------------
"Sort  (cost=18.01..18.01 rows=1 width=86) (actual time=541075.198..541099.504 rows=65354 loops=1)"
"  Sort Key: r.resource_id, s.start_date"
"  Sort Method: external merge  Disk: 8016kB"
"  ->  Nested Loop  (cost=0.85..18.00 rows=1 width=86) (actual time=6.946..540671.934 rows=65354 loops=1)"
"        Join Filter: (s.activity_index = sa.activity_index)"
"        Rows Removed by Join Filter: 3550029280"
"        ->  Nested Loop  (cost=0.42..12.93 rows=1 width=98) (actual time=0.074..1795.054 rows=65354 loops=1)"
"              Join Filter: (s.usage_id = r.id)"
"              Rows Removed by Join Filter: 248018"
"              ->  Index Scan using rm_o_resource_usage_instance_splits_new_solution_idx on rm_o_resource_usage_instance_splits_new s  (cost=0.42..4.76 rows=1 width=69) (actual time=0.032..36.395 rows=65354 loops=1)"
"                    Index Cond: (solution = 14349)"
"              ->  Seq Scan on rm_o_resource_usage r  (cost=0.00..8.15 rows=2 width=45) (actual time=0.018..0.019 rows=5 loops=65354)"
"                    Filter: (schedule_id = 14349)"
"                    Rows Removed by Filter: 332"
"        ->  Index Scan using scheduled_activities_idx on scheduledactivities sa  (cost=0.42..5.06 rows=1 width=16) (actual time=0.007..4.937 rows=54321 loops=65354)"
"              Index Cond: (solution_id = 14349)"
"Planning time: 1.547 ms"
"Execution time: 541104.491 ms"

回答1:

Looking at your query exec plan a lot of time is spent on sorting

    "Sort  (cost=18.01..18.01 rows=1 width=86) (actual time=541075.198..541099.504 rows=65354 loops=1)"
"  Sort Key: r.resource_id, s.start_date"
"  Sort Method: external merge  Disk: 8016kB"

I would recommend you to increase work_mem param for PostgreSQL as described here

If you do a lot of complex sorts, and have a lot of memory, then increasing the work_mem parameter allows PostgreSQL to do larger in-memory sorts which, unsurprisingly, will be faster than disk-based equivalents.

This size is applied to each and every sort done by each user, and complex queries can use multiple working memory sort buffers. Set it to 50MB, and have 30 users submitting queries, and you are soon using 1.5GB of real memory. Furthermore, if a query involves doing merge sorts of 8 tables, that requires 8 times work_mem. You need to consider what you set max_connections to in order to size this parameter correctly. This is a setting where data warehouse systems, where users are submitting very large queries, can readily make use of many gigabytes of memory.

log_temp_files can be used to log sorts, hashes, and temp files which can be useful in figuring out if sorts are spilling to disk instead of fitting in memory. You can see sorts spilling to disk using EXPLAIN ANALYZE plans as well. For example, if you see a line like Sort Method: external merge Disk: 7526kB in the output of EXPLAIN ANALYZE, a work_mem of at least 8MB would keep the intermediate data in memory and likely improve the query response time (although it may take substantially more than 8MB to do the sort entirely in memory, as data on disk is stored in a more compact format).

来源：https://stackoverflow.com/questions/56329107/interpreting-results-from-explain-analyze-in-postgres

标签

sql

postgresql

postgresql-10