Slow Postgres 9.3 Queries, again

问题

This is a follow-up to the question at Slow Postgres 9.3 queries.

The new indexes definitely help. But what we're seeing is sometimes queries are much slower in practice than when we run EXPLAIN ANALYZE. An example is the following, run on the production database:

explain analyze SELECT * FROM messages WHERE groupid=957 ORDER BY id DESC LIMIT 20 OFFSET 31980;
                                                                       QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=127361.90..127441.55 rows=20 width=747) (actual time=152.036..152.143 rows=20 loops=1)
   ->  Index Scan Backward using idx_groupid_id on messages  (cost=0.43..158780.12 rows=39869 width=747) (actual time=0.080..150.484 rows=32000 loops=1)
         Index Cond: (groupid = 957)
 Total runtime: 152.186 ms
(4 rows)

With slow query logging turned on, we see instances of this query taking over 2 seconds. We also have log_lock_waits=true, and no slow locks are reported around the same time. What could explain the vast difference in execution times?

回答1:

LIMIT x OFFSET y generally performs not much faster than LIMIT x + y. A large OFFSET is always comparatively expensive. The suggested index in the linked question helps, but while you cannot get index-only scans out of it, Postgres still has to check visibility in the heap (the main relation) for at least x + y rows to determine the correct result.

SELECT *
FROM   messages
WHERE  groupid = 957
ORDER  BY id DESC
LIMIT  20
OFFSET 31980;

CLUSTER on your index (groupid,id) would help to increase locality of data in the heap and reduce the number of data pages to be read per query. Definitely a win. But if all groupid are equally likely to be queried, that's not going to remove the bottleneck of too little RAM for cache. If you have concurrent access, consider pg_repack instead of CLUSTER:

Optimize Postgres timestamp query range

Do you actually need all columns returned? (SELECT *) A covering index enabling index-only scans might help if you only need a few small columns returned. (autovacuum must be strong enough to cope with writes to the table, though. Read-only table would be ideal.)

Also, according to your linked question, your table is 32 GB on disk. (Typically a bit more in RAM). The index on (groupid,id) adds another 308 MB at least (without any bloat):

SELECT pg_size_pretty(7337880.0 * 44);  -- row count * tuple size

Making sense of Postgres row sizes

You have 8 GB RAM, of which you expect around 4.5 GB to be used for cache (effective_cache_size = 4608MB). That's enough to cache the index for repeated use, but not nearly enough to also cache the whole table.

If your query happens to find data pages in cache, it's fast. Else, not so much. Big difference, even with SSD storage (much more with HDD).

Not directly related to this query, but 8 MB of work_mem (work_mem = 7864kB) seems way to small for your setup. Depending on various other factors I would set this to at least 64MB (unless you have many concurrent queries with sort / hash operations). Like @Craig commented, EXPLAIN (BUFFERS, ANALYZE) might tell us more.

The best query plan also depends on value frequencies. If only few rows pass the filter, the result might be empty for certain groupid and the query is comparatively fast. If a large portion of the table has to be fetched, a plain sequential scan wins. You need valid table statistics (autovacuum again). And possibly a larger statistics target for groupid:

Keep PostgreSQL from sometimes choosing a bad query plan

回答2:

Since OFFSET is slow, an alternative is to simulate OFFSET using another column and some index preparation. We require a UNIQUE column (like a PRIMARY KEY) on the table. If there is none, one can be added with:

CREATE SEQUENCE messages_pkey_seq ;
ALTER TABLE messages 
  ADD COLUMN message_id integer DEFAULT nextval('messages_pkey_seq');

Next we create the position column for the OFFSET simulation:

ALTER TABLE messages ADD COLUMN position INTEGER;
UPDATE messages SET position = q.position FROM (SELECT message_id,
  row_number() OVER (PARTITION BY group_id ORDER BY id DESC) AS position
  FROM messages ) AS q WHERE q.message_id=messages.message_id ;
CREATE INDEX ON messages ( group_id, position ) ;

Now we are ready for the new version of the query in the OP:

SELECT * FROM messages WHERE group_id = 957 AND
  position BETWEEN 31980 AND (31980+20-1) ;

来源：https://stackoverflow.com/questions/41092627/slow-postgres-9-3-queries-again

标签

postgresql

indexing

sql-execution-plan

explain

postgresql-performance