Indexed ORDER BY with LIMIT 1

余生颓废 提交于 2019-11-28 07:58:03

问题


I'm trying to fetch most recent row in a table. I have a simple timestamp created_at which is indexed. When I query ORDER BY created_at DESC LIMIT 1, it takes far more than I think it should (about 50ms on my machine on 36k rows).

EXPLAIN-ing claims that it uses backwards index scan, but I confirmed that changing the index to be (created_at DESC) does not change the cost in query planner for a simple index scan.

How can I optimize this use case?

Running postgresql 9.2.4.

Edit:

# EXPLAIN SELECT * FROM articles ORDER BY created_at DESC LIMIT 1;
                                                  QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
Limit  (cost=0.00..0.58 rows=1 width=1752)
   ->  Index Scan Backward using index_articles_on_created_at on articles  (cost=0.00..20667.37 rows=35696 width=1752)
(2 rows)

回答1:


Assuming we are dealing with a big table, a partial index might help:

CREATE INDEX tbl_created_recently_idx ON tbl (created_at DESC)
WHERE created_at > '2013-09-15 0:0'::timestamp;

As you already found out: descending or ascending hardly matters here. Postgres can scan backwards at almost the same speed (exceptions apply with multi-column indices).

Query to use this index:

SELECT * FROM tbl
WHERE  created_at > '2013-09-15 0:0'::timestamp -- matches index
ORDER  BY created_at DESC
LIMIT  1;

The point here is to make the index much smaller, so it should be easier to cache and maintain.

  1. You need to pick a timestamp that is guaranteed to be smaller than the most recent one.
  2. You should recreate the index from time to time to cut off old data.
  3. The condition needs to be IMMUTABLE.

So the one-time effect deteriorates over time. The specific problem is the hard coded condition:

WHERE created_at > '2013-09-15 0:0'::timestamp

Automate

You could update the index and your queries manually from time to time. Or you automate it with the help of a function like this one:

CREATE OR REPLACE FUNCTION f_min_ts()
  RETURNS timestamp LANGUAGE sql IMMUTABLE AS
$$SELECT '2013-09-15 0:0'::timestamp$$

Index:

CREATE INDEX tbl_created_recently_idx ON tbl (created_at DESC);
WHERE created_at > f_min_ts();

Query:

SELECT * FROM tbl
WHERE  created_at > f_min_ts()
ORDER  BY created_at DESC
LIMIT  1;

Automate recreation with a cron job or some trigger-based event. Your queries can stay the same now. But you need to recreate all indices using this function in any way after changing it. Just drop and create each one.

First ..

... test whether you are actually hitting the bottle neck with this.

Try whether a simple DROP index ... ; CREATE index ... does the job. Then your index might have been bloated. Your autovacuum settings may be off.

Or try VACUUM FULL ANALYZE to get your whole table plus indices in pristine condition and check again.

Other options include the usual general performance tuning and covering indexes, depending on what you actually retrieve from the table.



来源:https://stackoverflow.com/questions/18833832/indexed-order-by-with-limit-1

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!