Optimize BETWEEN date statement

这一生的挚爱 提交于 2019-11-28 11:40:41

The query executes in less than one second. The other 6+ seconds are spent on traffic between server and client.

Erwin Brandstetter

Proper DDL script

Not sure what kind of notation you are using in your question. It's not Postgres syntax. A proper setup could look like this:
SQL Fiddle.

More about this fiddle further down.
Assuming data type timestamp for the column datetime.

Incorrect query

BETWEEN is almost always wrong on principal with timestamp columns. More details in this related answer:

In your query:

SELECT o.one_id, date(o.cut_time), o.f1, t.f2 
FROM   one o
JOIN   two t USING (one_id)
WHERE  o.cut_time BETWEEN '2013-01-01' AND '2013-01-31';

... the string constants '2013-01-01' and '2013-01-31' are coerced to the timestamps '2013-01-01 00:00' and '2013-01-31 00:00'. This excludes most of Jan. 31. The timestamp '2013-01-31 12:00' would not qualify, which is most certainly wrong.
If you'd use '2013-02-01' as upper border instead, it'd include '2013-02-01 00:00'. Still wrong.

To get all timestamps of "January 2013" it needs to be:

SELECT o.one_id, date(o.cut_time), o.f1, t.f2 
FROM   one o
JOIN   two t USING (one_id)
WHERE  o.cut_time >= '2013-01-01'
AND    o.cut_time <  '2013-02-01';

Exclude the upper border.

Optimize query

@Clodoaldo already mentioned the major drag on performance: it's probably pointless to retrieve 1.7 mio rows. Aggregate before you retrieve the result.

Since table two is so much bigger, the crucial are the rows, you have to retrieve from there. As long as you retrieve a large part of the table, more than ~ 5% , a plain index on two.one_id will not be used, because it is faster to scan the table sequentially right away.

Your table statistics are outdated, or you have messed with cost constants and other parameters (which you obviously have, see below) to force Postgres into using the index anyway.

The only chance I would see for an index on two is a covering index with PostgreSQL 9.2. But you neglected to disclose your version number.

CREATE INDEX two_one_id_f2 on two(one_id, f2);

This way, Postgres could read from the index directly, if some preconditions are met. Might be a bit faster, not much. Didn't test.

Strange numbers in EXPLAIN output

As to your strange numbers in your EXPLAIN ANALYZE. This SQL Fiddle should explain it.

Seems like you had these debug settings:

SET enable_seqscan = off;
SET enable_indexscan = off;
SET enable_bitmapscan = off;

All of them should be on, except for debugging. Would cripple performance! Check with:

SELECT * FROM pg_settings WHERE name ~~ 'enable%'
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!