postgresql-performance | 易学教程

Can PostgreSQL array be optimized for join?

阅读更多关于 Can PostgreSQL array be optimized for join?

I see that Postgres array is good for performance if the array's element is the data itself, e.g., tag http://shon.github.io/2015/12/21/postgres_array_performance.html How about if I use array as a way to store foreign keys of integer? Barring foreign key constraint problem, is it advisable to store foreign keys with integer array? Apps should optimize for report or analytics. So if the app will end up joining the array to table most of the time, say the app need to show the label/title/name of the foreign key, is it still ok to use array for storage of foreign keys? Would the performance be

Slow LEFT JOIN on CTE with time intervals

阅读更多关于 Slow LEFT JOIN on CTE with time intervals

问题 I am trying to debug a query in PostgreSQL that I've built to bucket market data in time buckets in arbitrary time intervals . Here is my table definition: CREATE TABLE historical_ohlcv ( exchange_symbol TEXT NOT NULL, symbol_id TEXT NOT NULL, kafka_key TEXT NOT NULL, open NUMERIC, high NUMERIC, low NUMERIC, close NUMERIC, volume NUMERIC, time_open TIMESTAMP WITH TIME ZONE NOT NULL, time_close TIMESTAMP WITH TIME ZONE, CONSTRAINT historical_ohlcv_pkey PRIMARY KEY (exchange_symbol, symbol_id,

Slow LEFT JOIN on CTE with time intervals

阅读更多关于 Slow LEFT JOIN on CTE with time intervals

I am trying to debug a query in PostgreSQL that I've built to bucket market data in time buckets in arbitrary time intervals . Here is my table definition: CREATE TABLE historical_ohlcv ( exchange_symbol TEXT NOT NULL, symbol_id TEXT NOT NULL, kafka_key TEXT NOT NULL, open NUMERIC, high NUMERIC, low NUMERIC, close NUMERIC, volume NUMERIC, time_open TIMESTAMP WITH TIME ZONE NOT NULL, time_close TIMESTAMP WITH TIME ZONE, CONSTRAINT historical_ohlcv_pkey PRIMARY KEY (exchange_symbol, symbol_id, time_open) ); CREATE INDEX symbol_id_idx ON historical_ohlcv (symbol_id); CREATE INDEX open_close

Why does a slight change in the search term slow down the query so much?

阅读更多关于 Why does a slight change in the search term slow down the query so much?

问题 I have the following query in PostgreSQL (9.5.1): select e.id, (select count(id) from imgitem ii where ii.tabid = e.id and ii.tab = 'esp') as imgs, e.ano, e.mes, e.dia, cast(cast(e.ano as varchar(4))||'-'||right('0'||cast(e.mes as varchar(2)),2)||'-'|| right('0'||cast(e.dia as varchar(2)),2) as varchar(10)) as data, pl.pltag, e.inpa, e.det, d.ano anodet, coalesce(p.abrev,'')||' ('||coalesce(p.prenome,'')||')' determinador, d.tax, coalesce(v.val,v.valf)||' '||vu.unit as altura, coalesce(v1.val

How to delete many rows from frequently accessed table

阅读更多关于 How to delete many rows from frequently accessed table

I need to delete the majority (say, 90%) of a very large table (say, 5m rows). The other 10% of this table is frequently read, but not written to. From " Best way to delete millions of rows by ID ", I gather that I should remove any index on the 90% I'm deleting, to speed up the process (except an index I'm using to select the rows for deletion). From " PostgreSQL locking mode ", I see that this operation will acquire a ROW EXCLUSIVE lock on the entire table. But since I'm only reading the other 10%, this ought not matter. So, is it safe to delete everything in one command (i.e. DELETE FROM

How to delete many rows from frequently accessed table

阅读更多关于 How to delete many rows from frequently accessed table

问题 I need to delete the majority (say, 90%) of a very large table (say, 5m rows). The other 10% of this table is frequently read, but not written to. From "Best way to delete millions of rows by ID", I gather that I should remove any index on the 90% I'm deleting, to speed up the process (except an index I'm using to select the rows for deletion). From "PostgreSQL locking mode", I see that this operation will acquire a ROW EXCLUSIVE lock on the entire table. But since I'm only reading the other

Optimize performance for queries on recent rows of a large table

阅读更多关于 Optimize performance for queries on recent rows of a large table

I have a large table: CREATE TABLE "orders" ( "id" serial NOT NULL, "person_id" int4, "created" int4, CONSTRAINT "orders_pkey" PRIMARY KEY ("id") ); 90% of all requests are about orders from the last 2-3 days by a person_id , like: select * from orders where person_id = 1 and created >= extract(epoch from current_timestamp)::int - 60 * 60 * 24 * 3; How can I improve performance? I know about Partitioning , but what about existing rows? And it looks like I need to create INHERITS tables manually every 2-3 days. A partial, multicolumn index on (person_id, created) with a pseudo- IMMUTABLE

Spatial query on large table with multiple self joins performing slow

阅读更多关于 Spatial query on large table with multiple self joins performing slow

I am working on queries on a large table in Postgres 9.3.9. It is a spatial dataset and it is spatially indexed. Say, I have need to find 3 types of objects: A, B and C. The criteria is that B and C are both within certain distance of A, say 500 meters. My query is like this: select school.osm_id as school_osm_id, school.name as school_name, school.way as school_way, restaurant.osm_id as restaurant_osm_id, restaurant.name as restaurant_name, restaurant.way as restaurant_way, bar.osm_id as bar_osm_id, bar.name as bar_name, bar.way as bar_way from ( select osm_id, name, amenity, way, way_geo

Reuse computed select value

阅读更多关于 Reuse computed select value

I'm trying to use ST_SnapToGrid and then GROUP BY the grid cells (x, y). Here is what I did first: SELECT COUNT(*) AS n, ST_X(ST_SnapToGrid(geom, 50)) AS x, ST_Y(ST_SnapToGrid(geom, 50)) AS y FROM points GROUP BY x, y I don't want to recompute ST_SnapToGrid for both x and y . So I changed it to use a sub-query: SELECT COUNT(*) AS n, ST_X(geom) AS x, ST_Y(geom) AS y FROM ( SELECT ST_SnapToGrid(geom, 50) AS geom FROM points ) AS tmp GROUP BY x, y But when I run EXPLAIN , both of these queries have the exact same execution plan: GroupAggregate (...) -> Sort (...) Sort Key: (st_x(st_snaptogrid

Spatial query on large table with multiple self joins performing slow

阅读更多关于 Spatial query on large table with multiple self joins performing slow

问题 I am working on queries on a large table in Postgres 9.3.9. It is a spatial dataset and it is spatially indexed. Say, I have need to find 3 types of objects: A, B and C. The criteria is that B and C are both within certain distance of A, say 500 meters. My query is like this: select school.osm_id as school_osm_id, school.name as school_name, school.way as school_way, restaurant.osm_id as restaurant_osm_id, restaurant.name as restaurant_name, restaurant.way as restaurant_way, bar.osm_id as bar