Optimize performance for queries on recent rows of a large table

前端未结

关注

 2  564

I have a large table:

CREATE TABLE \"orders\" (
\"id\" serial NOT NULL,
\"person_id\" int4,
\"created\" int4,
CONSTRAINT \"orders_pkey\" PRIMARY KEY (\"id\")


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  孤城傲影        
                
              
                            
                2021-01-14 19:18
              
            
            
                                                                       


A partial, multicolumn index on (person_id, created) with a pseudo-IMMUTABLE condition would help (a lot). Needs to be recreated from time to time to keep performance up.

Note, if your table is not very big, you can largely simplify and use a plain multicolumn index.

Or consider table partitioning in Postgres 12 or later (where the feature finally matured).

A primitive function provides a constant point in time, 3 or more days back (represented by a unix epoch in your case):

CREATE OR REPLACE FUNCTION f_orders_idx_start()
  RETURNS int LANGUAGE sql IMMUTABLE PARALLEL SAFE COST 1 AS
'SELECT 1387497600';


PARALLEL SAFE only for Postgres 10 or later.

1387497600 being the result of:

SELECT extract(epoch from now())::integer - 259200;
-- 259200 being the result of 60 * 60 * 24 * 3


Base your partial index on this pseudo-IMMUTABLE condition:

CREATE INDEX orders_created_recent_idx ON orders (person_id, created)
WHERE created >= f_orders_idx_start();


Base your query on the same condition:

SELECT *
FROM   orders
WHERE  person_id = 1
AND    created >= f_orders_idx_start()  -- match partial idx condition
AND    created >= extract(epoch from now())::integer - 259200;  -- actual condition


The line AND created >= f_orders_idx_start() seems redundant, but is instrumental to convince Postgres to use the partial index.

A function to recreate function and index from time to time. Possibly with a cron-job every night:

CREATE OR REPLACE FUNCTION f_orders_reindex_partial()
  RETURNS void AS
$func$
DECLARE
   -- 3 days back, starting at 00:00
   _start int := extract(epoch from now()::date -3)::int;
BEGIN       
   IF _start = f_orders_idx_start() THEN
      -- do nothing, nothing changes.
   ELSE
      DROP INDEX IF EXISTS orders_created_recent_idx;
      -- Recreate IMMUTABLE function
      EXECUTE format('
         CREATE OR REPLACE FUNCTION f_orders_idx_start()
           RETURNS int LANGUAGE sql IMMUTABLE PARALLEL SAFE COST 1 AS
         $$SELECT %s $$'
       , _start
      );
      -- Recreate partial index
      CREATE INDEX orders_created_recent_idx ON orders (person_id, created)
      WHERE created >= f_orders_idx_start();
   END IF;    
END
$func$  LANGUAGE plpgsql;


Then, to rebase your index, call (ideally with little or no concurrent load):

SELECT f_orders_reindex_partial();  -- that's all


If you cannot afford dropping and recreating the index due to concurrent load, consider REINDEX CONCURRENTLY in Postgres 12 or later. It's dead simple:

REINDEX INDEX orders_created_recent_idx;


All queries continue to work, even if you never call this function. Performance slowly deteriorates over time with the growing partial index.

I am using this regime successfully with a couple of big tables and similar requirements. Very fast.

For Postgres 9.2 or later, and if your table has only few, small columns, and if the table is not heavily written, it might pay to make that a covering index:

CREATE INDEX orders_created_recent_idx ON orders (person_id, created, id)
WHERE created >= f_orders_idx_start();

In Postgres 11 or later, you might want to use INCLUDE instead:

CREATE INDEX orders_created_recent_idx ON orders (person_id, created) INCLUDE (id)
WHERE created >= f_orders_idx_start();
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一个人的身影        
                
              
                            
                2021-01-14 19:39
              
            
            
                                                                       
Suggesstion:- 

It may help you. 
Since the table size is growing, your query performance  going to be decreased gradually. Better maintain 3-5 days ( If you are very sure about going to access only 2-3 days ) records and periodically migrate the old records to backup table.  
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复