Why does PostgresQL query performance drop over time, but restored when rebuilding index

后端未结

关注

 5  1180

According to this page in the manual, indexes don\'t need to be maintained. However, we are running with a PostgresQL table that has a continuous rate of upd


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  無奈伤痛        
                
              
                            
                2021-01-31 19:22
              
            
            
                                                                       
That's a textbook case. You should setup autovacuum to be a lot more aggressive.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  遥遥无期        
                
              
                            
                2021-01-31 19:37
              
            
            
                                                                       
This smells like index bloat to me.  I'l refer you to this page

http://www.postgresql.org/docs/8.3/static/routine-reindex.html

which says at the bottom:


  Also, for B-tree indexes a
  freshly-constructed index is somewhat
  faster to access than one that has
  been updated many times, because
  logically adjacent pages are usually
  also physically adjacent in a newly
  built index. (This consideration does
  not currently apply to non-B-tree
  indexes.) It might be worthwhile to
  reindex periodically just to improve
  access speed.


Which does seem to conflict with the page you referenced saying that indexes "don't require maintenance or tuning".

Have you tried "create index concurrently"?  
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  暗喜        
                
              
                            
                2021-01-31 19:41
              
            
            
                                                                       
As for performance, using strings for storing time and status info is quite a bottleneck. First of all, indexes on texts are extremely inefficient, comparing two times on the same day needs at least 11 comparison (in the format you used), however, using time type it can be reduced to simply one comparison. This also effects the size of the index, and a large index is hard to search over, and the db won't keep it in memory. Same considerations apply to the state column. If it represents a small set of states, you should use integer numbers mapped to states, this will reduce the nodes of the index - and the index size accordingly. Furthermore, this index will be useless even using theese built-in types if you don't specify the actual time in your query.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  借酒劲吻你        
                
              
                            
                2021-01-31 19:46
              
            
            
                                                                       
Is the '2010-05-20T13:00:00.000' value that xmlscheduledtime is being compared to, part of the SQL, or supplied as a parameter?

When planning how to run the query, saying that a field must be less than a supplied parameter with an as yet unknown value doesn't give PostgreSQL much to go on. It doesn't know whether that'll match nearly all the rows, or hardly any of the rows.

Reading about how the planner uses statistics helps tremendously when trying to figure out why your database is using the plans it is.

You might get better select performance by changing the order of fields in that complex index, or creating a new index, with the fields ordered (campaignfqname, currentstate, xmlscheduledtime) since then the index will take you straight to the campaign fq name and current state that you are interested in, and the index scan over the xmlscheduledtime range will all be rows you're after.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  你的背包        
                
              
                            
                2021-01-31 19:48
              
            
            
                                                                       
Auto vacuum should do the trick, provided you configured it for your desired performance.

Notes:
VACUUM FULL: this will rebuild table statistics and reclaim loads of disk space. It locks the whole table.

VACUUM: this will rebuild table statistics and reclaim some disk space. It can be run in parallel with production system, but generates lots of IO which can impact performance.

ANALYZE: this will rebuild query planner statistics. This is triggered by VACUUM, but can be run on its own.

More detailed notes found here
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复