Select partitions based on matches in other table

前端未结

关注

 2  1820

Having the following table (conversations):

 id | record_id  |  is_response  |         text         |
 ---+------------+---------------+------------


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  傲寒        
                
              
                            
                2021-01-23 04:37
              
            
            
                                                                       
Here's my take:

SELECT
    record_id,
    string_agg(text, ' ' ORDER BY id) AS context
FROM (
    SELECT
        *,
        coalesce(sum(incl::integer) OVER (ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING),0) AS grp
    FROM (
        SELECT *, is_response AND text IN (SELECT text FROM responses) as incl
        FROM conversations
         ) c
     ) c1
GROUP BY record_id, grp
HAVING bool_or(incl)
ORDER BY max(id);


This will scan the table conversations once, but I am not sure if it will perform better than your solution. The basic idea is to use a window function to count how maybe preceding rows within the same record, end the conversation. Then we can group by with that number and the record_id and discard incomplete conversations.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤独总比滥情好        
                
              
                            
                2021-01-23 04:57
              
            
            
                                                                       
There is a simple and fast solution:

SELECT record_id, string_agg(text, ' ') As context
FROM  (
   SELECT c.*, count(r.text) OVER (PARTITION BY c.record_id ORDER BY c.id DESC) AS grp
   FROM   conversations  c
   LEFT   JOIN responses r ON r.text = c.text AND c.is_response
   ORDER  BY record_id, id
   ) sub
WHERE  grp > 0  -- ignore conversation part that does not end with a response
GROUP  BY record_id, grp
ORDER  BY record_id, grp;


count() only counts non-null values. r.text is NULL if the LEFT JOIN to responses comes up empty:


Select rows which are not present in other table


The value in grp (short for "group") is only increased when a new output row is triggered. All rows belonging to the same output row end up with the same grp number. It's then easy to aggregate in the outer SELECT.

The special trick is to count conversation ends in reverse order. Everything after the last end (coming first when starting from the end) gets grp = 0 and is removed in the outer SELECT.

Similar cases with more explanation:


Row number with reset in PostgreSQL
Select longest continuous sequence

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复