How to unnest and pivot two columns in BigQuery

前端未结

关注

 3  1207

Say I have a BQ table containing the following information

| id    | test.name     | test.score    |
|----   |-----------    |------------   |
| 1     | a


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  梦谈多话        
                
              
                            
                2021-01-27 08:18
              
            
            
                                                                       
One option could be using conditional aggregation
select id, 
       max(case when test.name='a' then test.score end) as a,
       max(case when test.name='b' then test.score end) as b,
       max(case when test.name='c' then test.score end) as c
from 
(
select a.id, t
from `table` as a,
unnest(test) as t
)A group by id

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  心在旅途        
                
              
                            
                2021-01-27 08:18
              
            
            
                                                                       
Below is generic/dynamic way to handle your case
EXECUTE IMMEDIATE (
  SELECT """
  SELECT id, """ || 
    STRING_AGG("""MAX(IF(name = '""" || name || """', score, NULL)) AS """ || name, ', ') 
  || """
  FROM `project.dataset.table` t, t.test
  GROUP BY id
  """
  FROM (
    SELECT DISTINCT name
    FROM `project.dataset.table` t, t.test
    ORDER BY name
  )
);  

If to apply to sample data from your question - output is
Row     id      a       b       c    
1       1       5       7       null     
2       2       8       null    3    

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  傲寒        
                
              
                            
                2021-01-27 08:22
              
            
            
                                                                       
Conditional aggregation is a good approach.  If your tables are large, you might find that this has the best performance:
select t.id,
       (select max(tt.score) from unnest(t.score) tt where tt.name = 'a') as a,
       (select max(tt.score) from unnest(t.score) tt where tt.name = 'b') as b,
       (select max(tt.score) from unnest(t.score) tt where tt.name = 'c') as c
from `table` t;

The reason I recommend this is because it avoids the outer aggregation.  The unnest() happens without shuffling the data around -- and I have found that this is a big win in terms of performance.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复