Streaming buffer - Google BigQuery

后端未结

关注

 2  1618

I\'m developing a python program to use like Google Dataflow template.

What I\'m doing is writing the data in BigQuery from PubSub:

 pipeline_options


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  猫巷女王i        
                
              
                            
                2021-01-16 22:00
              
            
            
                                                                       
In your example you create a Dataflow which streams data into BigQuery. Streaming means - as you write - that the data doesn't get to its permanent place in an instant but after a while (up to 2 hours), which state is actually the Streaming Buffer. There is no difference in this case between the runners - you run it locally (DirectRunner) or in the cloud (DataflowRunner) - because both solutions use cloud resources (write into cloud BigQuery directly). If you use emulators for local development, that's another case (but as far as I know BQ does not have one yet).

Here you can find a pretty good article on how this architecture looks like and how streaming into BigQuery works in deep: https://cloud.google.com/blog/products/gcp/life-of-a-bigquery-streaming-insert.

The reason why you could not see your data immediately is because the Preview button works probably with the Columnar Permanent storage of BQ.

If you'd like to see the data in the buffer use a query like:

SELECT * FROM `project_id.dataset_id.table_id` WHERE _PARTITIONTIME IS NULL

Querying the buffer is free of charge, by the way.

I hope it helped a bit to clear things up.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  隐瞒了意图╮        
                
              
                            
                2021-01-16 22:17
              
            
            
                                                                       
This was the problem:

 beam.io.Write(beam.io.BigQuerySink


It should be:

 beam.io.WriteToBigQuery


The first work well while I was reading from a file, the second while I read from pub/sub
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复