Observing duplicates using sqoop with Oozie

后端未结

关注

 1  1029

情话喂你 2021-01-26 03:00

I\'ve built a sqoop pogram in order to import data from MySQL to HDFS using a pre-built sqoop job:

                sqoop job -fs $driver_path -D mapreduce.map.ja


      
      
        
          1条回答        

        
                    
            
            
                         
                
              
              
                
                   迷失自我
                                             
                
                
                (楼主)
            
              
              
                2021-01-26 03:30
              

            
            
                        
Ask yourself a question: where does Sqoop store that "last value" information?

The answer is: for Sqoop1, by default, in a file on the local filesystem. But Oozie runs your Sqoop job on random machines therefore the executions are not coordinated.

And Sqoop2 (which has a proper Metastore database) is more or less in limbo; at least it is not supported by Oozie.

The solution is to start a shared HSQLDB database service to store the "last value" information for all Sqoop1 jobs, whatever machine they are running on.

Please read the Sqoop1 documentation about its lame Metastore and about how to use it, from there to there.

And for a more professional handling of that obsolete HSQLDB database, look at that post of mine.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                    
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复