Hadoop query regarding setJarByClass method of Job class

前端未结

关注

 3  1685

In the Hadoop API documentation it\'s given

that

setJarByClass 

public void setJarByClass(Class cls)

Set the Jar by finding where a give


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  抹茶落季        
                
              
                            
                2021-01-03 20:57
              
            
            
                                                                       
job.setJarByClass(WordCount.class);


Helps to identify the Jar which contains the Mapper and Reducer by specifying a class in that Jar.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  离开以前        
                
              
                            
                2021-01-03 21:00
              
            
            
                                                                       
Please note that the above method on the Job class is called in the driver. You driver is invoked form a client, typically your desktop or a edge machine which is not part of the cluster and your classes (in jar files) would be sitting on that machine. For your mapreduce job to run on the cluster, you need to send your Mapper, reducer and any other required classes to the cluster from your client machine. You driver class takes care of sending the jar file containing required classes to the cluster. Which jar to send you need to specify as the driver do not know which one should be sent amongst the heap of jar files you have on your driver's class path. This is done by using the method setJarByClass or setJar or any other variant of similar method on Job class.

Obviously if you don't specify this, meaning not calling this method or commenting it out will result in ClassNotFound exception on the slave nodes.

Hope this clarifies!
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  梦谈多话        
                
              
                            
                2021-01-03 21:07
              
            
            
                                                                       
This method sets the jar file in which each node will look for the Mapper and Reducer classes.  

It does not create a jar from the given class.  Rather, it identifies the jar containing the given class.  And yes, that jar file is "executed" (really the Mapper and Reducer in that jar file are executed) for the MapReduce job.

(Also see Stanley Xu's answer to a similar question about the need for this method since you give the jar on the command line)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复