Hadoop gen1 vs Hadoop gen2

前端未结

关注

 9  1237

I am a bit confused about place of tasktracker in Hadoop-2.x.

Daemons in Hadoop-1.x are namenode, datanode, jobtracker, taskracker and secondaryna


                      
              相关标签:


      
      
        
          9条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  南旧        
                
              
                            
                2021-02-10 10:15
              
            
            
                                                                       
namenode, datanode, resourcemanager, applicationmaster

You missed another daemons in Hadoop-2.x from above list which is NodeManager. This daemon runs on the individual nodes like tasktracker.   On startup, this component registers with the RM and sends information about the resources available on the nodes. Subsequent NM-RM communication is to provide updates on container statuses – new containers running on the node, completed containers, etc.

So here is what happen.  RM allocates resources to job. one of the allocated node act like applicationmaster and communicate with other nodes.  In simple terms now you can consider application master is jobtracker and all others are tasktraker nodes.     RM is free to service other users for more jobs.   Now that is the beauty of the MR v2    that you can run multiple MR jobs as well as other applications like Spark jobs on the same cluster.    ResourceManage is responsible for management of the cluster and spin allocate resources or nodes for jobs and one of the allocated node becomes application master.  

Shahzad 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  萌比男神i        
                
              
                            
                2021-02-10 10:16
              
            
            
                                                                       
Yes Jobtracker was split into resource manager and application master. Application master runs on one or all node managers instances based on the number of jobs submitted. So when job submitted, resource manager talks to one of free node managers to act as application master and that application master will be now job tracker and other node managers will be task trackers which they execute Yarn child.
find details here:
http://ercoppa.github.io/HadoopInternals/HadoopArchitectureOverview.html
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  天命终不由人        
                
              
                            
                2021-02-10 10:16
              
            
            
                                                                       
Just Remember the below comparisons
Job Tracker = Resource Manager (Application manager, known as container 0) + scheduler (FIFO,fair scheduler and capacity scheduler)

Tasktracker = Node manager

Initially when job is submitted in HDPv1
1. The job tracker had the responsibility of calculating the mappers and reducers for job, monitoring dead/live task-trackers, re-spawning mappers and reducers if they fail.

Now in HDPv2 when we submit a job the

Resource manager java process (The same java process act as scheduler) first spawns application manager on any node (also known as container 0), then application manager reads the job code and calculates the resources required by that job and asks for resources from scheduler (which also monitor how many resources does job's queue has). Scheduler calculated and gives names of nodes to AM where it can spawn containers. Then AM spawns containers on those nodes and monitors them . In case any container dies it is the AM which again goes to scheduler and negotiates for more resource.
Hence the work of jobtracker is divided between AM and scheduler of YARN. Also please note that each job submitted will have a new AM so there can be multiple AM running but only one scheduler on cluster.
The AM is spawned on node managers and scheduler is started on RM node.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复