Spark - repartition() vs coalesce()

前端未结
关注
 14  1778
误落风尘 2020-11-22 17:11
According to Learning Spark
Keep in mind that repartitioning your data is a fairly expensive operation. Spark also has an optimized version of

      
      
        
          14条回答        

        
                    
            
            
                         
                
              
              
                
                   隐瞒了意图╮
                                             
                
                
                (楼主)
            
              
              
                2020-11-22 17:42
              

            
            
                        
But also you should make sure that, the data which is coming coalesce nodes should have highly configured, if you are dealing with huge data. Because all the data will be loaded to those nodes, may lead memory exception.
Though reparation is costly, i prefer to use it. Since it shuffles and distribute the data equally.

Be wise to select between coalesce and repartition.  
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它14个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复