Why does Spark fail with “Failed to get broadcast_0_piece0 of broadcast_0” in local mode?

前端未结

关注

 5  1194

I\'m running this snippet to sort an RDD of points, ordering the RDD and taking the K-nearest points from a given point:

def getKNN(sparkContext:SparkContext, k:


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  陌清茗        
                
              
                            
                2021-02-06 11:32
              
            
            
                                                                       
Related to the above answers, I encountered this issue when I inadvertently serialized a datastax connector (i.e Cassandra connection driver) query to a spark slave. This then spun off its own SparkContext and within 4 seconds the entire application had crashed
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  北荒        
                
              
                            
                2021-02-06 11:34
              
            
            
                                                                       
Just discovered why I was getting this exception: for a reason my SparkContext object started/stopped several times between ScalaTest methods. So, fixing that behaviour lead me to get spark working in the right way I would expect.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤城傲影        
                
              
                            
                2021-02-06 11:34
              
            
            
                                                                       
For me helped this, because SparkContext was already created 

val sc = SparkContext.getOrCreate()


Before i tried with this

val conf = new SparkConf().setAppName("Testing").setMaster("local").set("spark.driver.allowMultipleContexts", "true")
val sc = SparkContext(conf)


But it was broken when i ran 

 spark.createDataFrame(rdd, schema)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  星月不相逢        
                
              
                            
                2021-02-06 11:38
              
            
            
                                                                       
I was also facing the same issue. after a lot of googling I found that I have made a singleton class for SparkContext initialization which is only valid for a single JVM instance, but in case of Spark this singleton class will be invoked from each worker node running on separate JVM instance and hence lead to multiple SparkContext object.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  南旧        
                
              
                            
                2021-02-06 11:40
              
            
            
                                                                       
I was getting this error as well.  I haven't really seen any concrete coding examples, so I will share my solution.  This cleared the error for me, but I have a sense that there may be more than 1 solution to this problem. But this would be worth a go as it keeps everything within the code.

It looks as though the SparkContext was shutting down, thus throwing the error. I think the issue is that the SparkContext is created in a class and then extended to other classes.  The extension causes it to shut down, which is a bit annoying.  Below is the implementation I used to get this error to clear.  

Spark Initialisation Class:

import org.apache.spark.{SparkConf, SparkContext}

class Spark extends Serializable {
  def getContext: SparkContext = {
    @transient lazy val conf: SparkConf = 
          new SparkConf()
          .setMaster("local")
          .setAppName("test")

    @transient lazy val sc: SparkContext = new SparkContext(conf)
    sc.setLogLevel("OFF")

   sc
  }
 }


Main Class:

object Test extends Spark{

  def main(args: Array[String]): Unit = {
  val sc = getContext
  val irisRDD: RDD[String] = sc.textFile("...")
...
}


Then just extend your other class with the the Spark Class and it should all work out.

I was getting the error running LogisticRegression Models, so I would assume this should fix it for you as well with other Machine Learning libraries as well. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复