PySpark 2.1: Importing module with UDF's breaks Hive connectivity

前端未结

关注

 1  342

I\'m currently working with Spark 2.1 and have a main script that calls a helper module that contains all my transformation methods. In other words:

main.py
help


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  北荒        
                
              
                            
                2021-01-21 08:10
              
            
            
                                                                       
Prior to Spark 2.2.0 UserDefinedFunction eagerly creates UserDefinedPythonFunction object, which represents Python UDF on JVM. This process requires access to SparkContext and SparkSession. If there are no active instances when UserDefinedFunction.__init__ is called, Spark will automatically initialize the contexts for you.

When you call SparkSession.Builder.getOrCreate after importing UserDefinedFunction object it returns existing SparkSession instance and only some configuration changes can be applied (enableHiveSupport is not among these).

To address this problem you should initialize SparkSession before you import UDF:

from pyspark.sql.session import SparkSession

spark = SparkSession.builder.enableHiveSupport().getOrCreate()

from helper import reformat_udf


This behavior is described in SPARK-19163 and fixed in Spark 2.2.0. Other API improvements include decorator syntax (SPARK-19160) and improved docstrings handling (SPARK-19161).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复