Spark Exception Complex types not supported while loading parquet

前端未结

关注

 1  1154

I am trying to load Parquet File in Spark as dataframe-

val df= spark.read.parquet(path)

I am getting -

org.apache.spark.SparkE


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  既然无缘        
                
              
                            
                2021-01-24 02:53
              
            
            
                                                                       
Take 1

SPARK-12854 Vectorize Parquet reader indicates that "ColumnarBatch supports structs and arrays" (cf. GitHub pull request 10820) starting with Spark 2.0.0

And SPARK-13518 Enable vectorized parquet reader by default, also starting with Spark 2.0.0, deals with property spark.sql.parquet.enableVectorizedReader (cf. GitHub commit e809074) 

My 2 cents: disable that "VectorizedReader" optimization and see what happens.

Take 2

Since the problem has been narrowed down to some empty files that do not display the same schema as "real" files, my 3 cents: experiment with spark.sql.parquet.mergeSchema to see if the schema from real files takes precedence after merging.

Other than that, you might try to eradicate the empty files at write time, with some kind of re-partitioning e.g. coalesce(1) (OK, 1 is a bit caricatural, but you see the point).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复