How do I read a parquet in PySpark written from Spark?

前端未结

关注

 2  913

无人及你 2021-01-31 03:36

I am using two Jupyter notebooks to do different things in an analysis. In my Scala notebook, I write some of my cleaned data to parquet:

partitionedDF.select(\


      
      
        
          2条回答        

        
                    
            
            
                         
                
              
              
                
                   予麋鹿
                                             
                
                
                (楼主)
            
              
              
                2021-01-31 04:08
              

            
            
                        
You can use parquet format of Spark Session to read parquet files. Like this:

df = spark.read.parquet("swift2d://xxxx.keystone/commentClusters.parquet")


Although, there is no difference between parquet and load functions. It might be the case that load is not able to infer the schema of data in the file (eg, some data type which is not identifiable by load or specific to parquet).
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它2个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复