Should we parallelize a DataFrame like we parallelize a Seq before training

前端未结

关注

 2  742

无人及你 2021-02-04 10:11

Consider the code given here,

https://spark.apache.org/docs/1.2.0/ml-guide.html

import org.apache.spark.ml.classification.LogisticRegression
val training


      
      
        
          2条回答        

        
                    
            
            
                         
                
              
              
                
                   南笙
                                             
                
                
                (楼主)
            
              
              
                2021-02-04 10:31
              

            
            
                        
You should maybe check out the difference between RDD and DataFrame and how to convert between the two: Difference between DataFrame and RDD in Spark

To answer your question directly: A DataFrame is already optimized for parallel execution. You do not need to do anything and you can pass it to any spark estimators fit() method directly. The parallel executions are handled in the background. 
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它2个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复