Return a data frame from function

前端未结

关注

 2  1300

I have the following code inside a function

Myfunc<- function(directory, MyFiles, id = 1:332) {
# uncomment the 3 lines below for testing
#directory<-\


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  盖世英雄少女心        
                
              
                            
                2020-11-29 10:17
              
            
            
                                                                       
If I understand you correctly, you are trying to create a dataframe with the number of complete cases for each id. Supposing your files are names with the id-numbers like you specified (e.g. f2.csv), you can simplify your function as follows:

myfunc <- function(directory, id = 1:332) {
  y <- vector()
  for(i in 1:length(id)){
    x <- id
    y <- c(y, sum(complete.cases(
      read.csv(as.character(paste0(directory,"/","f",id[i],".csv"))))))
  }
  df <- data.frame(x, y)
  colnames(df) <- c("id","ret2")
  return(df)
}


You can call this function like this:

myfunc("name-of-your-directory",25:87)




An explanation of the above code. You have to break down your problem into steps:


You need a vector of the id's, that's done by x <- id
For each id you want the number of complete cases. In order to get that, you have to read the file first. That's done by read.csv(as.character(paste0(directory,"/","f",id[i],".csv"))). To get the number of complete cases for that file, you have to wrap the read.csv code inside sum and complete.cases.
Now you want to add that number to a vector. Therefore you need an empty vector (y <- vector()) to which you can add the number of complete cases from step 2. That's done by wrapping the code from step 2 inside y <- c(y, "code step 2"). With this you add the number of complete cases for each id to the vector y.
The final step is to combine these two vectors into a dataframe with df <- data.frame(x, y) and assign some meaningfull colnames.


By including the steps 1, 2 and 3 (except the y <- vector() part) in a for-loop, you can iterate over the list of specified id's. Creating the empty vector with y <- vector() has to be done before the for-loop, so that the for-loop can add values to y.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  轮回少年        
                
              
                            
                2020-11-29 10:26
              
            
            
                                                                       
This one is actually pretty easy to get around by changing scope.

The issue is that you're creating the initial dataframe as a local variable initially, then you're just swapping out the rows, so you'll wind up with only the first and last results in the dataframe.

When I create a for loop with R and want to add the results of successive queries etc. to some initial dataframe, I do this:



function(<some_args>){ 
main_dataframe <<- do something to generate the first set of results from 
whatever you want to iterate, like 1:10, a given list, etc. and create the 
initial dataframe from the first iteration and use the global assignment 
('<<-'), not '<-' or '='

main_dataframe <<- do_something(whatever_you're_iterating_over[1])

for (i in 2:length(whatever_you're_iterating_over)) {
next_dataframe = do_something(whatever_you're_iterating_over[i])

main_dataframe <<- rbind(main_dataframe, next_dataframe)
    }
}


The scoping will allow each iteration to create a dataframe that you can append to the original without losing any of the iterations in between the first and the last. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复