Linearly apportion amounts by month

前端未结

关注

 1  1801

Please consider the following synthetic data frame:

#Learning to enable splitting contributions spanning two months

start = c(as.Date(\"2013-01-01\"), as.Date(\


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  时光取名叫无心        
                
              
                            
                2021-01-24 16:28
              
            
            
                                                                       
Create a function explode that explodes an interval into a data frame with one row per day.  Use Map to apply explode to each interval producing a list of data frames, one per interval.  Next rbind the data frames in the list into one big data frame, by.date, having one row per day.  Finally aggregate by.date into one row for each year/month:

library(zoo) # as.yearmon

explode <- function(start, end, amount) {
   dates <- seq(start, end, "day")
   data.frame(dates, yearmon = as.yearmon(dates), amount = amount / length(dates))
}
by.date <- do.call("rbind", Map(explode, df$start, df$end, df$amount))
aggregate(amount ~ yearmon, by.date, sum)


Using the data in the question (assuming the occurrence of 2010 was supposed to be 2013) we get:

   yearmon    amount
1 Jan 2013 100.00000
2 Feb 2013  94.91525
3 Mar 2013 105.08475
4 Apr 2013 100.00000
5 May 2013 100.00000


UPDATE: If memory is a problem use this for explode instead.  It aggregates within explode first so that its output is smaller.  Also we have eliminated the dates column in DF as it was only included for debugging:

explode <- function(start, end, amount) {
   dates <- seq(start, end, "day")
   DF <- data.frame(yearmon = as.yearmon(dates), amount = amount / length(dates))
   aggregate(amount ~ yearmon, DF, sum)
}


UPDATE 2:  Here is another attempt.  It uses rowsum which is specialized for aggregating sums.  This one ran 10x faster on the data in the post in my test.

explode2 <- function(start, end, amount) {
  dates <- seq(start, end, "day")
  n <- length(dates)
  rowsum(rep(amount, n) / n, format(dates, "%Y-%m"))
}
by.date <- do.call("rbind", Map(explode2, df$start, df$end, df$amount))
rowsum(by.date, rownames(by.date))

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复