Create class intervals in r and sum values

问题

I have a set of data (cost & distance) I want to aggregate those ns classes depending on the distance and find the sum of the cost for the aggregated data.

Here are some example tables.

Nam Cost    distance
1   1005    10
2   52505   52
3   51421   21
4   651     10
5   656     0
6   5448    1

Classes

   Class From   To
        1   0   5
        2   5   15
        3   15  100

Result

Class   Sum
    1   6104
    2   1656
    3   103926

I am doing this but it takes a lot of time to process. I sure that there is a better way to do it

for (i in 1:6)
{
  for (j in 1:3)
  {
    if((Table_numbers[i,3]<=classes[j,2])& (Table_numbers[i,3]<classes[j,3]))
    {
      result_table[j,2]<-result_table[j,2]+ Table_numbers [i,2]
    } 

  }

}

I used classIntervals as well but for each class I am getting the counts of the distance, but I need the sum of the cost.

I try to use group_by as well but i don't know if i can use classes for grouping.

Do you have any idea how I can do that more efficient?

回答1:

Here's a simple base solution combining findInterval and tapply

tapply(Table$Cost, findInterval(Table$distance, c(0, Classes$To)), sum)
#    1      2      3 
# 6104   1656 103926

If Classes names may differ (not just a counter), you could modify to

tapply(Table$Cost, Classes$Class[findInterval(Table$distance, c(0, Classes$To))], sum)

回答2:

Here is a solution with cut to produces classes and dplyr::group_by to sum by group:

library(dplyr)

mutate(df,class=cut(distance,c(0,5,15,100),include.lowest = TRUE)) %>% 
  group_by(class) %>% 
  summarize(sum=sum(Cost))

data

df <- read.table(text="Nam Cost    distance
1   1005    10
2   52505   52
3   51421   21
4   651     10
5   656     0
6   5448    1",head=TRUE)

来源：https://stackoverflow.com/questions/34653577/create-class-intervals-in-r-and-sum-values

标签

classification

intervals