How to sum the values of a numeric variable based on a string variable [duplicate]

只谈情不闲聊 提交于 2021-01-27 21:16:59

问题


Consider the following dataframe:

df <- data.frame(numeric=c(1,2,3,4,5,6,7,8,9,10), string=c("a", "a", "b", "b", "c", "d", "d", "e", "d", "f"))
print(df)
numeric string
1        1      a
2        2      a
3        3      b
4        4      b
5        5      c
6        6      d
7        7      d
8        8      e
9        9      d
10      10      f

It has a numeric variable and a string variable. Now, I would like to create another dataframe in which the string variable displays only the list of unique values "a", "b", "c", "d", "e", "f", and the numeric variable is the result of the sum of the numeric valuesin the previous dataframe, resulting in this data frame:

print(new_df)
numeric string
1        3      a
2        7      b
3        5      c
4       22      d
5        8      e
6       10      f

This can be done using a for loop, but it would be rather inefficient in large datasets, and I would prefer other options. I have tried using dplyr package, but I did not get the expected result:

library(dplyr)
> df %>% group_by(string) %>% summarize(result = sum(numeric))
result
1     55

回答1:


It could be an issue of masking function from plyr (summarise/mutate functions are also there in plyr). We can explicitly specify the summarise from dplyr

library(dplyr)
df %>% 
    group_by(string) %>%
    dplyr::summarise(numeric = sum(numeric))



回答2:


You can do this without loading any extra packages using tapply or aggregate.



来源:https://stackoverflow.com/questions/56028131/how-to-sum-the-values-of-a-numeric-variable-based-on-a-string-variable

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!