Scale all values depending on group [duplicate]

别来无恙 提交于 2019-12-11 02:57:37

问题


I have a dataframe similar to this one

ID <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
p1 <- c(21000, 23400, 26800, 2345, 23464, 34563, 456433, 56543, 34543,3524, 353, 3432, 4542, 6343, 4534 )
p2 <- c(234235, 2342342, 32, 23432, 23423, 2342342, 34, 2343, 23434, 23434, 34, 234, 2343, 34, 5)
my.df <- data.frame(ID, p1, p2)

Now I would like to scale the values in p1 and p2 depending on their ID. So not the whole column would be scaled like when using the tapply() function, but rather scaling is done once for all values for ID 1, then for all values for ID 2 etc. Same for scaling of p2. The new dataframe should consist of the scaled values.

I already tried

df_scaled <- ddply(my.df, my.df$ID, scale(my.df$p1))

but get the error message

.fun is not a function.

Thanks for your help!


回答1:


dplyr makes this easy:

ID <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)
p1 <- c(21000, 23400, 26800, 2345, 23464, 34563, 456433, 56543, 34543,3524, 353, 3432, 4542, 6343, 4534 )
p2 <- c(234235, 2342342, 32, 23432, 23423, 2342342, 34, 2343, 23434, 23434, 34, 234, 2343, 34, 5)
my.df <- data.frame(ID, p1, p2)

library(dplyr)
df_scaled <- my.df %>% group_by(ID) %>% mutate(p1 = scale(p1), p2=scale(p2))

Note that there is a bug in the stable version of dplyr when working with scale; you might need to update to the dev version (see comments).



来源:https://stackoverflow.com/questions/41761018/scale-all-values-depending-on-group

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!