问题
I have a dataset as following
V1 <- c(5,5,5,45,45,77)
V2 <- c("low", "low", "medium", "low", "low", "high")
V3 <- c(10,3,6,10,3,1)
df <- cbind.data.frame(V1,V3,V2)
v1 v2 v3
5 10 low
5 3 low
5 6 medium
45 10 low
45 3 low
77 1 high
I want it to be
v1 low medium high
5 13 6 0
45 13 0 0
77 0 0 1
I have tried with cast/melt with little success.
回答1:
Using rehape2
as Frank answered in the comments:
library(reshape2)
dcast(df, V1 ~ V2, value.var = "V3", fun = sum, fill = 0)
Output:
V1 high low medium
1 5 0 13 6
2 45 0 13 0
3 77 1 0 0
If we'd like to keep the column order:
dcast(df, V1 ~ factor(V2, levels = unique(V2)), value.var = "V3", sum)
Output:
V1 low medium high
1 5 13 6 0
2 45 13 0 0
3 77 0 0 1
回答2:
Since you are doing a sum
+ transform to wide, I would suggest using xtabs
in base R:
df <- data.frame(V1, V3, V2) ## Keeps numeric data as numeric....
xtabs(V3 ~ V1 + V2, df)
# V2
# V1 high low medium
# 5 0 13 6
# 45 0 13 0
# 77 1 0 0
Or, if you care about the column order, you can try:
xtabs(V3 ~ V1 + factor(V2, c("low", "medium", "high")), df)
回答3:
V1 <- c(5, 5, 5, 45, 45, 77)
V2 <- c("low", "low", "medium", "low", "low", "high")
V3 <- c(10, 3, 6, 10, 3, 1)
df <- data.frame(V1, V2, V3)
df$V2 <- factor(df$V2, levels = c("low", "medium", "high"))
library(tidyr)
library(dplyr)
df %>%
group_by(V1, V2) %>%
summarise(sum = sum(V3)) %>%
spread(V2, sum, fill = 0)
# Source: local data frame [3 x 4]
#
# V1 low medium high
# (dbl) (dbl) (dbl) (dbl)
# 1 5 13 6 0
# 2 45 13 0 0
# 3 77 0 0 1
来源:https://stackoverflow.com/questions/33636641/combine-multiple-rows-with-same-field-in-r