I have a large dataframe (df) that looks like this:
structure(list(var1 = c(1, 2, 3, 4, 2, 3, 4, 3, 2), var2 = c(2,
3, 4, 1, 2, 1, 1, 1, 3), var3 = c(4, 4,
This is a solution using base functions
dd <- t(apply(df, 1, function(x) table(factor(x, levels=1:4))))
colnames(dd) <- paste("n",1:4, sep="_")
cbind(df, dd)
Just use the table
command across rows of your data.frame
to get counts of each value from 1-4.
Here is an approach using qdapTools package:
library(qdapTools)
data.frame(dat, setNames(mtabulate(split(dat, id(dat))), paste0("n_", 1:4)))
## var1 var2 var3 var4 var5 n_1 n_2 n_3 n_4
## 1 1 2 4 2 4 1 2 0 2
## 2 2 3 4 2 4 0 2 1 2
## 3 3 4 2 2 2 0 3 1 1
## 4 4 1 3 2 3 1 1 2 1
## 5 2 2 3 3 3 0 2 3 0
## 6 3 1 1 2 1 3 1 1 0
## 7 4 1 1 3 1 3 0 1 1
## 8 3 1 1 4 1 3 0 1 1
## 9 2 3 4 1 4 1 1 1 2
This uses rowwise()
and do()
from dplyr
but it's definitely ugly.
Not sure if there is something that can modify from this so that you get a data.frame output directly as shown over @ https://github.com/hadley/dplyr/releases.
interim_res <- df %>%
rowwise() %>%
do(out = sapply(min(df):max(df), function(i) sum(i==.)))
interim_res <- interim_res[[1]] %>% do.call(rbind,.) %>% as.data.frame(.)
Then to get intended result:
res <- cbind(df,interim_res)