How can I transform data X to Y as in
X = data.frame( ID = c(1,1,1,2,2), NAME = c("MIKE","MIKE","MIKE","LUCY","LUCY"), SEX = c("MALE","MALE","MALE","FEMALE","FEMALE"), TEST = c(1,2,3,1,2), SCORE = c(70,80,90,65,75) ) Y = data.frame( ID = c(1,2), NAME = c("MIKE","LUCY"), SEX = c("MALE","FEMALE"), TEST_1 =c(70,65), TEST_2 =c(80,75), TEST_3 =c(90,NA) )
The dcast
function in reshape2
seems to work but it can not include other columns in the data like ID, NAME and SEX in the example above.
Assuming all other columns by a ID column are consistent, like Mike can only be a male with ID 1, how can we do it?
According to the documentation (?reshape2::dcast
), dcast()
allows for ...
in the formula:
"..." represents all other variables not used in the formula ...
This is true for both the reshape2
and the data.table
packages which both support dcast()
.
So, you can write:
reshape2::dcast(X, ... ~ TEST, value.var = "SCORE") # ID NAME SEX 1 2 3 #1 1 MIKE MALE 70 80 90 #2 2 LUCY FEMALE 65 75 NA
However, if the OP insists that the column names should be TEST_1
, TEST_2
, etc., the TEST
column needs to be modified before reshaping. Here, data.table
is used:
library(data.table) dcast(setDT(X)[, TEST := paste0("TEST_", TEST)], ... ~ TEST, value.var = "SCORE") # ID NAME SEX TEST_1 TEST_2 TEST_3 #1: 1 MIKE MALE 70 80 90 #2: 2 LUCY FEMALE 65 75 NA
which is in line with the expected answer given as data.frame Y
.