问题
I like to reshape a dataset from long to wide. Specifically, the new wide dataset should consist of rows corresponding to the unique number of IDs in the long dataset, and the number of columns is a multiple of unique values of another variable.
Let's say this is the original dataset:
ID a b C d e f g
1 1 1 1 1 2 3 4
1 1 1 2 5 6 7 8
2 2 2 1 1 2 3 4
2 2 2 3 9 0 1 2
2 2 2 2 5 6 7 8
3 3 3 3 9 0 1 2
3 3 3 2 5 6 7 8
3 3 3 1 1 2 3 4
In the new dataset, the number of rows is the number of IDs, the number of columns is 3 plus the multiple of unique elements found in variable C
and the values from variables d
to g
are populated after sorting variable C
in ascending order. It should look something like this:
ID a b d1 e1 f1 g1 d2 e2 f2 g2 d3 e3 f3 g3
1 1 1 1 2 3 4 5 6 7 8 NA NA NA NA
2 2 2 1 2 3 4 5 6 7 8 9 0 1 2
3 3 3 1 2 3 4 5 6 7 8 9 0 1 2
回答1:
You can use dcast
from data.table
:
data.table::setDT(df)
data.table::dcast(df, ID + a + b ~ C, sep = "", value.var = c("d", "e", "f", "g"), fill=NA)
ID a b d1 d2 d3 e1 e2 e3 f1 f2 f3 g1 g2 g3
1: 1 1 1 1 5 NA 2 6 NA 3 7 NA 4 8 NA
2: 2 2 2 1 5 9 2 6 0 3 7 1 4 8 2
3: 3 3 3 1 5 9 2 6 0 3 7 1 4 8 2
回答2:
Base reshape
version - just have to use C
as your time variable and away you go.
reshape(dat, idvar=c("ID","a","b"), direction="wide", timevar="C", sep="")
# ID a b d1 e1 f1 g1 d2 e2 f2 g2 d3 e3 f3 g3
#1 1 1 1 1 2 3 4 5 6 7 8 NA NA NA NA
#3 2 2 2 1 2 3 4 5 6 7 8 9 0 1 2
#6 3 3 3 1 2 3 4 5 6 7 8 9 0 1 2
来源:https://stackoverflow.com/questions/37800717/r-partial-reshape-data-from-long-to-wide