Another data.table
solution, which doesn't rely on the existence of any unique fields in the original data.
DT = data.table(read.table(header=T, text="blah | splitme
T | a,b,c
T | a,c
F | b,d
F | e,f", stringsAsFactors=F, sep="|", strip.white = TRUE))
DT[,.( blah
, splitme
, splitted=unlist(strsplit(splitme, ","))
),by=seq_len(nrow(DT))]
The important thing is by=seq_len(nrow(DT))
, this is the 'fake' uniqueID that the splitting occurs on. It's tempting to use by=.I
instead, as it should be defined the same, but .I
appears to be a magical thing that changes its value, better to stick with by=seq_len(nrow(DT))
There are three columns in the output. We simply name the two existing columns, and then compute the third as a split
.( blah # first column of original
, splitme # second column of original
, splitted = unlist(strsplit(splitme, ","))
)