So I have the data.frame
dat = data.frame(x = c(\'Sir Lancelot the Brave\', \'King Arthur\',
\'The Black Knight\', \'The Rabbit\'), stri
Here is a nice and simple approach with tidyr
.
library(tidyr)
ncol <- max(sapply(dat, length))
dat %>%
separate(x, paste0("V", seq(1,ncol)))
Note: You will get a warning, however, it is basically telling you that separate
is padding the data with NA
's. So you can ignore the warning.
Here's one option. The single complication is that you need to first convert each vector to a data.frame with one row, as data.frames are what rbind.fill()
expects.
library(plyr)
rbind.fill(lapply(sbt, function(X) data.frame(t(X))))
# X1 X2 X3 X4
# 1 Sir Lancelot the Brave
# 2 King Arthur <NA> <NA>
# 3 The Black Knight <NA>
# 4 The Rabbit <NA> <NA>
My own inclination, though, would be to just use base R, like this:
n <- max(sapply(sbt, length))
l <- lapply(sbt, function(X) c(X, rep(NA, n - length(X))))
data.frame(t(do.call(cbind, l)))
# X1 X2 X3 X4
# 1 Sir Lancelot the Brave
# 2 King Arthur <NA> <NA>
# 3 The Black Knight <NA>
# 4 The Rabbit <NA> <NA>
sbt = strsplit(dat$x, " ")
sbt
#[[1]]
#[1] "Sir" "Lancelot" "the" "Brave"
#[[2]]
#[1] "King" "Arthur"
#[[3]]
#[1] "The" "Black" "Knight"
#[[4]]
#[1] "The" "Rabbit"
ncol = max(sapply(sbt,length))
ncol
# [1] 4
as.data.table(lapply(1:ncol,function(i)sapply(sbt,"[",i)))
# V1 V2 V3 V4
# 1: Sir Lancelot the Brave
# 2: King Arthur NA NA
# 3: The Black Knight NA
# 4: The Rabbit NA NA
Using data.table
as it appears you are trying to use it.
library(data.table)
DT <- data.table(dat)
DTB <- DT[, list(y = unlist(strsplit(x, ' '))), by = x]
new <- rep(NA_character_, DTB[,.N,by =x][which.max(N), N])
names(new) <- paste0('V', seq_along(new))
DTB[,{.new <- new
.new[seq_len(.N)] <- y
as.list(.new)} ,by= x]
Or using reshape2
dcast
to reshape
library(reshape2)
dcast(DTB[,list(id = seq_len(.N),y),by= x ], x ~id, value.var = 'y')
This is an old question, I know, but I thought I would share two additional options.
concat.split
from my "splitstackshape" package was designed exactly for this type of thing.
library(splitstackshape)
concat.split(dat, "x", " ")
# x x_1 x_2 x_3 x_4
# 1 Sir Lancelot the Brave Sir Lancelot the Brave
# 2 King Arthur King Arthur
# 3 The Black Knight The Black Knight
# 4 The Rabbit The Rabbit
data.table
has recently (as of version 1.8.11, I believe) had some additions to its arsenal, notably in this case dcast.data.table
. To use it, unlist
the split data (as was done in @mnel's answer), create a "time" variable using .N
(how many new values per row), and use dcast.data.table
to transform the data into the form you are looking for.
library(data.table)
library(reshape2)
packageVersion("data.table")
# [1] ‘1.8.11’
DT <- data.table(dat)
S1 <- DT[, list(X = unlist(strsplit(x, " "))), by = seq_len(nrow(DT))]
S1[, Time := sequence(.N), by = seq_len]
dcast.data.table(S1, seq_len ~ Time, value.var="X")
# seq_len 1 2 3 4
# 1: 1 Sir Lancelot the Brave
# 2: 2 King Arthur NA NA
# 3: 3 The Black Knight NA
# 4: 4 The Rabbit NA NA