I have code that at one place ends up with a list of data frames which I really want to convert to a single big data frame.
I got some pointers from an earlier ques
For the purpose of completeness, I thought the answers to this question required an update. "My guess is that using do.call("rbind", ...)
is going to be the fastest approach that you will find..." It was probably true for May 2010 and some time after, but in about Sep 2011 a new function rbindlist
was introduced in the data.table
package version 1.8.2, with a remark that "This does the same as do.call("rbind",l)
, but much faster". How much faster?
library(rbenchmark)
benchmark(
do.call = do.call("rbind", listOfDataFrames),
plyr_rbind.fill = plyr::rbind.fill(listOfDataFrames),
plyr_ldply = plyr::ldply(listOfDataFrames, data.frame),
data.table_rbindlist = as.data.frame(data.table::rbindlist(listOfDataFrames)),
replications = 100, order = "relative",
columns=c('test','replications', 'elapsed','relative')
)
test replications elapsed relative
4 data.table_rbindlist 100 0.11 1.000
1 do.call 100 9.39 85.364
2 plyr_rbind.fill 100 12.08 109.818
3 plyr_ldply 100 15.14 137.636