问题
I have data like the SampleData below, which has lists of different length that I'd like to combine in to a data frame like the Desired Result below. I've tried using lapply and cbind.na from the qpcR package like the example below, but for some reason it won't let me turn the result into a data frame. If I just used two of the lists and cbind.na it will combine them and add the NA to the end like I want, but when I try using it in lapply it just leaves them as a list of different length lists. Any tips are greatly appreciated.
SampleData<-list(list(1,2,3),list(1,2),list(3,4,6,7))
Desired Result:
structure(list(V1 = c(1, 2, 3, NA), V2 = c(1, 2, NA, NA), V3 = c(3,
4, 6, 7)), .Names = c("V1", "V2", "V3"), row.names = c(NA, -4L
), class = "data.frame")
Example Code:
lapply(SampleData,qpcR:::cbind.na)
回答1:
My first instinct looking at your data is that, by using a data.frame
, you are implicitly stating that items across a row are paired. That is, in your example, the "3" of $V1
and "6" of $V3
are meant to be associated with each other. (If you look at mtcars
, each column of the first row is associated directly and solely with the "Mazda RX4".) If this is not true, then warping them into a data.frame
like this is mis-representing your data and like to encourage incorrect analysis/assumptions.
Assuming that they are in fact "paired", my next instinct is to try something like do.call(cbind, SampleData)
, but that lends to recycled data, not what you want. So, the trick to deter recycling is to force them to be all the same length.
maxlen <- max(lengths(SampleData))
SampleData2 <- lapply(SampleData, function(lst) c(lst, rep(NA, maxlen - length(lst))))
We can rename first:
names(SampleData2) <- paste("V", seq_along(SampleData2), sep = "")
Since the data appears homogenous (and should be, if you intend to put each element as a column of a data.frame
), it is useful to un-list it:
SampleData3 <- lapply(SampleData2, unlist)
Then it's as straight-forward as:
as.data.frame(SampleData3)
# V1 V2 V3
# 1 1 1 3
# 2 2 2 4
# 3 3 NA 6
# 4 NA NA 7
回答2:
Here is a modified version with length<-
assignment
setNames(do.call(cbind.data.frame, lapply(lapply(SampleData, unlist),
`length<-`, max(lengths(SampleData)))), paste0("V", 1:3))
# V1 V2 V3
#1 1 1 3
#2 2 2 4
#3 3 NA 6
#4 NA NA 7
来源:https://stackoverflow.com/questions/42774843/combining-lists-of-different-lengths-into-data-frame