问题
I have a data frame that contains lists, like below:
# Load packages
library(dplyr)
# Create data frame
df <- structure(list(ID = 1:3,
A = structure(list(c(9, 8), c(7,6), c(6, 9)), ptype = numeric(0), class = c("vctrs_list_of", "vctrs_vctr")),
B = structure(list(c(3, 5), c(2, 6), c(1, 5)), ptype = numeric(0), class = c("vctrs_list_of", "vctrs_vctr")),
C = structure(list(c(6, 5), c(7, 6), c(8, 7)), ptype = numeric(0), class = c("vctrs_list_of", "vctrs_vctr")),
D = structure(list(c(5, 3), c(4, 1), c(6, 5)), ptype = numeric(0), class = c("vctrs_list_of", "vctrs_vctr"))),
row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame"))
# Peek at data
df
#> # A tibble: 3 x 5
#> ID A B C D
#> <int> <list> <list> <list> <list>
#> 1 1 <dbl [2]> <dbl [2]> <dbl [2]> <dbl [2]>
#> 2 2 <dbl [2]> <dbl [2]> <dbl [2]> <dbl [2]>
#> 3 3 <dbl [2]> <dbl [2]> <dbl [2]> <dbl [2]>
I'd like to unnest the lists and can do so using pmap_dfr
.
# Expand rows
df %>% purrr::pmap_dfr(function(...)data.frame(...))
#> ID A B C D
#> 1 1 9 3 6 5
#> 2 1 8 5 5 3
#> 3 2 7 2 7 4
#> 4 2 6 6 6 1
#> 5 3 6 1 8 6
#> 6 3 9 5 7 5
Created on 2019-06-28 by the reprex package (v0.3.0)
This is the desired result, but seems to be reinventing the wheel because tidyr::unnest
is designed to flatten list columns back to regular columns. Using tidyr::unnest
produces the following error, however:
df %>% unnest(cols = c(A, B, C, D))
#Error: No common type for `x` <tbl_df<A:double>> and `y` <double>.
#Call `rlang::last_error()` to see a backtrace
How would I apply unnest
in this case for flattening my data frame with list columns?
Version information
> packageVersion("tidyr")
[1] ‘0.8.3.9000’
回答1:
Note: Hadley Wickham has flagged this issue on github as a bug in tidyr version 0.8.3.9000 (see here). I'll leave the below answer as a potential workaround until the issue is fixed.
It looks like nest
is more specifically used to create list-columns of dataframes in 0.8.3.9000. From the docs: Nesting creates a list-column of data frames; unnesting flattens it back out into regular columns.. For example, try:
df <- tibble(x = c(1, 1, 1, 2, 2, 3), y = 1:6, z = 6:1) %>%
nest(data = c(y, z))
Which returns:
# A tibble: 3 x 2
x data
<dbl> <list<df[,2]>>
1 1 [2]
2 2 [2]
3 3 [2]
Then look at df$data
:
<list_of<
tbl_df<
y: integer
z: integer
>
>[3]>
[[1]]
# A tibble: 3 x 2
y z
<int> <int>
1 1 6
2 2 5
3 3 4
[[2]]
# A tibble: 2 x 2
y z
<int> <int>
1 4 3
2 5 2
[[3]]
# A tibble: 1 x 2
y z
<int> <int>
1 6 1
Your dataframe's columns are list-columns of vectors, which seem to fall under purview of chop
, which shortens a dataframes while preserving their width. For example, try:
df <- tibble(x = c(1, 1, 1, 2, 2, 3), y = 1:6, z = 6:1) %>%
chop(c(y, z))
Which returns:
# A tibble: 3 x 3
x y z
<dbl> <list> <list>
1 1 <int [3]> <int [3]>
2 2 <int [2]> <int [2]>
3 3 <int [1]> <int [1]>
And take a look at df$y
:
[[1]]
[1] 1 2 3
[[2]]
[1] 4 5
[[3]]
[1] 6
Knowing this, the appropriate method for your data would be chop
's counterpart unchop
, so given your dataframe:
# A tibble: 3 x 5
ID A B C D
<int> <list<dbl>> <list<dbl>> <list<dbl>> <list<dbl>>
1 1 [2] [2] [2] [2]
2 2 [2] [2] [2] [2]
3 3 [2] [2] [2] [2]
Try unchop(df, c(A, B, C, D))
or unchop(df, A:D)
, which should return:
# A tibble: 6 x 5
ID A B C D
<int> <dbl> <dbl> <dbl> <dbl>
1 1 9 3 6 5
2 1 8 5 5 3
3 2 7 2 7 4
4 2 6 6 6 1
5 3 6 1 8 6
6 3 9 5 7 5
来源:https://stackoverflow.com/questions/56811233/unnesting-a-data-frame-containing-lists