Convert Nested List into data.frame with different column length

拟墨画扇 提交于 2019-12-13 08:07:46

问题


I was trying to convert below nested list into data.frame but without luck. There are a few complications, mainly the column "results" of position 1 is inconsistent with position 2, as there is no result in position 2.

item length inconsistent across different positions

[[1]]
[[1]]$html_attributions
list()

[[1]]$results
  geometry.location.lat geometry.location.lng
1              25.66544             -100.4354
                                        id                    place_id
1 6ce0a030663144c8e992cbce51eb00479ef7db89 ChIJVy7b7FW9YoYRdaH2I_gOJIk
                                                                                                                                                                                       reference
1 CmRSAAAATdtVfB4Tz1aQ8GhGaw4-nRJ5lZlVNgiOR3ciF4QjmYC56bn6b7omWh1SJEWWqQQEFNXxGZndgEwSgl8sRCOtdF8aXpngUY878Q__yH4in8EMZMCIqSHLARqNgGlV4mKgEhDlvkHLXLiBW4F_KQVT83jIGhS5DJipk6PAnpPDXP2p-4X5NPuG9w

[[1]]$status
[1] "OK"

[[2]]
[[2]]$html_attributions
list()

[[2]]$results
list()

[[2]]$status
[1] "ZERO_RESULTS"

I tried the following codes but they aint' working.

#1
m1 <- do.call(rbind, lapply(myDataFrames, function(y) do.call(rbind, y)))
relist(m1, skeleton = myDataFrames)

#2
relist(matrix(unlist(myDataFrames), ncol = 4, byrow = T), skeleton = myDataFrames)

#3
library(data.table)

df<-rbindlist(myDataFrames, idcol = "index")
df<-rbindlist(myDataFrames, fill=TRUE)

#4 
myDataFrame <- do.call(rbind.data.frame, c(myDataFrames, list(stringsAsFactors = FALSE)))

回答1:


I think I have enough of the original JSON to be able to create a reproducible example:

okjson <- '{"html_attributions":[],"results":[{"geometry":{"location":{"lat":25.66544,"lon":-100.4354},"id":"foo","place_id":"quux"}}],"status":"OK"}'
emptyjson <- '{"html_attributions":[],"results":[],"status":"ZERO_RESULTS"}'
jsons <- list(okjson, emptyjson, okjson)

From here, I'll step (slowly) through the process. I've included much of the intermediate structure for reproducibility, I apologize for the verbosity. This can easily be grouped together and/or put within a magrittr pipeline.

lists <- lapply(jsons, jsonlite::fromJSON)
str(lists)
# List of 3
#  $ :List of 3
#   ..$ html_attributions: list()
#   ..$ results          :'data.frame': 1 obs. of  1 variable:
#   .. ..$ geometry:'data.frame':   1 obs. of  3 variables:
#   .. .. ..$ location:'data.frame':    1 obs. of  2 variables:
#   .. .. .. ..$ lat: num 25.7
#   .. .. .. ..$ lon: num -100
#   .. .. ..$ id      : chr "foo"
#   .. .. ..$ place_id: chr "quux"
#   ..$ status           : chr "OK"
#  $ :List of 3
#   ..$ html_attributions: list()
#   ..$ results          : list()
#   ..$ status           : chr "ZERO_RESULTS"
#  $ :List of 3
#   ..$ html_attributions: list()
#   ..$ results          :'data.frame': 1 obs. of  1 variable:
#   .. ..$ geometry:'data.frame':   1 obs. of  3 variables:
#   .. .. ..$ location:'data.frame':    1 obs. of  2 variables:
#   .. .. .. ..$ lat: num 25.7
#   .. .. .. ..$ lon: num -100
#   .. .. ..$ id      : chr "foo"
#   .. .. ..$ place_id: chr "quux"
#   ..$ status           : chr "OK"


goodlists <- Filter(function(a) "results" %in% names(a) && length(a$results) > 0, lists)
goodresults <- lapply(goodlists, `[[`, "results")
str(goodresults)
# List of 2
#  $ :'data.frame': 1 obs. of  1 variable:
#   ..$ geometry:'data.frame':  1 obs. of  3 variables:
#   .. ..$ location:'data.frame':   1 obs. of  2 variables:
#   .. .. ..$ lat: num 25.7
#   .. .. ..$ lon: num -100
#   .. ..$ id      : chr "foo"
#   .. ..$ place_id: chr "quux"
#  $ :'data.frame': 1 obs. of  1 variable:
#   ..$ geometry:'data.frame':  1 obs. of  3 variables:
#   .. ..$ location:'data.frame':   1 obs. of  2 variables:
#   .. .. ..$ lat: num 25.7
#   .. .. ..$ lon: num -100
#   .. ..$ id      : chr "foo"
#   .. ..$ place_id: chr "quux"

goodresultsdf <- lapply(goodresults, function(a) jsonlite::flatten(as.data.frame(a)))
str(goodresultsdf)
# List of 2
#  $ :'data.frame': 1 obs. of  4 variables:
#   ..$ geometry.id          : chr "foo"
#   ..$ geometry.place_id    : chr "quux"
#   ..$ geometry.location.lat: num 25.7
#   ..$ geometry.location.lon: num -100
#  $ :'data.frame': 1 obs. of  4 variables:
#   ..$ geometry.id          : chr "foo"
#   ..$ geometry.place_id    : chr "quux"
#   ..$ geometry.location.lat: num 25.7
#   ..$ geometry.location.lon: num -100

We now have a list-of-data.frames, a good place to be.

do.call(rbind.data.frame, c(goodresultsdf, stringsAsFactors = FALSE))
#   geometry.id geometry.place_id geometry.location.lat geometry.location.lon
# 1         foo              quux              25.66544             -100.4354
# 2         foo              quux              25.66544             -100.4354


来源:https://stackoverflow.com/questions/46818672/convert-nested-list-into-data-frame-with-different-column-length

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!