Processing JSON using rjson

折月煮酒 提交于 2019-12-13 14:17:32

问题


I'm trying to process some data in JSON format. rjson::fromJSON imports the data successfully and places it into a quite unwieldy list.

library(rjson)
y <- fromJSON(file="http://api.lmiforall.org.uk/api/v1/wf/predict/breakdown/region?soc=6145&minYear=2014&maxYear=2020")
str(y)
List of 3
 $ soc                : num 6145
 $ breakdown          : chr "region"
 $ predictedEmployment:List of 7
  ..$ :List of 2
  .. ..$ year     : num 2014
  .. ..$ breakdown:List of 12
  .. .. ..$ :List of 3
  .. .. .. ..$ code      : num 1
  .. .. .. ..$ name      : chr "London"
  .. .. .. ..$ employment: num 74910
  .. .. ..$ :List of 3
  .. .. .. ..$ code      : num 7
  .. .. .. ..$ name      : chr "Yorkshire and the Humber"
  .. .. .. ..$ employment: num 61132
  ...

However, as this is essentially tabular data, I would like it in a succinct data.frame. After much trial and error I have the result:

y.p <- do.call(rbind,lapply(y[[3]], function(p) cbind(p$year,do.call(rbind,lapply(p$breakdown, function(q) data.frame(q$name,q$employment,stringsAsFactors=F))))))
head(y.p)
  p$year                   q.name q.employment
1   2014                   London     74909.59
2   2014 Yorkshire and the Humber     61131.62
3   2014     South West (England)     65833.57
4   2014                    Wales     33002.64
5   2014  West Midlands (England)     68695.34
6   2014     South East (England)     98407.36

But the command seems overly fiddly and complex. Is there a simpler way of doing this?


回答1:


I am not sure it is simpler, but the result is more complete and I think is easier to read. My idea using Map is, for each couple (year,breakdown), aggregate breakdown data into single table and then combine it with year.

dat <- y[[3]]
res <- Map(function(x,y)data.frame(year=y,
                                   do.call(rbind,lapply(x,as.data.frame))),
        lapply(dat,'[[','breakdown'),
        lapply(dat,'[[','year'))
## transform the list to a big data.frame
do.call(rbind,res)
   year code                     name employment
1  2014    1                   London   74909.59
2  2014    7 Yorkshire and the Humber   61131.62
3  2014    4     South West (England)   65833.57
4  2014   10                    Wales   33002.64
5  2014    5  West Midlands (England)   68695.34
6  2014    2     South East (England)   98407.36



回答2:


Here I recover the geometry of the list

ni <- seq_along(y[[3]])
nj <- seq_along(y[[c(3, 1, 2)]])
nij <- as.matrix(expand.grid(3, ni=ni, 2, nj=nj))

then extract the relevant variable information using the rows of nij as an index into the nested list

data <- apply(nij, 1, function(ij) y[[ij]])
year <- apply(cbind(nij[,1:2], 1), 1, function(ij) y[[ij]])

and make it into a more friendly structure

> data.frame(year, do.call(rbind, data))
   year code                     name employment
1  2014    1                   London   74909.59
2  2015    5  West Midlands (England)   69132.34
3  2016   12         Northern Ireland   24313.94
4  2017    5  West Midlands (England)    71723.4
5  2018    9     North East (England)   27199.99
6  2019    4     South West (England)   71219.51


来源:https://stackoverflow.com/questions/17674623/processing-json-using-rjson

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!