Parse multiple XBRL files stored in a zip file

南楼画角 提交于 2019-12-03 22:28:33

Using the suggestion from Karsten in the comments, I unzipped the files to a temporary directory, and then parsed each file. I used the snow package to speed things up.

  # Parse one zip file to start
  fls <- list.files(temp)[[1]]

  # Unzip 
  tmp <- tempdir()
  lst <- unzip(file.path(temp, fls), exdir=tmp)

  # Only parse first 10 records
  inst <- lst[1:10]

  # Start to parse - in parallel
  cl <- makeCluster(parallel::detectCores())
  clusterCall(cl, function() library(XBRL))

  # Start
  st <- Sys.time()

  out <- parLapply(cl, inst, function(i) 
                                  xbrlDoAll(i, 
                                            cache.dir="temp/hmrcCache", 
                                            prefix.out=NULL, verbose=T) )

  stopCluster(cl)

  Sys.time() - st

(I am not sure that I am using the tempdir() correctly as this seems to save large amounts of data to the Local\Temp directory - I would welcome comments if I have approached this incorrectly, thanks).

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!