Can't use jsonlite in R to read json format file

耗尽温柔 提交于 2019-12-11 15:24:16

问题


I can't use R to read the .json file, but I can see it on the web site.

Below is the site of data↓

https://data.kcg.gov.tw/dataset/7999ac19-e7dc-496a-9b7d-bd8daec107bd/resource/19d06299-a80c-42c2-a9b8-63d4466161a0/download/priceshistory20160101-20161231.json

Here is my code.

library(jsonlite)
link <- "https://data.kcg.gov.tw/dataset/7999ac19-e7dc-496a-9b7d-bd8daec107bd/resource/19d06299-a80c-42c2-a9b8-63d4466161a0/download/priceshistory_20160101-20161231.json"
kh <- fromJSON(link)

Error in open.connection(con, "rb") : Couldn't connect to server

Any help will be thankful.

> sessionInfo()
R version 3.3.1 (2016-06-21)
latform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

回答1:


Your main error is very likely the firewall issue others have pointed out. You may be able to use httr to triage better:

library(httr)
library(jsonlite)

link <- "https://data.kcg.gov.tw/dataset/7999ac19-e7dc-496a-9b7d-bd8daec107bd/resource/19d06299-a80c-42c2-a9b8-63d4466161a0/download/priceshistory_20160101-20161231.json"

The connection, here, worked for me but the data has some issues (which is the main reason I posted this answer):

kh <- jsonlite::fromJSON(json_url)
## Error in parse_con(txt, bigint_as_char) : 
##   lexical error: invalid char in json text.
##                                        [   {     "result":{       "
##                      (right here) ------^
## In addition: Warning message:
## JSON string contains (illegal) UTF8 byte-order-mark! 

That error means the BOM wasn't removed (we'll have to do that, then).

Here's a way you can triage the connection a bit using httr::GET():

httr::GET(
  link, 
  progress(), # it's a 13MB file on a slow connection for North America, so this helps
  verbose()   # this lets you see the connection info to make sure nothing is wrong
) -> res

This had no errors so I'm not pasting the verbose output, but you should look at the verbose output and see what HTTP errors show up. That may help diagnose any proxy/firewall issues. Using the latest curl and httr packages may also help get through this as they play nicer with Windows OS now.

Back to the BOM issue, which is still likely going to be an issue for you:

hk_raw <- httr::content(res, as="raw")

hk_raw[1:10]
## [1] ef bb bf ef bb bf 5b 0a 20 20

I'm not sure why the UTF-8 BOM sequence is there 2x, but that's easy to deal with (and will need to be dealt with)

hk <- jsonlite::fromJSON(rawToChar(hk_raw[-(1:6)]))

That should give you the data structure fully read in.



来源:https://stackoverflow.com/questions/47528321/cant-use-jsonlite-in-r-to-read-json-format-file

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!