Pandas json_normalize produces confusing `KeyError` message?

后端未结

关注

 3  593

I\'m trying to convert a nested JSON to a Pandas dataframe. I\'ve been using json_normalize with success until I came across a certain JSON. I\'ve made a smalle

相关标签:

3条回答

深忆病人

2021-01-12 02:58
In this case, I think you'd just use this:
```
In [57]: json_normalize(data[0]['events'])
Out[57]: 
  group  schedule.ID schedule.date schedule.location.building  \
0     A          815    2015-08-27                        BDC   
1     A          816    2015-08-27                        BDC   

   schedule.location.floor  
0                        5  
1                        5  
```
The meta paths ([['schedule','date']...]) are for specifying data at the same level of nesting as your records, i.e. at the same level as 'events'. It doesn't look like json_normalize handles dicts with nested lists particularly well, so you may need to do some manual reshaping if your actual data is much more complicated.
0 讨论(0)
发布评论:

提交评论
- 加载中...
花落未央

2021-01-12 02:58
I had this same problem! This thread helped, especially parachute py's answer.

I found a solution using:
```
df.dropna(subset = *column(s) with nested data*)
```
then saving the resultant df as a new json. Load the new json and now you'll be able to flatten the nested columns.

There's probably a more efficient way to get around this, but my solution works.

edit: forgot to mention, I tried using the *errors = 'ignore'* arg in json.normalize() and it didn't help.
0 讨论(0)
发布评论:

提交评论
- 加载中...
迷失自我

2021-01-12 03:12

I got the KeyError when the structue of the json was not consistent. Meaning, when one of the nested strucutes were missing from the json, I got KeyError.

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.json.json_normalize.html

From the examples mentioned on the pandas documentation site, if you make the nested tag (counties) missing on one of the records, you will get a KeyError. To circumvent this, you might have to make sure ignore the missing tag or consider only the records which have nested column/tag populated with data.

0 讨论(0)
发布评论:

提交评论
- 加载中...