I have the following data
{ \"results\": [
{
\"company\": \"XYZ\",
\"createdAt\": \"2014-03-27T23:21:48.758Z\",
\"email\": \"abc@
Not sure how your multiple observations are organized in json
. But it is clear that what is causing problem is you are having a nested structure for the "profilePicture"
field. Therefore each observation is expressed as a nested dictionary. You need to convert each observation to a dataframe
and concat
them into the final dataframe
as in this solution.
In [3]:
print df
results
0 {u'linkedinAccount': u'', u'username': u'abc@g...
1 {u'linkedinAccount': u'', u'username': u'abc@g...
[2 rows x 1 columns]
In [4]:
print pd.concat([pd.DataFrame.from_dict(item, orient='index').T for item in df.results])
linkedinAccount username registrationGate firstName title lastName \
0 abc@gmail.com normal abc AA xyz
0 abc@gmail.com normal abc AA xyz
company telephone profilePicture \
0 XYZ {u'url': u'url.url.com', u'__type': u'File', u...
0 ABC {u'url': u'url.url.com', u'__type': u'File', u...
location updatedAt email createdAt \
0 2014-03-27T23:24:20.220Z abc@gmail.com 2014-03-27T23:21:48.758Z
0 2014-03-27T23:24:20.220Z abc@gmail.com 2014-03-27T23:21:48.758Z
zipcode
0 00000
0 00000
[2 rows x 14 columns]
Then you may want to think about how to deal the the profilePicture
column. You can do what @U2EF1 suggested in the link. But I would probably just break that column into three columns pfPIC_url
, pfPIC_type
, pfPIC_name