parent-child relationship query in simple_salesforce python, extracting from ordered dicts

老子叫甜甜 提交于 2021-01-18 06:10:55

问题


I'm trying to query information from salesforce using the simple_salesforce package in python.

The problem is that it's nesting fields that are a part of a parent-child relationship into an ordered dict within an ordered dict

I want.. from the Opportunity object, to find the id, and the accountid associated with that record.

The SOQL query may look like..

query = "select id, account.id from opportunity where closedate = last_n_days:5"

in SOQL (salesforce object query language), a dot denotes a parent child relationship in the database. So I'm trying to get the id from the opportunity object, and then the related id from the account object on that record.

for some reason the Id comes in fine, but the account.id is nested in an ordered dict within an ordered dict:

q = sf.query_all(query)

this pulls back an ordered dictionary..

OrderedDict([('totalSize', 455),
             ('done', True),
             ('records',
              [OrderedDict([('attributes',
                             OrderedDict([('type', 'Opportunity'),
                                          ('url',

I would pull the records piece of the ordereddict to create a df

df = pd.DataFrame(q['records'])

This gives me 3 columns, an ordered dict called 'attributes', Id and another ordered dict called 'Account'. I'm looking for a way to extract the ('BillingCountry', 'United States') piece out of the nested ordered dict 'Account'

[OrderedDict([('attributes',
               OrderedDict([('type', 'Opportunity'),
                            ('url',
                             '/services/data/v34.0/sobjects/Opportunity/0061B003451RhZgiHHF')])),
              ('Id', '0061B003451RhZgiHHF'),
              ('Account',
               OrderedDict([('attributes',
                             OrderedDict([('type', 'Account'),
                                          ('url',
                                           '/services/data/v34.0/sobjects/Account/001304300MviPPF3Z')])),
                            ('BillingCountry', 'United States')]))])

Edit: clarifying what I'm looking for.

I want to end with a dataframe with a column for each of the queried fields.

When I put the 'records' piece into a DataFrame using df = pd.DataFrame(sf.query_all(query)['records']) it gives me:

attributes  Id  Account
OrderedDict([('type', 'Opportunity'), ('url', '/services/data/v34.0/sobjects/Opportunity/0061B003451RhZgiHHF')])    0061B003451RhZgiHHF OrderedDict([('attributes', OrderedDict([('type', 'Account'), ('url', '/services/data/v34.0/sobjects/Account/0013000000MvkRQQAZ')])), ('BillingCountry', 'United States')])
OrderedDict([('type', 'Opportunity'), ('url', '/services/data/v34.0/sobjects/Opportunity/0061B00001Pa52QQAR')]) 0061B00001Pa52QQAR  OrderedDict([('attributes', OrderedDict([('type', 'Account'), ('url', '/services/data/v34.0/sobjects/Account/0011300001vQPxqAAG')])), ('BillingCountry', 'United States')])
OrderedDict([('type', 'Opportunity'), ('url', '/services/data/v34.0/sobjects/Opportunity/0061B00001TRu5mQAD')]) 0061B00001TRu5mQAD  OrderedDict([('attributes', OrderedDict([('type', 'Account'), ('url', '/services/data/v34.0/sobjects/Account/0011300001rfRTrAAE')])), ('BillingCountry', 'United States')])

after I remove the 'attributes' column I want the output to be

Id BillingCountry
0061B003451RhZgiHHF 'United States'
0061B00001Pa52QQAR 'United States'
0061B00001TRu5mQAD 'United States'

回答1:


Pandas is an amazing tool for tabular data. But while it can contain Python objects, that is not its sweet spot. I suggest you extract your data from the query prior to inserting them into a pandas.Dataframe:

Extract records:

To extract the desired fields as a list of dictionaries is as easy as:

records = [dict(id=rec['Id'], country=rec['Account']['BillingCountry'])
           for rec in data['records']]

Insert records into a dataframe:

With a list of dicts, a dataframe is as easy as:

df = pd.DataFrame(records)

Test Code:

import pandas as pd
from collections import OrderedDict

data = OrderedDict([
    ('totalSize', 455),
    ('done', True),
    ('records', [
        OrderedDict([
            ('attributes', OrderedDict([('type', 'Opportunity'), ('url', '/services/data/v34.0/sobjects/Opportunity/0061B003451RhZgiHHF')])),
            ('Id', '0061B003451RhZgiHHF'),
            ('Account', OrderedDict([('attributes', OrderedDict([('type', 'Account'), ('url', '/services/data/v34.0/sobjects/Account/0013000000MvkRQQAZ')])),
                                     ('BillingCountry', 'United States')])),
        ]),
        OrderedDict([
            ('attributes', OrderedDict([('type', 'Opportunity'), ('url', '/services/data/v34.0/sobjects/Opportunity/0061B00001Pa52QQAR')])),
            ('Id', '0061B00001Pa52QQAR'),
            ('Account', OrderedDict([('attributes', OrderedDict([('type', 'Account'), ('url', '/services/data/v34.0/sobjects/Account/0011300001vQPxqAAG')])),
                                     ('BillingCountry', 'United States')])),
        ]),
        OrderedDict([
            ('attributes', OrderedDict([('type', 'Opportunity'), ('url', '/services/data/v34.0/sobjects/Opportunity/0061B00001TRu5mQAD')])),
            ('Id', '0061B00001TRu5mQAD'),
            ('Account', OrderedDict([('attributes', OrderedDict([('type', 'Account'), ('url', '/services/data/v34.0/sobjects/Account/0011300001rfRTrAAE')])),
                                     ('BillingCountry', 'United States')])),
        ]),
    ])
])

records = [dict(id=rec['Id'], country=rec['Account']['BillingCountry'])
           for rec in data['records']]
for r in records:
    print(r)

print(pd.DataFrame(records))

Test Results:

{'country': 'United States', 'id': '0061B003451RhZgiHHF'}
{'country': 'United States', 'id': '0061B00001Pa52QQAR'}
{'country': 'United States', 'id': '0061B00001TRu5mQAD'}

         country                   id
0  United States  0061B003451RhZgiHHF
1  United States   0061B00001Pa52QQAR
2  United States   0061B00001TRu5mQAD



回答2:


Pandas can read ordered dicts.

import pandas as pd
from simple_salesforce import Salesforce

sf = Salesforce(username='your_username',   
                password='your_password',
                security_token='your_token')

query = "select id, account.id from opportunity where closedate = last_n_days:5"
df = pd.DataFrame(sf.query_all(query)['records']).drop(columns='attributes')


来源:https://stackoverflow.com/questions/47860134/parent-child-relationship-query-in-simple-salesforce-python-extracting-from-ord

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!