Pandas dataframe to a dynamic nested JSON

后端 未结 3 919
萌比男神i
萌比男神i 2021-01-18 18:29

I want to create my dataframe which looks like this:

    employeeId  firstName   lastName    emailAddress    isDependent employeeIdTypeCode  entityCode  sour         


        
相关标签:
3条回答
  • 2021-01-18 19:12

    Not really a Pandas solution but kinds works:

    Starts from your result dataframe

    from collections import defaultdict
    import json
    
    result = 'your data frame'
    
    dicted = defaultdict(dict)
    for r in result.values.tolist():
        identifierValue, firstName, lastName, emailAddress,isDependent,\
        identityTypeCode, entityCode, sourceCode,roleCode = r
        tupled_criteria = (firstName,lastName,emailAddress)
        if dicted[tupled_criteria].get("individualInfo"):
            pass
        else:
            dicted[tupled_criteria]["individualInfo"] = {}
    
        dicted[tupled_criteria]["individualInfo"]['entityCode'] = entityCode
        dicted[tupled_criteria]["individualInfo"]['soruceCode'] = sourceCode
        dicted[tupled_criteria]["individualInfo"]['roleCode'] = roleCode
        dicted[tupled_criteria]["individualInfo"]['isDependent'] = isDependent
        if dicted[tupled_criteria]["individualInfo"].get("individualIdentifier"):
            pass
        else:
            dicted[tupled_criteria]["individualInfo"]["individualIdentifier"] = []
        dicted[tupled_criteria]["individualInfo"]["individualIdentifier"]\
            .append({"identityTypeCode":identityTypeCode,
                       "identifierValue":identifierValue,
                        "profileInfo":{  
                          "firstName":firstName,
                          "lastName":lastName,
                          "emailAddress":emailAddress}})
    
    for k,v in dicted.items():
        print(k,'\n',json.dumps(v),'\n\n')
    
    0 讨论(0)
  • 2021-01-18 19:13

    Perhaps you can iterate over a group by, then do another iteration for each row within that group. Thus, creating a nested dictionary structure:

    This explains one way going through with it:

    import pandas as pd
    df = pd.DataFrame({"entityCode":[1,1,3,3],"sourceCode":[4,4,6,6],'identityTypeCode':[7,8,9,10]})
    results = []
    for i, sub_df in df.groupby(["entityCode","sourceCode"]):
        entityCode, sourceCode = i
        d = {}
        d["individualInfo"] = {"entityCode":entityCode, "sourceCode":sourceCode}
        sub_result = []
        for _, row in sub_df[["identityTypeCode"]].drop_duplicates().iterrows():
            sub_result.append(row.to_dict())
        d["individualIdentifier"] = sub_result
        results.append(d)
    results
    

    which returns something like this:

    [{'individualInfo': {'entityCode': 1, 'sourceCode': 4},
      'individualIdentifier': [{'identityTypeCode': 7}, {'identityTypeCode': 8}]},
     {'individualInfo': {'entityCode': 3, 'sourceCode': 6},
      'individualIdentifier': [{'identityTypeCode': 9}, {'identityTypeCode': 10}]}]
    

    afterwards, you can convert the dictionary to json.

    0 讨论(0)
  • 2021-01-18 19:20

    It sounds like the most sensible way to pull this off is:

    info_dict = df.set_index(['identifierValue', 'identifierValue']).to_dict('index')
    

    Then every time you get to profileInfo in your JSON, you can reference the info_dict above with the appropriate ('identifierValue', 'identifierValue')` key pair

    I'm confused about what your desired formatting is, but this is a start.

    0 讨论(0)
提交回复
热议问题