I want to create my dataframe which looks like this:
employeeId firstName lastName emailAddress isDependent employeeIdTypeCode entityCode sour
Perhaps you can iterate over a group by, then do another iteration for each row within that group. Thus, creating a nested dictionary structure:
This explains one way going through with it:
import pandas as pd
df = pd.DataFrame({"entityCode":[1,1,3,3],"sourceCode":[4,4,6,6],'identityTypeCode':[7,8,9,10]})
results = []
for i, sub_df in df.groupby(["entityCode","sourceCode"]):
entityCode, sourceCode = i
d = {}
d["individualInfo"] = {"entityCode":entityCode, "sourceCode":sourceCode}
sub_result = []
for _, row in sub_df[["identityTypeCode"]].drop_duplicates().iterrows():
sub_result.append(row.to_dict())
d["individualIdentifier"] = sub_result
results.append(d)
results
which returns something like this:
[{'individualInfo': {'entityCode': 1, 'sourceCode': 4},
'individualIdentifier': [{'identityTypeCode': 7}, {'identityTypeCode': 8}]},
{'individualInfo': {'entityCode': 3, 'sourceCode': 6},
'individualIdentifier': [{'identityTypeCode': 9}, {'identityTypeCode': 10}]}]
afterwards, you can convert the dictionary to json.