Getting all descendants of a parent from a pandas dataframe parent child table

前端未结

关注

 2  1579

悲哀的现实 2021-01-06 05:28

I have a Pandas dataframe containing parent ids and child ids. I need help building an updated dataframe listing every descendant of each parent.

For clarificati

2条回答

不思量自难忘° (楼主)

2021-01-06 06:10

As long as your IDs never have cycles, I think this should work

def get_children(id): list_of_children = [] def dfs(id): child_ids = df[df["parent_id"]==id]["child_id"] if child_ids.empty: return for child_id in child_ids: list_of_children.append(child_id) dfs(child_id) dfs(id) return list_of_children df["list_of_children"] = df["parent_id"].apply(get_children) df

Returns:

parent_id child_id list_of_children 0 3111 4321 [4321] 1 2010 3102 [3102, 4001, 3011, 4200, 4010] 2 3000 4023 [4023, 5321, 5010, 6525, 6100, 6016] 3 1000 2010 [2010, 3102, 4001, 3011, 4200, 4010, 2110, 3000, 4023, 5321, 5010, 6525, 610... 4 4023 5321 [5321, 5010, 6525, 6100, 6016] 5 3011 4200 [4200, 4010] 6 3033 4113 [4113, 4311] 7 5010 6525 [6525, 6100, 6016] 8 3011 4010 [4200, 4010] 9 3102 4001 [4001] 10 2010 3011 [3102, 4001, 3011, 4200, 4010] 11 4023 5010 [5321, 5010, 6525, 6100, 6016] 12 2110 3000 [3000, 4023, 5321, 5010, 6525, 6100, 6016, 3111, 4321] 13 2100 3033 [3033, 4113, 4311] 14 1000 2110 [2010, 3102, 4001, 3011, 4200, 4010, 2110, 3000, 4023, 5321, 5010, 6525, 610... 15 5010 6100 [6525, 6100, 6016] 16 2110 3111 [3000, 4023, 5321, 5010, 6525, 6100, 6016, 3111, 4321] 17 1000 2100 [2010, 3102, 4001, 3011, 4200, 4010, 2110, 3000, 4023, 5321, 5010, 6525, 610... 18 5010 6016 [6525, 6100, 6016] 19 3033 4311 [4113, 4311]

One problem is that you don't pass the dataframe to the function here, so you need to be careful about what you name it. You could probably improve it by finding a way to implement this function without the inner dfs function relying on a dataframe named df existing.

0 讨论(0)

查看其它2个回答

发布评论:

提交评论

加载中...

验证码

看不清?

提交回复