问题
I have three relevant columns: time, id, and interaction. How can i create a new column with the id values that have a '1' in column 'interaction' in the given time window?
Should look something like this:
time id vec_len quadrant interaction Paired with
1 3271 0.9 7 0
1 3229 0.1 0 0
1 4228 0.5 0 0
1 2778 -0.3 5 0
2 4228 0.2 0 0
2 3271 0.1 6 0
2 3229 -0.7 5 1 [2778, 4228]
2 3229 -0.3 2 0
2 4228 -0.8 5 1 [2778, 3229]
2 2778 -0.6 5 1 [4228, 3229]
3 4228 0.2 0 0
3 3271 0.1 6 0
3 4228 -0.7 5 1 [3271]
3 3229 -0.3 2 0
3 3271 -0.8 5 1 [4228]
Thank you for helping!!
回答1:
import numpy as np
# initialize dict for all time blocks
dict_time_ids = dict.fromkeys(df.time.unique(), set())
# populate dictionary with ids for each time block where interaction == 1
dict_time_ids.update(df.query('interaction == 1').groupby('time').id.apply(set).to_dict())
# make new column with set of corresponding ids where interaction == 1
df['paired'] = np.where(df.interaction == 1, df.time.apply(lambda x: dict_time_ids[x]), set())
# remove the id from the set and convert to list
df.paired = df.apply(lambda x: list(x.paired - {x.id}), axis=1)
# Out:
time id interaction paired
0 1 3271 0 []
1 1 3229 0 []
2 1 4228 0 []
3 1 2778 0 []
4 2 4228 0 []
5 2 3271 0 []
6 2 3229 1 [2778, 4228]
7 2 3229 0 []
8 2 4228 1 [2778, 3229]
9 2 2778 1 [4228, 3229]
10 3 4228 0 []
11 3 3271 0 []
12 3 4228 1 [3271]
13 3 3229 0 []
14 3 3271 1 [4228]
来源:https://stackoverflow.com/questions/58948939/how-can-i-create-a-new-column-that-inserts-the-cell-value-of-grouped-column-id