Efficient way to assign values from another column pandas df

后端未结

关注

 4  1536

慢半拍i 2021-01-13 07:58

I\'m trying to create a more efficient script that creates a new column based off values in another column. The script below performs this but I can only select

4条回答

时光说笑 (楼主)

2021-01-13 08:35

On the second attempt this works.

It was quite hard to understand the question.

I was sure that this should be done with pandas groupby() and dataframe merging, if you check the history of this reply you can see how I changed the answer to replace more slow Python code with fast Pandas code.

The code below first counts the unique values per location and then uses a helper data frame to create the final value.

I recommend pasting this code into a Jupyter notebook and to examine the intermediary steps.

import pandas as pd
import numpy as np

d = ({
    'Day' : ['Mon','Tues','Wed','Wed','Thurs','Thurs','Fri','Mon','Sat','Fri','Sun'],                 
    'Location' : ['Home','Home','Away','Home','Away','Home','Home','Home','Home','Away','Home'],        
    })

df = pd.DataFrame(data=d)

# including the example result
df["example"] = pd.Series(["C" + str(e) for e in [1, 1, 2, 1, 2, 3, 3, 1, 3, 2, 4]])

# this groups days per location
s_grouped = df.groupby(["Location"])["Day"].unique()

# This is the 3 unique indicator per location
df["Pre-Assign"] = df.apply(
    lambda x: 1 + list(s_grouped[x["Location"]]).index(x["Day"]) // 3, axis=1
)

# Now we want these unique per combination
df_pre = df[["Location", "Pre-Assign"]].drop_duplicates().reset_index().drop("index", 1)
df_pre["Assign"] = 'C' + (df_pre.index + 1).astype(str)

# result
df.merge(df_pre, on=["Location", "Pre-Assign"], how="left")

Result

Other data frames / series:

0 讨论(0)

查看其它4个回答