I\'m trying to create a more efficient script that creates a new column
based off values in another column. The script below performs this but I can only select
On the second attempt this works.
It was quite hard to understand the question.
I was sure that this should be done with pandas groupby() and dataframe merging, if you check the history of this reply you can see how I changed the answer to replace more slow Python code with fast Pandas code.
The code below first counts the unique values per location and then uses a helper data frame to create the final value.
I recommend pasting this code into a Jupyter notebook and to examine the intermediary steps.
import pandas as pd
import numpy as np
d = ({
'Day' : ['Mon','Tues','Wed','Wed','Thurs','Thurs','Fri','Mon','Sat','Fri','Sun'],
'Location' : ['Home','Home','Away','Home','Away','Home','Home','Home','Home','Away','Home'],
})
df = pd.DataFrame(data=d)
# including the example result
df["example"] = pd.Series(["C" + str(e) for e in [1, 1, 2, 1, 2, 3, 3, 1, 3, 2, 4]])
# this groups days per location
s_grouped = df.groupby(["Location"])["Day"].unique()
# This is the 3 unique indicator per location
df["Pre-Assign"] = df.apply(
lambda x: 1 + list(s_grouped[x["Location"]]).index(x["Day"]) // 3, axis=1
)
# Now we want these unique per combination
df_pre = df[["Location", "Pre-Assign"]].drop_duplicates().reset_index().drop("index", 1)
df_pre["Assign"] = 'C' + (df_pre.index + 1).astype(str)
# result
df.merge(df_pre, on=["Location", "Pre-Assign"], how="left")
Result
Other data frames / series: