I need to find optimal discount for each product (in e.g. A, B, C) so that I can maximize total sales. I have existing Random Forest models for each product that map discount an
you can find a complete solution below !
The fundamental differences with your approach are the following :
season
feature, optimal discounts must be computed for every season.con
function yields an output that must comply with con(x) >= 0.0
. The correct constraint is therefore 20 - sum(...)
and not the other way around. In addition, the units
and mrp
variable were not given ; I just assumed a value of 1, you might want to change those values.Additional modifications to your original code include :
sklearn
in order to simplify the preprocessing steps..xlsx
file.maxiter
parameter of the PSO has been set to 5
to speed-up debugging, you might want to set its value to another one (default = 100
).The code is therefore :
import pandas as pd
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.compose import ColumnTransformer
from sklearn.ensemble import RandomForestRegressor
from sklearn.base import clone
# ====================== RF TRAINING ======================
# Preprocessing
def build_sample(season, discount_percentage):
return pd.DataFrame({
'season': [season],
'discount_percentage': [discount_percentage]
})
columns_to_encode = ["season"]
columns_to_scale = ["discount_percentage"]
encoder = OneHotEncoder()
scaler = StandardScaler()
preproc = ColumnTransformer(
transformers=[
("encoder", Pipeline([("OneHotEncoder", encoder)]), columns_to_encode),
("scaler", Pipeline([("StandardScaler", scaler)]), columns_to_scale)
]
)
# Model
myRFClassifier = RandomForestRegressor(
n_estimators = 500,
random_state = 12,
bootstrap = True,
oob_score = True)
pipeline_list = [
('preproc', preproc),
('clf', myRFClassifier)
]
pipe = Pipeline(pipeline_list)
# Dataset
df_tot = pd.read_excel("so_data.xlsx")
df_dict = {
product: df_tot[df_tot['product'] == product].drop(columns=['product']) for product in pd.unique(df_tot['product'])
}
# Fit
print("Training ...")
pipe_dict = {
product: clone(pipe) for product in df_dict.keys()
}
for product, df in df_dict.items():
X = df.drop(columns=["sales_uplift_norm"])
y = df["sales_uplift_norm"]
pipe_dict[product].fit(X,y)
# ====================== OPTIMIZATION ======================
from pyswarm import pso
# Parameter of PSO
maxiter = 5
n_product = len(pipe_dict.keys())
# Constraints
budget = 20
units = [1, 1, 1]
mrp = [1, 1, 1]
lb = [0.0, 0.0, 0.0]
ub = [0.3, 0.4, 0.4]
# Must always remain >= 0
def con(x):
s = 0
for i in range(n_product):
s += units[i] * mrp[i] * x[i]
return budget - s
print("Optimization ...")
# Save optimal discounts for every product and every season
df_opti = pd.DataFrame(data=None, columns=df_tot.columns)
for season in pd.unique(df_tot['season']):
# Objective function to minimize
def obj(x):
s = 0
for i, product in enumerate(pipe_dict.keys()):
s += pipe_dict[product].predict(build_sample(season, x[i]))
return -s
# PSO
xopt, fopt = pso(obj, lb, ub, f_ieqcons=con, maxiter=maxiter)
print("Season: {}\t xopt: {}".format(season, xopt))
# Store result
df_opti = pd.concat([
df_opti,
pd.DataFrame({
'product': list(pipe_dict.keys()),
'season': [season] * n_product,
'discount_percentage': xopt,
'sales_uplift_norm': [
pipe_dict[product].predict(build_sample(season, xopt[i]))[0] for i, product in enumerate(pipe_dict.keys())
]
})
])
# Save result
df_opti = df_opti.reset_index().drop(columns=['index'])
df_opti.to_excel("so_result.xlsx")
print("Summary")
print(df_opti)
It gives :
Training ...
Optimization ...
Stopping search: maximum iterations reached --> 5
Season: summer xopt: [0.1941521 0.11233673 0.36548761]
Stopping search: maximum iterations reached --> 5
Season: winter xopt: [0.18670604 0.37829516 0.21857777]
Stopping search: maximum iterations reached --> 5
Season: monsoon xopt: [0.14898102 0.39847885 0.18889792]
Summary
product season discount_percentage sales_uplift_norm
0 A summer 0.194152 0.175973
1 B summer 0.112337 0.229735
2 C summer 0.365488 0.374510
3 A winter 0.186706 -0.028205
4 B winter 0.378295 0.266675
5 C winter 0.218578 0.146012
6 A monsoon 0.148981 0.199073
7 B monsoon 0.398479 0.307632
8 C monsoon 0.188898 0.210134