问题
In Python only, and using data from a Pandas dataframe, how can I use PuLP to solve linear programming problems the same way I can in Excel? How much budget should be allocated to each Channel under the New Budget column so we maximize the total number of estimated successes? I'm really looking for a concrete example using data from a dataframe and not really high-level advice.
Problem Data Setup
Channel 30-day Cost Trials Success Cost Min Cost Max New Budget
0 Channel1 1765.21 9865 812 882.61 2647.82 0
1 Channel2 2700.00 15000 900 1350.00 4050.00 0
2 Channel3 2160.00 12000 333 1080.00 3240.00 0
This is a Maximization problem.
The objective function is:
objective_function = sum((df['New Budget']/(df['30-day Cost']/df['Trials']))*(df['Success']/df['Trials']))
The constraints are:
- The sum of
df['New Budget']
must equal5000
- The
New Budget
for a given channel can go no lower than theCost Min
- The
New Budget
for a given channel can go no higher than theCost Max
Any ideas how to translate this pandas dataframe solver linear problem using PuLP or any other solver approach? The end-result would be what you see in the image below.
回答1:
In general you create a dictionary of variables (x
in this case) and a model variable (mod
in this case). To create the objective you use sum
over the variables times some scalars, adding that result to mod
. You construct constraints by again computing linear combinations of variables, using >=
, <=
, or ==
, and adding that constraint to mod
. Finally you use mod.solve()
to get the solutions.
import pulp
# Create variables and model
x = pulp.LpVariable.dicts("x", df.index, lowBound=0)
mod = pulp.LpProblem("Budget", pulp.LpMaximize)
# Objective function
objvals = {idx: (1.0/(df['30-day Cost'][idx]/df['Trials'][idx]))*(df['Success'][idx]/float(df['Trials'][idx])) for idx in df.index}
mod += sum([x[idx]*objvals[idx] for idx in df.index])
# Lower and upper bounds:
for idx in df.index:
mod += x[idx] >= df['Cost Min'][idx]
mod += x[idx] <= df['Cost Max'][idx]
# Budget sum
mod += sum([x[idx] for idx in df.index]) == 5000.0
# Solve model
mod.solve()
# Output solution
for idx in df.index:
print idx, x[idx].value()
# 0 2570.0
# 1 1350.0
# 2 1080.0
print 'Objective', pulp.value(mod.objective)
# Objective 1798.70495012
Data:
import numpy as np
import pandas as pd
idx = [0, 1, 2]
d = {'channel': pd.Series(['Channel1', 'Channel2', 'Channel3'], index=idx),
'30-day Cost': pd.Series([1765.21, 2700., 2160.], index=idx),
'Trials': pd.Series([9865, 1500, 1200], index=idx),
'Success': pd.Series([812, 900, 333], index=idx),
'Cost Min': pd.Series([882.61, 1350.00, 1080.00], index=idx),
'Cost Max': pd.Series([2647.82, 4050.00, 3240.00], index=idx)}
df = pd.DataFrame(d)
df
# 30-day Cost Cost Max Cost Min Success Trials channel
# 0 1765.21 2647.82 882.61 812 9865 Channel1
# 1 2700.00 4050.00 1350.00 900 1500 Channel2
# 2 2160.00 3240.00 1080.00 333 1200 Channel3
来源:https://stackoverflow.com/questions/33160262/linear-programming-simplex-lp-pulp