问题
I have a Python function with 64 variables, and I tried to optimise it using L-BFGS-B method in the minimise function, however this method have quite a strong dependence on the initial guess, and failed to find the global minimum.
But I liked its ability to set bounds for the variables. Is there a way/function to find the global minimum while having boundaries for the variables ?
回答1:
This can be done with scipy.optimize.basinhopping
. Basinhopping is a function designed to find the global minimum of an objective function. It does repeated minimizations using the function scipy.optimize.minimize
and takes a random step in coordinate space after each minimization. Basinhopping can still respect bounds by using one of the minimizers that implement bounds (e.g. L-BFGS-B). Here is some code that shows how to do this
# an example function with multiple minima
def f(x): return x.dot(x) + sin(np.linalg.norm(x) * np.pi)
# the starting point
x0 = [10., 10.]
# the bounds
xmin = [1., 1.]
xmax = [11., 11.]
# rewrite the bounds in the way required by L-BFGS-B
bounds = [(low, high) for low, high in zip(xmin, xmax)]
# use method L-BFGS-B because the problem is smooth and bounded
minimizer_kwargs = dict(method="L-BFGS-B", bounds=bounds)
res = basinhopping(f, x0, minimizer_kwargs=minimizer_kwargs)
print res
The above code will work for a simple case, but you can still end up in a forbidden region if basinhopping random displacement routine takes you there. Luckily that can be overridden by passing a custom step taking routine using the keyword take_step
class RandomDisplacementBounds(object):
"""random displacement with bounds"""
def __init__(self, xmin, xmax, stepsize=0.5):
self.xmin = xmin
self.xmax = xmax
self.stepsize = stepsize
def __call__(self, x):
"""take a random step but ensure the new position is within the bounds"""
while True:
# this could be done in a much more clever way, but it will work for example purposes
xnew = x + np.random.uniform(-self.stepsize, self.stepsize, np.shape(x))
if np.all(xnew < self.xmax) and np.all(xnew > self.xmin):
break
return xnew
# define the new step taking routine and pass it to basinhopping
take_step = RandomDisplacementBounds(xmin, xmax)
result = basinhopping(f, x0, niter=100, minimizer_kwargs=minimizer_kwargs,
take_step=take_step)
print result
回答2:
Some common-sense suggestions for debugging and visualizing any optimizer on your function:
Are your objective function and your constraints reasonable ?
If the objective function is a sum say f() + g()
,
print those separately for all the x
in "fx-opt.nptxt"
(below);
if f()
is 99 % of the sum and g()
1 %, investigate.
Constraints: how many of the components x_i
in xfinal
are stuck at bounds,
x_i <= lo_i
or >= hi_i
?
How bumpy is your function on a global scale ?
Run with several random startpoints, and save the results to analyze / plot:
title = "%s n %d ntermhess %d nsample %d seed %d" % ( # all params!
__file__, n, ntermhess, nsample, seed )
print title
...
np.random.seed(seed) # for reproducible runs
np.set_printoptions( threshold=100, edgeitems=10, linewidth=100,
formatter = dict( float = lambda x: "%.3g" % x )) # float arrays %.3g
lo, hi = bounds.T # vecs of numbers or +- np.inf
print "lo:", lo
print "hi:", hi
fx = [] # accumulate all the final f, x
for jsample in range(nsample):
# x0 uniformly random in box lo .. hi --
x0 = lo + np.random.uniform( size=n ) * (hi - lo)
x, f, d = fmin_l_bfgs_b( func, x0, approx_grad=1,
m=ntermhess, factr=factr, pgtol=pgtol )
print "f: %g x: %s x0: %s" % (f, x, x0)
fx.append( np.r_[ f, x ])
fx = np.array(fx) # nsample rows, 1 + dim cols
np.savetxt( "fx-opt.nptxt", fx, fmt="%8.3g", header=title ) # to analyze / plot
ffinal = fx[:,0]
xfinal = fx[:,1:]
print "final f values, sorted:", np.sort(ffinal)
jbest = ffinal.argmin()
print "best x:", xfinal[jbest]
If some of the ffinal
values look reasonably good,
try more random startpoints near those --
that's surely better than pure random.
If the x
s are curves, or anything real, plot the best few x0
and xfinal
.
(A rule of thumb is nsample ~ 5*d or 10*d in d
dimensions.
Too slow, too many ? Reduce maxiter
/ maxeval
, reduce ftol
--
you don't need ftol
1e-6 for exploration like this.)
If you want reproducible results,
then you must list ALL relevant parameters in the title
and in derived files and plots.
Otherwise, you'll be asking "where did this come from ??"
How bumpy is your function on epsilon scale ~ 10^-6 ?
Methods that approximate a gradient sometimes return their last estimate, but if not:
from scipy.optimize._numdiff import approx_derivative # 3-point, much better than
## from scipy.optimize import approx_fprime
for eps in [1e-3, 1e-6]:
grad = approx_fprime( x, func, epsilon=eps )
print "approx_fprime eps %g: %s" % (eps, grad)
If however the gradient estimate is poor / bumpy before the optimizer quit,
you won't see that.
Then you have to save all the intermediate [f, x, approx_fprime]
to watch them too; easy in python -- ask if that's not clear.
In some problem areas it's common to back up and restart from a purported xmin
.
For example, if you're lost on a country road,
first find a major road, then restart from there.
Summary:
don't expect any black-box optimizer to work on a function that's large-scale bumpy, or epsilon-scale bumpy, or both.
Invest in test scaffolding, and in ways to see what the optimizer is doing.
回答3:
Many thanks to your detailed reply, but as im fairly new to python, i didnt quite know how to implement the code to my program, but here was my attempt at the optimisation:
x0=np.array((10, 13, f*2.5, 0.08, 10, f*1.5, 0.06, 20,
10, 14, f*2.5, 0.08, 10, f*1.75, 0.07, 20,
10, 15, f*2.5, 0.08, 10, f*2, 0.08, 20,
10, 16, f*2.5, 0.08, 10, f*2.25, 0.09, 20,
10, 17, f*2.5, -0.08, 10, f*2.5, -0.06, 20,
10, 18, f*2.5, -0.08, 10, f*2.75,-0.07, 20,
10, 19, f*2.5, -0.08, 10, f*3, -0.08, 20,
10, 20, f*2.5, -0.08, 10, f*3.25,-0.09, 20))
# boundary for each variable, each element in this restricts the corresponding element above
bnds=((1,12), (1,35), (0,f*6.75), (-0.1, 0.1),(1,35), (0,f*6.75), (-0.1, 0.1),(13, 35),
(1,12), (1,35), (0,f*6.75), (-0.1, 0.1),(1,35), (0,f*6.75), (-0.1, 0.1),(13, 35),
(1,12), (1,35), (0,f*6.75), (-0.1, 0.1),(1,35), (0,f*6.75), (-0.1, 0.1),(13, 35),
(1,12), (1,35), (0,f*6.75), (-0.1, 0.1),(1,35), (0,f*6.75), (-0.1, 0.1),(13, 35),
(1,12), (1,35), (0,f*6.75), (-0.1, 0.1),(1,35), (0,f*6.75), (-0.1, 0.1),(13, 35),
(1,12), (1,35), (0,f*6.75), (-0.1, 0.1),(1,35), (0,f*6.75), (-0.1, 0.1),(13, 35),
(1,12), (1,35), (0,f*6.75), (-0.1, 0.1),(1,35), (0,f*6.75), (-0.1, 0.1),(13, 35),
(1,12), (1,35), (0,f*6.75), (-0.1, 0.1),(1,35), (0,f*6.75), (-0.1, 0.1),(13, 35), )
from scipy.optimize import basinhopping
from scipy.optimize import minimize
merit=a*meritoflength + b*meritofROC + c*meritofproximity +d*(distancetoceiling+distancetofloor)+e*heightorder
minimizer_kwargs = {"method": "L-BFGS-B", "bounds": bnds, "tol":1e0}
ret = basinhopping(merit_function, x0, minimizer_kwargs=minimizer_kwargs, niter=10, T=0.01)
zoom = ret['x']
res = minimize(merit_function, zoom, method = 'L-BFGS-B', bounds=bnds, tol=1e-5)
print res
the merit function combines x0 with some other values to form 6 control points for 8 curves, then calculates their lengths, radii of curvature, etc. It returns the final merit as linear combinations of those parameters with some weights.
i used basinhopping
with low precisions to find the some minima, then used minimize
to increase the precision of the lowest munimum.
p.s. the platform i am running on is Enthoght canopy 1.3.0, numpy 1.8.0 scipy 0.13.2 mac 10.8.3
来源:https://stackoverflow.com/questions/21670080/how-to-find-global-minimum-in-python-optimization-with-bounds