What is the fastest way to minimize a function in python?

后端 未结 2 714
孤城傲影
孤城傲影 2020-12-31 10:00

So I have the following problem to minimize. I have a vector w that I need to find in order to minimize the following function:

import numpy as          


        
相关标签:
2条回答
  • 2020-12-31 10:27

    Based on pylang comments, I calculated the jacobian of my function which leads to the following function:

    def fct_deriv(x):
        return 2 * matrix.dot(x)
    

    The optimization problem becomes the following

    minimize(fct, x0, method='SLSQP', jac=fct_deriv, bounds=bnds, constraints=cons)['x']
    

    However, that solution does not allow to add the Hessian as the SLSQP method does not allow it. Other optimization methods exist, but SLSQP is the only one accepting bounds and constraints at the same time (which is central to my optimizatio problem).

    See below for full code:

    import numpy as np
    from scipy.optimize import minimize
    
    matrix = np.array([[1.0, 1.5, -2.],
                       [0.5, 3.0, 2.5],
                       [1.0, 0.25, 0.75]])
    
    def fct(x):
        return x.dot(matrix).dot(x)
    
    def fct_deriv(x):
        return 2 * matrix.dot(x)
    
    x0 = np.ones(3) / 3
    cons = ({'type': 'eq', 'fun': lambda x: x.sum() - 1.0})
    bnds = [(0, 1)] * 3
    
    w = minimize(fct, x0, method='SLSQP', jac=fct_deriv, bounds=bnds, constraints=cons)['x']
    

    Edited (added the jacobian of the constraint):

    cons2 = ({'type': 'eq', 'fun': lambda x: x.sum() - 1.0, 'jac': lambda x: np.ones_like(x)})
    
    w = minimize(fct, x0, method='SLSQP', jac=fct_deriv, bounds=bnds, constraints=cons2)['x']
    
    0 讨论(0)
  • 2020-12-31 10:41

    Introduction

    In general the fastest approach will always be the most tailored to the problem.

    As all optimization-algorithms within scipy.minimize are quite general, there will always be faster methods, gaining performance from special characteristics of your problem. It will be a trade-off, how much analysis and work is done to gain performance.

    It's important to note, that SLSQP for example is an algorithm, which is able to tackle non-convex problems, in which case convergence to some local-optimum is guaranteed (ignoring numerical-trouble within the implementation; which is always a possible problem).

    This power comes with a price: SLSQP will be less fast and less robust compared to algorithms which are specifically designed for convex problems (and even within convex problems, although they are all polynomially solvable, there are easier ones as LinearProgramming and harder ones as SemidefiniteProgramming).

    Problem Analysis

    As indicated in the comments above, for some general indefinite matrix M, this problem is non-convex (with a high probability; i'm not giving formal proof), which means, that there is no general feasible approach without further assumptions (ignoring special analysis as some non-convex problems can be solved globally in polynomial time).

    This means:

    • every optimization algorithm within scipy, will at most guarantee a local-optimum, which might be arbitrarily bad compared to the global-optimum

    Assumption: M is positive-definite / negative-definite

    If we assume matrix M is either positive-definite or negative-definite, but not indefinite, this is a convex-optimization problem. As you seem to be interested in this case, here are some remarks and approaches.

    This means:

    • SLSQP is guaranteed to converge to the global-optimum
    • You can use solvers specifically designed for convex optimization problems
      • Commercial solvers: Gurobi, CPLEX, Mosek
      • Open-Source solvers: ECOS, SCS

    Example code using Python + cvxpy + ecos/scs

    There is no special convex-optimization solver except for linprog, which is for Linear Programming and is therefore unable to tackle this problem.

    There are other alternatives though, as mentioned above and there are many possible routes to use them.

    Here i will present one of the most simple ones:

    • cvxpy is used for model-formulation
      • It will automatically proof that this problem is convex!
        • (Model-building and convexity-inference can be costly)
    • ecos
      • General-purpose solver for many convex optimization problems
        • Based on the Interior-Point-Method
    • scs
      • General-purpose solver for many convex optimization problems
        • Based on alternating direction method of multipliers (ADMM)
      • Supports two different approaches to solve linear equations:
        • direct (factorization based)
        • indirect (conjugate-gradient based)
          • GPU support for this one as it's all about matrix-vector products
      • Less accurate solutions in general compared to ECOS, but often much faster

    Example code:

    • Example using:
      • 1000x1000 matrix
      • Solver: SCS
        • indirect-mode
        • CPU
        • Multithreaded (automatically if BLAS available)

    Code:

    import time
    import numpy as np
    from cvxpy import *                 # Convex-Opt
    
    """ Create some random pos-def matrix """
    N = 1000
    matrix_ = np.random.normal(size=(N,N))
    matrix = np.dot(matrix_, matrix_.T)
    
    """ CVXPY-based Convex-Opt """
    print('\ncvxpy\n')
    x = Variable(N)
    constraints = [x >= 0, x <= 1, sum(x) == 1]
    objective = Minimize(quad_form(x, matrix))
    problem = Problem(objective, constraints)
    time_start = time.perf_counter()
    problem.solve(solver=SCS, use_indirect=True, verbose=True)  # or: solver=ECOS
    time_end = time.perf_counter()
    print(problem.value)
    print('cvxpy (modelling) + ecos/scs (solving) used (secs): ', time_end - time_start)
    

    Example output:

    cvxpy
    
    ----------------------------------------------------------------------------
        SCS v1.2.6 - Splitting Conic Solver
        (c) Brendan O'Donoghue, Stanford University, 2012-2016
    ----------------------------------------------------------------------------
    Lin-sys: sparse-indirect, nnz in A = 1003002, CG tol ~ 1/iter^(2.00)
    eps = 1.00e-03, alpha = 1.50, max_iters = 2500, normalize = 1, scale = 1.00
    Variables n = 1001, constraints m = 3003
    Cones:  primal zero / dual free vars: 1
        linear vars: 2000
        soc vars: 1002, soc blks: 1
    Setup time: 6.76e-02s
    ----------------------------------------------------------------------------
     Iter | pri res | dua res | rel gap | pri obj | dua obj | kap/tau | time (s)
    ----------------------------------------------------------------------------
         0|      inf       inf      -nan      -inf      -inf       inf  1.32e-01 
       100| 1.54e-02  1.48e-04  7.63e-01 -5.31e+00 -4.28e+01  1.10e-11  1.15e+00 
       200| 1.53e-02  1.10e-04  7.61e-01 -3.87e+00 -3.17e+01  1.08e-11  1.95e+00 
       300| 1.53e-02  7.25e-05  7.55e-01 -2.47e+00 -2.08e+01  1.07e-11  2.79e+00 
       400| 1.53e-02  3.61e-05  7.39e-01 -1.11e+00 -1.03e+01  1.06e-11  3.61e+00 
       500| 7.64e-03  2.55e-04  1.09e-01 -2.01e-01 -6.32e-02  1.05e-11  4.64e+00 
       560| 7.71e-06  4.24e-06  8.61e-04  2.17e-01  2.16e-01  1.05e-11  5.70e+00 
    ----------------------------------------------------------------------------
    Status: Solved
    Timing: Solve time: 5.70e+00s
        Lin-sys: avg # CG iterations: 1.71, avg solve time: 9.98e-03s
        Cones: avg projection time: 3.97e-06s
    ----------------------------------------------------------------------------
    Error metrics:
    dist(s, K) = 5.1560e-16, dist(y, K*) = 0.0000e+00, s'y/|s||y| = 2.4992e-17
    |Ax + s - b|_2 / (1 + |b|_2) = 7.7108e-06
    |A'y + c|_2 / (1 + |c|_2) = 4.2390e-06
    |c'x + b'y| / (1 + |c'x| + |b'y|) = 8.6091e-04
    ----------------------------------------------------------------------------
    c'x = 0.2169, -b'y = 0.2157
    ============================================================================
    0.21689554805292935
    cvxpy (modelling) + ecos/scs (solving) used (secs):  7.105745473999832
    

    Extra-example: 5000x5000 uses ~ 9 minutes (without tuning params).

    Some tiny extra-remarks:

    • SCS is faster than ECOS (expected)
    • SCS/ECOS both faster than naive (not giving jacobi-matrix) SLSQP (for everything at least every N >= 100); faster = orders of magnitude for big N
    • I checked the equivalence of this method compared to SLSQP for small examples
    0 讨论(0)
提交回复
热议问题