The basic equation of square exponential or RBF kernel is as follows:
Here l is the length scale and sigma is the variance parameter. The length scale controls how two points appear to be similar as it simply magnifies the distance between x and x'. The variance parameter controls how smooth the function is.
I want to optimize/train these parameters (l and sigma) with my training data sets. My training data sets are in the following form:
X: 2-D Cartesian coordinate as input data
y: radio signal strength (RSS) of Wi-Fi device at the 2-D coordinates points as observed output
According to sklearn, the GaussianProcessRegressor class is defined as:
class sklearn.gaussian_process.GaussianProcessRegressor(kernel=None, alpha=1e-10, optimizer=’fmin_l_bfgs_b’, n_restarts_optimizer=0, normalize_y=False, copy_X_train=True, random_state=None)
Here, the optimizer
is a string or callable with L-BFGS-B algorithm as the default optimization algorithm (“fmin_l_bfgs_b”
). The optimizer
can either be one of the internally supported optimizers for optimizing the kernel’s parameters, specified by a string, or an externally defined optimizer passed as a callable. Furthermore, the only available internal optimizer in scikit-learn is fmin_l_bfgs_b
. However, I got to know that scipy package has many more optimizers. Since I wanted to use trust-region-reflective algorithm to optimize the hyper-parameters, I tried to implement the algorithm as follows:
def fun_rosenbrock(Xvariable):
return np.array([10*(Xvariable[1]-Xvariable[0]**2),(1-Xvariable[0])])
Xvariable = [1.0,1.0]
kernel = C(1.0, (1e-5, 1e5)) * RBF(1, (1e-1, 1e3))
trust_region_method = least_squares(fun_rosenbrock,[10,20,30,40,50],bounds=[0,100], method ='trf')
gp = GaussianProcessRegressor(kernel=kernel, optimizer = trust_region_method, alpha =1.2, n_restarts_optimizer=10)
gp.fit(X, y)
Since I couldn't figure out what actually the parameter 'fun' is in my case, I resorted to using rosenbrock function from this example (the example is at bottom of the page). I get the following error in console.
Is my approach of using scipy package to optimize the kernel parameters correct? How can I print the optimized value of the parameters? What is the parameter 'fun' in scipy.optimize.least_squares in my case?
Thank you!
There are three primary problems here:
- The objective function that is being optimized is the rosenbrock function which is a test function for optimization purposes. It needs to be a cost function to be optimized based on the kernel parameters, internally for the GaussianProcessRegressor this is the log-marginal-likelihood and can be passed to the optimizer as a parameter.
- The log-marginal-likelihood optimizer internally needs to be maximized. See section 1.7.1 here. Scipy least squares minimizes the objective function, so you will likely need to minimize the inverse of the objective function.
- The formatting of the optimizer that is being passed into GaussianProcessRegressor, it needs to be passed in the format specified under the 'optimizer' parameter in the docs.
As a partially working example,ignoring the kernel definition to emphasize the optimizer:
import numpy as np
from scipy.optimize import minimize,least_squares
from sklearn.gaussian_process import GaussianProcessRegressor
def trust_region_optimizer(obj_func, initial_theta, bounds):
trust_region_method = least_squares(1/obj_func,initial_theta,bounds,method='trf')
return (trust_region_method.x,trust_region_method.fun)
X=np.random.random((10,4))
y=np.random.random((10,1))
gp = GaussianProcessRegressor(optimizer = trust_region_optimizer, alpha =1.2, n_restarts_optimizer=10)
gp.fit(X, y)
The scipy optimizers return a results object, using the minimization of the rosenbrock test function as an example:
from scipy.optimize import least_squares,rosen
res=least_squares(rosen,np.array([0,0]),method='trf')
As shown above, the optimized values can be accessed using:
res.x
and the resulting value of the function to be minimized:
res.fun
which is what the 'fun' parameter represents. However now that the optimizer is working internally, you will need to access the resulting function value from scikit-learn:
gp.log_marginal_likelihood_value_
来源:https://stackoverflow.com/questions/50265501/optimize-the-kernel-parameters-of-rbf-kernel-for-gpr-in-scikit-learn-using-inter