Bounding hyperparameter optimization with Tensorflow bijector chain in GPflow 2.0

问题

While doing GP regression in GPflow 2.0, I want to set hard bounds on lengthscale (i.e. limiting lengthscale optimization range). Following this thread (Setting hyperparameter optimization bounds in GPflow 2.0), I constructed a TensorFlow Bijector chain (see bounded_lengthscale function below). However, the bijector chain below does not prevent the model from optimizing outside the supposed bounds. What do I need to change to make the bounded_lengthscale function put hard bounds on optimization?

Below is the MRE:

import gpflow 
import numpy as np
from gpflow.utilities import print_summary
import tensorflow as tf
from tensorflow_probability import bijectors as tfb

# Noisy training data
noise = 0.3
X = np.arange(-3, 4, 1).reshape(-1, 1).astype('float64')
Y = (np.sin(X) + noise * np.random.randn(*X.shape)).reshape(-1,1)

def bounded_lengthscale(low, high, lengthscale):
    """Returns lengthscale Parameter with optimization bounds."""
    affine = tfb.AffineScalar(shift=low, scale=high-low)
    sigmoid = tfb.Sigmoid()
    logistic = tfb.Chain([affine, sigmoid])
    parameter = gpflow.Parameter(lengthscale, transform=logistic, dtype=tf.float32)
    parameter = tf.cast(parameter, dtype=tf.float64)
    return parameter

# build GPR model
k = gpflow.kernels.Matern52()
m = gpflow.models.GPR(data=(X, Y), kernel=k)

m.kernel.lengthscale.assign(bounded_lengthscale(0, 1, 0.5))

print_summary(m)

# train model
@tf.function(autograph=False)
def objective_closure():
    return - m.log_marginal_likelihood()

opt = gpflow.optimizers.Scipy()
opt_logs = opt.minimize(objective_closure,
                        m.trainable_variables)
print_summary(m)

Thanks!

回答1:

In the MWE you assign a new value to a Parameter that is already existing (and does not have the logistic transform). This value is the constrained-space value that the Parameter constructed with logistic transform has, but the transform isn't carried over. Instead, you need to replace the Parameter without logistic transform with one with the transform you want: m.kernel.lengthscale = bounded_lengthscale(0,1,0.5).

Note that the object that you assign to the kernel.lengthscale attribute must be a Parameter instance; if you assign the return value of tf.cast(parameter) as in the MWE this is equivalent to a constant, and it won't actually be optimised!

Simply temoving the tf.cast in the MWE in this question won't immediately work due to float32/float64 mismatch. To fix it, the AffineScalar bijector needs to be in float64; it does not have a dtype argument, instead cast the arguments to shift= and scale= to the required type:

def bounded_lengthscale(low, high, lengthscale):
    """Make lengthscale tfp Parameter with optimization bounds."""
    affine = tfb.AffineScalar(shift=tf.cast(low, tf.float64),
                              scale=tf.cast(high-low, tf.float64))
    sigmoid = tfb.Sigmoid()
    logistic = tfb.Chain([affine, sigmoid])
    parameter = gpflow.Parameter(lengthscale, transform=logistic, dtype=tf.float64)
    return parameter

m.kernel.lengthscale = bounded_lengthscale(0, 1, 0.5)

(GPflow should probably contain a helper function like this to make bounded parameter transforms easier to use - GPflow always appreciates people helping out, so if you want to turn this into a pull request, please do!)

来源：https://stackoverflow.com/questions/59504125/bounding-hyperparameter-optimization-with-tensorflow-bijector-chain-in-gpflow-2

标签

optimization

python-3.7

tensorflow2.0

tensorflow-probability

gpflow