Acquisition function optimization with torch.optim
Optimize acquisition functions using torch.optim
In this tutorial, we show how to use PyTorch's optim
module for optimizing BoTorch MC
acquisition functions. This is useful if the acquisition function is stochastic in
nature (caused by re-sampling the base samples when using the reparameterization trick,
or if the model posterior itself is stochastic).
Note: A pre-packaged, more user-friendly version of the optimization loop we will
develop below is contained in the gen_candidates_torch
function in the botorch.gen
module. This tutorial should be quite useful if you would like to implement custom
optimizers beyond what is contained in gen_candidates_torch
.
As discussed in the CMA-ES tutorial, for deterministic acquisition functions BoTorch uses quasi-second order methods (such as L-BFGS-B or SLSQP) by default, which provide superior convergence speed in this situation.
Set up a toy model
We'll fit a SingleTaskGP
model on noisy observations of the function
in d=5
dimensions on the hypercube .
# Install dependencies if we are running in colab
import sys
if 'google.colab' in sys.modules:
%pip install botorch
import torch
from botorch.fit import fit_gpytorch_mll
from botorch.models import SingleTaskGP
from gpytorch.mlls import ExactMarginalLogLikelihood
I1116 182000.166 _utils_internal.py:179] NCCL_DEBUG env var is set to None
I1116 182000.167 _utils_internal.py:188] NCCL_DEBUG is INFO from /etc/nccl.conf
d = 5
bounds = torch.stack([-torch.ones(d), torch.ones(d)])
train_X = bounds[0] + (bounds[1] - bounds[0]) * torch.rand(50, d)
train_Y = 1 - torch.linalg.norm(train_X, dim=-1, keepdim=True)
model = SingleTaskGP(train_X, train_Y)
mll = ExactMarginalLogLikelihood(model.likelihood, model)
fit_gpytorch_mll(mll);
Define acquisition function
We'll use qExpectedImprovement
with a StochasticSampler
that uses a small number of
MC samples. This results in a stochastic acquisition function that one should not
attempt to optimize with the quasi-second order methods that are used by default in
BoTorch's optimize_acqf
function.
from botorch.acquisition import qExpectedImprovement
from botorch.sampling.stochastic_samplers import StochasticSampler
sampler = StochasticSampler(sample_shape=torch.Size([128]))
qEI = qExpectedImprovement(model, best_f=train_Y.max(), sampler=sampler)
Optimizing the acquisition function
We will perform optimization over N=5
random initial q
-batches with q=2
in
parallel. We use N
random restarts because the acquisition function is non-convex and
as a result we may get stuck in local minima.
N = 5
q = 2
Choosing initial conditions via a heuristic
Using random initial conditions in conjunction with gradient-based optimizers can be problematic because qEI values and their corresponding gradients are often zero in large parts of the feature space. To mitigate this issue, BoTorch provides a heuristic for generating promising initial conditions (this dirty and not-so-little secret of Bayesian Optimization is actually very important for overall closed-loop performance).
Given a set of q
-batches and associated acquisiton function values , the
initialize_q_batch_nonneg
samples promising initial conditions (without
replacement) from the multinomial distribution