Implementing a custom acquisition function
Upper Confidence Bound (UCB)
The Upper Confidence Bound (UCB) acquisition function balances exploration and
exploitation by assigning a score of if the posterior
distribution is normal with mean and variance . This "analytic" version
is implemented in the UpperConfidenceBound
class. The Monte Carlo version of UCB is
implemented in the qUpperConfidenceBound
class, which also allows for q-batches of
size greater than one. (The derivation of q-UCB is given in Appendix A of
Wilson et. al., 2017).
A scalarized version of q-UCB
Suppose now that we are in a multi-output setting, where, e.g., we model the effects of
a design on multiple metrics. We first show a simple extension of the q-UCB acquisition
function that accepts a multi-output model and performs q-UCB on a scalarized version of
the multiple outputs, achieved via a vector of weights. Implementing a new acquisition
function in botorch is easy; one simply needs to implement the constructor and a
forward
method.
import math
from typing import Optional
from botorch.acquisition.monte_carlo import MCAcquisitionFunction
from botorch.models.model import Model
from botorch.sampling.base import MCSampler
from botorch.sampling.normal import SobolQMCNormalSampler
from botorch.utils import t_batch_mode_transform
from torch import Tensor
class qScalarizedUpperConfidenceBound(MCAcquisitionFunction):
def __init__(
self,
model: Model,
beta: Tensor,
weights: Tensor,
sampler: Optional[MCSampler] = None,
) -> None:
# we use the AcquisitionFunction constructor, since that of
# MCAcquisitionFunction performs some validity checks that we don't want here
super(MCAcquisitionFunction, self).__init__(model=model)
if sampler is None:
sampler = SobolQMCNormalSampler(sample_shape=torch.Size([512]))
self.sampler = sampler
self.register_buffer("beta", torch.as_tensor(beta))
self.register_buffer("weights", torch.as_tensor(weights))
@t_batch_mode_transform()
def forward(self, X: Tensor) -> Tensor:
"""Evaluate scalarized qUCB on the candidate set `X`.
Args:
X: A `(b) x q x d`-dim Tensor of `(b)` t-batches with `q` `d`-dim
design points each.
Returns:
Tensor: A `(b)`-dim Tensor of Upper Confidence Bound values at the
given design points `X`.
"""
posterior = self.model.posterior(X)
samples = self.get_posterior_samples(posterior) # n x b x q x o
scalarized_samples = samples.matmul(self.weights) # n x b x q
mean = posterior.mean # b x q x o
scalarized_mean = mean.matmul(self.weights) # b x q
ucb_samples = (
scalarized_mean
+ math.sqrt(self.beta * math.pi / 2)
* (scalarized_samples - scalarized_mean).abs()
)
return ucb_samples.max(dim=-1)[0].mean(dim=0)
[KeOps] Warning : There were warnings or errors :
/bin/sh: brew: command not found
[KeOps] Warning : CUDA libraries not found or could not be loaded; Switching to CPU only.
[KeOps] Warning : There were warnings or errors :
/bin/sh: brew: command not found
[KeOps] Warning : OpenMP library not found, it must be downloaded through Homebrew for apple Silicon chips
[KeOps] Warning : OpenMP support is not available. Disabling OpenMP.
I1116 181427.000 _utils_internal.py:188] NCCL_DEBUG is INFO from /etc/nccl.conf
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/mpmath/ctx_mp_python.py:892: SyntaxWarning:
"is" with a literal. Did you mean "=="?
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/mpmath/ctx_mp_python.py:986: SyntaxWarning:
"is" with a literal. Did you mean "=="?
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/sympy/solvers/diophantine.py:3188: SyntaxWarning:
"is" with a literal. Did you mean "=="?
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/sympy/plotting/plot.py:520: SyntaxWarning:
"is" with a literal. Did you mean "=="?
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/sympy/plotting/plot.py:540: SyntaxWarning:
"is" with a literal. Did you mean "=="?
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/sympy/plotting/plot.py:553: SyntaxWarning:
"is" with a literal. Did you mean "=="?
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/sympy/plotting/plot.py:560: SyntaxWarning:
"is" with a literal. Did you mean "=="?
Note that qScalarizedUpperConfidenceBound
is very similar to qUpperConfidenceBound
and only requires a few lines of new code to accomodate scalarization of multiple
outputs. The @t_batch_mode_transform
decorator ensures that the input X
has an
explicit t-batch dimension (code comments are added with shapes for clarity).
See the end of this tutorial for a quick and easy way of achieving the same
scalarization effect using ScalarizedPosteriorTransform
.
Ad-hoc testing q-Scalarized-UCB
Before hooking the newly defined acquisition function into a Bayesian Optimization loop,
we should test it. For this we'll just make sure that it properly evaluates on a
compatible multi-output model. Here we just define a basic multi-output SingleTaskGP
model trained on synthetic data.
import torch
from botorch.fit import fit_gpytorch_mll
from botorch.models import SingleTaskGP
from botorch.utils import standardize
from gpytorch.mlls import ExactMarginalLogLikelihood
torch.set_default_dtype(torch.double)
# generate synthetic data
X = torch.rand(20, 2)
Y = torch.stack([torch.sin(X[:, 0]), torch.cos(X[:, 1])], -1)
# construct and fit the multi-output model
gp = SingleTaskGP(train_X=X, train_Y=Y)
mll = ExactMarginalLogLikelihood(gp.likelihood, gp)
fit_gpytorch_mll(mll)
# construct the acquisition function
qSUCB = qScalarizedUpperConfidenceBound(gp, beta=0.1, weights=torch.tensor([0.1, 0.5]))
# evaluate on single q-batch with q=3
qSUCB(torch.rand(3, 2))
tensor([0.5276], grad_fn=<MeanBackward1>)
# batch-evaluate on two q-batches with q=3
qSUCB(torch.rand(2, 3, 2))
tensor([0.5630, 0.5453], grad_fn=<MeanBackward1>)
A scalarized version of analytic UCB (q=1
only)
We can also write an analytic version of UCB for a multi-output model, assuming a
multivariate normal posterior and q=1
. The new class ScalarizedUpperConfidenceBound
subclasses AnalyticAcquisitionFunction
instead of MCAcquisitionFunction
. In contrast
to the MC version, instead of using the weights on the MC samples, we directly scalarize
the mean vector and covariance matrix and apply standard UCB on the
univariate normal distribution, which has mean and variance . In
addition to the @t_batch_transform
decorator, here we are also using expected_q=1
to
ensure the input X
has a q=1
.
Note: BoTorch also provides a ScalarizedPosteriorTransform
abstraction that can be
used with any existing analytic acqusition functions and automatically performs the
scalarization we implement manually below. See the end of this tutorial for a usage
example.
from botorch.acquisition import AnalyticAcquisitionFunction
class ScalarizedUpperConfidenceBound(AnalyticAcquisitionFunction):
def __init__(
self,
model: Model,
beta: Tensor,
weights: Tensor,
maximize: bool = True,
) -> None:
# we use the AcquisitionFunction constructor, since that of
# AnalyticAcquisitionFunction performs some validity checks that we don't want here
super(AnalyticAcquisitionFunction, self).__init__(model)
self.maximize = maximize
self.register_buffer("beta", torch.as_tensor(beta))
self.register_buffer("weights", torch.as_tensor(weights))
@t_batch_mode_transform(expected_q=1)
def forward(self, X: Tensor) -> Tensor:
"""Evaluate the Upper Confidence Bound on the candidate set X using scalarization
Args:
X: A `(b) x d`-dim Tensor of `(b)` t-batches of `d`-dim design
points each.
Returns:
A `(b)`-dim Tensor of Upper Confidence Bound values at the given
design points `X`.
"""
self.beta = self.beta.to(X)
batch_shape = X.shape[:-2]
posterior = self.model.posterior(X)
means = posterior.mean.squeeze(dim=-2) # b x o
scalarized_mean = means.matmul(self.weights) # b
covs = posterior.mvn.covariance_matrix # b x o x o
weights = self.weights.view(
1, -1, 1
) # 1 x o x 1 (assume single batch dimension)
weights = weights.expand(batch_shape + weights.shape[1:]) # b x o x 1
weights_transpose = weights.permute(0, 2, 1) # b x 1 x o
scalarized_variance = torch.bmm(
weights_transpose, torch.bmm(covs, weights)
).view(
batch_shape
) # b
delta = (self.beta.expand_as(scalarized_mean) * scalarized_variance).sqrt()
if self.maximize:
return scalarized_mean + delta
else:
return scalarized_mean - delta
Ad-hoc testing Scalarized-UCB
Notice that we pass in an explicit q-batch dimension for consistency, even though q=1
.
# construct the acquisition function
SUCB = ScalarizedUpperConfidenceBound(gp, beta=0.1, weights=torch.tensor([0.1, 0.5]))
# evaluate on single point
SUCB(torch.rand(1, 2))
tensor([0.3412], grad_fn=<AddBackward0>)
# batch-evaluate on 3 points
SUCB(torch.rand(3, 1, 2))
tensor([0.5529, 0.5604, 0.5341], grad_fn=<AddBackward0>)
Appendix: Using ScalarizedPosteriorTransform
Using the ScalarizedPosteriorTransform
abstraction, the functionality of
ScalarizedUpperConfidenceBound
implemented above can be easily achieved in just a few
lines of code. PosteriorTransform
s can be used with both the MC and analytic
acquisition functions.
from botorch.acquisition.objective import ScalarizedPosteriorTransform
from botorch.acquisition.analytic import UpperConfidenceBound
pt = ScalarizedPosteriorTransform(weights=torch.tensor([0.1, 0.5]))
SUCB = UpperConfidenceBound(gp, beta=0.1, posterior_transform=pt)