Skip to main content
Version: Next

Implementing a custom acquisition function

Upper Confidence Bound (UCB)

The Upper Confidence Bound (UCB) acquisition function balances exploration and exploitation by assigning a score of μ+βσ\mu + \sqrt{\beta} \cdot \sigma if the posterior distribution is normal with mean μ\mu and variance σ2\sigma^2. This "analytic" version is implemented in the UpperConfidenceBound class. The Monte Carlo version of UCB is implemented in the qUpperConfidenceBound class, which also allows for q-batches of size greater than one. (The derivation of q-UCB is given in Appendix A of Wilson et. al., 2017).

A scalarized version of q-UCB

Suppose now that we are in a multi-output setting, where, e.g., we model the effects of a design on multiple metrics. We first show a simple extension of the q-UCB acquisition function that accepts a multi-output model and performs q-UCB on a scalarized version of the multiple outputs, achieved via a vector of weights. Implementing a new acquisition function in botorch is easy; one simply needs to implement the constructor and a forward method.

import math
from typing import Optional

from botorch.acquisition.monte_carlo import MCAcquisitionFunction
from botorch.models.model import Model
from botorch.sampling.base import MCSampler
from botorch.sampling.normal import SobolQMCNormalSampler
from botorch.utils import t_batch_mode_transform
from torch import Tensor


class qScalarizedUpperConfidenceBound(MCAcquisitionFunction):
def __init__(
self,
model: Model,
beta: Tensor,
weights: Tensor,
sampler: Optional[MCSampler] = None,
) -> None:
# we use the AcquisitionFunction constructor, since that of
# MCAcquisitionFunction performs some validity checks that we don't want here
super(MCAcquisitionFunction, self).__init__(model=model)
if sampler is None:
sampler = SobolQMCNormalSampler(sample_shape=torch.Size([512]))
self.sampler = sampler
self.register_buffer("beta", torch.as_tensor(beta))
self.register_buffer("weights", torch.as_tensor(weights))

@t_batch_mode_transform()
def forward(self, X: Tensor) -> Tensor:
"""Evaluate scalarized qUCB on the candidate set `X`.

Args:
X: A `(b) x q x d`-dim Tensor of `(b)` t-batches with `q` `d`-dim
design points each.

Returns:
Tensor: A `(b)`-dim Tensor of Upper Confidence Bound values at the
given design points `X`.
"""
posterior = self.model.posterior(X)
samples = self.get_posterior_samples(posterior) # n x b x q x o
scalarized_samples = samples.matmul(self.weights) # n x b x q
mean = posterior.mean # b x q x o
scalarized_mean = mean.matmul(self.weights) # b x q
ucb_samples = (
scalarized_mean
+ math.sqrt(self.beta * math.pi / 2)
* (scalarized_samples - scalarized_mean).abs()
)
return ucb_samples.max(dim=-1)[0].mean(dim=0)
Output:
[KeOps] Warning : There were warnings or errors :
/bin/sh: brew: command not found
[KeOps] Warning : CUDA libraries not found or could not be loaded; Switching to CPU only.
[KeOps] Warning : There were warnings or errors :
/bin/sh: brew: command not found
[KeOps] Warning : OpenMP library not found, it must be downloaded through Homebrew for apple Silicon chips
[KeOps] Warning : OpenMP support is not available. Disabling OpenMP.
I1116 181427.000 _utils_internal.py:188] NCCL_DEBUG is INFO from /etc/nccl.conf
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/mpmath/ctx_mp_python.py:892: SyntaxWarning:
"is" with a literal. Did you mean "=="?
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/mpmath/ctx_mp_python.py:986: SyntaxWarning:
"is" with a literal. Did you mean "=="?
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/sympy/solvers/diophantine.py:3188: SyntaxWarning:
"is" with a literal. Did you mean "=="?
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/sympy/plotting/plot.py:520: SyntaxWarning:
"is" with a literal. Did you mean "=="?
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/sympy/plotting/plot.py:540: SyntaxWarning:
"is" with a literal. Did you mean "=="?
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/sympy/plotting/plot.py:553: SyntaxWarning:
"is" with a literal. Did you mean "=="?
/data/sandcastle/boxes/fbsource/buck-out/v2/gen/fbcode/f3b9a99e517e0a13/bento/kernels/__bento_kernel_axoptics__/bento_kernel_axoptics#link-tree/sympy/plotting/plot.py:560: SyntaxWarning:
"is" with a literal. Did you mean "=="?

Note that qScalarizedUpperConfidenceBound is very similar to qUpperConfidenceBound and only requires a few lines of new code to accomodate scalarization of multiple outputs. The @t_batch_mode_transform decorator ensures that the input X has an explicit t-batch dimension (code comments are added with shapes for clarity).

See the end of this tutorial for a quick and easy way of achieving the same scalarization effect using ScalarizedPosteriorTransform.

Ad-hoc testing q-Scalarized-UCB

Before hooking the newly defined acquisition function into a Bayesian Optimization loop, we should test it. For this we'll just make sure that it properly evaluates on a compatible multi-output model. Here we just define a basic multi-output SingleTaskGP model trained on synthetic data.

import torch

from botorch.fit import fit_gpytorch_mll
from botorch.models import SingleTaskGP
from botorch.utils import standardize
from gpytorch.mlls import ExactMarginalLogLikelihood

torch.set_default_dtype(torch.double)

# generate synthetic data
X = torch.rand(20, 2)
Y = torch.stack([torch.sin(X[:, 0]), torch.cos(X[:, 1])], -1)

# construct and fit the multi-output model
gp = SingleTaskGP(train_X=X, train_Y=Y)
mll = ExactMarginalLogLikelihood(gp.likelihood, gp)
fit_gpytorch_mll(mll)

# construct the acquisition function
qSUCB = qScalarizedUpperConfidenceBound(gp, beta=0.1, weights=torch.tensor([0.1, 0.5]))
# evaluate on single q-batch with q=3
qSUCB(torch.rand(3, 2))
Output:
tensor([0.5276], grad_fn=<MeanBackward1>)
# batch-evaluate on two q-batches with q=3
qSUCB(torch.rand(2, 3, 2))
Output:
tensor([0.5630, 0.5453], grad_fn=<MeanBackward1>)

A scalarized version of analytic UCB (q=1 only)

We can also write an analytic version of UCB for a multi-output model, assuming a multivariate normal posterior and q=1. The new class ScalarizedUpperConfidenceBound subclasses AnalyticAcquisitionFunction instead of MCAcquisitionFunction. In contrast to the MC version, instead of using the weights on the MC samples, we directly scalarize the mean vector μ\mu and covariance matrix Σ\Sigma and apply standard UCB on the univariate normal distribution, which has mean wTμw^T \mu and variance wTΣww^T \Sigma w. In addition to the @t_batch_transform decorator, here we are also using expected_q=1 to ensure the input X has a q=1.

Note: BoTorch also provides a ScalarizedPosteriorTransform abstraction that can be used with any existing analytic acqusition functions and automatically performs the scalarization we implement manually below. See the end of this tutorial for a usage example.

from botorch.acquisition import AnalyticAcquisitionFunction


class ScalarizedUpperConfidenceBound(AnalyticAcquisitionFunction):
def __init__(
self,
model: Model,
beta: Tensor,
weights: Tensor,
maximize: bool = True,
) -> None:
# we use the AcquisitionFunction constructor, since that of
# AnalyticAcquisitionFunction performs some validity checks that we don't want here
super(AnalyticAcquisitionFunction, self).__init__(model)
self.maximize = maximize
self.register_buffer("beta", torch.as_tensor(beta))
self.register_buffer("weights", torch.as_tensor(weights))

@t_batch_mode_transform(expected_q=1)
def forward(self, X: Tensor) -> Tensor:
"""Evaluate the Upper Confidence Bound on the candidate set X using scalarization

Args:
X: A `(b) x d`-dim Tensor of `(b)` t-batches of `d`-dim design
points each.

Returns:
A `(b)`-dim Tensor of Upper Confidence Bound values at the given
design points `X`.
"""
self.beta = self.beta.to(X)
batch_shape = X.shape[:-2]
posterior = self.model.posterior(X)
means = posterior.mean.squeeze(dim=-2) # b x o
scalarized_mean = means.matmul(self.weights) # b
covs = posterior.mvn.covariance_matrix # b x o x o
weights = self.weights.view(
1, -1, 1
) # 1 x o x 1 (assume single batch dimension)
weights = weights.expand(batch_shape + weights.shape[1:]) # b x o x 1
weights_transpose = weights.permute(0, 2, 1) # b x 1 x o
scalarized_variance = torch.bmm(
weights_transpose, torch.bmm(covs, weights)
).view(
batch_shape
) # b
delta = (self.beta.expand_as(scalarized_mean) * scalarized_variance).sqrt()
if self.maximize:
return scalarized_mean + delta
else:
return scalarized_mean - delta

Ad-hoc testing Scalarized-UCB

Notice that we pass in an explicit q-batch dimension for consistency, even though q=1.

# construct the acquisition function
SUCB = ScalarizedUpperConfidenceBound(gp, beta=0.1, weights=torch.tensor([0.1, 0.5]))
# evaluate on single point
SUCB(torch.rand(1, 2))
Output:
tensor([0.3412], grad_fn=<AddBackward0>)
# batch-evaluate on 3 points
SUCB(torch.rand(3, 1, 2))
Output:
tensor([0.5529, 0.5604, 0.5341], grad_fn=<AddBackward0>)

Appendix: Using ScalarizedPosteriorTransform

Using the ScalarizedPosteriorTransform abstraction, the functionality of ScalarizedUpperConfidenceBound implemented above can be easily achieved in just a few lines of code. PosteriorTransforms can be used with both the MC and analytic acquisition functions.

from botorch.acquisition.objective import ScalarizedPosteriorTransform
from botorch.acquisition.analytic import UpperConfidenceBound

pt = ScalarizedPosteriorTransform(weights=torch.tensor([0.1, 0.5]))
SUCB = UpperConfidenceBound(gp, beta=0.1, posterior_transform=pt)