As seen in the custom BoTorch model in Ax tutorial, Ax's `BotorchModel`

is flexible in allowing different components of the Bayesian optimization loop to be specified through a functional API. This tutorial walks through the steps of writing a custom acquisition function and then inserting it into Ax.

The Upper Confidence Bound (UCB) acquisition function balances exploration and exploitation by assigning a score of $\mu + \sqrt{\beta} \cdot \sigma$ if the posterior distribution is normal with mean $\mu$ and variance $\sigma^2$. This "analytic" version is implemented in the `UpperConfidenceBound`

class. The Monte Carlo version of UCB is implemented in the `qUpperConfidenceBound`

class, which also allows for q-batches of size greater than one. (The derivation of q-UCB is given in Appendix A of Wilson et. al., 2017).

Suppose now that we are in a multi-output setting, where, e.g., we model the effects of a design on multiple metrics. We first show a simple extension of the q-UCB acquisition function that accepts a multi-output model and performs q-UCB on a scalarized version of the multiple outputs, achieved via a vector of weights. Implementing a new acquisition function in botorch is easy; one simply needs to implement the constructor and a `forward`

method.

In [1]:

```
import math
from torch import Tensor
from typing import Optional
from botorch.acquisition import MCAcquisitionObjective
from botorch.acquisition.acquisition import AcquisitionFunction
from botorch.acquisition.monte_carlo import MCAcquisitionFunction
from botorch.models.model import Model
from botorch.sampling.samplers import MCSampler, SobolQMCNormalSampler
from botorch.utils import t_batch_mode_transform
class qScalarizedUpperConfidenceBound(MCAcquisitionFunction):
def __init__(
self,
model: Model,
beta: Tensor,
weights: Tensor,
sampler: Optional[MCSampler] = None,
) -> None:
# we use the AcquisitionFunction constructor, since that of
# MCAcquisitionFunction performs some validity checks that we don't want here
super(MCAcquisitionFunction, self).__init__(model=model)
if sampler is None:
sampler = SobolQMCNormalSampler(num_samples=512, collapse_batch_dims=True)
self.sampler = sampler
self.register_buffer("beta", torch.as_tensor(beta))
self.register_buffer("weights", torch.as_tensor(weights))
@t_batch_mode_transform()
def forward(self, X: Tensor) -> Tensor:
"""Evaluate scalarized qUCB on the candidate set `X`.
Args:
X: A `(b) x q x d`-dim Tensor of `(b)` t-batches with `q` `d`-dim
design points each.
Returns:
Tensor: A `(b)`-dim Tensor of Upper Confidence Bound values at the
given design points `X`.
"""
posterior = self.model.posterior(X)
samples = self.sampler(posterior) # n x b x q x o
scalarized_samples = samples.matmul(self.weights) # n x b x q
mean = posterior.mean # b x q x o
scalarized_mean = mean.matmul(self.weights) # b x q
ucb_samples = (
scalarized_mean
+ math.sqrt(self.beta * math.pi / 2)
* (scalarized_samples - scalarized_mean).abs()
)
return ucb_samples.max(dim=-1)[0].mean(dim=0)
```

Note that `qScalarizedUpperConfidenceBound`

is very similar to `qUpperConfidenceBound`

and only requires a few lines of new code to accomodate scalarization of multiple outputs. The `@t_batch_mode_transform`

decorator ensures that the input `X`

has an explicit t-batch dimension (code comments are added with shapes for clarity).

Before hooking the newly defined acquisition function into a Bayesian Optimization loop, we should test it. For this we'll just make sure that it properly evaluates on a compatible multi-output model. Here we just define a basic multi-output `SingleTaskGP`

model trained on synthetic data.

In [2]:

```
import torch
from botorch.fit import fit_gpytorch_model
from botorch.models import SingleTaskGP
from botorch.utils import standardize
from gpytorch.mlls import ExactMarginalLogLikelihood
# generate synthetic data
X = torch.rand(20, 2)
Y = torch.stack([torch.sin(X[:, 0]), torch.cos(X[:, 1])], -1)
Y = standardize(Y) # standardize to zero mean unit variance
# construct and fit the multi-output model
gp = SingleTaskGP(X, Y)
mll = ExactMarginalLogLikelihood(gp.likelihood, gp)
fit_gpytorch_model(mll);
# construct the acquisition function
qSUCB = qScalarizedUpperConfidenceBound(gp, beta=0.1, weights=torch.tensor([0.1, 0.5]))
```

In [3]:

```
# evaluate on single q-batch with q=3
qSUCB(torch.rand(3, 2))
```

Out[3]:

tensor([-0.1354], grad_fn=<MeanBackward1>)

In [4]:

```
# batch-evaluate on two q-batches with q=3
qSUCB(torch.rand(2, 3, 2))
```

Out[4]:

tensor([-0.0566, 0.2893], grad_fn=<MeanBackward1>)

`q=1`

only)¶We can also write an *analytic* version of UCB for a multi-output model, assuming a multivariate normal posterior and `q=1`

. The new class `ScalarizedUpperConfidenceBound`

subclasses `AnalyticAcquisitionFunction`

instead of `MCAcquisitionFunction`

. In contrast to the MC version, instead of using the weights on the MC samples, we directly scalarize the mean vector $\mu$ and covariance matrix $\Sigma$ and apply standard UCB on the univariate normal distribution, which has mean $w^T \mu$ and variance $w^T \Sigma w$. In addition to the `@t_batch_transform`

decorator, here we are also using `expected_q=1`

to ensure the input `X`

has a `q=1`

.

*Note:* BoTorch also provides a `ScalarizedObjective`

abstraction that can be used with any existing analytic acqusition functions and automatically performs the scalarization we implement manually below. See the end of this tutorial for a usage example.

In [5]:

```
from botorch.acquisition import AnalyticAcquisitionFunction
class ScalarizedUpperConfidenceBound(AnalyticAcquisitionFunction):
def __init__(
self,
model: Model,
beta: Tensor,
weights: Tensor,
maximize: bool = True,
) -> None:
# we use the AcquisitionFunction constructor, since that of
# AnalyticAcquisitionFunction performs some validity checks that we don't want here
super(AnalyticAcquisitionFunction, self).__init__(model)
self.maximize = maximize
self.register_buffer("beta", torch.as_tensor(beta))
self.register_buffer("weights", torch.as_tensor(weights))
@t_batch_mode_transform(expected_q=1)
def forward(self, X: Tensor) -> Tensor:
"""Evaluate the Upper Confidence Bound on the candidate set X using scalarization
Args:
X: A `(b) x d`-dim Tensor of `(b)` t-batches of `d`-dim design
points each.
Returns:
A `(b)`-dim Tensor of Upper Confidence Bound values at the given
design points `X`.
"""
self.beta = self.beta.to(X)
batch_shape = X.shape[:-2]
posterior = self.model.posterior(X)
means = posterior.mean.squeeze(dim=-2) # b x o
scalarized_mean = means.matmul(self.weights) # b
covs = posterior.mvn.covariance_matrix # b x o x o
weights = self.weights.view(1, -1, 1) # 1 x o x 1 (assume single batch dimension)
weights = weights.expand(batch_shape + weights.shape[1:]) # b x o x 1
weights_transpose = weights.permute(0, 2, 1) # b x 1 x o
scalarized_variance = torch.bmm(
weights_transpose, torch.bmm(covs, weights)
).view(batch_shape) # b
delta = (self.beta.expand_as(scalarized_mean) * scalarized_variance).sqrt()
if self.maximize:
return scalarized_mean + delta
else:
return scalarized_mean - delta
```

Notice that we pass in an explicit q-batch dimension for consistency, even though `q=1`

.

In [6]:

```
# construct the acquisition function
SUCB = ScalarizedUpperConfidenceBound(gp, beta=0.1, weights=torch.tensor([0.1, 0.5]))
```

In [7]:

```
# evaluate on single point
SUCB(torch.rand(1, 2))
```

Out[7]:

tensor([0.3753], grad_fn=<AddBackward0>)

In [8]:

```
# batch-evaluate on 3 points
SUCB(torch.rand(3, 1, 2))
```

Out[8]:

tensor([-0.2374, -0.0187, 0.0905], grad_fn=<AddBackward0>)

To use our newly minted acquisition function within Ax, we need to write a custom factory function and pass it to the constructor of Ax's `BotorchModel`

as the `acqf_constructor`

, which has the call signature:

```
def acqf_constructor(
model: Model,
objective_weights: Tensor,
outcome_constraints: Optional[Tuple[Tensor, Tensor]],
X_observed: Optional[Tensor] = None,
X_pending: Optional[Tensor] = None,
**kwargs: Any,
) -> AcquisitionFunction:
```

The argument `objective_weights`

allows for scalarization of multiple objectives, `outcome_constraints`

is used to define constraints on multi-output models, `X_observed`

contains previously observed points (useful for acquisition functions such as Noisy Expected Improvement), and `X_pending`

are the points that are awaiting observations. By default, Ax uses the Noisy Expected Improvement (`qNoisyExpectedImprovement`

) acquisition function and so the default value of `acqf_constructor`

is `get_NEI`

(see documentation for additional details and context).

Note that there is ample flexibility to how the arguments of `acqf_constructor`

are used. In `get_NEI`

, they are used in some preprocessing steps *before* constructing the acquisition function. They could also be directly passed to the botorch acquisition function, or not used at all -- all we need to do is return an `AcquisitionFunction`

. We now give a bare-bones example of a custom factory function that returns our analytic scalarized-UCB acquisition.

```
def get_scalarized_UCB(
model: Model,
objective_weights: Tensor,
**kwargs: Any,
) -> AcquisitionFunction:
return ScalarizedUpperConfidenceBound(model=model, beta=0.2, weights=objective_weights)
```

By following the example shown in the custom botorch model in ax tutorial, a `BotorchModel`

can be instantiated with `get_scalarized_UCB`

and then run in Ax.

`ScalarizedObjective`

¶Using the `ScalarizedObjective`

abstraction, the funcitonality of `ScalarizedUpperConfidenceBound`

implemented above can be easily achieved in just a few lines of code:

In [9]:

```
from botorch.acquisition.objective import ScalarizedObjective
from botorch.acquisition.analytic import UpperConfidenceBound
obj = ScalarizedObjective(weights=torch.tensor([0.1, 0.5]))
SUCB = UpperConfidenceBound(gp, beta=0.1, objective=obj)
```