# botorch.acquisition¶

## Acquisition Function APIs¶

### Abstract Acquisition Function APIs¶

Abstract base module for all botorch acquisition functions.

class botorch.acquisition.acquisition.AcquisitionFunction(model)[source]

Bases: torch.nn.modules.module.Module, abc.ABC

Abstract base class for acquisition functions.

Please note that if your acquisition requires a backwards call, you will need to wrap the backwards call inside of an enable_grad context to be able to optimize the acquisition. See #1164.

Constructor for the AcquisitionFunction base class.

Parameters

model (Model) – A fitted model.

Return type

None

set_X_pending(X_pending=None)[source]

Informs the acquisition function about pending design points.

Parameters

X_pending (Optional[torch.Tensor]) – n x d Tensor with n d-dim design points that have been submitted for evaluation but have not yet been evaluated.

Return type

None

abstract forward(X)[source]

Evaluate the acquisition function on the candidate set X.

Parameters

X (torch.Tensor) – A (b) x q x d-dim Tensor of (b) t-batches with q d-dim design points each.

Returns

A (b)-dim Tensor of acquisition function values at the given design points X.

Return type

torch.Tensor

training: bool
class botorch.acquisition.acquisition.OneShotAcquisitionFunction(model)[source]

Bases: botorch.acquisition.acquisition.AcquisitionFunction, abc.ABC

Abstract base class for acquisition functions using one-shot optimization

Constructor for the AcquisitionFunction base class.

Parameters

model (Model) – A fitted model.

Return type

None

abstract get_augmented_q_batch_size(q)[source]

Get augmented q batch size for one-shot optimziation.

Parameters

q (int) – The number of candidates to consider jointly.

Returns

The augmented size for one-shot optimization (including variables parameterizing the fantasy solutions).

Return type

int

abstract extract_candidates(X_full)[source]

Extract the candidates from a full “one-shot” parameterization.

Parameters

X_full (torch.Tensor) – A b x q_aug x d-dim Tensor with b t-batches of q_aug design points each.

Returns

A b x q x d-dim Tensor with b t-batches of q design points each.

Return type

torch.Tensor

training: bool

### Analytic Acquisition Function API¶

class botorch.acquisition.analytic.AnalyticAcquisitionFunction(model, posterior_transform=None, **kwargs)[source]

Bases: botorch.acquisition.acquisition.AcquisitionFunction, abc.ABC

Base class for analytic acquisition functions.

Base constructor for analytic acquisition functions.

Parameters
• model (Model) – A fitted single-outcome model.

• posterior_transform (Optional[PosteriorTransform]) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.

Return type

None

set_X_pending(X_pending=None)[source]

Informs the acquisition function about pending design points.

Parameters

X_pending (Optional[torch.Tensor]) – n x d Tensor with n d-dim design points that have been submitted for evaluation but have not yet been evaluated.

Return type

None

training: bool

### Cached Cholesky Acquisition Function API¶

Abstract class for acquisition functions leveraging a cached Cholesky decomposition of the posterior covaiance over f(X_baseline).

class botorch.acquisition.cached_cholesky.CachedCholeskyMCAcquisitionFunction[source]

Bases: abc.ABC

Abstract class for acquisition functions using a cached Cholesky.

Specifically, this is for acquisition functions that require sampling from the posterior P(f(X_baseline, X) | D). The Cholesky of the posterior covariance over f(X_baseline) is cached.

### Monte-Carlo Acquisition Function API¶

class botorch.acquisition.monte_carlo.MCAcquisitionFunction(model, sampler=None, objective=None, posterior_transform=None, X_pending=None)[source]

Bases: botorch.acquisition.acquisition.AcquisitionFunction, abc.ABC

Abstract base class for Monte-Carlo based batch acquisition functions.

Constructor for the MCAcquisitionFunction base class.

Parameters
• model (Model) – A fitted model.

• sampler (Optional[MCSampler]) – The sampler used to draw base samples. Defaults to SobolQMCNormalSampler(num_samples=512, collapse_batch_dims=True).

• objective (Optional[MCAcquisitionObjective]) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to IdentityMCObjective().

• posterior_transform (Optional[PosteriorTransform]) – A PosteriorTransform (optional).

• X_pending (Optional[Tensor]) – A batch_shape, m x d-dim Tensor of m design points that have points that have been submitted for function evaluation but have not yet been evaluated.

Return type

None

abstract forward(X)[source]

Takes in a batch_shape x q x d X Tensor of t-batches with q d-dim design points each, and returns a Tensor with shape batch_shape’, where batch_shape’ is the broadcasted batch shape of model and input X. Should utilize the result of set_X_pending as needed to account for pending function evaluations.

Parameters

X (torch.Tensor) –

Return type

torch.Tensor

training: bool

### Multi-Objective Analytic Acquisition Function API¶

class botorch.acquisition.multi_objective.analytic.MultiObjectiveAnalyticAcquisitionFunction(model, objective=None)[source]

Abstract base class for Multi-Objective batch acquisition functions.

Constructor for the MultiObjectiveAnalyticAcquisitionFunction base class.

Parameters
Return type

None

abstract forward(X)[source]

Takes in a batch_shape x 1 x d X Tensor of t-batches with 1 d-dim design point each, and returns a Tensor with shape batch_shape’, where batch_shape’ is the broadcasted batch shape of model and input X.

Parameters

X (torch.Tensor) –

Return type

torch.Tensor

set_X_pending(X_pending=None)[source]

Informs the acquisition function about pending design points.

Parameters

X_pending (Optional[torch.Tensor]) – n x d Tensor with n d-dim design points that have been submitted for evaluation but have not yet been evaluated.

Return type

None

training: bool

### Multi-Objective Monte-Carlo Acquisition Function API¶

class botorch.acquisition.multi_objective.monte_carlo.MultiObjectiveMCAcquisitionFunction(model, sampler=None, objective=None, constraints=None, X_pending=None)[source]

Abstract base class for Multi-Objective batch acquisition functions.

Constructor for the MCAcquisitionFunction base class.

Parameters
• model (Model) – A fitted model.

• sampler (Optional[MCSampler]) – The sampler used to draw base samples. Defaults to SobolQMCNormalSampler(num_samples=128, collapse_batch_dims=True).

• objective (Optional[MCMultiOutputObjective]) – The MCMultiOutputObjective under which the samples are evaluated. Defaults to IdentityMultiOutputObjective().

• constraints (Optional[List[Callable[[Tensor], Tensor]]]) – A list of callables, each mapping a Tensor of dimension sample_shape x batch-shape x q x m to a Tensor of dimension sample_shape x batch-shape x q, where negative values imply feasibility.

• X_pending (Optional[Tensor]) – A m x d-dim Tensor of m design points that have points that have been submitted for function evaluation but have not yet been evaluated.

Return type

None

abstract forward(X)[source]

Takes in a batch_shape x q x d X Tensor of t-batches with q d-dim design points each, and returns a Tensor with shape batch_shape’, where batch_shape’ is the broadcasted batch shape of model and input X. Should utilize the result of set_X_pending as needed to account for pending function evaluations.

Parameters

X (torch.Tensor) –

Return type

torch.Tensor

training: bool

## Acquisition Functions¶

### Analytic Acquisition Functions¶

Analytic Acquisition Functions that evaluate the posterior without performing Monte-Carlo sampling.

class botorch.acquisition.analytic.ExpectedImprovement(model, best_f, posterior_transform=None, maximize=True, **kwargs)[source]

Single-outcome Expected Improvement (analytic).

Computes classic Expected Improvement over the current best observed value, using the analytic formula for a Normal posterior distribution. Unlike the MC-based acquisition functions, this relies on the posterior at single test point being Gaussian (and require the posterior to implement mean and variance properties). Only supports the case of q=1. The model must be single-outcome.

EI(x) = E(max(y - best_f, 0)), y ~ f(x)

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> EI = ExpectedImprovement(model, best_f=0.2)
>>> ei = EI(test_X)


Single-outcome Expected Improvement (analytic).

Parameters
• model (Model) – A fitted single-outcome model.

• best_f (Union[float, Tensor]) – Either a scalar or a b-dim Tensor (batch mode) representing the best function value observed so far (assumed noiseless).

• posterior_transform (Optional[PosteriorTransform]) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.

• maximize (bool) – If True, consider the problem a maximization problem.

Return type

None

forward(X)[source]

Evaluate Expected Improvement on the candidate set X.

Parameters

X (torch.Tensor) – A (b1 x … bk) x 1 x d-dim batched tensor of d-dim design points. Expected Improvement is computed for each point individually, i.e., what is considered are the marginal posteriors, not the joint.

Returns

A (b1 x … bk)-dim tensor of Expected Improvement values at the given design points X.

Return type

torch.Tensor

training: bool
class botorch.acquisition.analytic.PosteriorMean(model, posterior_transform=None, maximize=True)[source]

Single-outcome Posterior Mean.

Only supports the case of q=1. Requires the model’s posterior to have a mean property. The model must be single-outcome.

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> PM = PosteriorMean(model)
>>> pm = PM(test_X)


Single-outcome Posterior Mean.

Parameters
• model (Model) – A fitted single-outcome GP model (must be in batch mode if candidate sets X will be)

• posterior_transform (Optional[PosteriorTransform]) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.

• maximize (bool) – If True, consider the problem a maximization problem. Note that if maximize=False, the posterior mean is negated. As a consequence optimize_acqf(PosteriorMean(gp, maximize=False)) does actually return -1 * minimum of the posterior mean.

Return type

None

forward(X)[source]

Evaluate the posterior mean on the candidate set X.

Parameters

X (torch.Tensor) – A (b1 x … bk) x 1 x d-dim batched tensor of d-dim design points.

Returns

A (b1 x … bk)-dim tensor of Posterior Mean values at the given design points X.

Return type

torch.Tensor

training: bool
class botorch.acquisition.analytic.ProbabilityOfImprovement(model, best_f, posterior_transform=None, maximize=True, **kwargs)[source]

Single-outcome Probability of Improvement.

Probability of improvement over the current best observed value, computed using the analytic formula under a Normal posterior distribution. Only supports the case of q=1. Requires the posterior to be Gaussian. The model must be single-outcome.

PI(x) = P(y >= best_f), y ~ f(x)

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> PI = ProbabilityOfImprovement(model, best_f=0.2)
>>> pi = PI(test_X)


Single-outcome analytic Probability of Improvement.

Parameters
• model (Model) – A fitted single-outcome model.

• best_f (Union[float, Tensor]) – Either a scalar or a b-dim Tensor (batch mode) representing the best function value observed so far (assumed noiseless).

• posterior_transform (Optional[PosteriorTransform]) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.

• maximize (bool) – If True, consider the problem a maximization problem.

Return type

None

forward(X)[source]

Evaluate the Probability of Improvement on the candidate set X.

Parameters

X (torch.Tensor) – A (b1 x … bk) x 1 x d-dim batched tensor of d-dim design points.

Returns

A (b1 x … bk)-dim tensor of Probability of Improvement values at the given design points X.

Return type

torch.Tensor

training: bool
class botorch.acquisition.analytic.UpperConfidenceBound(model, beta, posterior_transform=None, maximize=True, **kwargs)[source]

Single-outcome Upper Confidence Bound (UCB).

Analytic upper confidence bound that comprises of the posterior mean plus an additional term: the posterior standard deviation weighted by a trade-off parameter, beta. Only supports the case of q=1 (i.e. greedy, non-batch selection of design points). The model must be single-outcome.

UCB(x) = mu(x) + sqrt(beta) * sigma(x), where mu and sigma are the posterior mean and standard deviation, respectively.

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> UCB = UpperConfidenceBound(model, beta=0.2)
>>> ucb = UCB(test_X)


Single-outcome Upper Confidence Bound.

Parameters
• model (Model) – A fitted single-outcome GP model (must be in batch mode if candidate sets X will be)

• beta (Union[float, Tensor]) – Either a scalar or a one-dim tensor with b elements (batch mode) representing the trade-off parameter between mean and covariance

• posterior_transform (Optional[PosteriorTransform]) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.

• maximize (bool) – If True, consider the problem a maximization problem.

Return type

None

forward(X)[source]

Evaluate the Upper Confidence Bound on the candidate set X.

Parameters

X (torch.Tensor) – A (b1 x … bk) x 1 x d-dim batched tensor of d-dim design points.

Returns

A (b1 x … bk)-dim tensor of Upper Confidence Bound values at the given design points X.

Return type

torch.Tensor

training: bool
class botorch.acquisition.analytic.ConstrainedExpectedImprovement(model, best_f, objective_index, constraints, maximize=True)[source]

Constrained Expected Improvement (feasibility-weighted).

Computes the analytic expected improvement for a Normal posterior distribution, weighted by a probability of feasibility. The objective and constraints are assumed to be independent and have Gaussian posterior distributions. Only supports the case q=1. The model should be multi-outcome, with the index of the objective and constraints passed to the constructor.

Constrained_EI(x) = EI(x) * Product_i P(y_i in [lower_i, upper_i]), where y_i ~ constraint_i(x) and lower_i, upper_i are the lower and upper bounds for the i-th constraint, respectively.

Example

>>> # example where 0th output has a non-negativity constraint and
... # 1st output is the objective
>>> model = SingleTaskGP(train_X, train_Y)
>>> constraints = {0: (0.0, None)}
>>> cEI = ConstrainedExpectedImprovement(model, 0.2, 1, constraints)
>>> cei = cEI(test_X)


Analytic Constrained Expected Improvement.

Parameters
• model (Model) – A fitted single-outcome model.

• best_f (Union[float, Tensor]) – Either a scalar or a b-dim Tensor (batch mode) representing the best feasible function value observed so far (assumed noiseless).

• objective_index (int) – The index of the objective.

• constraints (Dict[int, Tuple[Optional[float], Optional[float]]]) – A dictionary of the form {i: [lower, upper]}, where i is the output index, and lower and upper are lower and upper bounds on that output (resp. interpreted as -Inf / Inf if None)

• maximize (bool) – If True, consider the problem a maximization problem.

Return type

None

forward(X)[source]

Evaluate Constrained Expected Improvement on the candidate set X.

Parameters

X (torch.Tensor) – A (b) x 1 x d-dim Tensor of (b) t-batches of d-dim design points each.

Returns

A (b)-dim Tensor of Expected Improvement values at the given design points X.

Return type

torch.Tensor

training: bool
class botorch.acquisition.analytic.NoisyExpectedImprovement(model, X_observed, num_fantasies=20, maximize=True)[source]

Single-outcome Noisy Expected Improvement (via fantasies).

This computes Noisy Expected Improvement by averaging over the Expected Improvement values of a number of fantasy models. Only supports the case q=1. Assumes that the posterior distribution of the model is Gaussian. The model must be single-outcome.

NEI(x) = E(max(y - max Y_baseline), 0)), (y, Y_baseline) ~ f((x, X_baseline)), where X_baseline are previously observed points.

Note: This acquisition function currently relies on using a FixedNoiseGP (required for noiseless fantasies).

Example

>>> model = FixedNoiseGP(train_X, train_Y, train_Yvar=train_Yvar)
>>> NEI = NoisyExpectedImprovement(model, train_X)
>>> nei = NEI(test_X)


Single-outcome Noisy Expected Improvement (via fantasies).

Parameters
• model (GPyTorchModel) – A fitted single-outcome model.

• X_observed (Tensor) – A n x d Tensor of observed points that are likely to be the best observed points so far.

• num_fantasies (int) – The number of fantasies to generate. The higher this number the more accurate the model (at the expense of model complexity and performance).

• maximize (bool) – If True, consider the problem a maximization problem.

Return type

None

forward(X)[source]

Evaluate Expected Improvement on the candidate set X.

Parameters

X (torch.Tensor) – A b1 x … bk x 1 x d-dim batched tensor of d-dim design points.

Returns

A b1 x … bk-dim tensor of Noisy Expected Improvement values at the given design points X.

Return type

torch.Tensor

training: bool
class botorch.acquisition.analytic.ScalarizedPosteriorMean(model, weights, posterior_transform=None, **kwargs)[source]

Scalarized Posterior Mean.

This acquisition function returns a scalarized (across the q-batch) posterior mean given a vector of weights.

Scalarized Posterior Mean.

Parameters
• model (Model) – A fitted single-outcome model.

• weights (Tensor) – A tensor of shape q for scalarization.

• posterior_transform (Optional[PosteriorTransform]) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.

Return type

None

forward(X)[source]

Evaluate the scalarized posterior mean on the candidate set X.

Parameters

X (torch.Tensor) – A (b) x q x d-dim Tensor of (b) t-batches of d-dim design points each.

Returns

A (b)-dim Tensor of Posterior Mean values at the given design points X.

Return type

torch.Tensor

training: bool

### Monte-Carlo Acquisition Functions¶

Batch acquisition functions using the reparameterization trick in combination with (quasi) Monte-Carlo sampling. See [Rezende2014reparam], [Wilson2017reparam] and [Balandat2020botorch].

Rezende2014reparam

D. J. Rezende, S. Mohamed, and D. Wierstra. Stochastic backpropagation and approximate inference in deep generative models. ICML 2014.

Wilson2017reparam

J. T. Wilson, R. Moriconi, F. Hutter, and M. P. Deisenroth. The reparameterization trick for acquisition functions. ArXiv 2017.

class botorch.acquisition.monte_carlo.qExpectedImprovement(model, best_f, sampler=None, objective=None, posterior_transform=None, X_pending=None, **kwargs)[source]

MC-based batch Expected Improvement.

This computes qEI by (1) sampling the joint posterior over q points (2) evaluating the improvement over the current best for each sample (3) maximizing over q (4) averaging over the samples

qEI(X) = E(max(max Y - best_f, 0)), Y ~ f(X), where X = (x_1,…,x_q)

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> best_f = train_Y.max()[0]
>>> sampler = SobolQMCNormalSampler(1024)
>>> qEI = qExpectedImprovement(model, best_f, sampler)
>>> qei = qEI(test_X)


q-Expected Improvement.

Parameters
• model (Model) – A fitted model.

• best_f (Union[float, Tensor]) – The best objective value observed so far (assumed noiseless). Can be a batch_shape-shaped tensor, which in case of a batched model specifies potentially different values for each element of the batch.

• sampler (Optional[MCSampler]) – The sampler used to draw base samples. Defaults to SobolQMCNormalSampler(num_samples=512, collapse_batch_dims=True)

• objective (Optional[MCAcquisitionObjective]) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to IdentityMCObjective().

• posterior_transform (Optional[PosteriorTransform]) – A PosteriorTransform (optional).

• X_pending (Optional[Tensor]) – A m x d-dim Tensor of m design points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call. Copied and set to have no gradient.

• kwargs (Any) –

Return type

None

forward(X)[source]

Evaluate qExpectedImprovement on the candidate set X.

Parameters

X (torch.Tensor) – A batch_shape x q x d-dim Tensor of t-batches with q d-dim design points each.

Returns

A batch_shape’-dim Tensor of Expected Improvement values at the given design points X, where batch_shape’ is the broadcasted batch shape of model and input X.

Return type

torch.Tensor

training: bool
class botorch.acquisition.monte_carlo.qNoisyExpectedImprovement(model, X_baseline, sampler=None, objective=None, posterior_transform=None, X_pending=None, prune_baseline=False, cache_root=True, **kwargs)[source]

MC-based batch Noisy Expected Improvement.

This function does not assume a best_f is known (which would require noiseless observations). Instead, it uses samples from the joint posterior over the q test points and previously observed points. The improvement over previously observed points is computed for each sample and averaged.

qNEI(X) = E(max(max Y - max Y_baseline, 0)), where (Y, Y_baseline) ~ f((X, X_baseline)), X = (x_1,…,x_q)

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> sampler = SobolQMCNormalSampler(1024)
>>> qNEI = qNoisyExpectedImprovement(model, train_X, sampler)
>>> qnei = qNEI(test_X)


q-Noisy Expected Improvement.

Parameters
• model (Model) – A fitted model.

• X_baseline (Tensor) – A batch_shape x r x d-dim Tensor of r design points that have already been observed. These points are considered as the potential best design point.

• sampler (Optional[MCSampler]) – The sampler used to draw base samples. Defaults to SobolQMCNormalSampler(num_samples=512, collapse_batch_dims=True).

• objective (Optional[MCAcquisitionObjective]) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to IdentityMCObjective().

• posterior_transform (Optional[PosteriorTransform]) – A PosteriorTransform (optional).

• X_pending (Optional[Tensor]) – A batch_shape x m x d-dim Tensor of m design points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call. Copied and set to have no gradient.

• prune_baseline (bool) – If True, remove points in X_baseline that are highly unlikely to be the best point. This can significantly improve performance and is generally recommended. In order to customize pruning parameters, instead manually call botorch.acquisition.utils.prune_inferior_points on X_baseline before instantiating the acquisition function.

• cache_root (bool) – A boolean indicating whether to cache the root decomposition over X_baseline and use low-rank updates.

• kwargs (Any) –

Return type

None

TODO: similar to qNEHVI, when we are using sequential greedy candidate selection, we could incorporate pending points X_baseline and compute the incremental qNEI from the new point. This would greatly increase efficiency for large batches.

forward(X)[source]

Evaluate qNoisyExpectedImprovement on the candidate set X.

Parameters

X (torch.Tensor) – A batch_shape x q x d-dim Tensor of t-batches with q d-dim design points each.

Returns

A batch_shape’-dim Tensor of Noisy Expected Improvement values at the given design points X, where batch_shape’ is the broadcasted batch shape of model and input X.

Return type

torch.Tensor

training: bool
class botorch.acquisition.monte_carlo.qProbabilityOfImprovement(model, best_f, sampler=None, objective=None, posterior_transform=None, X_pending=None, tau=0.001)[source]

MC-based batch Probability of Improvement.

Estimates the probability of improvement over the current best observed value by sampling from the joint posterior distribution of the q-batch. MC-based estimates of a probability involves taking expectation of an indicator function; to support auto-differntiation, the indicator is replaced with a sigmoid function with temperature parameter tau.

qPI(X) = P(max Y >= best_f), Y ~ f(X), X = (x_1,…,x_q)

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> best_f = train_Y.max()[0]
>>> sampler = SobolQMCNormalSampler(1024)
>>> qPI = qProbabilityOfImprovement(model, best_f, sampler)
>>> qpi = qPI(test_X)


q-Probability of Improvement.

Parameters
• model (Model) – A fitted model.

• best_f (Union[float, Tensor]) – The best objective value observed so far (assumed noiseless). Can be a batch_shape-shaped tensor, which in case of a batched model specifies potentially different values for each element of the batch.

• sampler (Optional[MCSampler]) – The sampler used to draw base samples. Defaults to SobolQMCNormalSampler(num_samples=512, collapse_batch_dims=True)

• objective (Optional[MCAcquisitionObjective]) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to IdentityMCObjective().

• posterior_transform (Optional[PosteriorTransform]) – A PosteriorTransform (optional).

• X_pending (Optional[Tensor]) – A m x d-dim Tensor of m design points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call. Copied and set to have no gradient.

• tau (float) – The temperature parameter used in the sigmoid approximation of the step function. Smaller values yield more accurate approximations of the function, but result in gradients estimates with higher variance.

Return type

None

forward(X)[source]

Evaluate qProbabilityOfImprovement on the candidate set X.

Parameters

X (torch.Tensor) – A batch_shape x q x d-dim Tensor of t-batches with q d-dim design points each.

Returns

A batch_shape’-dim Tensor of Probability of Improvement values at the given design points X, where batch_shape’ is the broadcasted batch shape of model and input X.

Return type

torch.Tensor

training: bool
class botorch.acquisition.monte_carlo.qSimpleRegret(model, sampler=None, objective=None, posterior_transform=None, X_pending=None)[source]

MC-based batch Simple Regret.

Samples from the joint posterior over the q-batch and computes the simple regret.

qSR(X) = E(max Y), Y ~ f(X), X = (x_1,…,x_q)

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> sampler = SobolQMCNormalSampler(1024)
>>> qSR = qSimpleRegret(model, sampler)
>>> qsr = qSR(test_X)


Constructor for the MCAcquisitionFunction base class.

Parameters
• model (Model) – A fitted model.

• sampler (Optional[MCSampler]) – The sampler used to draw base samples. Defaults to SobolQMCNormalSampler(num_samples=512, collapse_batch_dims=True).

• objective (Optional[MCAcquisitionObjective]) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to IdentityMCObjective().

• posterior_transform (Optional[PosteriorTransform]) – A PosteriorTransform (optional).

• X_pending (Optional[Tensor]) – A batch_shape, m x d-dim Tensor of m design points that have points that have been submitted for function evaluation but have not yet been evaluated.

Return type

None

forward(X)[source]

Evaluate qSimpleRegret on the candidate set X.

Parameters

X (torch.Tensor) – A batch_shape x q x d-dim Tensor of t-batches with q d-dim design points each.

Returns

A batch_shape’-dim Tensor of Simple Regret values at the given design points X, where batch_shape’ is the broadcasted batch shape of model and input X.

Return type

torch.Tensor

training: bool
class botorch.acquisition.monte_carlo.qUpperConfidenceBound(model, beta, sampler=None, objective=None, posterior_transform=None, X_pending=None)[source]

MC-based batch Upper Confidence Bound.

Uses a reparameterization to extend UCB to qUCB for q > 1 (See Appendix A of [Wilson2017reparam].)

qUCB = E(max(mu + |Y_tilde - mu|)), where Y_tilde ~ N(mu, beta pi/2 Sigma) and f(X) has distribution N(mu, Sigma).

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> sampler = SobolQMCNormalSampler(1024)
>>> qUCB = qUpperConfidenceBound(model, 0.1, sampler)
>>> qucb = qUCB(test_X)


q-Upper Confidence Bound.

Parameters
• model (Model) – A fitted model.

• beta (float) – Controls tradeoff between mean and standard deviation in UCB.

• sampler (Optional[MCSampler]) – The sampler used to draw base samples. Defaults to SobolQMCNormalSampler(num_samples=512, collapse_batch_dims=True)

• objective (Optional[MCAcquisitionObjective]) – The MCAcquisitionObjective under which the samples are evaluated. Defaults to IdentityMCObjective().

• posterior_transform (Optional[PosteriorTransform]) – A PosteriorTransform (optional).

• X_pending (Optional[Tensor]) – A batch_shape x m x d-dim Tensor of m design points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call. Copied and set to have no gradient.

Return type

None

forward(X)[source]

Evaluate qUpperConfidenceBound on the candidate set X.

Parameters

X (torch.Tensor) – A batch_sahpe x q x d-dim Tensor of t-batches with q d-dim design points each.

Returns

A batch_shape’-dim Tensor of Upper Confidence Bound values at the given design points X, where batch_shape’ is the broadcasted batch shape of model and input X.

Return type

torch.Tensor

training: bool

### Multi-Objective Analytic Acquisition Functions¶

Analytic Acquisition Functions for Multi-objective Bayesian optimization.

References

Yang2019(1,2,3,4,5)

Yang, K., Emmerich, M., Deutz, A. et al. Efficient computation of expected hypervolume improvement using box decomposition algorithms. J Glob Optim 75, 3–34 (2019)

class botorch.acquisition.multi_objective.analytic.ExpectedHypervolumeImprovement(model, ref_point, partitioning, objective=None)[source]

Expected Hypervolume Improvement supporting m>=2 outcomes.

This implements the computes EHVI using the algorithm from [Yang2019], but additionally computes gradients via auto-differentiation as proposed by [Daulton2020qehvi].

Note: this is currently inefficient in two ways due to the binary partitioning algorithm that we use for the box decomposition:

• We have more boxes in our decomposition

• If we used a box decomposition that used inf as the upper bound for

the last dimension in all hypercells, then we could reduce the number of terms we need to compute from 2^m to 2^(m-1). [Yang2019] do this by using DKLV17 and LKF17 for the box decomposition.

TODO: Use DKLV17 and LKF17 for the box decomposition as in [Yang2019] for greater efficiency.

TODO: Add support for outcome constraints.

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> ref_point = [0.0, 0.0]
>>> EHVI = ExpectedHypervolumeImprovement(model, ref_point, partitioning)
>>> ehvi = EHVI(test_X)

Parameters
• model (Model) – A fitted model.

• ref_point (List[float]) – A list with m elements representing the reference point (in the outcome space) w.r.t. to which compute the hypervolume. This is a reference point for the objective values (i.e. after applying objective to the samples).

• partitioning (NondominatedPartitioning) – A NondominatedPartitioning module that provides the non- dominated front and a partitioning of the non-dominated space in hyper- rectangles.

• objective (Optional[AnalyticMultiOutputObjective]) – An AnalyticMultiOutputObjective.

Return type

None

psi(lower, upper, mu, sigma)[source]

Compute Psi function.

For each cell i and outcome k:

Psi(lower_{i,k}, upper_{i,k}, mu_k, sigma_k) = ( sigma_k * PDF((upper_{i,k} - mu_k) / sigma_k) + ( mu_k - lower_{i,k} ) * (1 - CDF(upper_{i,k} - mu_k) / sigma_k) )

See Equation 19 in [Yang2019] for more details.

Parameters
• lower (torch.Tensor) – A num_cells x m-dim tensor of lower cell bounds

• upper (torch.Tensor) – A num_cells x m-dim tensor of upper cell bounds

• mu (torch.Tensor) – A batch_shape x 1 x m-dim tensor of means

• sigma (torch.Tensor) – A batch_shape x 1 x m-dim tensor of standard deviations (clamped).

Returns

A batch_shape x num_cells x m-dim tensor of values.

Return type

None

nu(lower, upper, mu, sigma)[source]

Compute Nu function.

For each cell i and outcome k:

nu(lower_{i,k}, upper_{i,k}, mu_k, sigma_k) = ( upper_{i,k} - lower_{i,k} ) * (1 - CDF((upper_{i,k} - mu_k) / sigma_k))

See Equation 25 in [Yang2019] for more details.

Parameters
• lower (torch.Tensor) – A num_cells x m-dim tensor of lower cell bounds

• upper (torch.Tensor) – A num_cells x m-dim tensor of upper cell bounds

• mu (torch.Tensor) – A batch_shape x 1 x m-dim tensor of means

• sigma (torch.Tensor) – A batch_shape x 1 x m-dim tensor of standard deviations (clamped).

Returns

A batch_shape x num_cells x m-dim tensor of values.

Return type

None

forward(X)[source]

Takes in a batch_shape x 1 x d X Tensor of t-batches with 1 d-dim design point each, and returns a Tensor with shape batch_shape’, where batch_shape’ is the broadcasted batch shape of model and input X.

Parameters

X (torch.Tensor) –

Return type

torch.Tensor

training: bool

### Multi-Objective Monte-Carlo Acquisition Functions¶

Monte-Carlo Acquisition Functions for Multi-objective Bayesian optimization.

References

Daulton2020qehvi(1,2)

S. Daulton, M. Balandat, and E. Bakshy. Differentiable Expected Hypervolume Improvement for Parallel Multi-Objective Bayesian Optimization. Advances in Neural Information Processing Systems 33, 2020.

Daulton2021nehvi

S. Daulton, M. Balandat, and E. Bakshy. Parallel Bayesian Optimization of Multiple Noisy Objectives with Expected Hypervolume Improvement. Advances in Neural Information Processing Systems 34, 2021.

class botorch.acquisition.multi_objective.monte_carlo.qExpectedHypervolumeImprovement(model, ref_point, partitioning, sampler=None, objective=None, constraints=None, X_pending=None, eta=0.001)[source]

q-Expected Hypervolume Improvement supporting m>=2 outcomes.

See [Daulton2020qehvi] for details.

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> ref_point = [0.0, 0.0]
>>> qEHVI = qExpectedHypervolumeImprovement(model, ref_point, partitioning)
>>> qehvi = qEHVI(test_X)

Parameters
• model (Model) – A fitted model.

• ref_point (Union[List[float], Tensor]) – A list or tensor with m elements representing the reference point (in the outcome space) w.r.t. to which compute the hypervolume. This is a reference point for the objective values (i.e. after applyingobjective to the samples).

• partitioning (NondominatedPartitioning) – A NondominatedPartitioning module that provides the non- dominated front and a partitioning of the non-dominated space in hyper- rectangles. If constraints are present, this partitioning must only include feasible points.

• sampler (Optional[MCSampler]) – The sampler used to draw base samples. Defaults to SobolQMCNormalSampler(num_samples=128, collapse_batch_dims=True).

• objective (Optional[MCMultiOutputObjective]) – The MCMultiOutputObjective under which the samples are evaluated. Defaults to IdentityMultiOutputObjective().

• constraints (Optional[List[Callable[[Tensor], Tensor]]]) – A list of callables, each mapping a Tensor of dimension sample_shape x batch-shape x q x m to a Tensor of dimension sample_shape x batch-shape x q, where negative values imply feasibility. The acqusition function will compute expected feasible hypervolume.

• X_pending (Optional[Tensor]) – A batch_shape x m x d-dim Tensor of m design points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call. Copied and set to have no gradient.

• eta (float) – The temperature parameter for the sigmoid function used for the differentiable approximation of the constraints.

Return type

None

forward(X)[source]

Takes in a batch_shape x q x d X Tensor of t-batches with q d-dim design points each, and returns a Tensor with shape batch_shape’, where batch_shape’ is the broadcasted batch shape of model and input X. Should utilize the result of set_X_pending as needed to account for pending function evaluations.

Parameters

X (torch.Tensor) –

Return type

torch.Tensor

training: bool
class botorch.acquisition.multi_objective.monte_carlo.qNoisyExpectedHypervolumeImprovement(model, ref_point, X_baseline, sampler=None, objective=None, constraints=None, X_pending=None, eta=0.001, prune_baseline=False, alpha=0.0, cache_pending=True, max_iep=0, incremental_nehvi=True, cache_root=True, **kwargs)[source]

q-Noisy Expected Hypervolume Improvement supporting m>=2 outcomes.

See [Daulton2021nehvi] for details.

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> ref_point = [0.0, 0.0]
>>> qNEHVI = qNoisyExpectedHypervolumeImprovement(model, ref_point, train_X)
>>> qnehvi = qNEHVI(test_X)

Parameters
• model (Model) – A fitted model.

• ref_point (Union[List[float], Tensor]) – A list or tensor with m elements representing the reference point (in the outcome space) w.r.t. to which compute the hypervolume. This is a reference point for the objective values (i.e. after applying objective to the samples).

• X_baseline (Tensor) – A r x d-dim Tensor of r design points that have already been observed. These points are considered as potential approximate pareto-optimal design points.

• sampler (Optional[MCSampler]) – The sampler used to draw base samples. Defaults to SobolQMCNormalSampler(num_samples=128, collapse_batch_dims=True). Note: a pareto front is created for each mc sample, which can be computationally intensive for m > 2.

• objective (Optional[MCMultiOutputObjective]) – The MCMultiOutputObjective under which the samples are evaluated. Defaults to IdentityMultiOutputObjective().

• constraints (Optional[List[Callable[[Tensor], Tensor]]]) – A list of callables, each mapping a Tensor of dimension sample_shape x batch-shape x q x m to a Tensor of dimension sample_shape x batch-shape x q, where negative values imply feasibility. The acqusition function will compute expected feasible hypervolume.

• X_pending (Optional[Tensor]) – A batch_shape x m x d-dim Tensor of m design points that have points that have been submitted for function evaluation, but have not yet been evaluated.

• eta (float) – The temperature parameter for the sigmoid function used for the differentiable approximation of the constraints.

• prune_baseline (bool) – If True, remove points in X_baseline that are highly unlikely to be the pareto optimal and better than the reference point. This can significantly improve computation time and is generally recommended. In order to customize pruning parameters, instead manually call prune_inferior_points_multi_objective on X_baseline before instantiating the acquisition function.

• alpha (float) – The hyperparameter controlling the approximate non-dominated partitioning. The default value of 0.0 means an exact partitioning is used. As the number of objectives m increases, consider increasing this parameter in order to limit computational complexity.

• cache_pending (bool) – A boolean indicating whether to use cached box decompositions (CBD) for handling pending points. This is generally recommended.

• max_iep (int) – The maximum number of pending points before the box decompositions will be recomputed.

• incremental_nehvi (bool) – A boolean indicating whether to compute the incremental NEHVI from the ith point where i=1, …, q under sequential greedy optimization, or the full qNEHVI over q points.

• cache_root (bool) – A boolean indicating whether to cache the root decomposition over X_baseline and use low-rank updates.

• kwargs (Any) –

Return type

None

property X_baseline: torch.Tensor

Return X_baseline augmented with pending points cached using CBD.

set_X_pending(X_pending=None)[source]

Informs the acquisition function about pending design points.

Parameters

X_pending (Optional[torch.Tensor]) – n x d Tensor with n d-dim design points that have been submitted for evaluation but have not yet been evaluated.

Return type

None

forward(X)[source]

Takes in a batch_shape x q x d X Tensor of t-batches with q d-dim design points each, and returns a Tensor with shape batch_shape’, where batch_shape’ is the broadcasted batch shape of model and input X. Should utilize the result of set_X_pending as needed to account for pending function evaluations.

Parameters

X (torch.Tensor) –

Return type

torch.Tensor

training: bool

### The One-Shot Knowledge Gradient¶

Batch Knowledge Gradient (KG) via one-shot optimization as introduced in [Balandat2020botorch]. For broader discussion of KG see also [Frazier2008knowledge] and [Wu2016parallelkg].

Balandat2020botorch(1,2)

M. Balandat, B. Karrer, D. R. Jiang, S. Daulton, B. Letham, A. G. Wilson, and E. Bakshy. BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization. Advances in Neural Information Processing Systems 33, 2020.

Frazier2008knowledge

P. Frazier, W. Powell, and S. Dayanik. A Knowledge-Gradient policy for sequential information collection. SIAM Journal on Control and Optimization, 2008.

Wu2016parallelkg

J. Wu and P. Frazier. The parallel knowledge gradient method for batch bayesian optimization. NIPS 2016.

class botorch.acquisition.knowledge_gradient.qKnowledgeGradient(model, num_fantasies=64, sampler=None, objective=None, posterior_transform=None, inner_sampler=None, X_pending=None, current_value=None, **kwargs)[source]

Batch Knowledge Gradient using one-shot optimization.

This computes the batch Knowledge Gradient using fantasies for the outer expectation and either the model posterior mean or MC-sampling for the inner expectation.

In addition to the design variables, the input X also includes variables for the optimal designs for each of the fantasy models. For a fixed number of fantasies, all parts of X can be optimized in a “one-shot” fashion.

q-Knowledge Gradient (one-shot optimization).

Parameters
• model (Model) – A fitted model. Must support fantasizing.

• num_fantasies (Optional[int]) – The number of fantasy points to use. More fantasy points result in a better approximation, at the expense of memory and wall time. Unused if sampler is specified.

• sampler (Optional[MCSampler]) – The sampler used to sample fantasy observations. Optional if num_fantasies is specified.

• objective (Optional[MCAcquisitionObjective]) – The objective under which the samples are evaluated. If None, then the analytic posterior mean is used. Otherwise, the objective is MC-evaluated (using inner_sampler).

• posterior_transform (Optional[PosteriorTransform]) – An optional PosteriorTransform. If given, this transforms the posterior before evaluation. If objective is None, then the analytic posterior mean of the transformed posterior is used. If objective is given, the inner_sampler is used to draw samples from the transformed posterior, which are then evaluated under the objective.

• inner_sampler (Optional[MCSampler]) – The sampler used for inner sampling. Ignored if the objective is None.

• X_pending (Optional[Tensor]) – A m x d-dim Tensor of m design points that have points that have been submitted for function evaluation but have not yet been evaluated.

• current_value (Optional[Tensor]) – The current value, i.e. the expected best objective given the observed points D. If omitted, forward will not return the actual KG value, but the expected best objective given the data set D u X.

• kwargs (Any) –

Return type

None

forward(X)[source]

Evaluate qKnowledgeGradient on the candidate set X.

Parameters

X (torch.Tensor) –

A b x (q + num_fantasies) x d Tensor with b t-batches of q + num_fantasies design points each. We split this X tensor into two parts in the q dimension (dim=-2). The first q are the q-batch of design points and the last num_fantasies are the current solutions of the inner optimization problem.

X_fantasies = X[…, -num_fantasies:, :] X_fantasies.shape = b x num_fantasies x d

X_actual = X[…, :-num_fantasies, :] X_actual.shape = b x q x d

Returns

A Tensor of shape b. For t-batch b, the q-KG value of the design

X_actual[b] is averaged across the fantasy models, where X_fantasies[b, i] is chosen as the final selection for the i-th fantasy model. NOTE: If current_value is not provided, then this is not the true KG value of X_actual[b], and X_fantasies[b, : ] must be maximized at fixed X_actual[b].

Return type

torch.Tensor

evaluate(X, bounds, **kwargs)[source]

Evaluate qKnowledgeGradient on the candidate set X_actual by solving the inner optimization problem.

Parameters
• X (torch.Tensor) – A b x q x d Tensor with b t-batches of q design points each. Unlike forward(), this does not include solutions of the inner optimization problem.

• bounds (torch.Tensor) – A 2 x d tensor of lower and upper bounds for each column of the solutions to the inner problem.

• kwargs (Any) – Additional keyword arguments. This includes the options for optimization of the inner problem, i.e. num_restarts, raw_samples, an options dictionary to be passed on to the optimization helpers, and a scipy_options dictionary to be passed to scipy.minimize.

Returns

A Tensor of shape b. For t-batch b, the q-KG value of the design

X[b] is averaged across the fantasy models. NOTE: If current_value is not provided, then this is not the true KG value of X[b].

Return type

torch.Tensor

get_augmented_q_batch_size(q)[source]

Get augmented q batch size for one-shot optimization.

Parameters

q (int) – The number of candidates to consider jointly.

Returns

The augmented size for one-shot optimization (including variables parameterizing the fantasy solutions).

Return type

int

extract_candidates(X_full)[source]

We only return X as the set of candidates post-optimization.

Parameters

X_full (torch.Tensor) – A b x (q + num_fantasies) x d-dim Tensor with b t-batches of q + num_fantasies design points each.

Returns

A b x q x d-dim Tensor with b t-batches of q design points each.

Return type

torch.Tensor

training: bool
class botorch.acquisition.knowledge_gradient.qMultiFidelityKnowledgeGradient(model, num_fantasies=64, sampler=None, objective=None, posterior_transform=None, inner_sampler=None, X_pending=None, current_value=None, cost_aware_utility=None, project=<function qMultiFidelityKnowledgeGradient.<lambda>>, expand=<function qMultiFidelityKnowledgeGradient.<lambda>>, valfunc_cls=None, valfunc_argfac=None, **kwargs)[source]

Batch Knowledge Gradient for multi-fidelity optimization.

A version of qKnowledgeGradient that supports multi-fidelity optimization via a CostAwareUtility and the project and expand operators. If none of these are set, this acquisition function reduces to qKnowledgeGradient. Through valfunc_cls and valfunc_argfac, this can be changed into a custom multi-fidelity acquisition function (it is only KG if the terminal value is computed using a posterior mean).

Multi-Fidelity q-Knowledge Gradient (one-shot optimization).

Parameters
• model (Model) – A fitted model. Must support fantasizing.

• num_fantasies (Optional[int]) – The number of fantasy points to use. More fantasy points result in a better approximation, at the expense of memory and wall time. Unused if sampler is specified.

• sampler (Optional[MCSampler]) – The sampler used to sample fantasy observations. Optional if num_fantasies is specified.

• objective (Optional[MCAcquisitionObjective]) – The objective under which the samples are evaluated. If None, then the analytic posterior mean is used. Otherwise, the objective is MC-evaluated (using inner_sampler).

• posterior_transform (Optional[PosteriorTransform]) – An optional PosteriorTransform. If given, this transforms the posterior before evaluation. If objective is None, then the analytic posterior mean of the transformed posterior is used. If objective is given, the inner_sampler is used to draw samples from the transformed posterior, which are then evaluated under the objective.

• inner_sampler (Optional[MCSampler]) – The sampler used for inner sampling. Ignored if the objective is None.

• X_pending (Optional[Tensor]) – A m x d-dim Tensor of m design points that have points that have been submitted for function evaluation but have not yet been evaluated.

• current_value (Optional[Tensor]) – The current value, i.e. the expected best objective given the observed points D. If omitted, forward will not return the actual KG value, but the expected best objective given the data set D u X.

• cost_aware_utility (Optional[CostAwareUtility]) – A CostAwareUtility computing the cost-transformed utility from a candidate set and samples of increases in utility.

• project (Callable[[Tensor], Tensor]) – A callable mapping a batch_shape x q x d tensor of design points to a tensor with shape batch_shape x q_term x d projected to the desired target set (e.g. the target fidelities in case of multi-fidelity optimization). For the basic case, q_term = q.

• expand (Callable[[Tensor], Tensor]) – A callable mapping a batch_shape x q x d input tensor to a batch_shape x (q + q_e)’ x d-dim output tensor, where the q_e additional points in each q-batch correspond to additional (“trace”) observations.

• valfunc_cls (Optional[Type[AcquisitionFunction]]) – An acquisition function class to be used as the terminal value function.

• valfunc_argfac (Optional[Callable[[Model, Dict[str, Any]]]]) – An argument factory, i.e. callable that maps a Model to a dictionary of kwargs for the terminal value function (e.g. best_f for ExpectedImprovement).

• kwargs (Any) –

Return type

None

property cost_sampler
forward(X)[source]

Evaluate qMultiFidelityKnowledgeGradient on the candidate set X.

Parameters

X (torch.Tensor) –

A b x (q + num_fantasies) x d Tensor with b t-batches of q + num_fantasies design points each. We split this X tensor into two parts in the q dimension (dim=-2). The first q are the q-batch of design points and the last num_fantasies are the current solutions of the inner optimization problem.

X_fantasies = X[…, -num_fantasies:, :] X_fantasies.shape = b x num_fantasies x d

X_actual = X[…, :-num_fantasies, :] X_actual.shape = b x q x d

In addition, X may be augmented with fidelity parameteres as part of thee d-dimension. Projecting fidelities to the target fidelity is handled by project.

Returns

A Tensor of shape b. For t-batch b, the q-KG value of the design

X_actual[b] is averaged across the fantasy models, where X_fantasies[b, i] is chosen as the final selection for the i-th fantasy model. NOTE: If current_value is not provided, then this is not the true KG value of X_actual[b], and X_fantasies[b, : ] must be maximized at fixed X_actual[b].

Return type

torch.Tensor

training: bool
class botorch.acquisition.knowledge_gradient.ProjectedAcquisitionFunction(base_value_function, project)[source]

Defines a wrapper around an AcquisitionFunction that incorporates the project operator. Typically used to handle value functions in look-ahead methods.

Constructor for the AcquisitionFunction base class.

Parameters
• model – A fitted model.

• base_value_function (AcquisitionFunction) –

• project (Callable[[Tensor], Tensor]) –

Return type

None

forward(X)[source]

Evaluate the acquisition function on the candidate set X.

Parameters

X (torch.Tensor) – A (b) x q x d-dim Tensor of (b) t-batches with q d-dim design points each.

Returns

A (b)-dim Tensor of acquisition function values at the given design points X.

Return type

torch.Tensor

training: bool

### Multi-Step Lookahead Acquisition Functions¶

A general implementation of multi-step look-ahead acquistion function with configurable value functions. See [Jiang2020multistep].

Jiang2020multistep

S. Jiang, D. R. Jiang, M. Balandat, B. Karrer, J. Gardner, and R. Garnett. Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees. In Advances in Neural Information Processing Systems 33, 2020.

class botorch.acquisition.multi_step_lookahead.qMultiStepLookahead(model, batch_sizes, num_fantasies=None, samplers=None, valfunc_cls=None, valfunc_argfacs=None, objective=None, posterior_transform=None, inner_mc_samples=None, X_pending=None, collapse_fantasy_base_samples=True)[source]

MC-based batch Multi-Step Look-Ahead (one-shot optimization).

q-Multi-Step Look-Ahead (one-shot optimization).

Performs a k-step lookahead by means of repeated fantasizing.

Allows to specify the stage value functions by passing the respective class objects via the valfunc_cls list. Optionally, valfunc_argfacs takes a list of callables that generate additional kwargs for these constructors. By default, valfunc_cls will be chosen as [None, …, None, PosteriorMean], which corresponds to the (parallel) multi-step KnowledgeGradient. If, in addition, k=1 and q_1 = 1, this reduces to the classic Knowledge Gradient.

WARNING: The complexity of evaluating this function is exponential in the number of lookahead steps!

Parameters
• model (Model) – A fitted model.

• batch_sizes (List[int]) – A list [q_1, …, q_k] containing the batch sizes for the k look-ahead steps.

• num_fantasies (Optional[List[int]]) – A list [f_1, …, f_k] containing the number of fantasy points to use for the k look-ahead steps.

• samplers (Optional[List[MCSampler]]) – A list of MCSampler objects to be used for sampling fantasies in each stage.

• valfunc_cls (Optional[List[Optional[Type[AcquisitionFunction]]]]) – A list of k + 1 acquisition function classes to be used as the (stage + terminal) value functions. Each element (except for the last one) can be None, in which case a zero stage value is assumed for the respective stage. If None, this defaults to [None, …, None, PosteriorMean]

• valfunc_argfacs (Optional[List[Optional[TAcqfArgConstructor]]]) – A list of k + 1 “argument factories”, i.e. callables that map a Model and input tensor X to a dictionary of kwargs for the respective stage value function constructor (e.g. best_f for ExpectedImprovement). If None, only the standard (model, sampler and objective) kwargs will be used.

• objective (Optional[MCAcquisitionObjective]) – The objective under which the output is evaluated. If None, use the model output (requires a single-output model or a posterior transform). Otherwise the objective is MC-evaluated (using inner_sampler).

• posterior_transform (Optional[PosteriorTransform]) – An optional PosteriorTransform. If given, this transforms the posterior before evaluation. If objective is None, then the output of the transformed posterior is used. If objective is given, the inner_sampler is used to draw samples from the transformed posterior, which are then evaluated under the objective.

• inner_mc_samples (Optional[List[int]]) – A list [n_0, …, n_k] containing the number of MC samples to be used for evaluating the stage value function. Ignored if the objective is None.

• X_pending (Optional[Tensor]) – A m x d-dim Tensor of m design points that have points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call. Copied and set to have no gradient.

• collapse_fantasy_base_samples (bool) – If True, collapse_batch_dims of the Samplers will be applied on fantasy batch dimensions as well, meaning that base samples are the same in all subtrees starting from the same level.

Return type

None

forward(X)[source]

Evaluate qMultiStepLookahead on the candidate set X.

Parameters

X (torch.Tensor) – A batch_shape x q’ x d-dim Tensor with q’ design points for each batch, where q’ = q_0 + f_1 q_1 + f_2 f_1 q_2 + …. Here q_i is the number of candidates jointly considered in look-ahead step i, and f_i is respective number of fantasies.

Returns

The acquisition value for each batch as a tensor of shape batch_shape.

Return type

torch.Tensor

get_augmented_q_batch_size(q)[source]

Get augmented q batch size for one-shot optimzation.

Parameters

q (int) – The number of candidates to consider jointly.

Returns

The augmented size for one-shot optimzation (including variables parameterizing the fantasy solutions): q_0 + f_1 q_1 + f_2 f_1 q_2 + …

Return type

int

get_split_shapes(X)[source]

Get the split shapes from X.

Parameters

X (torch.Tensor) – A batch_shape x q_aug x d-dim tensor including fantasy points.

Returns

A 3-tuple (batch_shape, shapes, sizes), where shape[i] = f_i x …. x f_1 x batch_shape x q_i x d and size[i] = f_i * … f_1 * q_i.

Return type

Tuple[torch.Size, List[torch.Size], List[int]]

get_multi_step_tree_input_representation(X)[source]

Get the multi-step tree representation of X.

Parameters

X (torch.Tensor) – A batch_shape x q’ x d-dim Tensor with q’ design points for each batch, where q’ = q_0 + f_1 q_1 + f_2 f_1 q_2 + …. Here q_i is the number of candidates jointly considered in look-ahead step i, and f_i is respective number of fantasies.

Returns

A list [X_j, …, X_k] of tensors, where X_i has shape f_i x …. x f_1 x batch_shape x q_i x d.

Return type

List[torch.Tensor]

extract_candidates(X_full)[source]

We only return X as the set of candidates post-optimization.

Parameters

X_full (torch.Tensor) – A batch_shape x q’ x d-dim Tensor with q’ design points for each batch, where q’ = q + f_1 q_1 + f_2 f_1 q_2 + ….

Returns

A batch_shape x q x d-dim Tensor with q design points for each batch.

Return type

torch.Tensor

get_induced_fantasy_model(X)[source]

Fantasy model induced by X.

Parameters

X (torch.Tensor) – A batch_shape x q’ x d-dim Tensor with q’ design points for each batch, where q’ = q_0 + f_1 q_1 + f_2 f_1 q_2 + …. Here q_i is the number of candidates jointly considered in look-ahead step i, and f_i is respective number of fantasies.

Returns

The fantasy model induced by X.

Return type

botorch.models.model.Model

training: bool
botorch.acquisition.multi_step_lookahead.warmstart_multistep(acq_function, bounds, num_restarts, raw_samples, full_optimizer, **kwargs)[source]

Warm-start initialization for multi-step look-ahead acquisition functions.

For now uses the same q’ as in full_optimizer. TODO: allow different q.

Parameters
• acq_function (botorch.acquisition.multi_step_lookahead.qMultiStepLookahead) – A qMultiStepLookahead acquisition function.

• bounds (torch.Tensor) – A 2 x d tensor of lower and upper bounds for each column of features.

• num_restarts (int) – The number of starting points for multistart acquisition function optimization.

• raw_samples (int) – The number of raw samples to consider in the initialization heuristic.

• full_optimizer (torch.Tensor) – The full tree of optimizers of the previous iteration of shape batch_shape x q’ x d. Typically obtained by passing return_best_only=False and return_full_tree=True into optimize_acqf.

• kwargs (Any) – Optimization kwargs.

Returns

A num_restarts x q’ x d tensor for initial points for optimization.

Return type

torch.Tensor

This is a very simple initialization heuristic. TODO: Use the observed values to identify the fantasy sub-tree that is closest to the observed value.

botorch.acquisition.multi_step_lookahead.make_best_f(model, X)[source]

Extract the best observed training input from the model.

Parameters
Return type

Dict[str, Any]

### Active Learning Acquisition Functions¶

Active learning acquisition functions.

Seo2014activedata

S. Seo, M. Wallat, T. Graepel, and K. Obermayer. Gaussian process regression: Active data selection and test point rejection. IJCNN 2000.

Chen2014seqexpdesign

X. Chen and Q. Zhou. Sequential experimental designs for stochastic kriging. Winter Simulation Conference 2014.

Binois2017repexp

M. Binois, J. Huang, R. B. Gramacy, and M. Ludkovski. Replication or exploration? Sequential design for stochastic simulation experiments. ArXiv 2017.

class botorch.acquisition.active_learning.qNegIntegratedPosteriorVariance(model, mc_points, sampler=None, posterior_transform=None, X_pending=None, **kwargs)[source]

Batch Integrated Negative Posterior Variance for Active Learning.

This acquisition function quantifies the (negative) integrated posterior variance (excluding observation noise, computed using MC integration) of the model. In that, it is a proxy for global model uncertainty, and thus purely focused on “exploration”, rather the “exploitation” of many of the classic Bayesian Optimization acquisition functions.

q-Integrated Negative Posterior Variance.

Parameters
• model (Model) – A fitted model.

• mc_points (Tensor) – A batch_shape x N x d tensor of points to use for MC-integrating the posterior variance. Usually, these are qMC samples on the whole design space, but biased sampling directly allows weighted integration of the posterior variance.

• sampler (Optional[MCSampler]) – The sampler used for drawing fantasy samples. In the basic setting of a standard GP (default) this is a dummy, since the variance of the model after conditioning does not actually depend on the sampled values.

• posterior_transform (Optional[PosteriorTransform]) – A PosteriorTransform. If using a multi-output model, a PosteriorTransform that transforms the multi-output posterior into a single-output posterior is required.

• X_pending (Optional[Tensor]) – A n’ x d-dim Tensor of n’ design points that have points that have been submitted for function evaluation but have not yet been evaluated.

Return type

None

forward(X)[source]

Evaluate the acquisition function on the candidate set X.

Parameters

X (torch.Tensor) – A (b) x q x d-dim Tensor of (b) t-batches with q d-dim design points each.

Returns

A (b)-dim Tensor of acquisition function values at the given design points X.

Return type

torch.Tensor

training: bool
class botorch.acquisition.active_learning.PairwiseMCPosteriorVariance(model, objective, sampler=None)[source]

Variance of difference for Active Learning

Given a model and an objective, calculate the posterior sample variance of the objective on the difference of pairs of points. See more implementation details in forward. This acquisition function is typically used with a pairwise model (e.g., PairwiseGP) and a likelihood/link function on the pair difference (e.g., logistic or probit) for pure exploration

Pairwise Monte Carlo Posterior Variance

Parameters
• model (Model) – A fitted model.

• objective (MCAcquisitionObjective) – An MCAcquisitionObjective representing the link function (e.g., logistic or probit.) applied on the difference of (usually 1-d) two samples. Can be implemented via GenericMCObjective.

• sampler (Optional[MCSampler]) – The sampler used for drawing MC samples.

Return type

None

forward(X)[source]

Evaluate PairwiseMCPosteriorVariance on the candidate set X.

Parameters

X (torch.Tensor) – A batch_size x q x d-dim Tensor. q should be a multiple of 2.

Returns

Tensor of shape batch_size x q representing the posterior variance of link function at X that active learning hopes to maximize

Return type

torch.Tensor

training: bool

### Preference Acquisition Functions¶

Preference acquisition functions. This includes: Analytical EUBO acquisition function as introduced in [Lin2020preference].

Lin2020preference(1,2)

Lin, Z.J., Astudillo, R., Frazier, P.I. and Bakshy, E. Preference Exploration for Efficient Bayesian Optimization with Multiple Outcomes. International Conference on Artificial Intelligence and Statistics (AISTATS), 2022.

class botorch.acquisition.preference.AnalyticExpectedUtilityOfBestOption(pref_model, outcome_model=None, previous_winner=None)[source]

Analytic Prefential Expected Utility of Best Options, i.e., Analytical EUBO

Analytic implementation of Expected Utility of the Best Option under the Laplace model (assumes a PairwiseGP is used as the preference model) as proposed in [Lin2020preference].

Parameters
• pref_model (Model) – The preference model that maps the outcomes (i.e., Y) to scalar-valued utility.

• model – A deterministic model that maps parameters (i.e., X) to outcomes (i.e., Y). The outcome model f defines the search space of Y = f(X). If model is None, we are directly calculating EUBO on the parameter space. When used with OneSamplePosteriorDrawModel, we are obtaining EUBO-zeta as described in [Lin2020preference].

• previous_winner (Optional[Tensor]) – Tensor representing the previous winner in the Y space. Defaults to None.

• outcome_model (Optional[DeterministicModel]) –

Return type

None

forward(X)[source]

Evaluate analytical EUBO on the candidate set X.

Parameters

X (torch.Tensor) – A batch_shape x q x d-dim Tensor, where q = 2 if previous_winner is not None, and q = 1 otherwise.

Returns

The acquisition value for each batch as a tensor of shape batch_shape.

Return type

torch.Tensor

training: bool

## Objectives and Cost-Aware Utilities¶

### Objectives¶

Objective Modules to be used with acquisition functions.

class botorch.acquisition.objective.AcquisitionObjective[source]

Bases: torch.nn.modules.module.Module, abc.ABC

Abstract base class for objectives.

DEPRECATED - This will be removed in the next version.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Return type

None

training: bool
class botorch.acquisition.objective.PosteriorTransform[source]

Bases: torch.nn.modules.module.Module, abc.ABC

Abstract base class for objectives that transform the posterior.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Return type

None

scalarize: bool
abstract evaluate(Y)[source]

Evaluate the transform on a set of outcomes.

Parameters

Y (torch.Tensor) – A batch_shape x q x m-dim tensor of outcomes.

Returns

A batch_shape x q’ [x m’]-dim tensor of transformed outcomes.

Return type

torch.Tensor

abstract forward(posterior)[source]

Compute the transformed posterior.

Parameters

posterior (botorch.posteriors.posterior.Posterior) – The posterior to be transformed.

Returns

The transformed posterior object.

Return type

botorch.posteriors.posterior.Posterior

class botorch.acquisition.objective.ScalarizedPosteriorTransform(weights, offset=0.0)[source]

An affine posterior transform for scalarizing multi-output posteriors.

For a Gaussian posterior at a single point (q=1) with mean mu and covariance matrix Sigma, this yields a single-output posterior with mean weights^T * mu and variance weights^T Sigma w.

Example

Example for a model with two outcomes:

>>> weights = torch.tensor([0.5, 0.25])
>>> posterior_transform = ScalarizedPosteriorTransform(weights)
>>> EI = ExpectedImprovement(
... model, best_f=0.1, posterior_transform=posterior_transform
... )


Affine posterior transform.

Parameters
• weights (Tensor) – A one-dimensional tensor with m elements representing the linear weights on the outputs.

• offset (float) – An offset to be added to posterior mean.

Return type

None

scalarize: bool = True
evaluate(Y)[source]

Evaluate the transform on a set of outcomes.

Parameters

Y (torch.Tensor) – A batch_shape x q x m-dim tensor of outcomes.

Returns

A batch_shape x q-dim tensor of transformed outcomes.

Return type

torch.Tensor

forward(posterior)[source]

Compute the posterior of the affine transformation.

Parameters

posterior (botorch.posteriors.gpytorch.GPyTorchPosterior) – A posterior with the same number of outputs as the elements in self.weights.

Returns

A single-output posterior.

Return type

botorch.posteriors.gpytorch.GPyTorchPosterior

class botorch.acquisition.objective.ScalarizedObjective(weights, offset=0.0)[source]

DEPRECATED - Use ScalarizedPosteriorTransform instead.

Affine posterior transform.

Parameters
• weights (Tensor) – A one-dimensional tensor with m elements representing the linear weights on the outputs.

• offset (float) – An offset to be added to posterior mean.

Return type

None

training: bool
class botorch.acquisition.objective.ExpectationPosteriorTransform(n_w, weights=None)[source]

Transform the batch x (q * n_w) x m posterior into a batch x q x m posterior of the expectation. The expectation is calculated over each consecutive n_w block of points in the posterior.

This is intended for use with InputPerturbation or AppendFeatures for optimizing the expectation over n_w points. This should not be used when there are constraints present, since this does not take into account the feasibility of the objectives.

Note: This is different than ScalarizedPosteriorTransform in that this operates over the q-batch dimension.

A posterior transform calculating the expectation over the q-batch dimension.

Parameters
• n_w (int) – The number of points in the q-batch of the posterior to compute the expectation over. This corresponds to the size of the feature_set of AppendFeatures or the size of the perturbation_set of InputPerturbation.

• weights (Optional[Tensor]) – An optional n_w x m-dim tensor of weights. Can be used to compute a weighted expectation. Weights are normalized before use.

Return type

None

evaluate(Y)[source]

Evaluate the expectation of a set of outcomes.

Parameters

Y (torch.Tensor) – A batch_shape x (q * n_w) x m-dim tensor of outcomes.

Returns

A batch_shape x q x m-dim tensor of expectation outcomes.

Return type

torch.Tensor

forward(posterior)[source]

Compute the posterior of the expectation.

Parameters

posterior (botorch.posteriors.gpytorch.GPyTorchPosterior) – An m-outcome joint posterior over q * n_w points.

Returns

An m-outcome joint posterior over q expectations.

Return type

botorch.posteriors.gpytorch.GPyTorchPosterior

scalarize: bool
training: bool
class botorch.acquisition.objective.MCAcquisitionObjective[source]

Bases: torch.nn.modules.module.Module, abc.ABC

Abstract base class for MC-based objectives.

Parameters
• _verify_output_shape – If True and X is given, check that the q-batch shape of the objectives agrees with that of X.

• _is_mo – A boolean denoting whether the objectives are multi-output.

Return type

None

Initializes internal Module state, shared by both nn.Module and ScriptModule.

abstract forward(samples, X=None)[source]

Evaluate the objective on the samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Relevant only if the objective depends on the inputs explicitly.

Returns

A sample_shape x batch_shape x q-dim Tensor of objective values (assuming maximization).

Return type

Tensor

This method is usually not called directly, but via the objectives.

Example

>>> # __call__ method:
>>> samples = sampler(posterior)
>>> outcome = mc_obj(samples)

class botorch.acquisition.objective.IdentityMCObjective[source]

Trivial objective extracting the last dimension.

Example

>>> identity_objective = IdentityMCObjective()
>>> samples = sampler(posterior)
>>> objective = identity_objective(samples)


Initializes internal Module state, shared by both nn.Module and ScriptModule.

Return type

None

forward(samples, X=None)[source]

Evaluate the objective on the samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Relevant only if the objective depends on the inputs explicitly.

Returns

A sample_shape x batch_shape x q-dim Tensor of objective values (assuming maximization).

Return type

Tensor

This method is usually not called directly, but via the objectives.

Example

>>> # __call__ method:
>>> samples = sampler(posterior)
>>> outcome = mc_obj(samples)

class botorch.acquisition.objective.LinearMCObjective(weights)[source]

Linear objective constructed from a weight tensor.

For input samples and mc_obj = LinearMCObjective(weights), this produces mc_obj(samples) = sum_{i} weights[i] * samples[…, i]

Example

Example for a model with two outcomes:

>>> weights = torch.tensor([0.75, 0.25])
>>> linear_objective = LinearMCObjective(weights)
>>> samples = sampler(posterior)
>>> objective = linear_objective(samples)


Linear Objective.

Parameters

weights (Tensor) – A one-dimensional tensor with m elements representing the linear weights on the outputs.

Return type

None

forward(samples, X=None)[source]

Evaluate the linear objective on the samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x q x m-dim tensors of samples from a model posterior.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Relevant only if the objective depends on the inputs explicitly.

Returns

A sample_shape x batch_shape x q-dim tensor of objective values.

Return type

torch.Tensor

class botorch.acquisition.objective.GenericMCObjective(objective)[source]

Objective generated from a generic callable.

Allows to construct arbitrary MC-objective functions from a generic callable. In order to be able to use gradient-based acquisition function optimization it should be possible to backpropagate through the callable.

Example

>>> generic_objective = GenericMCObjective(
lambda Y, X: torch.sqrt(Y).sum(dim=-1),
)
>>> samples = sampler(posterior)
>>> objective = generic_objective(samples)


Objective generated from a generic callable.

Parameters

objective (Callable[[Tensor, Optional[Tensor]], Tensor]) – A callable f(samples, X) mapping a sample_shape x batch-shape x q x m-dim Tensor samples and an optional batch-shape x q x d-dim Tensor X to a sample_shape x batch-shape x q-dim Tensor of objective values.

Return type

None

forward(samples, X=None)[source]

Evaluate the feasibility-weigthed objective on the samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Relevant only if the objective depends on the inputs explicitly.

Returns

A sample_shape x batch_shape x q-dim Tensor of objective values weighted by feasibility (assuming maximization).

Return type

torch.Tensor

class botorch.acquisition.objective.ConstrainedMCObjective(objective, constraints, infeasible_cost=0.0, eta=0.001)[source]

Feasibility-weighted objective.

An Objective allowing to maximize some scalable objective on the model outputs subject to a number of constraints. Constraint feasibilty is approximated by a sigmoid function.

mc_acq(X) = ( (objective(X) + infeasible_cost) * prod_i (1 - sigmoid(constraint_i(X))) ) - infeasible_cost

See botorch.utils.objective.apply_constraints for details on the constraint handling.

Example

>>> bound = 0.0
>>> objective = lambda Y: Y[..., 0]
>>> # apply non-negativity constraint on f(x)[1]
>>> constraint = lambda Y: bound - Y[..., 1]
>>> constrained_objective = ConstrainedMCObjective(objective, [constraint])
>>> samples = sampler(posterior)
>>> objective = constrained_objective(samples)


Feasibility-weighted objective.

Parameters
• objective (Callable[[Tensor, Optional[Tensor]], Tensor]) – A callable f(samples, X) mapping a sample_shape x batch-shape x q x m-dim Tensor samples and an optional batch-shape x q x d-dim Tensor X to a sample_shape x batch-shape x q-dim Tensor of objective values.

• constraints (List[Callable[[Tensor], Tensor]]) – A list of callables, each mapping a Tensor of dimension sample_shape x batch-shape x q x m to a Tensor of dimension sample_shape x batch-shape x q, where negative values imply feasibility.

• infeasible_cost (float) – The cost of a design if all associated samples are infeasible.

• eta (float) – The temperature parameter of the sigmoid function approximating the constraint.

Return type

None

forward(samples, X=None)[source]

Evaluate the feasibility-weighted objective on the samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Relevant only if the objective depends on the inputs explicitly.

Returns

A sample_shape x batch_shape x q-dim Tensor of objective values weighted by feasibility (assuming maximization).

Return type

torch.Tensor

class botorch.acquisition.objective.LearnedObjective(pref_model, sampler=None)[source]

Learned preference objective constructed from a preference model.

For input samples, it samples each individual sample again from the latent preference posterior distribution using pref_model and return the posterior mean.

Example

>>> train_X = torch.rand(2, 2)
>>> train_comps = torch.LongTensor([[0, 1]])
>>> pref_model = PairwiseGP(train_X, train_comps)
>>> learned_pref_obj = LearnedObjective(pref_model)
>>> samples = sampler(posterior)
>>> objective = learned_pref_obj(samples)


Learned preference objective constructed from a preference model.

Parameters
• pref_model (Model) – A BoTorch model, which models the latent preference/utility function. Given an input tensor of size sample_size x batch_shape x N x d, its posterior method should return a Posterior object with single outcome representing the utility values of the input.

• sampler (Optional[MCSampler]) – Sampler for the preference model to account for uncertainty in preferece when calculating the objective; it’s not the one used in MC acquisition functions. If None, it uses IIDNormalSampler(num_samples=1).

forward(samples, X=None)[source]

Sample each element of samples.

Parameters
• samples (torch.Tensor) – A sample_size x batch_shape x N x d-dim Tensors of samples from a model posterior.

• X (Optional[torch.Tensor]) –

Returns

A (sample_size * num_samples) x batch_shape x N-dim Tensor of objective values sampled from utility posterior using pref_model.

Return type

torch.Tensor

### Multi-Objective Objectives¶

class botorch.acquisition.multi_objective.objective.MCMultiOutputObjective[source]

Abstract base class for MC multi-output objectives.

Parameters

_is_mo – A boolean denoting whether the objectives are multi-output.

Return type

None

Initializes internal Module state, shared by both nn.Module and ScriptModule.

abstract forward(samples, X=None, **kwargs)[source]

Evaluate the multi-output objective on the samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim Tensors of inputs.

Returns

A sample_shape x batch_shape x q x m’-dim Tensor of objective values with m’ the output dimension. This assumes maximization in each output dimension).

Return type

torch.Tensor

This method is usually not called directly, but via the objectives

Example

>>> # __call__ method:
>>> samples = sampler(posterior)
>>> outcomes = multi_obj(samples)

class botorch.acquisition.multi_objective.objective.GenericMCMultiOutputObjective(objective)[source]

Multi-output objective generated from a generic callable.

Allows to construct arbitrary MC-objective functions from a generic callable. In order to be able to use gradient-based acquisition function optimization it should be possible to backpropagate through the callable.

Objective generated from a generic callable.

Parameters

objective (Callable[[Tensor, Optional[Tensor]], Tensor]) – A callable f(samples, X) mapping a sample_shape x batch-shape x q x m-dim Tensor samples and an optional batch-shape x q x d-dim Tensor X to a sample_shape x batch-shape x q-dim Tensor of objective values.

Return type

None

class botorch.acquisition.multi_objective.objective.IdentityMCMultiOutputObjective(outcomes=None, num_outcomes=None)[source]

Trivial objective that returns the unaltered samples.

Example

>>> identity_objective = IdentityMCMultiOutputObjective()
>>> samples = sampler(posterior)
>>> objective = identity_objective(samples)


Initialize Objective.

Parameters
• weightsm’-dim tensor of outcome weights.

• outcomes (Optional[List[int]]) – A list of the m’ indices that the weights should be applied to.

• num_outcomes (Optional[int]) – The total number of outcomes m

Return type

None

forward(samples, X=None)[source]

Evaluate the multi-output objective on the samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim Tensors of inputs.

Returns

A sample_shape x batch_shape x q x m’-dim Tensor of objective values with m’ the output dimension. This assumes maximization in each output dimension).

Return type

torch.Tensor

This method is usually not called directly, but via the objectives

Example

>>> # __call__ method:
>>> samples = sampler(posterior)
>>> outcomes = multi_obj(samples)

class botorch.acquisition.multi_objective.objective.WeightedMCMultiOutputObjective(weights, outcomes=None, num_outcomes=None)[source]

Objective that reweights samples by given weights vector.

Example

>>> weights = torch.tensor([1.0, -1.0])
>>> weighted_objective = WeightedMCMultiOutputObjective(weights)
>>> samples = sampler(posterior)
>>> objective = weighted_objective(samples)


Initialize Objective.

Parameters
• weights (Tensor) – m’-dim tensor of outcome weights.

• outcomes (Optional[List[int]]) – A list of the m’ indices that the weights should be applied to.

• num_outcomes (Optional[int]) – the total number of outcomes m

Return type

None

forward(samples, X=None)[source]

Evaluate the multi-output objective on the samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim Tensors of inputs.

Returns

A sample_shape x batch_shape x q x m’-dim Tensor of objective values with m’ the output dimension. This assumes maximization in each output dimension).

Return type

torch.Tensor

This method is usually not called directly, but via the objectives

Example

>>> # __call__ method:
>>> samples = sampler(posterior)
>>> outcomes = multi_obj(samples)

class botorch.acquisition.multi_objective.objective.UnstandardizeMCMultiOutputObjective(Y_mean, Y_std, outcomes=None)[source]

Objective that unstandardizes the samples.

TODO: remove this when MultiTask models support outcome transforms.

Example

>>> unstd_objective = UnstandardizeMCMultiOutputObjective(Y_mean, Y_std)
>>> samples = sampler(posterior)
>>> objective = unstd_objective(samples)


Initialize objective.

Parameters
• Y_mean (Tensor) – m-dim tensor of outcome means.

• Y_std (Tensor) – m-dim tensor of outcome standard deviations.

• outcomes (Optional[List[int]]) – A list of m’ <= m indices that specifies which of the m model outputs should be considered as the outcomes for MOO. If omitted, use all model outcomes. Typically used for constrained optimization.

Return type

None

forward(samples, X=None)[source]

Evaluate the multi-output objective on the samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim Tensors of inputs.

Returns

A sample_shape x batch_shape x q x m’-dim Tensor of objective values with m’ the output dimension. This assumes maximization in each output dimension).

Return type

torch.Tensor

This method is usually not called directly, but via the objectives

Example

>>> # __call__ method:
>>> samples = sampler(posterior)
>>> outcomes = multi_obj(samples)

class botorch.acquisition.multi_objective.objective.AnalyticMultiOutputObjective[source]

Abstract base class for multi-output analyic objectives.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Return type

None

abstract forward(posterior)[source]

Transform the posterior

Parameters

posterior (botorch.posteriors.gpytorch.GPyTorchPosterior) – A posterior.

Returns

A transformed posterior.

Return type

botorch.posteriors.gpytorch.GPyTorchPosterior

training: bool
class botorch.acquisition.multi_objective.objective.IdentityAnalyticMultiOutputObjective[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Return type

None

forward(posterior)[source]

Transform the posterior

Parameters

posterior (botorch.posteriors.gpytorch.GPyTorchPosterior) – A posterior.

Returns

A transformed posterior.

Return type

botorch.posteriors.gpytorch.GPyTorchPosterior

training: bool
class botorch.acquisition.multi_objective.objective.UnstandardizeAnalyticMultiOutputObjective(Y_mean, Y_std)[source]

Objective that unstandardizes the posterior.

TODO: remove this when MultiTask models support outcome transforms.

Example

>>> unstd_objective = UnstandardizeAnalyticMultiOutputObjective(Y_mean, Y_std)
>>> unstd_posterior = unstd_objective(posterior)


Initialize objective.

Parameters
• Y_mean (Tensor) – m-dim tensor of outcome means

• Y_std (Tensor) – m-dim tensor of outcome standard deviations

Return type

None

forward(posterior)[source]

Transform the posterior

Parameters

posterior (botorch.posteriors.gpytorch.GPyTorchPosterior) – A posterior.

Returns

A transformed posterior.

Return type

torch.Tensor

training: bool

### Cost-Aware Utility¶

Cost functions for cost-aware acquisition functions, e.g. multi-fidelity KG. To be used in a context where there is an objective/cost tradeoff.

class botorch.acquisition.cost_aware.CostAwareUtility[source]

Bases: torch.nn.modules.module.Module, abc.ABC

Abstract base class for cost-aware utilities.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Return type

None

abstract forward(X, deltas, **kwargs)[source]

Evaluate the cost-aware utility on the candidates and improvements.

Parameters
• X (torch.Tensor) – A batch_shape x q x d-dim Tensor of with q d-dim design points each for each t-batch.

• deltas (torch.Tensor) – A num_fantasies x batch_shape-dim Tensor of num_fantasy samples from the marginal improvement in utility over the current state at X for each t-batch.

• kwargs (Any) –

Returns

A num_fantasies x batch_shape-dim Tensor of cost-transformed utilities.

Return type

torch.Tensor

training: bool
class botorch.acquisition.cost_aware.GenericCostAwareUtility(cost)[source]

Generic cost-aware utility wrapping a callable.

Generic cost-aware utility wrapping a callable.

Parameters

cost (Callable[[Tensor, Tensor], Tensor]) – A callable mapping a batch_shape x q x d’-dim candidate set to a batch_shape-dim tensor of costs

Return type

None

forward(X, deltas, **kwargs)[source]

Evaluate the cost function on the candidates and improvements.

Parameters
• X (torch.Tensor) – A batch_shape x q x d’-dim Tensor of with q d-dim design points for each t-batch.

• deltas (torch.Tensor) – A num_fantasies x batch_shape-dim Tensor of num_fantasy samples from the marginal improvement in utility over the current state at X for each t-batch.

• kwargs (Any) –

Returns

A num_fantasies x batch_shape-dim Tensor of cost-weighted utilities.

Return type

torch.Tensor

training: bool
class botorch.acquisition.cost_aware.InverseCostWeightedUtility(cost_model, use_mean=True, cost_objective=None, min_cost=0.01)[source]

A cost-aware utility using inverse cost weighting based on a model.

Computes the cost-aware utility by inverse-weighting samples U = (u_1, …, u_N) of the increase in utility. If use_mean=True, this uses the posterior mean mean_cost of the cost model, i.e. weighted utility = mean(U) / mean_cost. If use_mean=False, it uses samples C = (c_1, …, c_N) from the posterior of the cost model and performs the inverse weighting on the sample level: weighted utility = mean(u_1 / c_1, …, u_N / c_N).

The cost is additive across multiple elements of a q-batch.

Cost-aware utility that weights increase in utiltiy by inverse cost.

Parameters
• cost_model (Model) – A Model modeling the cost of evaluating a candidate set X, where X are the same features as in the model for the acquisition function this is to be used with. If no cost_objective is specified, the outputs are required to be non-negative.

• use_mean (bool) – If True, use the posterior mean, otherwise use posterior samples from the cost model.

• cost_objective (Optional[MCAcquisitionObjective]) – If specified, transform the posterior mean / the posterior samples from the cost model. This can be used e.g. to un-transform predictions/samples of a cost model fit on the log-transformed cost (often done to ensure non-negativity).

• min_cost (float) – A value used to clamp the cost samples so that they are not too close to zero, which may cause numerical issues.

Returns

The inverse-cost-weighted utiltiy.

Return type

None

forward(X, deltas, sampler=None, **kwargs)[source]

Evaluate the cost function on the candidates and improvements.

Parameters
• X (torch.Tensor) – A batch_shape x q x d-dim Tensor of with q d-dim design points each for each t-batch.

• deltas (torch.Tensor) – A num_fantasies x batch_shape-dim Tensor of num_fantasy samples from the marginal improvement in utility over the current state at X for each t-batch.

• sampler (Optional[botorch.sampling.samplers.MCSampler]) – A sampler used for sampling from the posterior of the cost model (required if use_mean=False, ignored if use_mean=True).

• kwargs (Any) –

Returns

A num_fantasies x batch_shape-dim Tensor of cost-weighted utilities.

Return type

torch.Tensor

training: bool

### Risk Measures¶

Risk Measures implemented as Monte-Carlo objectives, based on Bayesian optimization of risk measures as introduced in [Cakmak2020risk]. For a broader discussion of Monte-Carlo methods for VaR and CVaR risk measures, see also [Hong2014review].

Cakmak2020risk

S. Cakmak, R. Astudillo, P. Frazier, and E. Zhou. Bayesian Optimization of Risk Measures. Advances in Neural Information Processing Systems 33, 2020.

Hong2014review

L. J. Hong, Z. Hu, and G. Liu. Monte carlo methods for value-at-risk and conditional value-at-risk: a review. ACM Transactions on Modeling and Computer Simulation, 2014.

class botorch.acquisition.risk_measures.RiskMeasureMCObjective(n_w, weights=None)[source]

Bases: botorch.acquisition.objective.MCAcquisitionObjective, abc.ABC

Objective transforming the posterior samples to samples of a risk measure.

The risk measure is calculated over joint q-batch samples from the posterior. If the q-batch includes samples corresponding to multiple inputs, it is assumed that first n_w samples correspond to first input, second n_w samples correspond to second input etc.

The risk measures are commonly defined for minimization by considering the upper tail of the distribution, i.e., treating larger values as being undesirable. BoTorch by default assumes a maximization objective, so the default behavior here is to calculate the risk measures w.r.t. the lower tail of the distribution. This can be changed by passing weights=torch.tensor([-1.0]).

Transform the posterior samples to samples of a risk measure.

Parameters
• n_w (int) – The size of the w_set to calculate the risk measure over.

• weights (Optional[torch.Tensor]) – An optional m-dim tensor of weights for scalarizing multi-output samples before calculating the risk measure.

Return type

None

abstract forward(samples, X=None)[source]

Calculate the risk measure corresponding to the given samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that each n_w block of samples correspond to the same input.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Ignored.

Returns

A sample_shape x batch_shape x q-dim tensor of risk measure samples.

Return type

torch.Tensor

class botorch.acquisition.risk_measures.CVaR(alpha, n_w, weights=None)[source]

The Conditional Value-at-Risk risk measure.

The Conditional Value-at-Risk measures the expectation of the worst outcomes (small rewards or large losses) with a total probability of 1 - alpha. It is commonly defined as the conditional expectation of the reward function, with the condition that the reward is smaller than the corresponding Value-at-Risk (also defined below).

Note: Due to the use of a discrete w_set of samples, the VaR and CVaR

calculated here are (possibly biased) Monte-Carlo approximations of the true risk measures.

Transform the posterior samples to samples of a risk measure.

Parameters
• alpha (float) – The risk level, float in (0.0, 1.0].

• n_w (int) – The size of the w_set to calculate the risk measure over.

• weights (Optional[torch.Tensor]) – An optional m-dim tensor of weights for scalarizing multi-objective samples before calculating the risk measure.

Return type

None

forward(samples, X=None)[source]

Calculate the CVaR corresponding to the given samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that each n_w block of samples correspond to the same input.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Ignored.

Returns

A sample_shape x batch_shape x q-dim tensor of CVaR samples.

Return type

torch.Tensor

class botorch.acquisition.risk_measures.VaR(alpha, n_w, weights=None)[source]

The Value-at-Risk risk measure.

Value-at-Risk measures the smallest possible reward (or largest possible loss) after excluding the worst outcomes with a total probability of 1 - alpha. It is commonly used in financial risk management, and it corresponds to the 1 - alpha quantile of a given random variable.

Transform the posterior samples to samples of a risk measure.

Parameters
• alpha (float) – The risk level, float in (0.0, 1.0].

• n_w (int) – The size of the w_set to calculate the risk measure over.

• weights (Optional[torch.Tensor]) – An optional m-dim tensor of weights for scalarizing multi-objective samples before calculating the risk measure.

Return type

None

forward(samples, X=None)[source]

Calculate the VaR corresponding to the given samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that each n_w block of samples correspond to the same input.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Ignored.

Returns

A sample_shape x batch_shape x q-dim tensor of VaR samples.

Return type

torch.Tensor

class botorch.acquisition.risk_measures.WorstCase(n_w, weights=None)[source]

The worst-case risk measure.

Transform the posterior samples to samples of a risk measure.

Parameters
• n_w (int) – The size of the w_set to calculate the risk measure over.

• weights (Optional[torch.Tensor]) – An optional m-dim tensor of weights for scalarizing multi-output samples before calculating the risk measure.

Return type

None

forward(samples, X=None)[source]

Calculate the worst-case measure corresponding to the given samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that each n_w block of samples correspond to the same input.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Ignored.

Returns

A sample_shape x batch_shape x q-dim tensor of worst-case samples.

Return type

torch.Tensor

class botorch.acquisition.risk_measures.Expectation(n_w, weights=None)[source]

The expectation risk measure.

For unconstrained problems, we recommend using the ExpectationPosteriorTransform instead. ExpectationPosteriorTransform directly transforms the posterior distribution over q * n_w to a posterior of q expectations, significantly reducing the cost of posterior sampling as a result.

Transform the posterior samples to samples of a risk measure.

Parameters
• n_w (int) – The size of the w_set to calculate the risk measure over.

• weights (Optional[torch.Tensor]) – An optional m-dim tensor of weights for scalarizing multi-output samples before calculating the risk measure.

Return type

None

forward(samples, X=None)[source]

Calculate the expectation corresponding to the given samples. This calculates the expectation / mean / average of each n_w samples across the q-batch dimension. If self.weights is given, the samples are scalarized across the output dimension before taking the expectation.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that each n_w block of samples correspond to the same input.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Ignored.

Returns

A sample_shape x batch_shape x q-dim tensor of expectation samples.

Return type

torch.Tensor

### Multi-Output Risk Measures¶

Multi-output extensions of the risk measures, implemented as Monte-Carlo objectives. Except for MVaR, the risk measures are computed over each output dimension independently. In contrast, MVaR is computed using the joint distribution of the outputs, and provides more accurate risk estimates.

References

Prekopa2012MVaR(1,2,3,4)

A. Prekopa. Multivariate value at risk and related topics. Annals of Operations Research, 2012.

Cousin2013MVaR(1,2)

A. Cousin and E. Di Bernardino. On multivariate extensions of Value-at-Risk. Journal of Multivariate Analysis, 2013.

class botorch.acquisition.multi_objective.multi_output_risk_measures.MultiOutputRiskMeasureMCObjective(n_w, weights=None)[source]

Objective transforming the multi-output posterior samples to samples of a multi-output risk measure.

The risk measure is calculated over joint q-batch samples from the posterior. If the q-batch includes samples corresponding to multiple inputs, it is assumed that first n_w samples correspond to first input, second n_w samples correspond to second input, etc.

Transform the posterior samples to samples of a risk measure.

Parameters
• n_w (int) – The size of the w_set to calculate the risk measure over.

• weights (Optional[torch.Tensor]) – An optional m-dim tensor of weights for scaling multi-output samples before calculating the risk measure. This can also be used to make sure that all outputs are correctly aligned for maximization by negating those that are originally defined for minimization.

Return type

None

abstract forward(samples, X=None)[source]

Calculate the risk measure corresponding to the given samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that each n_w block of samples correspond to the same input.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Ignored.

Returns

A sample_shape x batch_shape x q x m-dim tensor of risk measure samples.

Return type

torch.Tensor

class botorch.acquisition.multi_objective.multi_output_risk_measures.MultiOutputExpectation(n_w, weights=None)[source]

A multi-output MC expectation risk measure.

For unconstrained problems, we recommend using the ExpectationPosteriorTransform instead. ExpectationPosteriorTransform directly transforms the posterior distribution over q * n_w to a posterior of q expectations, significantly reducing the cost of posterior sampling as a result.

Transform the posterior samples to samples of a risk measure.

Parameters
• n_w (int) – The size of the w_set to calculate the risk measure over.

• weights (Optional[torch.Tensor]) – An optional m-dim tensor of weights for scaling multi-output samples before calculating the risk measure. This can also be used to make sure that all outputs are correctly aligned for maximization by negating those that are originally defined for minimization.

Return type

None

forward(samples, X=None)[source]

Calculate the expectation of the given samples. Expectation is calculated over each n_w samples in the q-batch dimension.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that each n_w block of samples correspond to the same input.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Ignored.

Returns

A sample_shape x batch_shape x q x m-dim tensor of expectation samples.

Return type

torch.Tensor

class botorch.acquisition.multi_objective.multi_output_risk_measures.IndependentCVaR(alpha, n_w, weights=None)[source]

The multi-output Conditional Value-at-Risk risk measure that operates on each output dimension independently. Since this does not consider the joint distribution of the outputs (i.e., that the outputs were evaluated on same perturbed input and are not independent), the risk estimates provided by IndependentCVaR in general are more optimistic than the definition of CVaR would suggest.

The Conditional Value-at-Risk measures the expectation of the worst outcomes (small rewards or large losses) with a total probability of 1 - alpha. It is commonly defined as the conditional expectation of the reward function, with the condition that the reward is smaller than the corresponding Value-at-Risk (also defined below).

NOTE: Due to the use of a discrete w_set of samples, the VaR and CVaR calculated here are (possibly biased) Monte-Carlo approximations of the true risk measures.

Transform the posterior samples to samples of a risk measure.

Parameters
• alpha (float) – The risk level, float in (0.0, 1.0].

• n_w (int) – The size of the w_set to calculate the risk measure over.

• weights (Optional[torch.Tensor]) – An optional m-dim tensor of weights for scalarizing multi-objective samples before calculating the risk measure.

Return type

None

forward(samples, X=None)[source]

Calculate the CVaR corresponding to the given samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that each n_w block of samples correspond to the same input.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Ignored.

Returns

A sample_shape x batch_shape x q x m-dim tensor of CVaR samples.

Return type

torch.Tensor

class botorch.acquisition.multi_objective.multi_output_risk_measures.IndependentVaR(alpha, n_w, weights=None)[source]

The multi-output Value-at-Risk risk measure that operates on each output dimension independently. For the same reasons as IndependentCVaR, the risk estimates provided by this are in general more optimistic than the definition of VaR would suggest.

Value-at-Risk measures the smallest possible reward (or largest possible loss) after excluding the worst outcomes with a total probability of 1 - alpha. It is commonly used in financial risk management, and it corresponds to the 1 - alpha quantile of a given random variable.

Transform the posterior samples to samples of a risk measure.

Parameters
• alpha (float) – The risk level, float in (0.0, 1.0].

• n_w (int) – The size of the w_set to calculate the risk measure over.

• weights (Optional[torch.Tensor]) – An optional m-dim tensor of weights for scalarizing multi-objective samples before calculating the risk measure.

Return type

None

forward(samples, X=None)[source]

Calculate the VaR corresponding to the given samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that each n_w block of samples correspond to the same input.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Ignored.

Returns

A sample_shape x batch_shape x q x m-dim tensor of VaR samples.

Return type

torch.Tensor

class botorch.acquisition.multi_objective.multi_output_risk_measures.MultiOutputWorstCase(n_w, weights=None)[source]

The multi-output worst-case risk measure.

Transform the posterior samples to samples of a risk measure.

Parameters
• n_w (int) – The size of the w_set to calculate the risk measure over.

• weights (Optional[torch.Tensor]) – An optional m-dim tensor of weights for scaling multi-output samples before calculating the risk measure. This can also be used to make sure that all outputs are correctly aligned for maximization by negating those that are originally defined for minimization.

Return type

None

forward(samples, X=None)[source]

Calculate the worst-case measure corresponding to the given samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that each n_w block of samples correspond to the same input.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Ignored.

Returns

A sample_shape x batch_shape x q x m-dim tensor of worst-case samples.

Return type

torch.Tensor

class botorch.acquisition.multi_objective.multi_output_risk_measures.MVaR(n_w, alpha, expectation=False, weights=None, pad_to_n_w=False, filter_dominated=True)[source]

The multivariate Value-at-Risk as introduced in [Prekopa2012MVaR].

MVaR is defined as the non-dominated set of points in the extended domain of the random variable that have multivariate CDF greater than or equal to alpha. Note that MVaR is set valued and the size of the set depends on the particular realizations of the random variable. [Cousin2013MVaR] instead propose to use the expectation of the set-valued MVaR as the multivariate VaR. We support this alternative with an expectation flag.

The multivariate Value-at-Risk.

Parameters
• n_w (int) – The size of the w_set to calculate the risk measure over.

• alpha (float) – The risk level of MVaR, float in (0.0, 1.0]. Each MVaR value dominates alpha fraction of all observations.

• expectation (bool) – If True, returns the expectation of the MVaR set as is done in [Cousin2013MVaR]. Otherwise, it returns the union of all values in the MVaR set. Default: False.

• weights (Optional[torch.Tensor]) – An optional m-dim tensor of weights for scaling multi-output samples before calculating the risk measure. This can also be used to make sure that all outputs are correctly aligned for maximization by negating those that are originally defined for minimization.

• pad_to_n_w (bool) – If True, instead of padding up to k’, which is the size of the largest MVaR set across all batches, we pad the MVaR set up to n_w. This produces a return tensor of known size, however, it may in general be much larger than the alternative. See forward for more details on the return shape. NOTE: this is only relevant if expectation=False.

• filter_dominated (bool) – If True, returns the non-dominated subset of alpha level points (this is MVaR as defined by [Prekopa2012MVaR]). Disabling this will make it faster, and may be preferable if the dominated points will be filtered out later, e.g., while calculating the hypervolume. Disabling this is not recommended if expectation=True.

Return type

None

get_mvar_set_cpu(Y)[source]

Find MVaR set based on the definition in [Prekopa2012MVaR].

NOTE: This is much faster on CPU for large n_w than the alternative but it is significantly slower on GPU. Based on empirical evidence, this is recommended when running on CPU with n_w > 64.

This first calculates the CDF for each point on the extended domain of the random variable (the grid defined by the given samples), then takes the values with CDF equal to (rounded if necessary) alpha. The non-dominated subset of these form the MVaR set.

Parameters

Y (torch.Tensor) – A batch x n_w x m-dim tensor of outcomes. This is currently restricted to m = 2 objectives. TODO: Support m > 2 objectives.

Returns

A batch length list of k x m-dim tensor of MVaR values, where k depends on the corresponding batch inputs. Note that MVaR values in general are not in-sample points.

Return type

torch.Tensor

get_mvar_set_gpu(Y)[source]

Find MVaR set based on the definition in [Prekopa2012MVaR].

NOTE: This is much faster on GPU than the alternative but it scales very poorly on CPU as n_w increases. This should be preferred if a GPU is available or when n_w <= 64. In addition, this supports m >= 2 outcomes (vs m = 2 for the CPU version) and it should be used if m > 2.

This first calculates the CDF for each point on the extended domain of the random variable (the grid defined by the given samples), then takes the values with CDF equal to (rounded if necessary) alpha. The non-dominated subset of these form the MVaR set.

Parameters

Y (torch.Tensor) – A batch x n_w x m-dim tensor of observations.

Returns

A batch length list of k x m-dim tensor of MVaR values, where k depends on the corresponding batch inputs. Note that MVaR values in general are not in-sample points.

Return type

torch.Tensor

forward(samples, X=None)[source]

Calculate the MVaR corresponding to the given samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x (q * n_w) x m-dim tensor of posterior samples. The q-batches should be ordered so that each n_w block of samples correspond to the same input.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Ignored.

Returns

A sample_shape x batch_shape x q x m-dim tensor of MVaR values, if self.expectation=True. Otherwise, this returns a sample_shape x batch_shape x (q * k’) x m-dim tensor, where k’ is the maximum k across all batches that is returned by get_mvar_set_…. Each (q * k’) x m corresponds to the k MVaR values for each q batch of n_w inputs, padded up to k’ by repeating the last element. If self.pad_to_n_w, we set k’ = self.n_w, producing a deterministic return shape.

Return type

torch.Tensor

## Utilities¶

### Fixed Feature Acquisition Function¶

A wrapper around AquisitionFunctions to fix certain features for optimization. This is useful e.g. for performing contextual optimization.

class botorch.acquisition.fixed_feature.FixedFeatureAcquisitionFunction(acq_function, d, columns, values)[source]

A wrapper around AquisitionFunctions to fix a subset of features.

Example

>>> model = SingleTaskGP(train_X, train_Y)  # d = 5
>>> qEI = qExpectedImprovement(model, best_f=0.0)
>>> columns = [2, 4]
>>> values = X[..., columns]
>>> qEI_FF = FixedFeatureAcquisitionFunction(qEI, 5, columns, values)
>>> qei = qEI_FF(test_X)  # d' = 3


Derived Acquisition Function by fixing a subset of input features.

Parameters
• acq_function (AcquisitionFunction) – The base acquisition function, operating on input tensors X_full of feature dimension d.

• d (int) – The feature dimension expected by acq_function.

• columns (List[int]) – d_f < d indices of columns in X_full that are to be fixed to the provided values.

• values (Union[Tensor, Sequence[Union[Tensor, float]]]) – The values to which to fix the columns in columns. Either a full batch_shape x q x d_f tensor of values (if values are different for each of the q input points), or an array-like of values that is broadcastable to the input across t-batch and q-batch dimensions, e.g. a list of length d_f if values are the same across all t and q-batch dimensions, or a combination of Tensors and numbers which can be broadcasted to form a tensor with trailing dimension size of d_f.

Return type

None

forward(X)[source]

Evaluate base acquisition function under the fixed features.

Parameters

X (torch.Tensor) – Input tensor of feature dimension d’ < d such that d’ + d_f = d.

Returns

Base acquisition function evaluated on tensor X_full constructed by adding values in the appropriate places (see _construct_X_full).

training: bool

### Constructors for Acquisition Function Input Arguments¶

A registry of helpers for generating inputs to acquisition function constructors programmatically from a consistent input format.

botorch.acquisition.input_constructors.get_acqf_input_constructor(acqf_cls)[source]

Get acqusition function input constructor from registry.

Parameters

acqf_cls (Type[botorch.acquisition.acquisition.AcquisitionFunction]) – The AcquisitionFunction class (not instance) for which to retrieve the input constructor.

Returns

The input constructor associated with acqf_cls.

Return type

Callable[[…], Dict[str, Any]]

botorch.acquisition.input_constructors.acqf_input_constructor(*acqf_cls)[source]

Decorator for registering acquisition function input constructors.

Parameters

acqf_cls (Type[botorch.acquisition.acquisition.AcquisitionFunction]) – The AcquisitionFunction classes (not instances) for which to register the input constructor.

Return type

Callable[[…], botorch.acquisition.acquisition.AcquisitionFunction]

botorch.acquisition.input_constructors.construct_inputs_analytic_base(model, training_data, posterior_transform=None, **kwargs)[source]

Construct kwargs for basic analytic acquisition functions.

Parameters
Returns

A dict mapping kwarg names of the constructor to values.

Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_best_f(model, training_data, posterior_transform=None, maximize=True, **kwargs)[source]

Construct kwargs for the acquisition functions requiring best_f.

Parameters
Returns

A dict mapping kwarg names of the constructor to values.

Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_ucb(model, training_data, posterior_transform=None, beta=0.2, maximize=True, **kwargs)[source]

Construct kwargs for UpperConfidenceBound.

Parameters
• model (botorch.models.model.Model) – The model to be used in the acquisition function.

• training_data (botorch.utils.containers.TrainingData) – A TrainingData object contraining the model’s training data. best_f is extracted from here.

• posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – The posterior transform to be used in the acquisition function.

• beta (Union[float, torch.Tensor]) – Either a scalar or a one-dim tensor with b elements (batch mode) representing the trade-off parameter between mean and covariance

• maximize (bool) – If True, consider the problem a maximization problem.

• kwargs (Any) –

Returns

A dict mapping kwarg names of the constructor to values.

Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_constrained_ei(model, training_data, objective_index, constraints, maximize=True, **kwargs)[source]

Construct kwargs for ConstrainedExpectedImprovement.

Parameters
• model (botorch.models.model.Model) – The model to be used in the acquisition function.

• training_data (botorch.utils.containers.TrainingData) – A TrainingData object contraining the model’s training data. best_f is extracted from here.

• objective_index (int) – The index of the objective.

• constraints (Dict[int, Tuple[Optional[float], Optional[float]]]) – A dictionary of the form {i: [lower, upper]}, where i is the output index, and lower and upper are lower and upper bounds on that output (resp. interpreted as -Inf / Inf if None)

• maximize (bool) – If True, consider the problem a maximization problem.

• kwargs (Any) –

Returns

A dict mapping kwarg names of the constructor to values.

Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_noisy_ei(model, training_data, num_fantasies=20, maximize=True, **kwargs)[source]

Construct kwargs for NoisyExpectedImprovement.

Parameters
• model (botorch.models.model.Model) – The model to be used in the acquisition function.

• training_data (botorch.utils.containers.TrainingData) – A TrainingData object contraining the model’s training data. best_f is extracted from here.

• num_fantasies (int) – The number of fantasies to generate. The higher this number the more accurate the model (at the expense of model complexity and performance).

• maximize (bool) – If True, consider the problem a maximization problem.

• kwargs (Any) –

Returns

A dict mapping kwarg names of the constructor to values.

Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_mc_base(model, training_data, objective=None, posterior_transform=None, X_pending=None, sampler=None, **kwargs)[source]

Construct kwargs for basic MC acquisition functions.

Parameters
Returns

A dict mapping kwarg names of the constructor to values.

Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_qEI(model, training_data, objective=None, posterior_transform=None, X_pending=None, sampler=None, **kwargs)[source]

Construct kwargs for the qExpectedImprovement constructor.

Parameters
Returns

A dict mapping kwarg names of the constructor to values.

Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_qNEI(model, training_data, objective=None, posterior_transform=None, X_pending=None, sampler=None, X_baseline=None, prune_baseline=False, **kwargs)[source]

Construct kwargs for the qNoisyExpectedImprovement constructor.

Parameters
• model (botorch.models.model.Model) – The model to be used in the acquisition function.

• training_data (botorch.utils.containers.TrainingData) – A TrainingData object contraining the model’s training data. Used e.g. to extract inputs such as best_f for expected improvement acquisition functions. Only block- design training data currently supported.

• objective (Optional[botorch.acquisition.objective.MCAcquisitionObjective]) – The objective to be used in the acquisition function.

• posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – The posterior transform to be used in the acquisition function.

• X_pending (Optional[torch.Tensor]) – A m x d-dim Tensor of m design points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call.

• sampler (Optional[botorch.sampling.samplers.MCSampler]) – The sampler used to draw base samples. If omitted, uses the acquisition functions’s default sampler.

• X_baseline (Optional[torch.Tensor]) – A batch_shape x r x d-dim Tensor of r design points that have already been observed. These points are considered as the potential best design point. If omitted, use training_data.X.

• prune_baseline (bool) – If True, remove points in X_baseline that are highly unlikely to be the best point. This can significantly improve performance and is generally recommended.

• kwargs (Any) –

Returns

A dict mapping kwarg names of the constructor to values.

Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_qPI(model, training_data, objective=None, posterior_transform=None, X_pending=None, sampler=None, tau=0.001, best_f=None, **kwargs)[source]

Construct kwargs for the qProbabilityOfImprovement constructor.

Parameters
• model (botorch.models.model.Model) – The model to be used in the acquisition function.

• training_data (botorch.utils.containers.TrainingData) – A TrainingData object contraining the model’s training data. Used e.g. to extract inputs such as best_f for expected improvement acquisition functions.

• objective (Optional[botorch.acquisition.objective.MCAcquisitionObjective]) – The objective to be used in the acquisition function.

• posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – The posterior transform to be used in the acquisition function.

• X_pending (Optional[torch.Tensor]) – A m x d-dim Tensor of m design points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call.

• sampler (Optional[botorch.sampling.samplers.MCSampler]) – The sampler used to draw base samples. If omitted, uses the acquisition functions’s default sampler.

• tau (float) – The temperature parameter used in the sigmoid approximation of the step function. Smaller values yield more accurate approximations of the function, but result in gradients estimates with higher variance.

• best_f (Optional[Union[float, torch.Tensor]]) – The best objective value observed so far (assumed noiseless). Can be a batch_shape-shaped tensor, which in case of a batched model specifies potentially different values for each element of the batch.

• kwargs (Any) –

Returns

A dict mapping kwarg names of the constructor to values.

Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_qUCB(model, training_data, objective=None, posterior_transform=None, X_pending=None, sampler=None, beta=0.2, **kwargs)[source]

Construct kwargs for the qUpperConfidenceBound constructor.

Parameters
• model (botorch.models.model.Model) – The model to be used in the acquisition function.

• training_data (botorch.utils.containers.TrainingData) – A TrainingData object contraining the model’s training data. Used e.g. to extract inputs such as best_f for expected improvement acquisition functions.

• objective (Optional[botorch.acquisition.objective.MCAcquisitionObjective]) – The objective to be used in the acquisition function.

• posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – The posterior transform to be used in the acquisition function.

• X_pending (Optional[torch.Tensor]) – A m x d-dim Tensor of m design points that have been submitted for function evaluation but have not yet been evaluated. Concatenated into X upon forward call.

• sampler (Optional[botorch.sampling.samplers.MCSampler]) – The sampler used to draw base samples. If omitted, uses the acquisition functions’s default sampler.

• beta (float) – Controls tradeoff between mean and standard deviation in UCB.

• kwargs (Any) –

Returns

A dict mapping kwarg names of the constructor to values.

Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_EHVI(model, training_data, objective_thresholds, objective=None, **kwargs)[source]

Construct kwargs for ExpectedHypervolumeImprovement constructor.

Parameters
Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_qEHVI(model, training_data, objective_thresholds, objective=None, **kwargs)[source]

Construct kwargs for qExpectedHypervolumeImprovement constructor.

Parameters
Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_qNEHVI(model, training_data, objective_thresholds, objective=None, **kwargs)[source]

Construct kwargs for qNoisyExpectedHypervolumeImprovement constructor.

Parameters
Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_qMES(model, training_data, bounds, objective=None, posterior_transform=None, candidate_size=1000, **kwargs)[source]

Construct kwargs for qMaxValueEntropy constructor.

Parameters
Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_mf_base(model, training_data, target_fidelities, fidelity_weights=None, cost_intercept=1.0, num_trace_observations=0, **ignore)[source]

Construct kwargs for a multifidetlity acquisition function’s constructor.

Parameters
Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_qKG(model, training_data, bounds, objective=None, posterior_transform=None, target_fidelities=None, num_fantasies=64, **kwargs)[source]

Construct kwargs for qKnowledgeGradient constructor.

Parameters
Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_qMFKG(model, training_data, bounds, target_fidelities, objective=None, posterior_transform=None, **kwargs)[source]

Construct kwargs for qMultiFidelityKnowledgeGradient constructor.

Parameters
Return type

Dict[str, Any]

botorch.acquisition.input_constructors.construct_inputs_qMFMES(model, training_data, bounds, target_fidelities, objective=None, posterior_transform=None, **kwargs)[source]

Construct kwargs for qMultiFidelityMaxValueEntropy constructor.

Parameters
Return type

Dict[str, Any]

botorch.acquisition.input_constructors.get_best_f_analytic(training_data, posterior_transform=None, **kwargs)[source]
Parameters
Return type

torch.Tensor

botorch.acquisition.input_constructors.get_best_f_mc(training_data, objective=None, posterior_transform=None)[source]
Parameters
Return type

torch.Tensor

botorch.acquisition.input_constructors.optimize_objective(model, bounds, q, objective=None, posterior_transform=None, linear_constraints=None, fixed_features=None, target_fidelities=None, qmc=True, mc_samples=512, seed_inner=None, optimizer_options=None, post_processing_func=None, batch_initial_conditions=None, sequential=False, **ignore)[source]

Optimize an objective under the given model.

Parameters
• model (botorch.models.model.Model) – The model to be used in the objective.

• bounds (torch.Tensor) – A 2 x d tensor of lower and upper bounds for each column of X.

• q (int) – The cardinality of input sets on which the objective is to be evaluated.

• objective (Optional[botorch.acquisition.objective.MCAcquisitionObjective]) – The objective to optimize.

• posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – The posterior transform to be used in the acquisition function.

• linear_constraints (Optional[Tuple[torch.Tensor, torch.Tensor]]) – A tuple of (A, b). Given k linear constraints on a d-dimensional space, A is k x d and b is k x 1 such that A x <= b. (Not used by single task models).

• fixed_features (Optional[Dict[int, float]]) – A dictionary of feature assignments {feature_index: value} to hold fixed during generation.

• target_fidelities (Optional[Dict[int, float]]) – A dictionary mapping input feature indices to fidelity values. Defaults to {-1: 1.0}.

• qmc (bool) – Toggle for enabling (qmc=1) or disabling (qmc=0) use of Quasi Monte Carlo.

• mc_samples (int) – Integer number of samples used to estimate Monte Carlo objectives.

• seed_inner (Optional[int]) – Integer seed used to initialize the sampler passed to MCObjective.

• optimizer_options (Optional[Dict[str, Any]]) – Table used to lookup keyword arguments for the optimizer.

• post_processing_func (Optional[Callable[[torch.Tensor], torch.Tensor]]) – A function that post-processes an optimization result appropriately (i.e. according to round-trip transformations).

• batch_initial_conditions (Optional[torch.Tensor]) – A Tensor of initial values for the optimizer.

• sequential (bool) – If False, uses joint optimization, otherwise uses sequential optimization.

Returns

A tuple containing the best input locations and corresponding objective values.

Return type

Tuple[torch.Tensor, torch.Tensor]

### Penalized Acquisition Function Wrapper¶

Modules to add regularization to acquisition functions.

class botorch.acquisition.penalized.L2Penalty(init_point)[source]

Bases: torch.nn.modules.module.Module

L2 penalty class to be added to any arbitrary acquisition function to construct a PenalizedAcquisitionFunction.

Initializing L2 regularization.

Parameters

init_point (Tensor) – The “1 x dim” reference point against which we want to regularize.

forward(X)[source]
Parameters

X (torch.Tensor) – A “batch_shape x q x dim” representing the points to be evaluated.

Returns

A tensor of size “batch_shape” representing the acqfn for each q-batch.

Return type

torch.Tensor

training: bool
class botorch.acquisition.penalized.L1Penalty(init_point)[source]

Bases: torch.nn.modules.module.Module

L1 penalty class to be added to any arbitrary acquisition function to construct a PenalizedAcquisitionFunction.

Initializing L1 regularization.

Parameters

init_point (Tensor) – The “1 x dim” reference point against which we want to regularize.

forward(X)[source]
Parameters

X (torch.Tensor) – A “batch_shape x q x dim” representing the points to be evaluated.

Returns

A tensor of size “batch_shape” representing the acqfn for each q-batch.

Return type

torch.Tensor

training: bool
class botorch.acquisition.penalized.GaussianPenalty(init_point, sigma)[source]

Bases: torch.nn.modules.module.Module

Gaussian penalty class to be added to any arbitrary acquisition function to construct a PenalizedAcquisitionFunction.

Initializing Gaussian regularization.

Parameters
• init_point (Tensor) – The “1 x dim” reference point against which we want to regularize.

• sigma (float) – The parameter used in gaussian function.

forward(X)[source]
Parameters

X (torch.Tensor) – A “batch_shape x q x dim” representing the points to be evaluated.

Returns

A tensor of size “batch_shape” representing the acqfn for each q-batch.

Return type

torch.Tensor

training: bool
class botorch.acquisition.penalized.GroupLassoPenalty(init_point, groups)[source]

Bases: torch.nn.modules.module.Module

Group lasso penalty class to be added to any arbitrary acquisition function to construct a PenalizedAcquisitionFunction.

Initializing Group-Lasso regularization.

Parameters
• init_point (Tensor) – The “1 x dim” reference point against which we want to regularize.

• groups (List[List[int]]) – Groups of indices used in group lasso.

forward(X)[source]

X should be batch_shape x 1 x dim tensor. Evaluation for q-batch is not implemented yet.

Parameters

X (torch.Tensor) –

Return type

torch.Tensor

training: bool
class botorch.acquisition.penalized.PenalizedAcquisitionFunction(raw_acqf, penalty_func, regularization_parameter)[source]

Single-outcome acquisition function regularized by the given penalty.

The usage is similar to:

raw_acqf = NoisyExpectedImprovement(…) penalty = GroupLassoPenalty(…) acqf = PenalizedAcquisitionFunction(raw_acqf, penalty)

Initializing Group-Lasso regularization.

Parameters
• raw_acqf (AcquisitionFunction) – The raw acquisition function that is going to be regularized.

• penalty_func (torch.nn.Module) – The regularization function.

• regularization_parameter (float) – Regularization parameter used in optimization.

Return type

None

forward(X)[source]

Evaluate the acquisition function on the candidate set X.

Parameters

X (torch.Tensor) – A (b) x q x d-dim Tensor of (b) t-batches with q d-dim design points each.

Returns

A (b)-dim Tensor of acquisition function values at the given design points X.

Return type

torch.Tensor

property X_pending: Optional[torch.Tensor]
set_X_pending(X_pending=None)[source]

Informs the acquisition function about pending design points.

Parameters

X_pending (Optional[torch.Tensor]) – n x d Tensor with n d-dim design points that have been submitted for evaluation but have not yet been evaluated.

Return type

None

training: bool
botorch.acquisition.penalized.group_lasso_regularizer(X, groups)[source]

Computes the group lasso regularization function for the given point.

Parameters
• X (torch.Tensor) – A bxd tensor representing the points to evaluate the regularization at.

• groups (List[List[int]]) – List of indices of different groups.

Returns

Computed group lasso norm of at the given points.

Return type

torch.Tensor

class botorch.acquisition.penalized.L1PenaltyObjective(init_point)[source]

Bases: torch.nn.modules.module.Module

L1 penalty objective class. An instance of this class can be added to any arbitrary objective to construct a PenalizedMCObjective.

Initializing L1 penalty objective.

Parameters

init_point (Tensor) – The “1 x dim” reference point against which we want to regularize.

forward(X)[source]
Parameters

X (torch.Tensor) – A “batch_shape x q x dim” representing the points to be evaluated.

Returns

A “1 x batch_shape x q” tensor representing the penalty for each point. The first dimension corresponds to the dimension of MC samples.

Return type

torch.Tensor

training: bool
class botorch.acquisition.penalized.PenalizedMCObjective(objective, penalty_objective, regularization_parameter)[source]

Penalized MC objective.

Allows to construct a penaltized MC-objective by adding a penalty term to the original objective.

mc_acq(X) = objective(X) + penalty_objective(X)

Note: PenalizedMCObjective allows adding penalty at the MCObjective level, different from the AcquisitionFunction level in PenalizedAcquisitionFunction.

Example

>>> regularization_parameter = 0.01
>>> init_point = torch.zeros(3) # assume data dim is 3
>>> objective = lambda Y, X: torch.sqrt(Y).sum(dim=-1)
>>> l1_penalty_objective = L1PenaltyObjective(init_point=init_point)
>>> l1_penalized_objective = PenalizedMCObjective(
objective, l1_penalty_objective, regularization_parameter
)
>>> samples = sampler(posterior)
objective, l1_penalty_objective, regularization_parameter


Penalized MC objective.

Parameters
• objective (Callable[[Tensor, Optional[Tensor]], Tensor]) – A callable f(samples, X) mapping a sample_shape x batch-shape x q x m-dim Tensor samples and an optional batch-shape x q x d-dim Tensor X to a sample_shape x batch-shape x q-dim Tensor of objective values.

• penalty_objective (torch.nn.Module) – A torch.nn.Module f(X) that takes in a batch-shape x q x d-dim Tensor X and outputs a 1 x batch-shape x q-dim Tensor of penalty objective values.

• regularization_parameter (float) – weight of the penalty (regularization) term

Return type

None

forward(samples, X=None)[source]

Evaluate the penalized objective on the samples.

Parameters
• samples (torch.Tensor) – A sample_shape x batch_shape x q x m-dim Tensors of samples from a model posterior.

• X (Optional[torch.Tensor]) – A batch_shape x q x d-dim tensor of inputs. Relevant only if the objective depends on the inputs explicitly.

Returns

A sample_shape x batch_shape x q-dim Tensor of objective values with penalty added for each point.

Return type

torch.Tensor

### Proximal Acquisition Function Wrapper¶

A wrapper around AcquisitionFunctions to add proximal weighting of the acquisition function.

class botorch.acquisition.proximal.ProximalAcquisitionFunction(acq_function, proximal_weights)[source]

A wrapper around AcquisitionFunctions to add proximal weighting of the acquisition function. Acquisition function is weighted via a squared exponential centered at the last training point, with varying lengthscales corresponding to proximal_weights. Can only be used with acquisition functions based on single batch models.

Small values of proximal_weights corresponds to strong biasing towards recently observed points, which smoothes optimization with a small potential decrese in convergence rate.

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> EI = ExpectedImprovement(model, best_f=0.0)
>>> proximal_weights = torch.ones(d)
>>> EI_proximal = ProximalAcquisitionFunction(EI, proximal_weights)
>>> eip = EI_proximal(test_X)


Derived Acquisition Function weighted by proximity to recently observed point.

Parameters
• acq_function (AcquisitionFunction) – The base acquisition function, operating on input tensors of feature dimension d.

• proximal_weights (Tensor) – A d dim tensor used to bias locality along each axis.

Return type

None

forward(X)[source]

Evaluate base acquisition function with proximal weighting.

Parameters

X (torch.Tensor) – Input tensor of feature dimension d .

Returns

Base acquisition function evaluated on tensor X multiplied by proximal weighting.

Return type

torch.Tensor

training: bool

### General Utilities for Acquisition Functions¶

Utilities for acquisition functions.

botorch.acquisition.utils.get_acquisition_function(acquisition_function_name, model, objective, X_observed, posterior_transform=None, X_pending=None, constraints=None, mc_samples=500, qmc=True, seed=None, **kwargs)[source]

Convenience function for initializing botorch acquisition functions.

Parameters
• acquisition_function_name (str) – Name of the acquisition function.

• model (botorch.models.model.Model) – A fitted model.

• objective (botorch.acquisition.objective.MCAcquisitionObjective) – A MCAcquisitionObjective.

• X_observed (torch.Tensor) – A m1 x d-dim Tensor of m1 design points that have already been observed.

• posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – A PosteriorTransform (optional).

• X_pending (Optional[torch.Tensor]) – A m2 x d-dim Tensor of m2 design points whose evaluation is pending.

• constraints (Optional[List[Callable[[torch.Tensor], torch.Tensor]]]) – A list of callables, each mapping a Tensor of dimension sample_shape x batch-shape x q x m to a Tensor of dimension sample_shape x batch-shape x q, where negative values imply feasibility. Used when constraint_transforms are not passed as part of the objective.

• mc_samples (int) – The number of samples to use for (q)MC evaluation of the acquisition function.

• qmc (bool) – If True, use quasi-Monte-Carlo sampling (instead of iid).

• seed (Optional[int]) – If provided, perform deterministic optimization (i.e. the function to optimize is fixed and not stochastic).

Returns

The requested acquisition function.

Return type

botorch.acquisition.monte_carlo.MCAcquisitionFunction

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> obj = LinearMCObjective(weights=torch.tensor([1.0, 2.0]))
>>> acqf = get_acquisition_function("qEI", model, obj, train_X)

botorch.acquisition.utils.get_infeasible_cost(X, model, objective=None, posterior_transform=None)[source]

Get infeasible cost for a model and objective.

Computes an infeasible cost M such that -M < min_x f(x) almost always,

so that feasible points are preferred.

Parameters
• X (torch.Tensor) – A n x d Tensor of n design points to use in evaluating the minimum. These points should cover the design space well. The more points the better the estimate, at the expense of added computation.

• model (botorch.models.model.Model) – A fitted botorch model.

• objective (Optional[Callable[[torch.Tensor, Optional[torch.Tensor]], torch.Tensor]]) – The objective with which to evaluate the model output.

• posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – A PosteriorTransform (optional).

Returns

The infeasible cost M value.

Return type

float

Example

>>> model = SingleTaskGP(train_X, train_Y)
>>> objective = lambda Y: Y[..., -1] ** 2
>>> M = get_infeasible_cost(train_X, model, obj)

botorch.acquisition.utils.is_nonnegative(acq_function)[source]

Determine whether a given acquisition function is non-negative.

Parameters

acq_function (botorch.acquisition.acquisition.AcquisitionFunction) – The AcquisitionFunction instance.

Returns

True if acq_function is non-negative, False if not, or if the behavior is unknown (for custom acquisition functions).

Return type

bool

Example

>>> qEI = qExpectedImprovement(model, best_f=0.1)
>>> is_nonnegative(qEI)  # returns True

botorch.acquisition.utils.prune_inferior_points(model, X, objective=None, posterior_transform=None, num_samples=2048, max_frac=1.0, sampler=None, marginalize_dim=None)[source]

Prune points from an input tensor that are unlikely to be the best point.

Given a model, an objective, and an input tensor X, this function returns the subset of points in X that have some probability of being the best point under the objective. This function uses sampling to estimate the probabilities, the higher the number of points n in X the higher the number of samples num_samples should be to obtain accurate estimates.

Parameters
• model (botorch.models.model.Model) – A fitted model. Batched models are currently not supported.

• X (torch.Tensor) – An input tensor of shape n x d. Batched inputs are currently not supported.

• objective (Optional[botorch.acquisition.objective.MCAcquisitionObjective]) – The objective under which to evaluate the posterior.

• posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – A PosteriorTransform (optional).

• num_samples (int) – The number of samples used to compute empirical probabilities of being the best point.

• max_frac (float) – The maximum fraction of points to retain. Must satisfy 0 < max_frac <= 1. Ensures that the number of elements in the returned tensor does not exceed ceil(max_frac * n).

• sampler (Optional[botorch.sampling.samplers.MCSampler]) – If provided, will use this customized sampler instead of automatically constructing one with num_samples.

• marginalize_dim (Optional[int]) – A batch dimension that should be marginalized. For example, this is useful when using a batched fully Bayesian model.

Returns

A n’ x d with subset of points in X, where

n’ = min(N_nz, ceil(max_frac * n))

with N_nz the number of points in X that have non-zero (empirical, under num_samples samples) probability of being the best point.

Return type

torch.Tensor

botorch.acquisition.utils.project_to_target_fidelity(X, target_fidelities=None)[source]

Project X onto the target set of fidelities.

This function assumes that the set of feasible fidelities is a box, so projecting here just means setting each fidelity parameter to its target value.

Parameters
• X (torch.Tensor) – A batch_shape x q x d-dim Tensor of with q d-dim design points for each t-batch.

• target_fidelities (Optional[Dict[int, float]]) – A dictionary mapping a subset of columns of X (the fidelity parameters) to their respective target fidelity value. If omitted, assumes that the last column of X is the fidelity parameter with a target value of 1.0.

Returns

A batch_shape x q x d-dim Tensor X_proj with fidelity parameters

projected to the provided fidelity values.

Return type

torch.Tensor

botorch.acquisition.utils.expand_trace_observations(X, fidelity_dims=None, num_trace_obs=0)[source]

Expand X with trace observations.

Expand a tensor of inputs with “trace observations” that are obtained during the evaluation of the candidate set. This is used in multi-fidelity optimization. It can be though of as augmenting the q-batch with additional points that are the expected trace observations.

Let f_i be the i-th fidelity parameter. Then this functions assumes that for each element of the q-batch, besides the fidelity f_i, we will observe additonal fidelities f_i1, …, f_iK, where K = num_trace_obs, during evaluation of the candidate set X. Specifically, this function assumes that f_ij = (K-j) / (num_trace_obs + 1) * f_i for all i. That is, the expansion is performed in parallel for all fidelities (it does not expand out all possible combinations).

Parameters
• X (torch.Tensor) – A batch_shape x q x d-dim Tensor of with q d-dim design points (incl. the fidelity parameters) for each t-batch.

• fidelity_dims (Optional[List[int]]) – The indices of the fidelity parameters. If omitted, assumes that the last column of X contains the fidelity parameters.

• num_trace_obs (int) – The number of trace observations to use.

Returns

A batch_shape x (q + num_trace_obs x q) x d Tensor X_expanded that

expands X with trace observations.

Return type

torch.Tensor

botorch.acquisition.utils.project_to_sample_points(X, sample_points)[source]

Augment X with sample points at which to take weighted average.

Parameters
• X (torch.Tensor) – A batch_shape x 1 x d-dim Tensor of with one d`-dim design points for each t-batch.

• sample_points (torch.Tensor) – p x d’-dim Tensor (d’ < d) of d’-dim sample points at which to compute the expectation. The d’-dims refer to the trailing columns of X.

Returns

A batch_shape x p x d Tensor where the q-batch includes the p sample points.

Return type

torch.Tensor

### Multi-Objective Utilities for Acquisition Functions¶

Utilities for multi-objective acquisition functions.

botorch.acquisition.multi_objective.utils.get_default_partitioning_alpha(num_objectives)[source]

Determines an approximation level based on the number of objectives.

If alpha is 0, FastNondominatedPartitioning should be used. Otherwise, an approximate NondominatedPartitioning should be used with approximation level alpha.

Parameters

num_objectives (int) – the number of objectives.

Returns

The approximation level alpha.

Return type

float

botorch.acquisition.multi_objective.utils.prune_inferior_points_multi_objective(model, X, ref_point, objective=None, constraints=None, num_samples=2048, max_frac=1.0, marginalize_dim=None)[source]

Prune points from an input tensor that are unlikely to be pareto optimal.

Given a model, an objective, and an input tensor X, this function returns the subset of points in X that have some probability of being pareto optimal, better than the reference point, and feasible. This function uses sampling to estimate the probabilities, the higher the number of points n in X the higher the number of samples num_samples should be to obtain accurate estimates.

Parameters
• model (botorch.models.model.Model) – A fitted model. Batched models are currently not supported.

• X (torch.Tensor) – An input tensor of shape n x d. Batched inputs are currently not supported.

• ref_point (torch.Tensor) – The reference point.

• objective (Optional[botorch.acquisition.multi_objective.objective.MCMultiOutputObjective]) – The objective under which to evaluate the posterior.

• constraints (Optional[List[Callable[[torch.Tensor], torch.Tensor]]]) – A list of callables, each mapping a Tensor of dimension sample_shape x batch-shape x q x m to a Tensor of dimension sample_shape x batch-shape x q, where negative values imply feasibility.

• num_samples (int) – The number of samples used to compute empirical probabilities of being the best point.

• max_frac (float) – The maximum fraction of points to retain. Must satisfy 0 < max_frac <= 1. Ensures that the number of elements in the returned tensor does not exceed ceil(max_frac * n).

• marginalize_dim (Optional[int]) – A batch dimension that should be marginalized. For example, this is useful when using a batched fully Bayesian model.

Returns

A n’ x d with subset of points in X, where

n’ = min(N_nz, ceil(max_frac * n))

with N_nz the number of points in X that have non-zero (empirical, under num_samples samples) probability of being pareto optimal.

Return type

torch.Tensor