botorch.models

Model APIs

Abstract Model API

Abstract base module for all BoTorch models.

class botorch.models.model.Model[source]

Bases: torch.nn.modules.module.Module, abc.ABC

Abstract base class for BoTorch models.

Parameters
  • _has_transformed_inputs – A boolean denoting whether train_inputs are currently stored as transformed or not.

  • _original_train_inputs – A Tensor storing the original train inputs for use in _revert_to_original_inputs. Note that this is necessary since transform / untransform cycle introduces numerical errors which lead to upstream errors during training.

Return type

None

Initializes internal Module state, shared by both nn.Module and ScriptModule.

abstract posterior(X, output_indices=None, observation_noise=False, posterior_transform=None, **kwargs)[source]

Computes the posterior over model outputs at the provided points.

Note: The input transforms should be applied here using

self.transform_inputs(X) after the self.eval() call and before any model.forward or model.likelihood calls.

Parameters
  • X (torch.Tensor) – A b x q x d-dim Tensor, where d is the dimension of the feature space, q is the number of points considered jointly, and b is the batch dimension.

  • output_indices (Optional[List[int]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multi-output). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.

  • observation_noise (bool) – If True, add observation noise to the posterior.

  • posterior_transform (Optional[Callable[[botorch.posteriors.posterior.Posterior], botorch.posteriors.posterior.Posterior]]) – An optional PosteriorTransform.

  • kwargs (Any) –

Returns

A Posterior object, representing a batch of b joint distributions over q points and m outputs each.

Return type

botorch.posteriors.posterior.Posterior

property batch_shape: torch.Size

The batch shape of the model.

This is a batch shape from an I/O perspective, independent of the internal representation of the model (as e.g. in BatchedMultiOutputGPyTorchModel). For a model with m outputs, a test_batch_shape x q x d-shaped input X to the posterior method returns a Posterior object over an output of shape broadcast(test_batch_shape, model.batch_shape) x q x m.

property num_outputs: int

The number of outputs of the model.

subset_output(idcs)[source]

Subset the model along the output dimension.

Parameters

idcs (List[int]) – The output indices to subset the model to.

Returns

A Model object of the same type and with the same parameters as the current model, subset to the specified output indices.

Return type

botorch.models.model.Model

condition_on_observations(X, Y, **kwargs)[source]

Condition the model on new observations.

Parameters
  • X (torch.Tensor) – A batch_shape x n’ x d-dim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).

  • Y (torch.Tensor) – A batch_shape’ x n’ x m-dim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, it is assumed that the missing batch dimensions are the same for all Y.

  • kwargs (Any) –

Returns

A Model object of the same type, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).

Return type

botorch.models.model.Model

fantasize(X, sampler, observation_noise=True, **kwargs)[source]

Construct a fantasy model.

Constructs a fantasy model in the following fashion: (1) compute the model posterior at X (including observation noise if observation_noise=True). (2) sample from this posterior (using sampler) to generate “fake” observations. (3) condition the model on the new fake observations.

Parameters
  • X (torch.Tensor) – A batch_shape x n’ x d-dim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).

  • sampler (botorch.sampling.samplers.MCSampler) – The sampler used for sampling from the posterior at X.

  • observation_noise (bool) – If True, include observation noise.

  • kwargs (Any) –

Returns

The constructed fantasy model.

Return type

botorch.models.model.Model

classmethod construct_inputs(training_data, **kwargs)[source]

Construct kwargs for the Model from TrainingData and other options.

Parameters
Return type

Dict[str, Any]

transform_inputs(X, input_transform=None)[source]

Transform inputs.

Parameters
  • X (torch.Tensor) – A tensor of inputs

  • input_transform (Optional[torch.nn.modules.module.Module]) – A Module that performs the input transformation.

Returns

A tensor of transformed inputs

Return type

torch.Tensor

eval()[source]

Puts the model in eval mode and sets the transformed inputs.

Return type

botorch.models.model.Model

train(mode=True)[source]

Puts the model in train mode and reverts to the original inputs.

Parameters

mode (bool) – A boolean denoting whether to put in train or eval mode. If False, model is put in eval mode.

Return type

botorch.models.model.Model

class botorch.models.model.ModelList(*models)[source]

Bases: botorch.models.model.Model

Container for a list of models.

A multi-output Model represented by a list of independent models.

Parameters
  • *models – A variable number of models.

  • models (Model) –

Return type

None

Example

>>> m_1 = SingleTaskGP(train_X, train_Y
>>> m_2 = GenericDeterministicModel(lambda x: x.sum(dim=-1))
>>> m_12 = ModelList(m_1, m_2)
>>> m_12.predict(test_X)
posterior(X, output_indices=None, observation_noise=False, posterior_transform=None, **kwargs)[source]

Computes the posterior over model outputs at the provided points.

Note: The input transforms should be applied here using

self.transform_inputs(X) after the self.eval() call and before any model.forward or model.likelihood calls.

Parameters
  • X (torch.Tensor) – A b x q x d-dim Tensor, where d is the dimension of the feature space, q is the number of points considered jointly, and b is the batch dimension.

  • output_indices (Optional[List[int]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multi-output). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.

  • observation_noise (bool) – If True, add observation noise to the posterior.

  • posterior_transform (Optional[Callable[[botorch.posteriors.posterior.Posterior], botorch.posteriors.posterior.Posterior]]) – An optional PosteriorTransform.

  • kwargs (Any) –

Returns

A Posterior object, representing a batch of b joint distributions over q points and m outputs each.

Return type

botorch.posteriors.posterior.Posterior

property batch_shape: torch.Size

The batch shape of the model.

This is a batch shape from an I/O perspective, independent of the internal representation of the model (as e.g. in BatchedMultiOutputGPyTorchModel). For a model with m outputs, a test_batch_shape x q x d-shaped input X to the posterior method returns a Posterior object over an output of shape broadcast(test_batch_shape, model.batch_shape) x q x m.

property num_outputs: int

The number of outputs of the model.

Equal to the sum of the number of outputs of the individual models in the ModelList.

subset_output(idcs)[source]

Subset the model along the output dimension.

Parameters

idcs (List[int]) – The output indices to subset the model to. Relative to the overall number of outputs of the model.

Returns

A Model (either a ModelList or one of the submodels) with the outputs subset to the indices in idcs.

Return type

botorch.models.model.Model

Internally, this drops (if single-output) or subsets (if multi-output) the constitutent models and returns them as a ModelList. If the result is a single (possibly subset) model from the list, returns this model (instead of forming a degenerate singe-model ModelList). For instance, if m = ModelList(m1, m2) with m1 a two-output model and m2 a single-output model, then m.subset_output([1]) ` will return the model `m1 subset to its second output.

transform_inputs(X)[source]

Individually transform the inputs for each model.

Parameters

X (torch.Tensor) – A tensor of inputs.

Returns

A list of tensors of transformed inputs.

Return type

List[torch.Tensor]

GPyTorch Model API

Abstract model class for all GPyTorch-based botorch models.

To implement your own, simply inherit from both the provided classes and a GPyTorch Model class such as an ExactGP.

class botorch.models.gpytorch.GPyTorchModel[source]

Bases: botorch.models.model.Model, abc.ABC

Abstract base class for models based on GPyTorch models.

The easiest way to use this is to subclass a model from a GPyTorch model class (e.g. an ExactGP) and this GPyTorchModel. See e.g. SingleTaskGP.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Return type

None

property batch_shape: torch.Size

The batch shape of the model.

This is a batch shape from an I/O perspective, independent of the internal representation of the model (as e.g. in BatchedMultiOutputGPyTorchModel). For a model with m outputs, a test_batch_shape x q x d-shaped input X to the posterior method returns a Posterior object over an output of shape broadcast(test_batch_shape, model.batch_shape) x q x m.

property num_outputs: int

The number of outputs of the model.

posterior(X, observation_noise=False, posterior_transform=None, **kwargs)[source]

Computes the posterior over model outputs at the provided points.

Parameters
  • X (torch.Tensor) – A (batch_shape) x q x d-dim Tensor, where d is the dimension of the feature space and q is the number of points considered jointly.

  • observation_noise (Union[bool, torch.Tensor]) – If True, add the observation noise from the likelihood to the posterior. If a Tensor, use it directly as the observation noise (must be of shape (batch_shape) x q).

  • posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – An optional PosteriorTransform.

  • kwargs (Any) –

Returns

A GPyTorchPosterior object, representing a batch of b joint distributions over q points. Includes observation noise if specified.

Return type

botorch.posteriors.gpytorch.GPyTorchPosterior

condition_on_observations(X, Y, **kwargs)[source]

Condition the model on new observations.

Parameters
  • X (torch.Tensor) – A batch_shape x n’ x d-dim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).

  • Y (torch.Tensor) – A batch_shape’ x n x m-dim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.

  • kwargs (Any) –

Returns

A Model object of the same type, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).

Return type

botorch.models.model.Model

Example

>>> train_X = torch.rand(20, 2)
>>> train_Y = torch.sin(train_X[:, 0]) + torch.cos(train_X[:, 1])
>>> model = SingleTaskGP(train_X, train_Y)
>>> new_X = torch.rand(5, 2)
>>> new_Y = torch.sin(new_X[:, 0]) + torch.cos(new_X[:, 1])
>>> model = model.condition_on_observations(X=new_X, Y=new_Y)
class botorch.models.gpytorch.BatchedMultiOutputGPyTorchModel[source]

Bases: botorch.models.gpytorch.GPyTorchModel

Base class for batched multi-output GPyTorch models with independent outputs.

This model should be used when the same training data is used for all outputs. Outputs are modeled independently by using a different batch for each output.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Return type

None

static get_batch_dimensions(train_X, train_Y)[source]

Get the raw batch shape and output-augmented batch shape of the inputs.

Parameters
  • train_X (torch.Tensor) – A n x d or batch_shape x n x d (batch mode) tensor of training features.

  • train_Y (torch.Tensor) – A n x m or batch_shape x n x m (batch mode) tensor of training observations.

Returns

2-element tuple containing

  • The input_batch_shape

  • The output-augmented batch shape: input_batch_shape x (m)

Return type

Tuple[torch.Size, torch.Size]

property batch_shape: torch.Size

The batch shape of the model.

This is a batch shape from an I/O perspective, independent of the internal representation of the model (as e.g. in BatchedMultiOutputGPyTorchModel). For a model with m outputs, a test_batch_shape x q x d-shaped input X to the posterior method returns a Posterior object over an output of shape broadcast(test_batch_shape, model.batch_shape) x q x m.

posterior(X, output_indices=None, observation_noise=False, posterior_transform=None, **kwargs)[source]

Computes the posterior over model outputs at the provided points.

Parameters
  • X (torch.Tensor) – A (batch_shape) x q x d-dim Tensor, where d is the dimension of the feature space and q is the number of points considered jointly.

  • output_indices (Optional[List[int]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multi-output). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.

  • observation_noise (Union[bool, torch.Tensor]) – If True, add the observation noise from the likelihood to the posterior. If a Tensor, use it directly as the observation noise (must be of shape (batch_shape) x q x m).

  • posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – An optional PosteriorTransform.

  • kwargs (Any) –

Returns

A GPyTorchPosterior object, representing batch_shape joint distributions over q points and the outputs selected by output_indices each. Includes observation noise if specified.

Return type

botorch.posteriors.gpytorch.GPyTorchPosterior

condition_on_observations(X, Y, **kwargs)[source]

Condition the model on new observations.

Parameters
  • X (torch.Tensor) – A batch_shape x n’ x d-dim Tensor, where d is the dimension of the feature space, m is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).

  • Y (torch.Tensor) – A batch_shape’ x n’ x m-dim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.

  • kwargs (Any) –

Returns

A BatchedMultiOutputGPyTorchModel object of the same type with n + n’ training examples, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).

Return type

botorch.models.gpytorch.BatchedMultiOutputGPyTorchModel

Example

>>> train_X = torch.rand(20, 2)
>>> train_Y = torch.cat(
>>>     [torch.sin(train_X[:, 0]), torch.cos(train_X[:, 1])], -1
>>> )
>>> model = SingleTaskGP(train_X, train_Y)
>>> new_X = torch.rand(5, 2)
>>> new_Y = torch.cat([torch.sin(new_X[:, 0]), torch.cos(new_X[:, 1])], -1)
>>> model = model.condition_on_observations(X=new_X, Y=new_Y)
subset_output(idcs)[source]

Subset the model along the output dimension.

Parameters

idcs (List[int]) – The output indices to subset the model to.

Returns

The current model, subset to the specified output indices.

Return type

botorch.models.gpytorch.BatchedMultiOutputGPyTorchModel

class botorch.models.gpytorch.ModelListGPyTorchModel(*models)[source]

Bases: botorch.models.gpytorch.GPyTorchModel, botorch.models.model.ModelList, abc.ABC

Abstract base class for models based on multi-output GPyTorch models.

This is meant to be used with a gpytorch ModelList wrapper for independent evaluation of submodels.

A multi-output Model represented by a list of independent models.

Parameters
  • *models – A variable number of models.

  • models (Model) –

Return type

None

Example

>>> m_1 = SingleTaskGP(train_X, train_Y
>>> m_2 = GenericDeterministicModel(lambda x: x.sum(dim=-1))
>>> m_12 = ModelList(m_1, m_2)
>>> m_12.predict(test_X)
property batch_shape: torch.Size

The batch shape of the model.

This is a batch shape from an I/O perspective, independent of the internal representation of the model (as e.g. in BatchedMultiOutputGPyTorchModel). For a model with m outputs, a test_batch_shape x q x d-shaped input X to the posterior method returns a Posterior object over an output of shape broadcast(test_batch_shape, model.batch_shape) x q x m.

posterior(X, output_indices=None, observation_noise=False, posterior_transform=None, **kwargs)[source]

Computes the posterior over model outputs at the provided points.

Parameters
  • X (torch.Tensor) – A b x q x d-dim Tensor, where d is the dimension of the feature space, q is the number of points considered jointly, and b is the batch dimension.

  • output_indices (Optional[List[int]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multi-output). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.

  • observation_noise (Union[bool, torch.Tensor]) – If True, add the observation noise from the respective likelihoods to the posterior. If a Tensor of shape (batch_shape) x q x m, use it directly as the observation noise (with observation_noise[…,i] added to the posterior of the i-th model).

  • posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – An optional PosteriorTransform.

  • kwargs (Any) –

Returns

A GPyTorchPosterior or FullyBayesianPosterior object, representing batch_shape joint distributions over q points and the outputs selected by output_indices each. Includes measurement noise if observation_noise is specified.

Return type

botorch.posteriors.gpytorch.GPyTorchPosterior

condition_on_observations(X, Y, **kwargs)[source]

Condition the model on new observations.

Parameters
  • X (torch.Tensor) – A batch_shape x n’ x d-dim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).

  • Y (torch.Tensor) – A batch_shape’ x n x m-dim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.

  • kwargs (Any) –

Returns

A Model object of the same type, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).

Return type

botorch.models.model.Model

Example

>>> train_X = torch.rand(20, 2)
>>> train_Y = torch.sin(train_X[:, 0]) + torch.cos(train_X[:, 1])
>>> model = SingleTaskGP(train_X, train_Y)
>>> new_X = torch.rand(5, 2)
>>> new_Y = torch.sin(new_X[:, 0]) + torch.cos(new_X[:, 1])
>>> model = model.condition_on_observations(X=new_X, Y=new_Y)
class botorch.models.gpytorch.MultiTaskGPyTorchModel[source]

Bases: botorch.models.gpytorch.GPyTorchModel, abc.ABC

Abstract base class for multi-task models based on GPyTorch models.

This class provides the posterior method to models that implement a “long-format” multi-task GP in the style of MultiTaskGP.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Return type

None

posterior(X, output_indices=None, observation_noise=False, posterior_transform=None, **kwargs)[source]

Computes the posterior over model outputs at the provided points.

Parameters
  • X (torch.Tensor) – A q x d or batch_shape x q x d (batch mode) tensor, where d is the dimension of the feature space (not including task indices) and q is the number of points considered jointly.

  • output_indices (Optional[List[int]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multi-output). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.

  • observation_noise (Union[bool, torch.Tensor]) – If True, add observation noise from the respective likelihoods. If a Tensor, specifies the observation noise levels to add.

  • posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – An optional PosteriorTransform.

  • kwargs (Any) –

Returns

A GPyTorchPosterior object, representing batch_shape joint distributions over q points and the outputs selected by output_indices. Includes measurement noise if observation_noise is specified.

Return type

botorch.posteriors.gpytorch.GPyTorchPosterior

Deterministic Model API

Deterministic Models. Simple wrappers that allow the usage of deterministic mappings via the BoTorch Model and Posterior APIs. Useful e.g. for defining known cost functions for cost-aware acquisition utilities.

class botorch.models.deterministic.DeterministicModel[source]

Bases: botorch.models.model.Model, abc.ABC

Abstract base class for deterministic models.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Return type

None

abstract forward(X)[source]

Compute the (deterministic) model output at X.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim input tensor X.

Returns

A batch_shape x n x m-dimensional output tensor (the outcome dimension m must be explicit if m=1).

Return type

torch.Tensor

property num_outputs: int

The number of outputs of the model.

posterior(X, output_indices=None, posterior_transform=None, **kwargs)[source]

Compute the (deterministic) posterior at X.

Parameters
  • X (torch.Tensor) – A batch_shape x n x d-dim input tensor X.

  • output_indices (Optional[List[int]]) – A list of indices, corresponding to the outputs over which to compute the posterior. If omitted, computes the posterior over all model outputs.

  • posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – An optional PosteriorTransform.

  • kwargs (Any) –

Returns

A DeterministicPosterior object, representing batch_shape joint posteriors over n points and the outputs selected by output_indices.

Return type

botorch.posteriors.deterministic.DeterministicPosterior

class botorch.models.deterministic.GenericDeterministicModel(f, num_outputs=1)[source]

Bases: botorch.models.deterministic.DeterministicModel

A generic deterministic model constructed from a callable.

A generic deterministic model constructed from a callable.

Parameters
  • f (Callable[[Tensor], Tensor]) – A callable mapping a batch_shape x n x d-dim input tensor X to a batch_shape x n x m-dimensional output tensor (the outcome dimension m must be explicit, even if m=1).

  • num_outputs (int) – The number of outputs m.

Return type

None

Example

>>> f = lambda x: x.sum(dim=-1, keep_dims=True)
>>> model = GenericDeterministicModel(f)
subset_output(idcs)[source]

Subset the model along the output dimension.

Parameters

idcs (List[int]) – The output indices to subset the model to.

Returns

The current model, subset to the specified output indices.

Return type

botorch.models.deterministic.GenericDeterministicModel

forward(X)[source]

Compute the (deterministic) model output at X.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim input tensor X.

Returns

A batch_shape x n x m-dimensional output tensor.

Return type

torch.Tensor

class botorch.models.deterministic.AffineDeterministicModel(a, b=0.01)[source]

Bases: botorch.models.deterministic.DeterministicModel

An affine deterministic model.

Affine deterministic model from weights and offset terms.

A simple model of the form

y[…, m] = b[m] + sum_{i=1}^d a[i, m] * X[…, i]

Parameters
  • a (Tensor) – A d x m-dim tensor of linear weights, where m is the number of outputs (must be explicit if m=1)

  • b (Union[Tensor, float]) – The affine (offset) term. Either a float (for single-output models or if the offset is shared), or a m-dim tensor (with different offset values for for the m different outputs).

Return type

None

subset_output(idcs)[source]

Subset the model along the output dimension.

Parameters

idcs (List[int]) – The output indices to subset the model to.

Returns

The current model, subset to the specified output indices.

Return type

botorch.models.deterministic.AffineDeterministicModel

forward(X)[source]

Compute the (deterministic) model output at X.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim input tensor X.

Returns

A batch_shape x n x m-dimensional output tensor (the outcome dimension m must be explicit if m=1).

Return type

torch.Tensor

class botorch.models.deterministic.PosteriorMeanModel(model)[source]

Bases: botorch.models.deterministic.DeterministicModel

A deterministic model that always return the posterior mean.

Parameters

model (Model) – The base model.

Return type

None

forward(X)[source]

Compute the (deterministic) model output at X.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim input tensor X.

Returns

A batch_shape x n x m-dimensional output tensor (the outcome dimension m must be explicit if m=1).

Return type

torch.Tensor

class botorch.models.deterministic.FixedSingleSampleModel(model, w=None)[source]

Bases: botorch.models.deterministic.DeterministicModel

A deterministic model defined by a single sample w.

Given a base model f and a fixed sample w, the model always outputs

y = f_mean(x) + f_stddev(x) * w

We assume the outcomes are uncorrelated here.

Parameters
  • model (Model) – The base model.

  • w (Optional[Tensor]) – A 1-d tensor with length model.num_outputs. If None, draw it from a standard normal distribution.

Return type

None

forward(X)[source]

Compute the (deterministic) model output at X.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim input tensor X.

Returns

A batch_shape x n x m-dimensional output tensor (the outcome dimension m must be explicit if m=1).

Return type

torch.Tensor

Models

Cost Models (for cost-aware optimization)

Cost models to be used with multi-fidelity optimization.

class botorch.models.cost.AffineFidelityCostModel(fidelity_weights=None, fixed_cost=0.01)[source]

Bases: botorch.models.deterministic.DeterministicModel

Affine cost model operating on fidelity parameters.

For each (q-batch) element of a candidate set X, this module computes a cost of the form

cost = fixed_cost + sum_j weights[j] * X[fidelity_dims[j]]

Affine cost model operating on fidelity parameters.

Parameters
  • fidelity_weights (Optional[Dict[int, float]]) – A dictionary mapping a subset of columns of X (the fidelity parameters) to it’s associated weight in the affine cost expression. If omitted, assumes that the last column of X is the fidelity parameter with a weight of 1.0.

  • fixed_cost (float) – The fixed cost of running a single candidate point (i.e. an element of a q-batch).

Return type

None

forward(X)[source]

Evaluate the cost on a candidate set X.

Computes a cost of the form

cost = fixed_cost + sum_j weights[j] * X[fidelity_dims[j]]

for each element of the q-batch

Parameters

X (torch.Tensor) – A batch_shape x q x d’-dim tensor of candidate points.

Returns

A batch_shape x q x 1-dim tensor of costs.

Return type

torch.Tensor

GP Regression Models

Gaussian Process Regression models based on GPyTorch models.

class botorch.models.gp_regression.SingleTaskGP(train_X, train_Y, likelihood=None, covar_module=None, mean_module=None, outcome_transform=None, input_transform=None)[source]

Bases: botorch.models.gpytorch.BatchedMultiOutputGPyTorchModel, gpytorch.models.exact_gp.ExactGP

A single-task exact GP model.

A single-task exact GP using relatively strong priors on the Kernel hyperparameters, which work best when covariates are normalized to the unit cube and outcomes are standardized (zero mean, unit variance).

This model works in batch mode (each batch having its own hyperparameters). When the training observations include multiple outputs, this model will use batching to model outputs independently.

Use this model when you have independent output(s) and all outputs use the same training data. If outputs are independent and outputs have different training data, use the ModelListGP. When modeling correlations between outputs, use the MultiTaskGP.

A single-task exact GP model.

Parameters
  • train_X (Tensor) – A batch_shape x n x d tensor of training features.

  • train_Y (Tensor) – A batch_shape x n x m tensor of training observations.

  • likelihood (Optional[Likelihood]) – A likelihood. If omitted, use a standard GaussianLikelihood with inferred noise level.

  • covar_module (Optional[Module]) – The module computing the covariance (Kernel) matrix. If omitted, use a MaternKernel.

  • mean_module (Optional[Mean]) – The mean function to be used. If omitted, use a ConstantMean.

  • outcome_transform (Optional[OutcomeTransform]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).

  • input_transform (Optional[InputTransform]) – An input transform that is applied in the model’s forward pass.

Return type

None

Example

>>> train_X = torch.rand(20, 2)
>>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True)
>>> model = SingleTaskGP(train_X, train_Y)
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters

x (torch.Tensor) –

Return type

gpytorch.distributions.multivariate_normal.MultivariateNormal

classmethod construct_inputs(training_data, **kwargs)[source]

Construct kwargs for the Model from TrainingData and other options.

Parameters
  • training_data (botorch.utils.containers.TrainingData) – TrainingData container with data for single outcome or for multiple outcomes for batched multi-output case.

  • **kwargs – None expected for this class.

  • kwargs (Any) –

Return type

Dict[str, Any]

class botorch.models.gp_regression.FixedNoiseGP(train_X, train_Y, train_Yvar, covar_module=None, mean_module=None, outcome_transform=None, input_transform=None, **kwargs)[source]

Bases: botorch.models.gpytorch.BatchedMultiOutputGPyTorchModel, gpytorch.models.exact_gp.ExactGP

A single-task exact GP model using fixed noise levels.

A single-task exact GP that uses fixed observation noise levels. This model also uses relatively strong priors on the Kernel hyperparameters, which work best when covariates are normalized to the unit cube and outcomes are standardized (zero mean, unit variance).

This model works in batch mode (each batch having its own hyperparameters).

A single-task exact GP model using fixed noise levels.

Parameters
  • train_X (Tensor) – A batch_shape x n x d tensor of training features.

  • train_Y (Tensor) – A batch_shape x n x m tensor of training observations.

  • train_Yvar (Tensor) – A batch_shape x n x m tensor of observed measurement noise.

  • covar_module (Optional[Module]) – The module computing the covariance (Kernel) matrix. If omitted, use a MaternKernel.

  • mean_module (Optional[Mean]) – The mean function to be used. If omitted, use a ConstantMean.

  • outcome_transform (Optional[OutcomeTransform]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).

  • input_transform (Optional[InputTransform]) – An input transfrom that is applied in the model’s forward pass.

  • kwargs (Any) –

Return type

None

Example

>>> train_X = torch.rand(20, 2)
>>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True)
>>> train_Yvar = torch.full_like(train_Y, 0.2)
>>> model = FixedNoiseGP(train_X, train_Y, train_Yvar)
fantasize(X, sampler, observation_noise=True, **kwargs)[source]

Construct a fantasy model.

Constructs a fantasy model in the following fashion: (1) compute the model posterior at X (if observation_noise=True, this includes observation noise taken as the mean across the observation noise in the training data. If observation_noise is a Tensor, use it directly as the observation noise to add). (2) sample from this posterior (using sampler) to generate “fake” observations. (3) condition the model on the new fake observations.

Parameters
  • X (torch.Tensor) – A batch_shape x n’ x d-dim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).

  • sampler (botorch.sampling.samplers.MCSampler) – The sampler used for sampling from the posterior at X.

  • observation_noise (Union[bool, torch.Tensor]) – If True, include the mean across the observation noise in the training data as observation noise in the posterior from which the samples are drawn. If a Tensor, use it directly as the specified measurement noise.

  • kwargs (Any) –

Returns

The constructed fantasy model.

Return type

botorch.models.gp_regression.FixedNoiseGP

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters

x (torch.Tensor) –

Return type

gpytorch.distributions.multivariate_normal.MultivariateNormal

subset_output(idcs)[source]

Subset the model along the output dimension.

Parameters

idcs (List[int]) – The output indices to subset the model to.

Returns

The current model, subset to the specified output indices.

Return type

botorch.models.gpytorch.BatchedMultiOutputGPyTorchModel

classmethod construct_inputs(training_data, **kwargs)[source]

Construct kwargs for the Model from TrainingData and other options.

Parameters
  • training_data (botorch.utils.containers.TrainingData) – TrainingData container with data for single outcome or for multiple outcomes for batched multi-output case.

  • **kwargs – None expected for this class.

  • kwargs (Any) –

Return type

Dict[str, Any]

class botorch.models.gp_regression.HeteroskedasticSingleTaskGP(train_X, train_Y, train_Yvar, outcome_transform=None, input_transform=None)[source]

Bases: botorch.models.gp_regression.SingleTaskGP

A single-task exact GP model using a heteroskeastic noise model.

This model internally wraps another GP (a SingleTaskGP) to model the observation noise. This allows the likelihood to make out-of-sample predictions for the observation noise levels.

A single-task exact GP model using a heteroskedastic noise model.

Parameters
  • train_X (Tensor) – A batch_shape x n x d tensor of training features.

  • train_Y (Tensor) – A batch_shape x n x m tensor of training observations.

  • train_Yvar (Tensor) – A batch_shape x n x m tensor of observed measurement noise.

  • outcome_transform (Optional[OutcomeTransform]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale). Note that the noise model internally log-transforms the variances, which will happen after this transform is applied.

  • input_transform (Optional[InputTransform]) – An input transfrom that is applied in the model’s forward pass.

Return type

None

Example

>>> train_X = torch.rand(20, 2)
>>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True)
>>> se = torch.norm(train_X, dim=1, keepdim=True)
>>> train_Yvar = 0.1 + se * torch.rand_like(train_Y)
>>> model = HeteroskedasticSingleTaskGP(train_X, train_Y, train_Yvar)
condition_on_observations(X, Y, **kwargs)[source]

Condition the model on new observations.

Parameters
  • X (torch.Tensor) – A batch_shape x n’ x d-dim Tensor, where d is the dimension of the feature space, m is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).

  • Y (torch.Tensor) – A batch_shape’ x n’ x m-dim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.

  • kwargs (Any) –

Returns

A BatchedMultiOutputGPyTorchModel object of the same type with n + n’ training examples, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).

Return type

botorch.models.gp_regression.HeteroskedasticSingleTaskGP

Example

>>> train_X = torch.rand(20, 2)
>>> train_Y = torch.cat(
>>>     [torch.sin(train_X[:, 0]), torch.cos(train_X[:, 1])], -1
>>> )
>>> model = SingleTaskGP(train_X, train_Y)
>>> new_X = torch.rand(5, 2)
>>> new_Y = torch.cat([torch.sin(new_X[:, 0]), torch.cos(new_X[:, 1])], -1)
>>> model = model.condition_on_observations(X=new_X, Y=new_Y)
subset_output(idcs)[source]

Subset the model along the output dimension.

Parameters

idcs (List[int]) – The output indices to subset the model to.

Returns

The current model, subset to the specified output indices.

Return type

botorch.models.gp_regression.HeteroskedasticSingleTaskGP

Multi-Fidelity GP Regression Models

Gaussian Process Regression models based on GPyTorch models.

Wu2019mf(1,2)

J. Wu, S. Toscano-Palmerin, P. I. Frazier, and A. G. Wilson. Practical multi-fidelity bayesian optimization for hyperparameter tuning. ArXiv 2019.

class botorch.models.gp_regression_fidelity.SingleTaskMultiFidelityGP(train_X, train_Y, iteration_fidelity=None, data_fidelity=None, linear_truncated=True, nu=2.5, likelihood=None, outcome_transform=None, input_transform=None)[source]

Bases: botorch.models.gp_regression.SingleTaskGP

A single task multi-fidelity GP model.

A SingleTaskGP model using a DownsamplingKernel for the data fidelity parameter (if present) and an ExponentialDecayKernel for the iteration fidelity parameter (if present).

This kernel is described in [Wu2019mf].

Parameters
  • train_X (Tensor) – A batch_shape x n x (d + s) tensor of training features, where s is the dimension of the fidelity parameters (either one or two).

  • train_Y (Tensor) – A batch_shape x n x m tensor of training observations.

  • iteration_fidelity (Optional[int]) – The column index for the training iteration fidelity parameter (optional).

  • data_fidelity (Optional[int]) – The column index for the downsampling fidelity parameter (optional).

  • linear_truncated (bool) – If True, use a LinearTruncatedFidelityKernel instead of the default kernel.

  • nu (float) – The smoothness parameter for the Matern kernel: either 1/2, 3/2, or 5/2. Only used when linear_truncated=True.

  • likelihood (Optional[Likelihood]) – A likelihood. If omitted, use a standard GaussianLikelihood with inferred noise level.

  • outcome_transform (Optional[OutcomeTransform]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).

  • input_transform (Optional[InputTransform]) – An input transform that is applied in the model’s forward pass.

Return type

None

Example

>>> train_X = torch.rand(20, 4)
>>> train_Y = train_X.pow(2).sum(dim=-1, keepdim=True)
>>> model = SingleTaskMultiFidelityGP(train_X, train_Y, data_fidelity=3)

A single-task exact GP model.

Parameters
  • train_X (Tensor) – A batch_shape x n x d tensor of training features.

  • train_Y (Tensor) – A batch_shape x n x m tensor of training observations.

  • likelihood (Optional[Likelihood]) – A likelihood. If omitted, use a standard GaussianLikelihood with inferred noise level.

  • covar_module – The module computing the covariance (Kernel) matrix. If omitted, use a MaternKernel.

  • mean_module – The mean function to be used. If omitted, use a ConstantMean.

  • outcome_transform (Optional[OutcomeTransform]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).

  • input_transform (Optional[InputTransform]) – An input transform that is applied in the model’s forward pass.

  • iteration_fidelity (Optional[int]) –

  • data_fidelity (Optional[int]) –

  • linear_truncated (bool) –

  • nu (float) –

Return type

None

Example

>>> train_X = torch.rand(20, 2)
>>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True)
>>> model = SingleTaskGP(train_X, train_Y)
classmethod construct_inputs(training_data, **kwargs)[source]

Construct kwargs for the Model from TrainingData and other options.

Parameters
  • training_data (botorch.utils.containers.TrainingData) – TrainingData container with data for single outcome or for multiple outcomes for batched multi-output case.

  • **kwargs – Options, expected for this class: - fidelity_features: List of columns of X that are fidelity parameters.

Return type

Dict[str, Any]

class botorch.models.gp_regression_fidelity.FixedNoiseMultiFidelityGP(train_X, train_Y, train_Yvar, iteration_fidelity=None, data_fidelity=None, linear_truncated=True, nu=2.5, outcome_transform=None, input_transform=None)[source]

Bases: botorch.models.gp_regression.FixedNoiseGP

A single task multi-fidelity GP model using fixed noise levels.

A FixedNoiseGP model analogue to SingleTaskMultiFidelityGP, using a DownsamplingKernel for the data fidelity parameter (if present) and an ExponentialDecayKernel for the iteration fidelity parameter (if present).

This kernel is described in [Wu2019mf].

Parameters
  • train_X (Tensor) – A batch_shape x n x (d + s) tensor of training features, where s is the dimension of the fidelity parameters (either one or two).

  • train_Y (Tensor) – A batch_shape x n x m tensor of training observations.

  • train_Yvar (Tensor) – A batch_shape x n x m tensor of observed measurement noise.

  • iteration_fidelity (Optional[int]) – The column index for the training iteration fidelity parameter (optional).

  • data_fidelity (Optional[int]) – The column index for the downsampling fidelity parameter (optional).

  • linear_truncated (bool) – If True, use a LinearTruncatedFidelityKernel instead of the default kernel.

  • nu (float) – The smoothness parameter for the Matern kernel: either 1/2, 3/2, or 5/2. Only used when linear_truncated=True.

  • outcome_transform (Optional[OutcomeTransform]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).

  • input_transform (Optional[InputTransform]) – An input transform that is applied in the model’s forward pass.

Return type

None

Example

>>> train_X = torch.rand(20, 4)
>>> train_Y = train_X.pow(2).sum(dim=-1, keepdim=True)
>>> train_Yvar = torch.full_like(train_Y) * 0.01
>>> model = FixedNoiseMultiFidelityGP(
>>>     train_X,
>>>     train_Y,
>>>     train_Yvar,
>>>     data_fidelity=3,
>>> )

A single-task exact GP model using fixed noise levels.

Parameters
  • train_X (Tensor) – A batch_shape x n x d tensor of training features.

  • train_Y (Tensor) – A batch_shape x n x m tensor of training observations.

  • train_Yvar (Tensor) – A batch_shape x n x m tensor of observed measurement noise.

  • covar_module – The module computing the covariance (Kernel) matrix. If omitted, use a MaternKernel.

  • mean_module – The mean function to be used. If omitted, use a ConstantMean.

  • outcome_transform (Optional[OutcomeTransform]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).

  • input_transform (Optional[InputTransform]) – An input transfrom that is applied in the model’s forward pass.

  • iteration_fidelity (Optional[int]) –

  • data_fidelity (Optional[int]) –

  • linear_truncated (bool) –

  • nu (float) –

Return type

None

Example

>>> train_X = torch.rand(20, 2)
>>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True)
>>> train_Yvar = torch.full_like(train_Y, 0.2)
>>> model = FixedNoiseGP(train_X, train_Y, train_Yvar)
classmethod construct_inputs(training_data, **kwargs)[source]

Construct kwargs for the Model from TrainingData and other options.

Parameters
  • training_data (botorch.utils.containers.TrainingData) – TrainingData container with data for single outcome or for multiple outcomes for batched multi-output case.

  • **kwargs – Options, expected for this class: - fidelity_features: List of columns of X that are fidelity parameters.

Return type

Dict[str, Any]

GP Regression Models for Mixed Parameter Spaces

class botorch.models.gp_regression_mixed.MixedSingleTaskGP(train_X, train_Y, cat_dims, cont_kernel_factory=None, likelihood=None, outcome_transform=None, input_transform=None)[source]

Bases: botorch.models.gp_regression.SingleTaskGP

A single-task exact GP model for mixed search spaces.

This model uses a kernel that combines a CategoricalKernel (based on Hamming distances) and a regular kernel into a kernel of the form

K((x1, c1), (x2, c2)) =

K_cont_1(x1, x2) + K_cat_1(c1, c2) + K_cont_2(x1, x2) * K_cat_2(c1, c2)

where xi and ci are the continuous and categorical features of the input, respectively. The suffix _i indicates that we fit different lengthscales for the kernels in the sum and product terms.

Since this model does not provide gradients for the categorical features, optimization of the acquisition function will need to be performed in a mixed fashion, i.e., treating the categorical features properly as discrete optimization variables.

A single-task exact GP model supporting categorical parameters.

Parameters
  • train_X (Tensor) – A batch_shape x n x d tensor of training features.

  • train_Y (Tensor) – A batch_shape x n x m tensor of training observations.

  • cat_dims (List[int]) – A list of indices corresponding to the columns of the input X that should be considered categorical features.

  • cont_kernel_factory (Optional[Callable[[int, List[int]], Kernel]]) – A method that accepts ard_num_dims and active_dims arguments and returns an instatiated GPyTorch Kernel object to be used as the ase kernel for the continuous dimensions. If omitted, this model uses a Matern-2.5 kernel as the kernel for the ordinal parameters.

  • likelihood (Optional[Likelihood]) – A likelihood. If omitted, use a standard GaussianLikelihood with inferred noise level.

  • outcome_transform (Optional[OutcomeTransform]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).

  • input_transform (Optional[InputTransform]) – An input transform that is applied in the model’s forward pass. Only input transforms are allowed which do not transform the categorical dimensions. This can be achieved by using the indices argument when constructing the transform.

Return type

None

Example

>>> train_X = torch.cat(
        [torch.rand(20, 2), torch.randint(3, (20, 1))], dim=-1)
    )
>>> train_Y = (
        torch.sin(train_X[..., :-1]).sum(dim=1, keepdim=True)
        + train_X[..., -1:]
    )
>>> model = MixedSingleTaskGP(train_X, train_Y, cat_dims=[-1])
classmethod construct_inputs(training_data, **kwargs)[source]

Construct kwargs for the Model from TrainingData and other options.

Parameters
  • training_data (botorch.utils.containers.TrainingData) – TrainingData container with data for single outcome or for multiple outcomes for batched multi-output case.

  • **kwargs – None expected for this class.

  • kwargs (Any) –

Return type

Dict[str, Any]

Model List GP Regression Models

Model List GP Regression models.

class botorch.models.model_list_gp_regression.ModelListGP(*gp_models)[source]

Bases: gpytorch.models.model_list.IndependentModelList, botorch.models.gpytorch.ModelListGPyTorchModel

A multi-output GP model with independent GPs for the outputs.

This model supports different-shaped training inputs for each of its sub-models. It can be used with any BoTorch models.

Internally, this model is just a list of individual models, but it implements the same input/output interface as all other BoTorch models. This makes it very flexible and convenient to work with. The sequential evaluation comes at a performance cost though - if you are using a block design (i.e. the same number of training example for each output, and a similar model structure, you should consider using a batched GP model instead).

A multi-output GP model with independent GPs for the outputs.

Parameters
  • *gp_models – An variable number of single-output BoTorch models. If models have input/output transforms, these are honored individually for each model.

  • gp_models (GPyTorchModel) –

Return type

None

Example

>>> model1 = SingleTaskGP(train_X1, train_Y1)
>>> model2 = SingleTaskGP(train_X2, train_Y2)
>>> model = ModelListGP(model1, model2)
condition_on_observations(X, Y, **kwargs)[source]

Condition the model on new observations.

Parameters
  • X (torch.Tensor) – A batch_shape x n’ x d-dim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).

  • Y (torch.Tensor) – A batch_shape’ x n’ x m-dim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.

  • kwargs (Any) –

Returns

A ModelListGPyTorchModel representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs). Here the i-th model has n_i + n’ training examples, where the n’ training examples have been added and all test-time caches have been updated.

Return type

botorch.models.model_list_gp_regression.ModelListGP

subset_output(idcs)[source]

Subset the model along the output dimension.

Parameters

idcs (List[int]) – The output indices to subset the model to.

Returns

The current model, subset to the specified output indices.

Return type

botorch.models.model_list_gp_regression.ModelListGP

Multitask GP Models

Multi-Task GP models.

References

Doucet2010sampl

A. Doucet. A Note on Efficient Conditional Simulation of Gaussian Distributions. http://www.stats.ox.ac.uk/~doucet/doucet_simulationconditionalgaussian.pdf, Apr 2010.

Maddox2021bohdo

W. Maddox, M. Balandat, A. Wilson, and E. Bakshy. Bayesian Optimization with High-Dimensional Outputs. https://arxiv.org/abs/2106.12997, Jun 2021.

class botorch.models.multitask.MultiTaskGP(train_X, train_Y, task_feature, covar_module=None, task_covar_prior=None, output_tasks=None, rank=None, input_transform=None, outcome_transform=None)[source]

Bases: gpytorch.models.exact_gp.ExactGP, botorch.models.gpytorch.MultiTaskGPyTorchModel

Multi-Task GP model using an ICM kernel, inferring observation noise.

Multi-task exact GP that uses a simple ICM kernel. Can be single-output or multi-output. This model uses relatively strong priors on the base Kernel hyperparameters, which work best when covariates are normalized to the unit cube and outcomes are standardized (zero mean, unit variance).

This model infers the noise level. WARNING: It currently does not support different noise levels for the different tasks. If you have known observation noise, please use FixedNoiseMultiTaskGP instead.

Multi-Task GP model using an ICM kernel, inferring observation noise.

Parameters
  • train_X (Tensor) – A n x (d + 1) or b x n x (d + 1) (batch mode) tensor of training data. One of the columns should contain the task features (see task_feature argument).

  • train_Y (Tensor) – A n x 1 or b x n x 1 (batch mode) tensor of training observations.

  • task_feature (int) – The index of the task feature (-d <= task_feature <= d).

  • output_tasks (Optional[List[int]]) – A list of task indices for which to compute model outputs for. If omitted, return outputs for all task indices.

  • rank (Optional[int]) – The rank to be used for the index kernel. If omitted, use a full rank (i.e. number of tasks) kernel.

  • task_covar_prior (Optional[Prior]) – A Prior on the task covariance matrix. Must operate on p.s.d. matrices. A common prior for this is the LKJ prior.

  • input_transform (Optional[InputTransform]) – An input transform that is applied in the model’s forward pass.

  • covar_module (Optional[Module]) –

  • outcome_transform (Optional[OutcomeTransform]) –

Return type

None

Example

>>> X1, X2 = torch.rand(10, 2), torch.rand(20, 2)
>>> i1, i2 = torch.zeros(10, 1), torch.ones(20, 1)
>>> train_X = torch.cat([
>>>     torch.cat([X1, i1], -1), torch.cat([X2, i2], -1),
>>> ])
>>> train_Y = torch.cat(f1(X1), f2(X2)).unsqueeze(-1)
>>> model = MultiTaskGP(train_X, train_Y, task_feature=-1)
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters

x (torch.Tensor) –

Return type

gpytorch.distributions.multivariate_normal.MultivariateNormal

classmethod get_all_tasks(train_X, task_feature, output_tasks=None)[source]
Parameters
  • train_X (torch.Tensor) –

  • task_feature (int) –

  • output_tasks (Optional[List[int]]) –

Return type

Tuple[List[int], int, int]

classmethod construct_inputs(training_data, **kwargs)[source]

Construct kwargs for the Model from TrainingData and other options.

Parameters
  • training_data (botorch.utils.containers.TrainingData) – TrainingData container with data for single outcome or for multiple outcomes for batched multi-output case.

  • **kwargs

    Additional options for the model that pertain to the training data, including:

    • task_features: Indices of the input columns containing the task features (expected list of length 1),

    • task_covar_prior: A GPyTorch Prior object to use as prior on the cross-task covariance matrix,

    • prior_config: A dict representing a prior config, should only be used if prior is not passed directly. Should contain: use_LKJ_prior (whether to use LKJ prior) and eta (eta value, float),

    • rank: The rank of the cross-task covariance matrix.

Return type

Dict[str, Any]

class botorch.models.multitask.FixedNoiseMultiTaskGP(train_X, train_Y, train_Yvar, task_feature, covar_module=None, task_covar_prior=None, output_tasks=None, rank=None, input_transform=None)[source]

Bases: botorch.models.multitask.MultiTaskGP

Multi-Task GP model using an ICM kernel, with known observation noise.

Multi-task exact GP that uses a simple ICM kernel. Can be single-output or multi-output. This model uses relatively strong priors on the base Kernel hyperparameters, which work best when covariates are normalized to the unit cube and outcomes are standardized (zero mean, unit variance).

This model requires observation noise data (specified in train_Yvar).

Multi-Task GP model using an ICM kernel and known observation noise.

Parameters
  • train_X (Tensor) – A n x (d + 1) or b x n x (d + 1) (batch mode) tensor of training data. One of the columns should contain the task features (see task_feature argument).

  • train_Y (Tensor) – A n x 1 or b x n x 1 (batch mode) tensor of training observations.

  • train_Yvar (Tensor) – A n or b x n (batch mode) tensor of observation noise standard errors.

  • task_feature (int) – The index of the task feature (-d <= task_feature <= d).

  • task_covar_prior (Optional[Prior]) – A Prior on the task covariance matrix. Must operate on p.s.d. matrices. A common prior for this is the LKJ prior.

  • output_tasks (Optional[List[int]]) – A list of task indices for which to compute model outputs for. If omitted, return outputs for all task indices.

  • rank (Optional[int]) – The rank to be used for the index kernel. If omitted, use a full rank (i.e. number of tasks) kernel.

  • input_transform (Optional[InputTransform]) – An input transform that is applied in the model’s forward pass.

  • covar_module (Optional[Module]) –

Return type

None

Example

>>> X1, X2 = torch.rand(10, 2), torch.rand(20, 2)
>>> i1, i2 = torch.zeros(10, 1), torch.ones(20, 1)
>>> train_X = torch.cat([
>>>     torch.cat([X1, i1], -1), torch.cat([X2, i2], -1),
>>> ], dim=0)
>>> train_Y = torch.cat(f1(X1), f2(X2))
>>> train_Yvar = 0.1 + 0.1 * torch.rand_like(train_Y)
>>> model = FixedNoiseMultiTaskGP(train_X, train_Y, train_Yvar, -1)
classmethod construct_inputs(training_data, **kwargs)[source]

Construct kwargs for the Model from TrainingData and other options.

Parameters
  • training_data (botorch.utils.containers.TrainingData) – TrainingData container with data for single outcome or for multiple outcomes for batched multi-output case.

  • **kwargs

    Additional options for the model that pertain to the training data, including:

    • task_features: Indices of the input columns containing the task features (expected list of length 1),

    • task_covar_prior: A GPyTorch Prior object to use as prior on the cross-task covariance matrix,

    • prior_config: A dict representing a prior config, should only be used if prior is not passed directly. Should contain: use_LKJ_prior` (whether to use LKJ prior) and eta (eta value, float),

    • rank: The rank of the cross-task covariance matrix.

Return type

Dict[str, Any]

class botorch.models.multitask.KroneckerMultiTaskGP(train_X, train_Y, likelihood=None, data_covar_module=None, task_covar_prior=None, rank=None, input_transform=None, outcome_transform=None, **kwargs)[source]

Bases: gpytorch.models.exact_gp.ExactGP, botorch.models.gpytorch.GPyTorchModel

Multi-task GP with Kronecker structure, using an ICM kernel.

This model assumes the “block design” case, i.e., it requires that all tasks are observed at all data points.

For posterior sampling, this model uses Matheron’s rule [Doucet2010sampl] to compute the posterior over all tasks as in [Maddox2021bohdo] by exploiting Kronecker structure.

Multi-task GP with Kronecker structure, using a simple ICM kernel.

Parameters
  • train_X (Tensor) – A batch_shape x n x d tensor of training features.

  • train_Y (Tensor) – A batch_shape x n x m tensor of training observations.

  • likelihood (Optional[MultitaskGaussianLikelihood]) – A MultitaskGaussianLikelihood. If omitted, uses a MultitaskGaussianLikelihood with a GammaPrior(1.1, 0.05) noise prior.

  • data_covar_module (Optional[Module]) – The module computing the covariance (Kernel) matrix in data space. If omitted, use a MaternKernel.

  • task_covar_prior (Optional[Prior]) – A Prior on the task covariance matrix. Must operate on p.s.d. matrices. A common prior for this is the LKJ prior. If omitted, uses LKJCovariancePrior with eta parameter as specified in the keyword arguments (if not specified, use eta=1.5).

  • rank (Optional[int]) – The rank of the ICM kernel. If omitted, use a full rank kernel.

  • kwargs (Any) – Additional arguments to override default settings of priors, including: - eta: The eta parameter on the default LKJ task_covar_prior. A value of 1.0 is uninformative, values <1.0 favor stronger correlations (in magnitude), correlations vanish as eta -> inf. - sd_prior: A scalar prior over nonnegative numbers, which is used for the default LKJCovariancePrior task_covar_prior. - likelihood_rank: The rank of the task covariance matrix to fit. Defaults to 0 (which corresponds to a diagonal covariance matrix).

  • input_transform (Optional[InputTransform]) –

  • outcome_transform (Optional[OutcomeTransform]) –

Return type

None

Example

>>> train_X = torch.rand(10, 2)
>>> train_Y = torch.cat([f_1(X), f_2(X)], dim=-1)
>>> model = KroneckerMultiTaskGP(train_X, train_Y)
forward(X)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters

X (torch.Tensor) –

Return type

gpytorch.distributions.multitask_multivariate_normal.MultitaskMultivariateNormal

property train_full_covar
property predictive_mean_cache
posterior(X, output_indices=None, observation_noise=False, posterior_transform=None, **kwargs)[source]

Computes the posterior over model outputs at the provided points.

Parameters
  • X (torch.Tensor) – A (batch_shape) x q x d-dim Tensor, where d is the dimension of the feature space and q is the number of points considered jointly.

  • observation_noise (Union[bool, torch.Tensor]) – If True, add the observation noise from the likelihood to the posterior. If a Tensor, use it directly as the observation noise (must be of shape (batch_shape) x q).

  • posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – An optional PosteriorTransform.

  • output_indices (Optional[List[int]]) –

  • kwargs (Any) –

Returns

A GPyTorchPosterior object, representing a batch of b joint distributions over q points. Includes observation noise if specified.

Return type

botorch.posteriors.multitask.MultitaskGPPosterior

train(val=True, *args, **kwargs)[source]

Puts the model in train mode and reverts to the original inputs.

Parameters

mode – A boolean denoting whether to put in train or eval mode. If False, model is put in eval mode.

Higher Order GP Models

References

Zhe2019hogp

S. Zhe, W. Xing, and R. M. Kirby. Scalable high-order gaussian process regression. Proceedings of Machine Learning Research, volume 89, Apr 2019.

class botorch.models.higher_order_gp.FlattenedStandardize(output_shape, batch_shape=None, min_stdv=1e-08)[source]

Bases: botorch.models.transforms.outcome.Standardize

Standardize outcomes in a structured multi-output settings by reshaping the batched output dimensions to be a vector. Specifically, an output dimension of [a x b x c] will be squeezed to be a vector of [a * b * c].

Standardize outcomes (zero mean, unit variance).

Parameters
  • m – The output dimension.

  • outputs – Which of the outputs to standardize. If omitted, all outputs will be standardized.

  • batch_shape (torch.Size) – The batch_shape of the training targets.

  • min_stddv – The minimum standard deviation for which to perform standardization (if lower, only de-mean the data).

  • output_shape (torch.Size) –

  • min_stdv (float) –

forward(Y, Yvar=None)[source]

Standardize outcomes.

If the module is in train mode, this updates the module state (i.e. the mean/std normalizing constants). If the module is in eval mode, simply applies the normalization using the module state.

Parameters
  • Y (torch.Tensor) – A batch_shape x n x m-dim tensor of training targets.

  • Yvar (Optional[torch.Tensor]) – A batch_shape x n x m-dim tensor of observation noises associated with the training targets (if applicable).

Returns

  • The transformed outcome observations.

  • The transformed observation noise (if applicable).

Return type

A two-tuple with the transformed outcomes

untransform(Y, Yvar=None)[source]

Un-standardize outcomes.

Parameters
  • Y (torch.Tensor) – A batch_shape x n x m-dim tensor of standardized targets.

  • Yvar (Optional[torch.Tensor]) – A batch_shape x n x m-dim tensor of standardized observation noises associated with the targets (if applicable).

Returns

  • The un-standardized outcome observations.

  • The un-standardized observation noise (if applicable).

Return type

A two-tuple with the un-standardized outcomes

untransform_posterior(posterior)[source]

Un-standardize the posterior.

Parameters

posterior (botorch.posteriors.higher_order.HigherOrderGPPosterior) – A posterior in the standardized space.

Returns

The un-standardized posterior. If the input posterior is a MVN, the transformed posterior is again an MVN.

Return type

botorch.posteriors.transformed.TransformedPosterior

training: bool
class botorch.models.higher_order_gp.HigherOrderGP(train_X, train_Y, likelihood=None, covar_modules=None, num_latent_dims=None, learn_latent_pars=True, latent_init='default', outcome_transform=None, input_transform=None)[source]

Bases: botorch.models.gpytorch.BatchedMultiOutputGPyTorchModel, gpytorch.models.exact_gp.ExactGP

A Higher order Gaussian process model (HOGP) (predictions are matrices/tensors) as described in [Zhe2019hogp]. The posterior uses Matheron’s rule [Doucet2010sampl] as described in [Maddox2021bohdo].

A HigherOrderGP model for high-dim output regression.

Parameters
  • train_X (Tensor) – A batch_shape x n x d-dim tensor of training inputs.

  • train_Y (Tensor) – A batch_shape x n x output_shape-dim tensor of training targets.

  • likelihood (Optional[Likelihood]) – Gaussian likelihood for the model.

  • covar_modules (Optional[List[Kernel]]) – List of kernels for each output structure.

  • num_latent_dims (Optional[List[int]]) – Sizes for the latent dimensions.

  • learn_latent_pars (bool) – If true, learn the latent parameters.

  • latent_init (str) – [default or gp] how to initialize the latent parameters.

  • outcome_transform (Optional[OutcomeTransform]) –

  • input_transform (Optional[InputTransform]) –

forward(X)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters

X (torch.Tensor) –

Return type

gpytorch.distributions.multivariate_normal.MultivariateNormal

get_fantasy_model(inputs, targets, **kwargs)[source]

Returns a new GP model that incorporates the specified inputs and targets as new training data.

Using this method is more efficient than updating with set_train_data when the number of inputs is relatively small, because any computed test-time caches will be updated in linear time rather than computed from scratch.

Note

If targets is a batch (e.g. b x m), then the GP returned from this method will be a batch mode GP. If inputs is of the same (or lesser) dimension as targets, then it is assumed that the fantasy points are the same for each target batch.

Parameters
  • inputs (torch.Tensor) – (b1 x … x bk x m x d or f x b1 x … x bk x m x d) Locations of fantasy observations.

  • targets (torch.Tensor) – (b1 x … x bk x m or f x b1 x … x bk x m) Labels of fantasy observations.

Returns

An ExactGP model with n + m training examples, where the m fantasy examples have been added and all test-time caches have been updated.

Return type

ExactGP

condition_on_observations(X, Y, **kwargs)[source]

Condition the model on new observations.

Parameters
  • X (torch.Tensor) – A batch_shape x n’ x d-dim Tensor, where d is the dimension of the feature space, m is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).

  • Y (torch.Tensor) – A batch_shape’ x n’ x m_d-dim Tensor, where m_d is the shaping of the model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.

  • kwargs (Any) –

Returns

A BatchedMultiOutputGPyTorchModel object of the same type with n + n’ training examples, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).

Return type

botorch.models.higher_order_gp.HigherOrderGP

posterior(X, output_indices=None, observation_noise=False, posterior_transform=None, **kwargs)[source]

Computes the posterior over model outputs at the provided points.

Parameters
  • X (torch.Tensor) – A (batch_shape) x q x d-dim Tensor, where d is the dimension of the feature space and q is the number of points considered jointly.

  • output_indices (Optional[List[int]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multi-output). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.

  • observation_noise (Union[bool, torch.Tensor]) – If True, add the observation noise from the likelihood to the posterior. If a Tensor, use it directly as the observation noise (must be of shape (batch_shape) x q x m).

  • posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – An optional PosteriorTransform.

  • kwargs (Any) –

Returns

A GPyTorchPosterior object, representing batch_shape joint distributions over q points and the outputs selected by output_indices each. Includes observation noise if specified.

Return type

botorch.posteriors.gpytorch.GPyTorchPosterior

make_posterior_variances(joint_covariance_matrix)[source]

Computes the posterior variances given the data points X. As currently implemented, it computes another forwards call with the stacked data to get out the joint covariance across all data points.

Parameters

joint_covariance_matrix (gpytorch.lazy.lazy_tensor.LazyTensor) –

Return type

torch.Tensor

Pairwise GP Models

Preference Learning with Gaussian Process

Chu2005preference(1,2,3)

Wei Chu, and Zoubin Ghahramani. Preference learning with Gaussian processes. Proceedings of the 22nd international conference on Machine learning. 2005.

Brochu2010tutorial

Eric Brochu, Vlad M. Cora, and Nando De Freitas. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599 (2010).

class botorch.models.pairwise_gp.PairwiseGP(datapoints, comparisons, covar_module=None, input_transform=None, **kwargs)[source]

Bases: botorch.models.model.Model, gpytorch.models.gp.GP

Probit GP for preference learning with Laplace approximation

Implementation is based on [Chu2005preference]. Also see [Brochu2010tutorial] for additional reference.

Note that in [Chu2005preference] the likelihood of a pairwise comparison is \(\left(\frac{f(x_1) - f(x_2)}{\sqrt{2}\sigma}\right)\), i.e. a scale is used in the denominator. To maintain consistency with usage of kernels elsewhere in botorch, we instead do not include \(\sigma\) in the code (implicitly setting it to 1) and use ScaleKernel to scale the function.

A probit-likelihood GP with Laplace approximation model that learns via

pairwise comparison data. By default it uses a scaled RBF kernel.

Parameters
  • datapoints (Tensor) – A batch_shape x n x d tensor of training features.

  • comparisons (Tensor) – A batch_shape x m x 2 training comparisons; comparisons[i] is a noisy indicator suggesting the utility value of comparisons[i, 0]-th is greater than comparisons[i, 1]-th.

  • covar_module (Optional[Module]) – Covariance module.

  • input_transform (Optional[InputTransform]) – An input transform that is applied in the model’s forward pass.

Return type

None

property num_outputs: int

The number of outputs of the model.

property batch_shape: torch.Size

The batch shape of the model.

This is a batch shape from an I/O perspective, independent of the internal representation of the model (as e.g. in BatchedMultiOutputGPyTorchModel). For a model with m outputs, a test_batch_shape x q x d-shaped input X to the posterior method returns a Posterior object over an output of shape broadcast(test_batch_shape, model.batch_shape) x q x m.

set_train_data(datapoints=None, comparisons=None, strict=False, update_model=True)[source]

Set datapoints and comparisons and update model properties if needed

Parameters
  • datapoints (Optional[torch.Tensor]) – A batch_shape x n x d dimension tensor X. If there are input transformations, assume the datapoints are not transformed

  • comparisons (Optional[torch.Tensor]) – A tensor of size batch_shape x m x 2. (i, j) means f_i is preferred over f_j.

  • strict (bool) – strict argument as in gpytorch.models.exact_gp for compatibility when using fit_gpytorch_model with input_transform.

  • update_model (bool) – True if we want to refit the model (see _update) after re-setting the data.

Return type

None

load_state_dict(state_dict, strict=False)[source]

Removes data related buffers from the state_dict and calls super().load_state_dict with strict=False.

Parameters
  • state_dict (Dict[str, torch.Tensor]) – The state dict.

  • strict (Optional[bool]) – A boolean denoting whether to error out if all keys are not present in the state_dict. Since we remove data related buffers from the state_dict, this will lead to an error whenever strict=True. Instead, we overwrite it with strict=False, and raise a warning explaining this if strict=True is passed.

Returns

A named tuple _IncompatibleKeys, containing the missing_keys and unexpected_keys. Note that the buffers we remove from the state_dict may be listed under missing_keys.

Return type

torch.nn.modules.module._IncompatibleKeys

forward(datapoints)[source]

Calculate a posterior or prior prediction.

During training mode, forward implemented solely for gradient-based hyperparam opt. Essentially what it does is to re-calculate the utility f using its analytical form at f_map so that we are able to obtain gradients of the hyperparameters.

Parameters

datapoints (torch.Tensor) – A batch_shape x n x d Tensor, should be the same as self.datapoints during training

Returns

  1. Posterior centered at MAP points for training data (training mode)

  2. Prior predictions (prior mode)

  3. Predictive posterior (eval mode)

Return type

A MultivariateNormal object, being one of the followings

posterior(X, output_indices=None, observation_noise=False, posterior_transform=None, **kwargs)[source]

Computes the posterior over model outputs at the provided points.

Parameters
  • X (torch.Tensor) – A batch_shape x q x d-dim Tensor, where d is the dimension of the feature space and q is the number of points considered jointly.

  • output_indices (Optional[List[int]]) – As defined in parent Model class, not used for this model.

  • observation_noise (bool) – Ignored (since noise is not identifiable from scale in probit models).

  • posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – An optional PosteriorTransform.

  • kwargs (Any) –

Returns

A Posterior object, representing joint

distributions over q points.

Return type

botorch.posteriors.posterior.Posterior

condition_on_observations(X, Y, **kwargs)[source]

Condition the model on new observations.

Note that unlike other BoTorch models, PairwiseGP requires Y to be pairwise comparisons

Parameters
  • X (torch.Tensor) – A batch_shape x n x d dimension tensor X

  • Y (torch.Tensor) – A tensor of size batch_shape x m x 2. (i, j) means f_i is preferred over f_j

  • kwargs (Any) –

Returns

A (deepcopied) Model object of the same type, representing the original model conditioned on the new observations (X, Y).

Return type

botorch.models.model.Model

class botorch.models.pairwise_gp.PairwiseLaplaceMarginalLogLikelihood(model)[source]

Bases: gpytorch.mlls.marginal_log_likelihood.MarginalLogLikelihood

Laplace-approximated marginal log likelihood/evidence for PairwiseGP

See (12) from [Chu2005preference].

Parameters

model (PairwiseGP) – A model using laplace approximation (currently only supports PairwiseGP)

Return type

None

forward(post, comp)[source]

Calculate approximated log evidence, i.e., log(P(D|theta))

Parameters
Returns

The approximated evidence, i.e., the marginal log likelihood

Return type

torch.Tensor

training: bool

Contextual GP Models with Aggregate Rewards

class botorch.models.contextual.SACGP(train_X, train_Y, train_Yvar, decomposition)[source]

Bases: botorch.models.gp_regression.FixedNoiseGP

The GP uses Structural Additive Contextual(SAC) kernel.

Parameters
  • train_X (torch.Tensor) – (n x d) X training data.

  • train_Y (torch.Tensor) – (n x 1) Y training data.

  • train_Yvar (torch.Tensor) – (n x 1) Noise variances of each training Y.

  • decomposition (Dict[str, List[int]]) – Keys are context names. Values are the indexes of parameters belong to the context. The parameter indexes are in the same order across contexts.

Return type

None

A single-task exact GP model using fixed noise levels.

Parameters
  • train_X (torch.Tensor) – A batch_shape x n x d tensor of training features.

  • train_Y (torch.Tensor) – A batch_shape x n x m tensor of training observations.

  • train_Yvar (torch.Tensor) – A batch_shape x n x m tensor of observed measurement noise.

  • covar_module – The module computing the covariance (Kernel) matrix. If omitted, use a MaternKernel.

  • mean_module – The mean function to be used. If omitted, use a ConstantMean.

  • outcome_transform – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).

  • input_transform – An input transfrom that is applied in the model’s forward pass.

  • decomposition (Dict[str, List[int]]) –

Return type

None

Example

>>> train_X = torch.rand(20, 2)
>>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True)
>>> train_Yvar = torch.full_like(train_Y, 0.2)
>>> model = FixedNoiseGP(train_X, train_Y, train_Yvar)
class botorch.models.contextual.LCEAGP(train_X, train_Y, train_Yvar, decomposition, train_embedding=True, cat_feature_dict=None, embs_feature_dict=None, embs_dim_list=None, context_weight_dict=None)[source]

Bases: botorch.models.gp_regression.FixedNoiseGP

The GP with Latent Context Embedding Additive (LCE-A) Kernel. Note that the model does not support batch training. Input training data sets should have dim = 2.

Parameters
  • train_X (torch.Tensor) – (n x d) X training data.

  • train_Y (torch.Tensor) – (n x 1) Y training data.

  • train_Yvar (torch.Tensor) – (n x 1) Noise variance of Y.

  • decomposition (Dict[str, List[int]]) – Keys are context names. Values are the indexes of parameters belong to the context. The parameter indexes are in the same order across contexts.

  • cat_feature_dict (Optional[Dict]) – Keys are context names and values are list of categorical features i.e. {“context_name” : [cat_0, …, cat_k]}. k equals to number of categorical variables. If None, we use context names in the decomposition as the only categorical feature i.e. k = 1

  • embs_feature_dict (Optional[Dict]) – Pre-trained continuous embedding features of each context.

  • embs_dim_list (Optional[List[int]]) – Embedding dimension for each categorical variable. The length equals to num of categorical features k. If None, emb dim is set to 1 for each categorical variable.

  • context_weight_dict (Optional[Dict]) – Known population Weights of each context.

  • train_embedding (bool) –

Return type

None

A single-task exact GP model using fixed noise levels.

Parameters
  • train_X (torch.Tensor) – A batch_shape x n x d tensor of training features.

  • train_Y (torch.Tensor) – A batch_shape x n x m tensor of training observations.

  • train_Yvar (torch.Tensor) – A batch_shape x n x m tensor of observed measurement noise.

  • covar_module – The module computing the covariance (Kernel) matrix. If omitted, use a MaternKernel.

  • mean_module – The mean function to be used. If omitted, use a ConstantMean.

  • outcome_transform – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).

  • input_transform – An input transfrom that is applied in the model’s forward pass.

  • decomposition (Dict[str, List[int]]) –

  • train_embedding (bool) –

  • cat_feature_dict (Optional[Dict]) –

  • embs_feature_dict (Optional[Dict]) –

  • embs_dim_list (Optional[List[int]]) –

  • context_weight_dict (Optional[Dict]) –

Return type

None

Example

>>> train_X = torch.rand(20, 2)
>>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True)
>>> train_Yvar = torch.full_like(train_Y, 0.2)
>>> model = FixedNoiseGP(train_X, train_Y, train_Yvar)

Contextual GP Models with Context Rewards

class botorch.models.contextual_multioutput.LCEMGP(train_X, train_Y, task_feature, context_cat_feature=None, context_emb_feature=None, embs_dim_list=None, output_tasks=None, input_transform=None, outcome_transform=None)[source]

Bases: botorch.models.multitask.MultiTaskGP

The Multi-Task GP with the latent context embedding multioutput (LCE-M) kernel.

Parameters
  • train_X (torch.Tensor) – (n x d) X training data.

  • train_Y (torch.Tensor) – (n x 1) Y training data.

  • task_feature (int) – column index of train_X to get context indices.

  • context_cat_feature (Optional[torch.Tensor]) – (n_contexts x k) one-hot encoded context features. Rows are ordered by context indices. k equals to number of categorical variables. If None, task indices will be used and k = 1

  • context_emb_feature (Optional[torch.Tensor]) – (n_contexts x m) pre-given continuous embedding features. Rows are ordered by context indices.

  • embs_dim_list (Optional[List[int]]) – Embedding dimension for each categorical variable. The length equals to k. If None, emb dim is set to 1 for each categorical variable.

  • output_tasks (Optional[List[int]]) – A list of task indices for which to compute model outputs for. If omitted, return outputs for all task indices.

  • input_transform (Optional[botorch.models.transforms.input.InputTransform]) –

  • outcome_transform (Optional[botorch.models.transforms.outcome.OutcomeTransform]) –

Return type

None

Multi-Task GP model using an ICM kernel, inferring observation noise.

Parameters
  • train_X (torch.Tensor) – A n x (d + 1) or b x n x (d + 1) (batch mode) tensor of training data. One of the columns should contain the task features (see task_feature argument).

  • train_Y (torch.Tensor) – A n x 1 or b x n x 1 (batch mode) tensor of training observations.

  • task_feature (int) – The index of the task feature (-d <= task_feature <= d).

  • output_tasks (Optional[List[int]]) – A list of task indices for which to compute model outputs for. If omitted, return outputs for all task indices.

  • rank – The rank to be used for the index kernel. If omitted, use a full rank (i.e. number of tasks) kernel.

  • task_covar_prior – A Prior on the task covariance matrix. Must operate on p.s.d. matrices. A common prior for this is the LKJ prior.

  • input_transform (Optional[botorch.models.transforms.input.InputTransform]) – An input transform that is applied in the model’s forward pass.

  • context_cat_feature (Optional[torch.Tensor]) –

  • context_emb_feature (Optional[torch.Tensor]) –

  • embs_dim_list (Optional[List[int]]) –

  • outcome_transform (Optional[botorch.models.transforms.outcome.OutcomeTransform]) –

Return type

None

Example

>>> X1, X2 = torch.rand(10, 2), torch.rand(20, 2)
>>> i1, i2 = torch.zeros(10, 1), torch.ones(20, 1)
>>> train_X = torch.cat([
>>>     torch.cat([X1, i1], -1), torch.cat([X2, i2], -1),
>>> ])
>>> train_Y = torch.cat(f1(X1), f2(X2)).unsqueeze(-1)
>>> model = MultiTaskGP(train_X, train_Y, task_feature=-1)
task_covar_matrix(task_idcs)[source]

compute covariance matrix of a list of given context

Parameters

task_idcs (torch.Tensor) – (n x 1) or (b x n x 1) task indices tensor

Return type

torch.Tensor

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters

x (torch.Tensor) –

Return type

gpytorch.distributions.multivariate_normal.MultivariateNormal

class botorch.models.contextual_multioutput.FixedNoiseLCEMGP(train_X, train_Y, train_Yvar, task_feature, context_cat_feature=None, context_emb_feature=None, embs_dim_list=None, output_tasks=None)[source]

Bases: botorch.models.contextual_multioutput.LCEMGP

The Multi-Task GP the latent context embedding multioutput (LCE-M) kernel, with known observation noise.

Parameters
  • train_X (torch.Tensor) – (n x d) X training data.

  • train_Y (torch.Tensor) – (n x 1) Y training data.

  • train_Yvar (torch.Tensor) – (n x 1) Noise variances of each training Y.

  • task_feature (int) – column index of train_X to get context indices.

  • context_cat_feature (Optional[torch.Tensor]) – (n_contexts x k) one-hot encoded context features. Rows are ordered by context indices. k equals to number of categorical variables. If None, task indices will be used and k = 1.

  • context_emb_feature (Optional[torch.Tensor]) – (n_contexts x m) pre-given continuous embedding features. Rows are ordered by context indices.

  • embs_dim_list (Optional[List[int]]) – Embedding dimension for each categorical variable. The length equals to k. If None, emb dim is set to 1 for each categorical variable.

  • output_tasks (Optional[List[int]]) – A list of task indices for which to compute model outputs for. If omitted, return outputs for all task indices.

Return type

None

Multi-Task GP model using an ICM kernel, inferring observation noise.

Parameters
  • train_X (torch.Tensor) – A n x (d + 1) or b x n x (d + 1) (batch mode) tensor of training data. One of the columns should contain the task features (see task_feature argument).

  • train_Y (torch.Tensor) – A n x 1 or b x n x 1 (batch mode) tensor of training observations.

  • task_feature (int) – The index of the task feature (-d <= task_feature <= d).

  • output_tasks (Optional[List[int]]) – A list of task indices for which to compute model outputs for. If omitted, return outputs for all task indices.

  • rank – The rank to be used for the index kernel. If omitted, use a full rank (i.e. number of tasks) kernel.

  • task_covar_prior – A Prior on the task covariance matrix. Must operate on p.s.d. matrices. A common prior for this is the LKJ prior.

  • input_transform – An input transform that is applied in the model’s forward pass.

  • train_Yvar (torch.Tensor) –

  • context_cat_feature (Optional[torch.Tensor]) –

  • context_emb_feature (Optional[torch.Tensor]) –

  • embs_dim_list (Optional[List[int]]) –

Return type

None

Example

>>> X1, X2 = torch.rand(10, 2), torch.rand(20, 2)
>>> i1, i2 = torch.zeros(10, 1), torch.ones(20, 1)
>>> train_X = torch.cat([
>>>     torch.cat([X1, i1], -1), torch.cat([X2, i2], -1),
>>> ])
>>> train_Y = torch.cat(f1(X1), f2(X2)).unsqueeze(-1)
>>> model = MultiTaskGP(train_X, train_Y, task_feature=-1)

Variational GP Models

References

burt2020svgp(1,2)

David R. Burt and Carl Edward Rasmussen and Mark van der Wilk, Convergence of Sparse Variational Inference in Gaussian Process Regression, Journal of Machine Learning Research, 2020, http://jmlr.org/papers/v21/19-1015.html.

chen2018dpp

Laming Chen and Guoxin Zhang and Hanning Zhou, Fast greedy MAP inference for determinantal point process to improve recommendation diversity, Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018, https://arxiv.org/abs/1709.05135.

hensman2013svgp(1,2)

James Hensman and Nicolo Fusi and Neil D. Lawrence, Gaussian Processes for Big Data, Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence, 2013, https://arxiv.org/abs/1309.6835.

class botorch.models.approximate_gp.ApproximateGPyTorchModel(model=None, likelihood=None, num_outputs=1, *args, **kwargs)[source]

Bases: botorch.models.gpytorch.GPyTorchModel

Botorch wrapper class for various (variational) approximate GP models in gpytorch. This can either include stochastic variational GPs (SVGPs) or variational implementations of weight space approximate GPs.

Parameters
  • model (Optional[ApproximateGP]) – Instance of gpytorch.approximate GP models. If omitted, constructs a _SingleTaskVariationalGP.

  • likelihood (Optional[Likelihood]) – Instance of a GPyYorch likelihood. If omitted, uses a either a GaussianLikelihood (if num_outputs=1) or a MultitaskGaussianLikelihood`(if `num_outputs>1).

  • num_outputs (int) – Number of outputs expected for the GP model.

  • args – Optional positional arguments passed to the _SingleTaskVariationalGP constructor if no model is provided.

  • kwargs – Optional keyword arguments passed to the _SingleTaskVariationalGP constructor if no model is provided.

Return type

None

property num_outputs

The number of outputs of the model.

posterior(X, output_indices=None, observation_noise=False, *args, **kwargs)[source]

Computes the posterior over model outputs at the provided points.

Parameters
  • X – A (batch_shape) x q x d-dim Tensor, where d is the dimension of the feature space and q is the number of points considered jointly.

  • observation_noise – If True, add the observation noise from the likelihood to the posterior. If a Tensor, use it directly as the observation noise (must be of shape (batch_shape) x q).

  • posterior_transform – An optional PosteriorTransform.

Returns

A GPyTorchPosterior object, representing a batch of b joint distributions over q points. Includes observation noise if specified.

Return type

botorch.posteriors.gpytorch.GPyTorchPosterior

forward(X, *args, **kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type

gpytorch.distributions.multivariate_normal.MultivariateNormal

fantasize(X, sampler=<class 'botorch.sampling.samplers.MCSampler'>, observation_noise=True, *args, **kwargs)[source]

Construct a fantasy model.

Constructs a fantasy model in the following fashion: (1) compute the model posterior at X (including observation noise if observation_noise=True). (2) sample from this posterior (using sampler) to generate “fake” observations. (3) condition the model on the new fake observations.

Parameters
  • X – A batch_shape x n’ x d-dim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).

  • sampler – The sampler used for sampling from the posterior at X.

  • observation_noise – If True, include observation noise.

Returns

The constructed fantasy model.

class botorch.models.approximate_gp.SingleTaskVariationalGP(train_X, train_Y=None, likelihood=None, num_outputs=1, learn_inducing_points=True, covar_module=None, mean_module=None, variational_distribution=None, variational_strategy=<class 'gpytorch.variational.variational_strategy.VariationalStrategy'>, inducing_points=None, outcome_transform=None, input_transform=None)[source]

Bases: botorch.models.approximate_gp.ApproximateGPyTorchModel

A single-task variational GP model following [hensman2013svgp] with pivoted cholesky initialization following [chen2018dpp] and [burt2020svgp].

A single-task variational GP using relatively strong priors on the Kernel hyperparameters, which work best when covariates are normalized to the unit cube and outcomes are standardized (zero mean, unit variance).

This model works in batch mode (each batch having its own hyperparameters). When the training observations include multiple outputs, this model will use batching to model outputs independently. However, batches of multi-output models are not supported at this time, if you need to use those, please use a ModelListGP.

Use this model if you have a lot of data or if your responses are non-Gaussian.

To train this model, you should use gpytorch.mlls.VariationalELBO and not the exact marginal log likelihood. Example mll:

mll = VariationalELBO(model.likelihood, model.model, num_data=train_X.shape[-2])

A single task stochastic variational Gaussian process model (SVGP) as described by [hensman2013svgp]. We use pivoted cholesky initialization [burt2020svgp] to initialize the inducing points of the model.

Parameters
  • train_X (Tensor) – Training inputs (due to the ability of the SVGP to sub-sample this does not have to be all of the training inputs).

  • train_Y (Optional[Tensor]) – Training targets (optional).

  • likelihood (Optional[Likelihood]) – Instance of a GPyYorch likelihood. If omitted, uses a either a GaussianLikelihood (if num_outputs=1) or a MultitaskGaussianLikelihood`(if `num_outputs>1).

  • num_outputs (int) – Number of output responses per input (default: 1).

  • covar_module (Optional[Kernel]) – Kernel function. If omitted, uses a MaternKernel.

  • mean_module (Optional[Mean]) – Mean of GP model. If omitted, uses a ConstantMean.

  • variational_distribution (Optional[_VariationalDistribution]) – Type of variational distribution to use (default: CholeskyVariationalDistribution), the properties of the variational distribution will encourage scalability or ease of optimization.

  • variational_strategy (Type[_VariationalStrategy]) – Type of variational strategy to use (default: VariationalStrategy). The default setting uses “whitening” of the variational distribution to make training easier.

  • inducing_points (Optional[Union[Tensor, int]]) – The number or specific locations of the inducing points.

  • learn_inducing_points (bool) –

  • outcome_transform (Optional[OutcomeTransform]) –

  • input_transform (Optional[InputTransform]) –

Return type

None

init_inducing_points(inputs)[source]

Reinitialize the inducing point locations in-place with the current kernel applied to inputs. The variational distribution and variational strategy caches are reset.

Parameters

inputs (torch.Tensor) – (*batch_shape, n, d)-dim input data tensor.

Returns

(*batch_shape, m, d)-dim tensor of selected inducing point locations.

Return type

torch.Tensor

Fully Bayesian GP Models

Gaussian Process Regression models with fully Bayesian inference.

We use a lightweight PyTorch implementation of a Matern-5/2 kernel as there are some performance issues with running NUTS on top of standard GPyTorch models. The resulting hyperparameter samples are loaded into a batched GPyTorch model after fitting.

References:

Eriksson2021saasbo(1,2)

D. Eriksson, M. Jankowiak. High-Dimensional Bayesian Optimization with Sparse Axis-Aligned Subspaces. Proceedings of the Thirty- Seventh Conference on Uncertainty in Artificial Intelligence, 2021.

botorch.models.fully_bayesian.matern52_kernel(X, lengthscale)[source]

Matern-5/2 kernel.

Parameters
  • X (torch.Tensor) –

  • lengthscale (torch.Tensor) –

Return type

torch.Tensor

botorch.models.fully_bayesian.compute_dists(X, lengthscale)[source]

Compute kernel distances.

Parameters
  • X (torch.Tensor) –

  • lengthscale (torch.Tensor) –

Return type

torch.Tensor

botorch.models.fully_bayesian.reshape_and_detach(target, new_value)[source]

Detach and reshape new_value to match target.

Parameters
  • target (torch.Tensor) –

  • new_value (torch.Tensor) –

Return type

None

class botorch.models.fully_bayesian.PyroModel[source]

Bases: object

Abstract base class for a Pyro model.

set_inputs(train_X, train_Y, train_Yvar=None)[source]

Set the training data.

Parameters
  • train_X (torch.Tensor) – Training inputs (n x d)

  • train_Y (torch.Tensor) – Training targets (n x 1)

  • train_Yvar (Optional[torch.Tensor]) – Observed noise variance (n x 1). Inferred if None.

abstract sample()[source]

Sample from the model.

Return type

None

abstract postprocess_mcmc_samples(mcmc_samples, **kwargs)[source]

Post-process the final MCMC samples.

Parameters
  • mcmc_samples (Dict[str, torch.Tensor]) –

  • kwargs (Any) –

Return type

Dict[str, torch.Tensor]

abstract load_mcmc_samples(mcmc_samples)[source]
Parameters

mcmc_samples (Dict[str, torch.Tensor]) –

Return type

Tuple[gpytorch.means.mean.Mean, gpytorch.kernels.kernel.Kernel, gpytorch.likelihoods.likelihood.Likelihood]

class botorch.models.fully_bayesian.SaasPyroModel[source]

Bases: botorch.models.fully_bayesian.PyroModel

Implementation of the sparse axis-aligned subspace priors (SAAS) model.

The SAAS model uses sparsity-inducing priors to identift the most important parameters. This model is suitable for high-dimensional BO with potentially hundreds of tunable parameters. See [Eriksson2021saasbo] for more details.

sample()[source]

Sample from the SAAS model.

This samples the mean, noise variance, outputscale, and lengthscales according to the SAAS prior.

Return type

None

sample_outputscale(concentration=2.0, rate=0.15, **tkwargs)[source]

Sample the outputscale.

Parameters
  • concentration (float) –

  • rate (float) –

  • tkwargs (Any) –

Return type

torch.Tensor

sample_mean(**tkwargs)[source]

Sample the mean constant.

Parameters

tkwargs (Any) –

Return type

torch.Tensor

sample_noise(**tkwargs)[source]

Sample the noise variance.

Parameters

tkwargs (Any) –

Return type

torch.Tensor

sample_lengthscale(dim, alpha=0.1, **tkwargs)[source]

Sample the lengthscale.

Parameters
  • dim (int) –

  • alpha (float) –

  • tkwargs (Any) –

Return type

torch.Tensor

postprocess_mcmc_samples(mcmc_samples)[source]

Post-process the MCMC samples.

This computes the true lengthscales and removes the inverse lengthscales and tausq (global shrinkage).

Parameters

mcmc_samples (Dict[str, torch.Tensor]) –

Return type

Dict[str, torch.Tensor]

load_mcmc_samples(mcmc_samples)[source]

Load the MCMC samples into the mean_module, covar_module, and likelihood.

Parameters

mcmc_samples (Dict[str, torch.Tensor]) –

Return type

Tuple[gpytorch.means.mean.Mean, gpytorch.kernels.kernel.Kernel, gpytorch.likelihoods.likelihood.Likelihood]

class botorch.models.fully_bayesian.SaasFullyBayesianSingleTaskGP(train_X, train_Y, train_Yvar=None, outcome_transform=None, input_transform=None, pyro_model=None)[source]

Bases: botorch.models.gp_regression.SingleTaskGP

A fully Bayesian single-task GP model with the SAAS prior.

This model assumes that the inputs have been normalized to [0, 1]^d and that the output has been standardized to have zero mean and unit variance. You can either normalize and standardize the data before constructing the model or use an input_transform and outcome_transform. The SAAS model [Eriksson2021saasbo] with a Matern-5/2 kernel is used by default.

You are expected to use fit_fully_bayesian_model_nuts to fit this model as it isn’t compatible with fit_gpytorch_model.

Example: >>> saas_gp = SaasFullyBayesianSingleTaskGP(train_X, train_Y) >>> fit_fully_bayesian_model_nuts(saas_gp) >>> posterior = saas_gp.posterior(test_X)

Initialize the fully Bayesian single-task GP model.

Parameters
  • train_X (torch.Tensor) – Training inputs (n x d)

  • train_Y (torch.Tensor) – Training targets (n x 1)

  • train_Yvar (Optional[torch.Tensor]) – Observed noise variance (n x 1). Inferred if None.

  • outcome_transform (Optional[botorch.models.transforms.outcome.OutcomeTransform]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).

  • input_transform (Optional[botorch.models.transforms.input.InputTransform]) – An input transform that is applied in the model’s forward pass.

  • pyro_model (Optional[botorch.models.fully_bayesian.PyroModel]) – Optional PyroModel, defaults to SaasPyroModel.

Return type

None

property median_lengthscale: torch.Tensor

Median lengthscales across the MCMC samples.

property num_mcmc_samples: int

Number of MCMC samples in the model.

fantasize(X, sampler, observation_noise=True, **kwargs)[source]

Construct a fantasy model.

Constructs a fantasy model in the following fashion: (1) compute the model posterior at X (including observation noise if observation_noise=True). (2) sample from this posterior (using sampler) to generate “fake” observations. (3) condition the model on the new fake observations.

Parameters
  • X (torch.Tensor) – A batch_shape x n’ x d-dim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).

  • sampler (botorch.sampling.samplers.MCSampler) – The sampler used for sampling from the posterior at X.

  • observation_noise (Union[bool, torch.Tensor]) – If True, include observation noise.

  • kwargs (Any) –

Returns

The constructed fantasy model.

Return type

botorch.models.gp_regression.FixedNoiseGP

train(mode=True)[source]

Puts the model in train mode.

Parameters

mode (bool) –

Return type

None

load_mcmc_samples(mcmc_samples)[source]

Load the MCMC hyperparameter samples into the model.

This method will be called by fit_fully_bayesian_model_nuts when the model has been fitted in order to create a batched SingleTaskGP model.

Parameters

mcmc_samples (Dict[str, torch.Tensor]) –

Return type

None

forward(X)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters

X (torch.Tensor) –

Return type

gpytorch.distributions.multivariate_normal.MultivariateNormal

posterior(X, output_indices=None, observation_noise=False, posterior_transform=None, **kwargs)[source]

Computes the posterior over model outputs at the provided points.

Parameters
  • X (torch.Tensor) – A (batch_shape) x q x d-dim Tensor, where d is the dimension of the feature space and q is the number of points considered jointly.

  • output_indices (Optional[List[int]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multi-output). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.

  • observation_noise (bool) – If True, add the observation noise from the likelihood to the posterior. If a Tensor, use it directly as the observation noise (must be of shape (batch_shape) x q x m).

  • posterior_transform (Optional[botorch.acquisition.objective.PosteriorTransform]) – An optional PosteriorTransform.

  • kwargs (Any) –

Returns

A FullyBayesianPosterior object. Includes observation noise if specified.

Return type

botorch.posteriors.fully_bayesian.FullyBayesianPosterior

classmethod construct_inputs(training_data, **kwargs)[source]

Construct kwargs for the Model from TrainingData and other options.

Parameters
  • training_data (botorch.utils.containers.TrainingData) – TrainingData container with data for single outcome or for multiple outcomes for batched multi-output case.

  • **kwargs – None expected for this class.

  • kwargs (Any) –

Return type

Dict[str, Any]

Model Components

Kernels

class botorch.models.kernels.categorical.CategoricalKernel(ard_num_dims=None, batch_shape=torch.Size([]), active_dims=None, lengthscale_prior=None, lengthscale_constraint=None, eps=1e-06, **kwargs)[source]

Bases: gpytorch.kernels.kernel.Kernel

A Kernel for categorical features.

Computes exp(-dist(x1, x2) / lengthscale), where dist(x1, x2) is zero if x1 == x2 and one if x1 != x2. If the last dimension is not a batch dimension, then the mean is considered.

Note: This kernel is NOT differentiable w.r.t. the inputs.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Parameters
  • ard_num_dims (Optional[int]) –

  • batch_shape (Optional[torch.Size]) –

  • active_dims (Optional[Tuple[int, ...]]) –

  • lengthscale_prior (Optional[gpytorch.priors.prior.Prior]) –

  • lengthscale_constraint (Optional[gpytorch.constraints.constraints.Interval]) –

  • eps (Optional[float]) –

class botorch.models.kernels.downsampling.DownsamplingKernel(power_prior=None, offset_prior=None, power_constraint=None, offset_constraint=None, **kwargs)[source]

Bases: gpytorch.kernels.kernel.Kernel

GPyTorch Downsampling Kernel.

Computes a covariance matrix based on the down sampling kernel between inputs x_1 and x_2 (we expect d = 1):

K(mathbf{x_1}, mathbf{x_2}) = c + (1 - x_1)^(1 + delta) *

(1 - x_2)^(1 + delta).

where c is an offset parameter, and delta is a power parameter.

Parameters
  • power_constraint (Optional[Interval]) – Constraint to place on power parameter. Default is Positive.

  • power_prior (Optional[Prior]) – Prior over the power parameter.

  • offset_constraint (Optional[Interval]) – Constraint to place on offset parameter. Default is Positive.

  • active_dims – List of data dimensions to operate on. len(active_dims) should equal num_dimensions.

  • offset_prior (Optional[Prior]) –

Initializes internal Module state, shared by both nn.Module and ScriptModule.

class botorch.models.kernels.exponential_decay.ExponentialDecayKernel(power_prior=None, offset_prior=None, power_constraint=None, offset_constraint=None, **kwargs)[source]

Bases: gpytorch.kernels.kernel.Kernel

GPyTorch Exponential Decay Kernel.

Computes a covariance matrix based on the exponential decay kernel between inputs x_1 and x_2 (we expect d = 1):

K(x_1, x_2) = w + beta^alpha / (x_1 + x_2 + beta)^alpha.

where w is an offset parameter, beta is a lenthscale parameter, and alpha is a power parameter.

Parameters
  • lengthscale_constraint – Constraint to place on lengthscale parameter. Default is Positive.

  • lengthscale_prior – Prior over the lengthscale parameter.

  • power_constraint (Optional[Interval]) – Constraint to place on power parameter. Default is Positive.

  • power_prior (Optional[Prior]) – Prior over the power parameter.

  • offset_constraint (Optional[Interval]) – Constraint to place on offset parameter. Default is Positive.

  • active_dims – List of data dimensions to operate on. len(active_dims) should equal num_dimensions.

  • offset_prior (Optional[Prior]) –

Initializes internal Module state, shared by both nn.Module and ScriptModule.

class botorch.models.kernels.linear_truncated_fidelity.LinearTruncatedFidelityKernel(fidelity_dims, dimension=None, power_prior=None, power_constraint=None, nu=2.5, lengthscale_prior_unbiased=None, lengthscale_prior_biased=None, lengthscale_constraint_unbiased=None, lengthscale_constraint_biased=None, covar_module_unbiased=None, covar_module_biased=None, **kwargs)[source]

Bases: gpytorch.kernels.kernel.Kernel

GPyTorch Linear Truncated Fidelity Kernel.

Computes a covariance matrix based on the Linear truncated kernel between inputs x_1 and x_2 for up to two fidelity parmeters:

K(x_1, x_2) = k_0 + c_1(x_1, x_2)k_1 + c_2(x_1,x_2)k_2 + c_3(x_1,x_2)k_3

where

  • k_i(i=0,1,2,3) are Matern kernels calculated between non-fidelity

    parameters of x_1 and x_2 with different priors.

  • c_1=(1 - x_1[f_1])(1 - x_2[f_1]))(1 + x_1[f_1] x_2[f_1])^p is the kernel

    of the the bias term, which can be decomposed into a determistic part and a polynomial kernel. Here f_1 is the first fidelity dimension and p is the order of the polynomial kernel.

  • c_3 is the same as c_1 but is calculated for the second fidelity

    dimension f_2.

  • c_2 is the interaction term with four deterministic terms and the

    polynomial kernel between x_1[…, [f_1, f_2]] and x_2[…, [f_1, f_2]].

Parameters
  • fidelity_dims (List[int]) – A list containing either one or two indices specifying the fidelity parameters of the input.

  • dimension (Optional[int]) – The dimension of x. Unused if active_dims is specified.

  • power_prior (Optional[Prior]) – Prior for the power parameter of the polynomial kernel. Default is None.

  • power_constraint (Optional[Interval]) – Constraint on the power parameter of the polynomial kernel. Default is Positive.

  • nu (float) – The smoothness parameter for the Matern kernel: either 1/2, 3/2, or 5/2. Unused if both covar_module_unbiased and covar_module_biased are specified.

  • lengthscale_prior_unbiased (Optional[Prior]) – Prior on the lengthscale parameter of Matern kernel k_0. Default is Gamma(1.1, 1/20).

  • lengthscale_constraint_unbiased (Optional[Interval]) – Constraint on the lengthscale parameter of the Matern kernel k_0. Default is Positive.

  • lengthscale_prior_biased (Optional[Prior]) – Prior on the lengthscale parameter of Matern kernels k_i(i>0). Default is Gamma(5, 1/20).

  • lengthscale_constraint_biased (Optional[Interval]) – Constraint on the lengthscale parameter of the Matern kernels k_i(i>0). Default is Positive.

  • covar_module_unbiased (Optional[Kernel]) – Specify a custom kernel for k_0. If omitted, use a MaternKernel.

  • covar_module_biased (Optional[Kernel]) – Specify a custom kernel for the biased parts k_i(i>0). If omitted, use a MaternKernel.

  • batch_shape – If specified, use a separate lengthscale for each batch of input data. If x1 is a batch_shape x n x d tensor, this should be batch_shape.

  • active_dims – Compute the covariance of a subset of input dimensions. The numbers correspond to the indices of the dimensions.

  • kwargs (Any) –

Return type

None

Example

>>> x = torch.randn(10, 5)
>>> # Non-batch: Simple option
>>> covar_module = LinearTruncatedFidelityKernel()
>>> covar = covar_module(x)  # Output: LazyVariable of size (10 x 10)
>>>
>>> batch_x = torch.randn(2, 10, 5)
>>> # Batch: Simple option
>>> covar_module = LinearTruncatedFidelityKernel(batch_shape = torch.Size([2]))
>>> covar = covar_module(x)  # Output: LazyVariable of size (2 x 10 x 10)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

class botorch.models.kernels.contextual_lcea.LCEAKernel(decomposition, batch_shape, train_embedding=True, cat_feature_dict=None, embs_feature_dict=None, embs_dim_list=None, context_weight_dict=None, device=None)[source]

Bases: gpytorch.kernels.kernel.Kernel

The Latent Context Embedding Additive (LCE-A) Kernel.

This kernel is similar to the SACKernel, and is used when context breakdowns are unbserverable. It assumes the same additive structure and a spatial kernel shared across contexts. Rather than assuming independence, LCEAKernel models the correlation in the latent functions for each context through learning context embeddings.

Parameters
  • decomposition (Dict[str, List[int]]) – Keys index context names. Values are the indexes of parameters belong to the context. The parameter indexes are in the same order across contexts.

  • batch_shape (torch.Size) – Batch shape as usual for gpytorch kernels. Model does not support batch training. When batch_shape is non-empty, it is used for loading hyper-parameter values generated from MCMC sampling.

  • train_embedding (bool) – A boolean indictor of whether to learn context embeddings

  • cat_feature_dict (Optional[Dict]) – Keys are context names and values are list of categorical features i.e. {“context_name” : [cat_0, …, cat_k]}. k equals to number of categorical variables. If None, we use context names in the decomposition as the only categorical feature i.e. k = 1

  • embs_feature_dict (Optional[Dict]) – Pre-trained continuous embedding features of each context.

  • embs_dim_list (Optional[List[int]]) – Embedding dimension for each categorical variable. The length equals to num of categorical features k. If None, emb dim is set to 1 for each categorical variable.

  • context_weight_dict (Optional[Dict]) – Known population Weights of each context.

  • device (Optional[torch.device]) –

Return type

None

Initializes internal Module state, shared by both nn.Module and ScriptModule.

class botorch.models.kernels.contextual_sac.SACKernel(decomposition, batch_shape, device=None)[source]

Bases: gpytorch.kernels.kernel.Kernel

The structural additive contextual(SAC) kernel.

The kernel is used for contextual BO without oberseving context breakdowns. There are d parameters and M contexts. In total, the dimension of parameter space is d*M and input x can be written as x=[x_11, …, x_1d, x_21, …, x_2d, …, x_M1, …, x_Md].

The kernel uses the parameter decomposition and assumes an additive structure across contexts. Each context compponent is assumed to be independent.

\[\begin{equation*} k(\mathbf{x}, \mathbf{x'}) = k_1(\mathbf{x_(1)}, \mathbf{x'_(1)}) + \cdots + k_M(\mathbf{x_(M)}, \mathbf{x'_(M)}) \end{equation*}\]

where * :math: M is the number of partitions of parameter space. Each partition contains same number of parameters d. Each kernel k_i acts only on d parameters of ith partition i.e. mathbf{x}_(i). Each kernel k_i is a scaled Matern kernel with same lengthscales but different outputscales.

Parameters
  • decomposition (Dict[str, List[int]]) – Keys are context names. Values are the indexes of parameters belong to the context. The parameter indexes are in the same order across contexts.

  • batch_shape (torch.Size) – Batch shape as usual for gpytorch kernels.

  • device (Optional[torch.device]) –

Return type

None

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Likelihoods

Transforms

Outcome Transforms

Outcome transformations for automatically transforming and un-transforming model outputs. Outcome transformations are typically part of a Model and applied (i) within the model constructor to transform the train observations to the model space, and (ii) in the Model.posterior call to untransform the model posterior back to the original space.

class botorch.models.transforms.outcome.OutcomeTransform[source]

Bases: torch.nn.modules.module.Module, abc.ABC

Abstract base class for outcome transforms.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Return type

None

abstract forward(Y, Yvar=None)[source]

Transform the outcomes in a model’s training targets

Parameters
  • Y (torch.Tensor) – A batch_shape x n x m-dim tensor of training targets.

  • Yvar (Optional[torch.Tensor]) – A batch_shape x n x m-dim tensor of observation noises associated with the training targets (if applicable).

Returns

  • The transformed outcome observations.

  • The transformed observation noise (if applicable).

Return type

A two-tuple with the transformed outcomes

subset_output(idcs)[source]

Subset the transform along the output dimension.

This functionality is used to properly treat outcome transformations in the subset_model functionality.

Parameters

idcs (List[int]) – The output indices to subset the transform to.

Returns

The current outcome transform, subset to the specified output indices.

Return type

botorch.models.transforms.outcome.OutcomeTransform

untransform(Y, Yvar=None)[source]

Un-transform previously transformed outcomes

Parameters
  • Y (torch.Tensor) – A batch_shape x n x m-dim tensor of transfomred training targets.

  • Yvar (Optional[torch.Tensor]) – A batch_shape x n x m-dim tensor of transformed observation noises associated with the training targets (if applicable).

Returns

  • The un-transformed outcome observations.

  • The un-transformed observation noise (if applicable).

Return type

A two-tuple with the un-transformed outcomes

untransform_posterior(posterior)[source]

Un-transform a posterior

Parameters

posterior (botorch.posteriors.posterior.Posterior) – A posterior in the transformed space.

Returns

The un-transformed posterior.

Return type

botorch.posteriors.posterior.Posterior

training: bool
class botorch.models.transforms.outcome.ChainedOutcomeTransform(**transforms)[source]

Bases: botorch.models.transforms.outcome.OutcomeTransform, torch.nn.modules.container.ModuleDict

An outcome transform representing the chaining of individual transforms

Chaining of outcome transforms.

Parameters

transforms (OutcomeTransform) – The transforms to chain. Internally, the names of the kwargs are used as the keys for accessing the individual transforms on the module.

Return type

None

forward(Y, Yvar=None)[source]

Transform the outcomes in a model’s training targets

Parameters
  • Y (torch.Tensor) – A batch_shape x n x m-dim tensor of training targets.

  • Yvar (Optional[torch.Tensor]) – A batch_shape x n x m-dim tensor of observation noises associated with the training targets (if applicable).

Returns

  • The transformed outcome observations.

  • The transformed observation noise (if applicable).

Return type

A two-tuple with the transformed outcomes

subset_output(idcs)[source]

Subset the transform along the output dimension.

Parameters

idcs (List[int]) – The output indices to subset the transform to.

Returns

The current outcome transform, subset to the specified output indices.

Return type

botorch.models.transforms.outcome.OutcomeTransform

untransform(Y, Yvar=None)[source]

Un-transform previously transformed outcomes

Parameters
  • Y (torch.Tensor) – A batch_shape x n x m-dim tensor of transfomred training targets.

  • Yvar (Optional[torch.Tensor]) – A batch_shape x n x m-dim tensor of transformed observation noises associated with the training targets (if applicable).

Returns

  • The un-transformed outcome observations.

  • The un-transformed observation noise (if applicable).

Return type

A two-tuple with the un-transformed outcomes

untransform_posterior(posterior)[source]

Un-transform a posterior

Parameters

posterior (botorch.posteriors.posterior.Posterior) – A posterior in the transformed space.

Returns

The un-transformed posterior.

Return type

botorch.posteriors.posterior.Posterior

training: bool
class botorch.models.transforms.outcome.Standardize(m, outputs=None, batch_shape=torch.Size([]), min_stdv=1e-08)[source]

Bases: botorch.models.transforms.outcome.OutcomeTransform

Standardize outcomes (zero mean, unit variance).

This module is stateful: If in train mode, calling forward updates the module state (i.e. the mean/std normalizing constants). If in eval mode, calling forward simply applies the standardization using the current module state.

Standardize outcomes (zero mean, unit variance).

Parameters
  • m (int) – The output dimension.

  • outputs (Optional[List[int]]) – Which of the outputs to standardize. If omitted, all outputs will be standardized.

  • batch_shape (torch.Size) – The batch_shape of the training targets.

  • min_stddv – The minimum standard deviation for which to perform standardization (if lower, only de-mean the data).

  • min_stdv (float) –

Return type

None

forward(Y, Yvar=None)[source]

Standardize outcomes.

If the module is in train mode, this updates the module state (i.e. the mean/std normalizing constants). If the module is in eval mode, simply applies the normalization using the module state.

Parameters
  • Y (torch.Tensor) – A batch_shape x n x m-dim tensor of training targets.

  • Yvar (Optional[torch.Tensor]) – A batch_shape x n x m-dim tensor of observation noises associated with the training targets (if applicable).

Returns

  • The transformed outcome observations.

  • The transformed observation noise (if applicable).

Return type

A two-tuple with the transformed outcomes

subset_output(idcs)[source]

Subset the transform along the output dimension.

Parameters

idcs (List[int]) – The output indices to subset the transform to.

Returns

The current outcome transform, subset to the specified output indices.

Return type

botorch.models.transforms.outcome.OutcomeTransform

untransform(Y, Yvar=None)[source]

Un-standardize outcomes.

Parameters
  • Y (torch.Tensor) – A batch_shape x n x m-dim tensor of standardized targets.

  • Yvar (Optional[torch.Tensor]) – A batch_shape x n x m-dim tensor of standardized observation noises associated with the targets (if applicable).

Returns

  • The un-standardized outcome observations.

  • The un-standardized observation noise (if applicable).

Return type

A two-tuple with the un-standardized outcomes

untransform_posterior(posterior)[source]

Un-standardize the posterior.

Parameters

posterior (botorch.posteriors.posterior.Posterior) – A posterior in the standardized space.

Returns

The un-standardized posterior. If the input posterior is a MVN, the transformed posterior is again an MVN.

Return type

botorch.posteriors.posterior.Posterior

training: bool
class botorch.models.transforms.outcome.Log(outputs=None)[source]

Bases: botorch.models.transforms.outcome.OutcomeTransform

Log-transform outcomes.

Useful if the targets are modeled using a (multivariate) log-Normal distribution. This means that we can use a standard GP model on the log-transformed outcomes and un-transform the model posterior of that GP.

Log-transform outcomes.

Parameters

outputs (Optional[List[int]]) – Which of the outputs to log-transform. If omitted, all outputs will be standardized.

Return type

None

subset_output(idcs)[source]

Subset the transform along the output dimension.

Parameters

idcs (List[int]) – The output indices to subset the transform to.

Returns

The current outcome transform, subset to the specified output indices.

Return type

botorch.models.transforms.outcome.OutcomeTransform

forward(Y, Yvar=None)[source]

Log-transform outcomes.

Parameters
  • Y (torch.Tensor) – A batch_shape x n x m-dim tensor of training targets.

  • Yvar (Optional[torch.Tensor]) – A batch_shape x n x m-dim tensor of observation noises associated with the training targets (if applicable).

Returns

  • The transformed outcome observations.

  • The transformed observation noise (if applicable).

Return type

A two-tuple with the transformed outcomes

untransform(Y, Yvar=None)[source]

Un-transform log-transformed outcomes

Parameters
  • Y (torch.Tensor) – A batch_shape x n x m-dim tensor of log-transfomred targets.

  • Yvar (Optional[torch.Tensor]) – A batch_shape x n x m-dim tensor of log- transformed observation noises associated with the training targets (if applicable).

Returns

  • The exponentiated outcome observations.

  • The exponentiated observation noise (if applicable).

Return type

A two-tuple with the un-transformed outcomes

untransform_posterior(posterior)[source]

Un-transform the log-transformed posterior.

Parameters

posterior (botorch.posteriors.posterior.Posterior) – A posterior in the log-transformed space.

Returns

The un-transformed posterior.

Return type

botorch.posteriors.posterior.Posterior

training: bool
class botorch.models.transforms.outcome.Power(power, outputs=None)[source]

Bases: botorch.models.transforms.outcome.OutcomeTransform

Power-transform outcomes.

Useful if the targets are modeled using a (multivariate) power transform of a Normal distribution. This means that we can use a standard GP model on the power-transformed outcomes and un-transform the model posterior of that GP.

Power-transform outcomes.

Parameters
  • outputs (Optional[List[int]]) – Which of the outputs to power-transform. If omitted, all outputs will be standardized.

  • power (float) –

Return type

None

subset_output(idcs)[source]

Subset the transform along the output dimension.

Parameters

idcs (List[int]) – The output indices to subset the transform to.

Returns

The current outcome transform, subset to the specified output indices.

Return type

botorch.models.transforms.outcome.OutcomeTransform

forward(Y, Yvar=None)[source]

Power-transform outcomes.

Parameters
  • Y (torch.Tensor) – A batch_shape x n x m-dim tensor of training targets.

  • Yvar (Optional[torch.Tensor]) – A batch_shape x n x m-dim tensor of observation noises associated with the training targets (if applicable).

Returns

  • The transformed outcome observations.

  • The transformed observation noise (if applicable).

Return type

A two-tuple with the transformed outcomes

untransform(Y, Yvar=None)[source]

Un-transform power-transformed outcomes

Parameters
  • Y (torch.Tensor) – A batch_shape x n x m-dim tensor of power-transfomred targets.

  • Yvar (Optional[torch.Tensor]) – A batch_shape x n x m-dim tensor of power-transformed observation noises associated with the training targets (if applicable).

Returns

  • The un-power transformed outcome observations.

  • The un-power transformed observation noise (if applicable).

Return type

A two-tuple with the un-transformed outcomes

untransform_posterior(posterior)[source]

Un-transform the power-transformed posterior.

Parameters

posterior (botorch.posteriors.posterior.Posterior) – A posterior in the power-transformed space.

Returns

The un-transformed posterior.

Return type

botorch.posteriors.posterior.Posterior

training: bool

Input Transforms

Input Transformations.

These classes implement a variety of transformations for input parameters including: learned input warping functions, rounding functions, and log transformations. The input transformation is typically part of a Model and applied within the model.forward() method.

class botorch.models.transforms.input.InputTransform[source]

Bases: abc.ABC

Abstract base class for input transforms.

Note: Input transforms must inherit from torch.nn.Module. This

is deferred to the subclasses to avoid any potential conflict between gpytorch.module.Module and torch.nn.Module in Warp.

Properties:
transform_on_train: A boolean indicating whether to apply the

transform in train() mode.

transform_on_eval: A boolean indicating whether to apply the

transform in eval() mode.

transform_on_fantasize: A boolean indicating whether to apply

the transform when called from within a fantasize call.

transform_on_eval: bool
transform_on_train: bool
transform_on_fantasize: bool
forward(X)[source]

Transform the inputs to a model.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim tensor of inputs.

Returns

A batch_shape x n’ x d-dim tensor of transformed inputs.

Return type

torch.Tensor

abstract transform(X)[source]

Transform the inputs to a model.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim tensor of inputs.

Returns

A batch_shape x n x d-dim tensor of transformed inputs.

Return type

torch.Tensor

untransform(X)[source]

Un-transform the inputs to a model.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim tensor of transformed inputs.

Returns

A batch_shape x n x d-dim tensor of un-transformed inputs.

Return type

torch.Tensor

equals(other)[source]

Check if another input transform is equivalent.

Note: The reason that a custom equals method is defined rather than defining an __eq__ method is because defining an __eq__ method sets the __hash__ method to None. Hashing modules is currently used in pytorch. See https://github.com/pytorch/pytorch/issues/7733.

Parameters

other (botorch.models.transforms.input.InputTransform) – Another input transform.

Returns

A boolean indicating if the other transform is equivalent.

Return type

bool

preprocess_transform(X)[source]

Apply transforms for preprocessing inputs.

The main use cases for this method are 1) to preprocess training data before calling set_train_data and 2) preprocess X_baseline for noisy acquisition functions so that X_baseline is “preprocessed” with the same transformations as the cached training inputs.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim tensor of inputs.

Returns

A batch_shape x n x d-dim tensor of (transformed) inputs.

Return type

torch.Tensor

class botorch.models.transforms.input.ChainedInputTransform(**transforms)[source]

Bases: botorch.models.transforms.input.InputTransform, torch.nn.modules.container.ModuleDict

An input transform representing the chaining of individual transforms.

Chaining of input transforms.

Parameters

transforms (InputTransform) – The transforms to chain. Internally, the names of the kwargs are used as the keys for accessing the individual transforms on the module.

Return type

None

Example

>>> tf1 = Normalize(d=2)
>>> tf2 = Normalize(d=2)
>>> tf = ChainedInputTransform(tf1=tf1, tf2=tf2)
>>> list(tf.keys())
['tf1', 'tf2']
>>> tf["tf1"]
Normalize()
transform_on_train: bool
transform_on_eval: bool
transform_on_fantasize: bool
transform(X)[source]

Transform the inputs to a model.

Individual transforms are applied in sequence.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim tensor of inputs.

Returns

A batch_shape x n x d-dim tensor of transformed inputs.

Return type

torch.Tensor

untransform(X)[source]

Un-transform the inputs to a model.

Un-transforms of the individual transforms are applied in reverse sequence.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim tensor of transformed inputs.

Returns

A batch_shape x n x d-dim tensor of un-transformed inputs.

Return type

torch.Tensor

equals(other)[source]

Check if another input transform is equivalent.

Parameters

other (botorch.models.transforms.input.InputTransform) – Another input transform.

Returns

A boolean indicating if the other transform is equivalent.

Return type

bool

preprocess_transform(X)[source]

Apply transforms for preprocessing inputs.

The main use cases for this method are 1) to preprocess training data before calling set_train_data and 2) preprocess X_baseline for noisy acquisition functions so that X_baseline is “preprocessed” with the same transformations as the cached training inputs.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim tensor of inputs.

Returns

A batch_shape x n x d-dim tensor of (transformed) inputs.

Return type

torch.Tensor

class botorch.models.transforms.input.ReversibleInputTransform[source]

Bases: botorch.models.transforms.input.InputTransform, abc.ABC

An abstract class for a reversible input transform.

Properties:
reverse: A boolean indicating if the functionality of transform

and untransform methods should be swapped.

reverse: bool
transform(X)[source]

Transform the inputs.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim tensor of inputs.

Returns

A batch_shape x n x d-dim tensor of transformed inputs.

Return type

torch.Tensor

untransform(X)[source]

Un-transform the inputs.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim tensor of inputs.

Returns

A batch_shape x n x d-dim tensor of un-transformed inputs.

Return type

torch.Tensor

equals(other)[source]

Check if another input transform is equivalent.

Parameters

other (botorch.models.transforms.input.InputTransform) – Another input transform.

Returns

A boolean indicating if the other transform is equivalent.

Return type

bool

class botorch.models.transforms.input.Normalize(d, indices=None, bounds=None, batch_shape=torch.Size([]), transform_on_train=True, transform_on_eval=True, transform_on_fantasize=True, reverse=False, min_range=1e-08)[source]

Bases: botorch.models.transforms.input.ReversibleInputTransform, torch.nn.modules.module.Module

Normalize the inputs to the unit cube.

If no explicit bounds are provided this module is stateful: If in train mode, calling forward updates the module state (i.e. the normalizing bounds). If in eval mode, calling forward simply applies the normalization using the current module state.

Normalize the inputs to the unit cube.

Parameters
  • d (int) – The dimension of the input space.

  • indices (Optional[List[int]]) – The indices of the inputs to normalize. If omitted, take all dimensions of the inputs into account.

  • bounds (Optional[Tensor]) – If provided, use these bounds to normalize the inputs. If omitted, learn the bounds in train mode.

  • batch_shape (torch.Size) – The batch shape of the inputs (asssuming input tensors of shape batch_shape x n x d). If provided, perform individual normalization per batch, otherwise uses a single normalization.

  • transform_on_train (bool) – A boolean indicating whether to apply the transforms in train() mode. Default: True.

  • transform_on_eval (bool) – A boolean indicating whether to apply the transform in eval() mode. Default: True.

  • transform_on_fantasize (bool) – A boolean indicating whether to apply the transform when called from within a fantasize call. Default: True.

  • reverse (bool) – A boolean indicating whether the forward pass should untransform the inputs.

  • min_range (float) – Amount of noise to add to the range to ensure no division by zero errors.

Return type

None

transform_on_train: bool
transform_on_eval: bool
transform_on_fantasize: bool
reverse: bool
property bounds: torch.Tensor

The bounds used for normalizing the inputs.

equals(other)[source]

Check if another input transform is equivalent.

Parameters

other (botorch.models.transforms.input.InputTransform) – Another input transform.

Returns

A boolean indicating if the other transform is equivalent.

Return type

bool

class botorch.models.transforms.input.InputStandardize(d, indices=None, batch_shape=torch.Size([]), transform_on_train=True, transform_on_eval=True, transform_on_fantasize=True, reverse=False, min_std=1e-08)[source]

Bases: botorch.models.transforms.input.ReversibleInputTransform, torch.nn.modules.module.Module

Standardize inputs (zero mean, unit variance).

In train mode, calling forward updates the module state (i.e. the mean/std normalizing constants). If in eval mode, calling forward simply applies the standardization using the current module state.

Standardize inputs (zero mean, unit variance).

Parameters
  • d (int) – The dimension of the input space.

  • indices (Optional[List[int]]) – The indices of the inputs to standardize. If omitted, take all dimensions of the inputs into account.

  • batch_shape (torch.Size) – The batch shape of the inputs (asssuming input tensors of shape batch_shape x n x d). If provided, perform individual normalization per batch, otherwise uses a single normalization.

  • transform_on_train (bool) – A boolean indicating whether to apply the transforms in train() mode. Default: True

  • transform_on_eval (bool) – A boolean indicating whether to apply the transform in eval() mode. Default: True

  • reverse (bool) – A boolean indicating whether the forward pass should untransform the inputs.

  • min_std (float) – Amount of noise to add to the standard deviation to ensure no division by zero errors.

  • transform_on_fantasize (bool) –

Return type

None

transform_on_train: bool
transform_on_eval: bool
transform_on_fantasize: bool
reverse: bool
equals(other)[source]

Check if another input transform is equivalent.

Parameters

other (botorch.models.transforms.input.InputTransform) – Another input transform.

Returns

A boolean indicating if the other transform is equivalent.

Return type

bool

class botorch.models.transforms.input.Round(indices, transform_on_train=True, transform_on_eval=True, transform_on_fantasize=True, approximate=True, tau=0.001)[source]

Bases: botorch.models.transforms.input.InputTransform, torch.nn.modules.module.Module

A rounding transformation for integer inputs.

This will typically be used in conjunction with normalization as follows:

In eval() mode (i.e. after training), the inputs pass would typically be normalized to the unit cube (e.g. during candidate optimization). 1. These are unnormalized back to the raw input space. 2. The integers are rounded. 3. All values are normalized to the unit cube.

In train() mode, the inputs can either (a) be normalized to the unit cube or (b) provided using their raw values. In the case of (a) transform_on_train should be set to True, so that the normalized inputs are unnormalized before rounding. In the case of (b) transform_on_train should be set to False, so that the raw inputs are rounded and then normalized to the unit cube.

This transformation uses differentiable approximate rounding by default. The rounding function is approximated with a piece-wise function where each piece is a hyperbolic tangent function.

Example

>>> unnormalize_tf = Normalize(
>>>     d=d,
>>>     bounds=bounds,
>>>     transform_on_eval=True,
>>>     transform_on_train=True,
>>>     reverse=True,
>>> )
>>> round_tf = Round(integer_indices)
>>> normalize_tf = Normalize(d=d, bounds=bounds)
>>> tf = ChainedInputTransform(
>>>     tf1=unnormalize_tf, tf2=round_tf, tf3=normalize_tf
>>> )

Initialize transform.

Parameters
  • indices (List[int]) – The indices of the integer inputs.

  • transform_on_train (bool) – A boolean indicating whether to apply the transforms in train() mode. Default: True.

  • transform_on_eval (bool) – A boolean indicating whether to apply the transform in eval() mode. Default: True.

  • transform_on_fantasize (bool) – A boolean indicating whether to apply the transform when called from within a fantasize call. Default: True.

  • approximate (bool) – A boolean indicating whether approximate or exact rounding should be used. Default: approximate.

  • tau (float) – The temperature parameter for approximate rounding.

Return type

None

transform_on_train: bool
transform_on_eval: bool
transform_on_fantasize: bool
transform(X)[source]

Round the inputs.

Parameters

X (torch.Tensor) – A batch_shape x n x d-dim tensor of inputs.

Returns

A batch_shape x n x d-dim tensor of rounded inputs.

Return type

torch.Tensor

equals(other)[source]

Check if another input transform is equivalent.

Parameters

other (botorch.models.transforms.input.InputTransform) – Another input transform.

Returns

A boolean indicating if the other transform is equivalent.

Return type

bool

class botorch.models.transforms.input.Log10(indices, transform_on_train=True, transform_on_eval=True, transform_on_fantasize=True, reverse=False)[source]

Bases: botorch.models.transforms.input.ReversibleInputTransform, torch.nn.modules.module.Module

A base-10 log transformation.

Initialize transform.

Parameters
  • indices (List[int]) – The indices of the inputs to log transform.

  • transform_on_train (bool) – A boolean indicating whether to apply the transforms in train() mode. Default: True.

  • transform_on_eval (bool) – A boolean indicating whether to apply the transform in eval() mode. Default: True.

  • transform_on_fantasize (bool) – A boolean indicating whether to apply the transform when called from within a fantasize call. Default: True.

  • reverse (bool) – A boolean indicating whether the forward pass should untransform the inputs.

Return type

None

transform_on_train: bool
transform_on_eval: bool
transform_on_fantasize: bool
reverse: bool
class botorch.models.transforms.input.Warp(indices, transform_on_train=True, transform_on_eval=True, transform_on_fantasize=True, reverse=False, eps=1e-07, concentration1_prior=None, concentration0_prior=None, batch_shape=None)[source]

Bases: botorch.models.transforms.input.ReversibleInputTransform, gpytorch.module.Module

A transform that uses learned input warping functions.

Each specified input dimension is warped using the CDF of a Kumaraswamy distribution. Typically, MAP estimates of the parameters of the Kumaraswamy distribution, for each input dimension, are learned jointly with the GP hyperparameters.

TODO: implement support using independent warping functions for each output in batched multi-output and multi-task models.

For now, ModelListGPs should be used to learn independent warping functions for each output.

Initialize transform.

Parameters
  • indices (List[int]) – The indices of the inputs to warp.

  • transform_on_train (bool) – A boolean indicating whether to apply the transforms in train() mode. Default: True.

  • transform_on_eval (bool) – A boolean indicating whether to apply the transform in eval() mode. Default: True.

  • transform_on_fantasize (bool) – A boolean indicating whether to apply the transform when called from within a fantasize call. Default: True.

  • reverse (bool) – A boolean indicating whether the forward pass should untransform the inputs.

  • eps (float) – A small value used to clip values to be in the interval (0, 1).

  • concentration1_prior (Optional[Prior]) – A prior distribution on the concentration1 parameter of the Kumaraswamy distribution.

  • concentration0_prior (Optional[Prior]) – A prior distribution on the concentration0 parameter of the Kumaraswamy distribution.

  • batch_shape (Optional[torch.Size]) – The batch shape.

Return type

None

transform_on_train: bool
transform_on_eval: bool
transform_on_fantasize: bool
reverse: bool
class botorch.models.transforms.input.AppendFeatures(feature_set, transform_on_train=False, transform_on_eval=True, transform_on_fantasize=False)[source]

Bases: botorch.models.transforms.input.InputTransform, torch.nn.modules.module.Module

A transform that appends the input with a given set of features.

As an example, this can be used with RiskMeasureMCObjective to optimize risk measures as described in [Cakmak2020risk]. A tutorial notebook implementing the rhoKG acqusition function introduced in [Cakmak2020risk] can be found at https://botorch.org/tutorials/risk_averse_bo_with_environmental_variables.

The steps for using this to obtain samples of a risk measure are as follows:

  • Train a model on (x, w) inputs and the corresponding observations;

  • Pass in an instance of AppendFeatures with the feature_set denoting the samples of W as the input_transform to the trained model;

  • Call posterior(…).rsample(…) on the model with x inputs only to get the joint posterior samples over (x, w)`s, where the `w`s come from the `feature_set;

  • Pass these posterior samples through the RiskMeasureMCObjective of choice to get the samples of the risk measure.

Note: The samples of the risk measure obtained this way are in general biased since the feature_set does not fully represent the distribution of the environmental variable.

Example

>>> # We consider 1D `x` and 1D `w`, with `W` having a
>>> # uniform distribution over [0, 1]
>>> model = SingleTaskGP(
...     train_X=torch.rand(10, 2),
...     train_Y=torch.randn(10, 1),
...     input_transform=AppendFeatures(feature_set=torch.rand(10, 1))
... )
>>> mll = ExactMarginalLogLikelihood(model.likelihood, model)
>>> fit_gpytorch_model(mll)
>>> test_x = torch.rand(3, 1)
>>> # `posterior_samples` is a `10 x 30 x 1`-dim tensor
>>> posterior_samples = model.posterior(test_x).rsamples(torch.size([10]))
>>> risk_measure = VaR(alpha=0.8, n_w=10)
>>> # `risk_measure_samples` is a `10 x 3`-dim tensor of samples of the
>>> # risk measure VaR
>>> risk_measure_samples = risk_measure(posterior_samples)

Append feature_set to each input.

Parameters
  • feature_set (Tensor) – An n_f x d_f-dim tensor denoting the features to be appended to the inputs.

  • transform_on_train (bool) – A boolean indicating whether to apply the transforms in train() mode. Default: False.

  • transform_on_eval (bool) – A boolean indicating whether to apply the transform in eval() mode. Default: True.

  • transform_on_fantasize (bool) – A boolean indicating whether to apply the transform when called from within a fantasize call. Default: False.

Return type

None

transform_on_train: bool
transform_on_eval: bool
transform_on_fantasize: bool
transform(X)[source]

Transform the inputs by appending feature_set to each input.

For each 1 x d-dim element in the input tensor, this will produce an n_f x (d + d_f)-dim tensor with feature_set appended as the last d_f dimensions. For a generic batch_shape x q x d-dim X, this translates to a batch_shape x (q * n_f) x (d + d_f)-dim output, where the values corresponding to X[…, i, :] are found in output[…, i * n_f: (i + 1) * n_f, :].

Note: Adding the feature_set on the q-batch dimension is necessary to avoid introducing additional bias by evaluating the inputs on independent GP sample paths.

Parameters

X (torch.Tensor) – A batch_shape x q x d-dim tensor of inputs.

Returns

A batch_shape x (q * n_f) x (d + d_f)-dim tensor of appended inputs.

Return type

torch.Tensor

class botorch.models.transforms.input.FilterFeatures(feature_indices, transform_on_train=True, transform_on_eval=True, transform_on_fantasize=True)[source]

Bases: botorch.models.transforms.input.InputTransform, torch.nn.modules.module.Module

A transform that filters the input with a given set of features indices.

As an example, this can be used in a multiobjective optimization with ModelListGP in which the specific models only share subsets of features (feature selection). A reason could be that it is known that specific features do not have any impact on a specific objective but they need to be included in the model for another one.

Filter features from a model.

Parameters
  • feature_set – An one-dim tensor denoting the indices of the features to be kept and fed to the model.

  • transform_on_train (bool) – A boolean indicating whether to apply the transforms in train() mode. Default: True.

  • transform_on_eval (bool) – A boolean indicating whether to apply the transform in eval() mode. Default: True.

  • transform_on_fantasize (bool) – A boolean indicating whether to apply the transform when called from within a fantasize call. Default: True.

  • feature_indices (Tensor) –

Return type

None

transform_on_train: bool
transform_on_eval: bool
transform_on_fantasize: bool
transform(X)[source]

Transform the inputs by keeping only the in feature_indices specified feature indices and filtering out the others.

Parameters

X (torch.Tensor) – A batch_shape x q x d-dim tensor of inputs.

Returns

A batch_shape x q x e-dim tensor of filtered inputs,

where e is the length of feature_indices.

Return type

torch.Tensor

equals(other)[source]

Check if another input transform is equivalent.

Parameters

other (botorch.models.transforms.input.InputTransform) – Another input transform

Returns

A boolean indicating if the other transform is equivalent.

Return type

bool

class botorch.models.transforms.input.InputPerturbation(perturbation_set, bounds=None, multiplicative=False, transform_on_train=False, transform_on_eval=True, transform_on_fantasize=False)[source]

Bases: botorch.models.transforms.input.InputTransform, torch.nn.modules.module.Module

A transform that adds the set of perturbations to the given input.

Similar to AppendFeatures, this can be used with RiskMeasureMCObjective to optimize risk measures. See AppendFeatures for additional discussion on optimizing risk measures.

A tutorial notebook using this with qNoisyExpectedImprovement can be found at https://botorch.org/tutorials/risk_averse_bo_with_input_perturbations.

Add perturbation_set to each input.

Parameters
  • perturbation_set (Union[Tensor, Callable[[Tensor], Tensor]]) – An n_p x d-dim tensor denoting the perturbations to be added to the inputs. Alternatively, this can be a callable that returns batch x n_p x d-dim tensor of perturbations for input of shape batch x d. This is useful for heteroscedastic perturbations.

  • bounds (Optional[Tensor]) – A 2 x d-dim tensor of lower and upper bounds for each column of the input. If given, the perturbed inputs will be clamped to these bounds.

  • multiplicative (bool) – A boolean indicating whether the input perturbations are additive or multiplicative. If True, inputs will be multiplied with the perturbations.

  • transform_on_train (bool) – A boolean indicating whether to apply the transforms in train() mode. Default: False.

  • transform_on_eval (bool) – A boolean indicating whether to apply the transform in eval() mode. Default: True.

  • transform_on_fantasize (bool) – A boolean indicating whether to apply the transform when called from within a fantasize call. Default: False.

Return type

None

transform_on_train: bool
transform_on_eval: bool
transform_on_fantasize: bool
transform(X)[source]

Transform the inputs by adding perturbation_set to each input.

For each 1 x d-dim element in the input tensor, this will produce an n_p x d-dim tensor with the perturbation_set added to the input. For a generic batch_shape x q x d-dim X, this translates to a batch_shape x (q * n_p) x d-dim output, where the values corresponding to X[…, i, :] are found in output[…, i * n_w: (i + 1) * n_w, :].

Note: Adding the perturbation_set on the q-batch dimension is necessary to avoid introducing additional bias by evaluating the inputs on independent GP sample paths.

Parameters

X (torch.Tensor) – A batch_shape x q x d-dim tensor of inputs.

Returns

A batch_shape x (q * n_p) x d-dim tensor of perturbed inputs.

Return type

torch.Tensor

Transform Utilities

botorch.models.transforms.utils.lognorm_to_norm(mu, Cov)[source]

Compute mean and covariance of a MVN from those of the associated log-MVN

If Y is log-normal with mean mu_ln and covariance Cov_ln, then X ~ N(mu_n, Cov_n) with

Cov_n_{ij} = log(1 + Cov_ln_{ij} / (mu_ln_{i} * mu_n_{j})) mu_n_{i} = log(mu_ln_{i}) - 0.5 * log(1 + Cov_ln_{ii} / mu_ln_{i}**2)

Parameters
  • mu (torch.Tensor) – A batch_shape x n mean vector of the log-Normal distribution.

  • Cov (torch.Tensor) – A batch_shape x n x n covariance matrix of the log-Normal distribution.

Returns

  • The batch_shape x n mean vector of the Normal distribution

  • The batch_shape x n x n covariance matrix of the Normal distribution

Return type

A two-tuple containing

botorch.models.transforms.utils.norm_to_lognorm(mu, Cov)[source]

Compute mean and covariance of a log-MVN from its MVN sufficient statistics

If X ~ N(mu, Cov) and Y = exp(X), then Y is log-normal with

mu_ln_{i} = exp(mu_{i} + 0.5 * Cov_{ii}) Cov_ln_{ij} = exp(mu_{i} + mu_{j} + 0.5 * (Cov_{ii} + Cov_{jj})) * (exp(Cov_{ij}) - 1)

Parameters
  • mu (torch.Tensor) – A batch_shape x n mean vector of the Normal distribution.

  • Cov (torch.Tensor) – A batch_shape x n x n covariance matrix of the Normal distribution.

Returns

  • The batch_shape x n mean vector of the log-Normal distribution.

  • The batch_shape x n x n covariance matrix of the log-Normal

    distribution.

Return type

A two-tuple containing

botorch.models.transforms.utils.norm_to_lognorm_mean(mu, var)[source]

Compute mean of a log-MVN from its MVN marginals

Parameters
  • mu (torch.Tensor) – A batch_shape x n mean vector of the Normal distribution.

  • var (torch.Tensor) – A batch_shape x n variance vectorof the Normal distribution.

Returns

The batch_shape x n mean vector of the log-Normal distribution

Return type

torch.Tensor

botorch.models.transforms.utils.norm_to_lognorm_variance(mu, var)[source]

Compute variance of a log-MVN from its MVN marginals

Parameters
  • mu (torch.Tensor) – A batch_shape x n mean vector of the Normal distribution.

  • var (torch.Tensor) – A batch_shape x n variance vectorof the Normal distribution.

Returns

The batch_shape x n variance vector of the log-Normal distribution.

Return type

torch.Tensor

botorch.models.transforms.utils.expand_and_copy_tensor(X, batch_shape)[source]

Expand and copy X according to batch_shape.

Parameters
  • X (torch.Tensor) – A input_batch_shape x n x d-dim tensor of inputs

  • batch_shape (torch.Size) – The new batch shape

Returns

A input_batch_shape x batch_shape x n x d-dim tensor of inputs

Return type

torch.Tensor

Utilities

Dataset Parsing

Model Conversion

Utilities for converting between different models.

botorch.models.converter.model_list_to_batched(model_list)[source]

Convert a ModelListGP to a BatchedMultiOutputGPyTorchModel.

Parameters

model_list (botorch.models.model_list_gp_regression.ModelListGP) – The ModelListGP to be converted to the appropriate BatchedMultiOutputGPyTorchModel. All sub-models must be of the same type and have the shape (batch shape and number of training inputs).

Returns

The model converted into a BatchedMultiOutputGPyTorchModel.

Return type

botorch.models.gpytorch.BatchedMultiOutputGPyTorchModel

Example

>>> list_gp = ModelListGP(gp1, gp2)
>>> batch_gp = model_list_to_batched(list_gp)
botorch.models.converter.batched_to_model_list(batch_model)[source]

Convert a BatchedMultiOutputGPyTorchModel to a ModelListGP.

Parameters

batch_model (botorch.models.gpytorch.BatchedMultiOutputGPyTorchModel) – The BatchedMultiOutputGPyTorchModel to be converted to a ModelListGP.

Returns

The model converted into a ModelListGP.

Return type

botorch.models.model_list_gp_regression.ModelListGP

Example

>>> train_X = torch.rand(5, 2)
>>> train_Y = torch.rand(5, 2)
>>> batch_gp = SingleTaskGP(train_X, train_Y)
>>> list_gp = batched_to_model_list(batch_gp)
botorch.models.converter.batched_multi_output_to_single_output(batch_mo_model)[source]

Convert a model from batched multi-output to a batched single-output.

Note: the underlying GPyTorch GP does not change. The GPyTorch GP’s batch_shape (referred to as _aug_batch_shape) is still _input_batch_shape x num_outputs. The only things that change are the attributes of the BatchedMultiOutputGPyTorchModel that are responsible the internal accounting of the number of outputs: namely, num_outputs, _input_batch_shape, and _aug_batch_shape. Initially for the batched MO models these are: num_outputs = m, _input_batch_shape = train_X.batch_shape, and _aug_batch_shape = train_X.batch_shape + torch.Size([num_outputs]). In the new SO model, these are: num_outputs = 1, _input_batch_shape = train_X.batch_shape + torch.Size([num_outputs]), and _aug_batch_shape = train_X.batch_shape + torch.Size([num_outputs]).

This is a (hopefully) temporary measure until multi-output MVNs with independent outputs have better support in GPyTorch (see https://github.com/cornellius-gp/gpytorch/pull/1083).

Parameters
Returns

The model converted into a batch single-output model.

Return type

botorch.models.gpytorch.BatchedMultiOutputGPyTorchModel

Example

>>> train_X = torch.rand(5, 2)
>>> train_Y = torch.rand(5, 2)
>>> batch_mo_gp = SingleTaskGP(train_X, train_Y)
>>> batch_so_gp = batched_multioutput_to_single_output(batch_gp)

Other Utilties