botorch.models¶
Abstract Model API¶
Model¶

class
botorch.models.model.
Model
[source]¶ Abstract base class for BoTorch models.

condition_on_observations
(X, Y, **kwargs)[source]¶ Condition the model on new observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).Y (
Tensor
) – A batch_shape’ x n’ x mdim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, it is assumed that the missing batch dimensions are the same for all Y.
 Return type
 Returns
A Model object of the same type, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).

fantasize
(X, sampler, observation_noise=True, **kwargs)[source]¶ Construct a fantasy model.
Constructs a fantasy model in the following fashion: (1) compute the model posterior at X (including observation noise if observation_noise=True). (2) sample from this posterior (using sampler) to generate “fake” observations. (3) condition the model on the new fake observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).sampler (
MCSampler
) – The sampler used for sampling from the posterior at X.observation_noise (
bool
) – If True, include observation noise.
 Return type
 Returns
The constructed fantasy model.

abstract
posterior
(X, output_indices=None, observation_noise=False, **kwargs)[source]¶ Computes the posterior over model outputs at the provided points.
 Parameters
X (
Tensor
) – A b x q x ddim Tensor, where d is the dimension of the feature space, q is the number of points considered jointly, and b is the batch dimension.output_indices (
Optional
[List
[int
]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multioutput). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.observation_noise (
bool
) – If True, add observation noise to the posterior.
 Return type
 Returns
A Posterior object, representing a batch of b joint distributions over q points and m outputs each.

GPyTorchModel¶

class
botorch.models.gpytorch.
GPyTorchModel
[source]¶ Abstract base class for models based on GPyTorch models.
The easiest way to use this is to subclass a model from a GPyTorch model class (e.g. an ExactGP) and this GPyTorchModel. See e.g. SingleTaskGP.

condition_on_observations
(X, Y, **kwargs)[source]¶ Condition the model on new observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).Y (
Tensor
) – A batch_shape’ x n x mdim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.
 Return type
 Returns
A Model object of the same type, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.sin(train_X[:, 0]) + torch.cos(train_X[:, 1]) >>> model = SingleTaskGP(train_X, train_Y) >>> new_X = torch.rand(5, 2) >>> new_Y = torch.sin(new_X[:, 0]) + torch.cos(new_X[:, 1]) >>> model = model.condition_on_observations(X=new_X, Y=new_Y)

posterior
(X, observation_noise=False, **kwargs)[source]¶ Computes the posterior over model outputs at the provided points.
 Parameters
X (
Tensor
) – A (batch_shape) x q x ddim Tensor, where d is the dimension of the feature space and q is the number of points considered jointly.observation_noise (
bool
) – If True, add observation noise to the posterior.
 Return type
 Returns
A GPyTorchPosterior object, representing a batch of b joint distributions over q points. Includes observation noise if observation_noise=True.

BatchedMultiOutputGPyTorchModel¶

class
botorch.models.gpytorch.
BatchedMultiOutputGPyTorchModel
[source]¶ Base class for batched multioutput GPyTorch models with independent outputs.
This model should be used when the same training data is used for all outputs. Outputs are modeled independently by using a different batch for each output.

condition_on_observations
(X, Y, **kwargs)[source]¶ Condition the model on new observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, m is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).Y (
Tensor
) – A batch_shape’ x n’ x mdim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.
 Return type
 Returns
A BatchedMultiOutputGPyTorchModel object of the same type with n + n’ training examples, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.cat( >>> [torch.sin(train_X[:, 0]), torch.cos(train_X[:, 1])], 1 >>> ) >>> model = SingleTaskGP(train_X, train_Y) >>> new_X = torch.rand(5, 2) >>> new_Y = torch.cat([torch.sin(new_X[:, 0]), torch.cos(new_X[:, 1])], 1) >>> model = model.condition_on_observations(X=new_X, Y=new_Y)

posterior
(X, output_indices=None, observation_noise=False, **kwargs)[source]¶ Computes the posterior over model outputs at the provided points.
 Parameters
X (
Tensor
) – A (batch_shape) x q x ddim Tensor, where d is the dimension of the feature space and q is the number of points considered jointly.output_indices (
Optional
[List
[int
]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multioutput). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.observation_noise (
bool
) – If True, add observation noise to the posterior.
 Return type
 Returns
A GPyTorchPosterior object, representing batch_shape joint distributions over q points and the outputs selected by output_indices each. Includes observation noise if observation_noise=True.

ModelListGPyTorchModel¶

class
botorch.models.gpytorch.
ModelListGPyTorchModel
[source]¶ Abstract base class for models based on multioutput GPyTorch models.
This is meant to be used with a gpytorch ModelList wrapper for independent evaluation of submodels.

condition_on_observations
(X, Y, **kwargs)[source]¶ Condition the model on new observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).Y (
Tensor
) – A batch_shape’ x n x mdim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.
 Return type
 Returns
A Model object of the same type, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.sin(train_X[:, 0]) + torch.cos(train_X[:, 1]) >>> model = SingleTaskGP(train_X, train_Y) >>> new_X = torch.rand(5, 2) >>> new_Y = torch.sin(new_X[:, 0]) + torch.cos(new_X[:, 1]) >>> model = model.condition_on_observations(X=new_X, Y=new_Y)

abstract property
num_outputs
¶ The number of outputs of the model.
 Return type
int

posterior
(X, output_indices=None, observation_noise=False, **kwargs)[source]¶ Computes the posterior over model outputs at the provided points.
 Parameters
X (
Tensor
) – A b x q x ddim Tensor, where d is the dimension of the feature space, q is the number of points considered jointly, and b is the batch dimension.output_indices (
Optional
[List
[int
]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multioutput). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.observation_noise (
bool
) – If True, add observation noise to the posterior.
 Return type
 Returns
A GPyTorchPosterior object, representing batch_shape joint distributions over q points and the outputs selected by output_indices each. Includes measurement noise if observation_noise=True.

MultiTaskGPyTorchModel¶

class
botorch.models.gpytorch.
MultiTaskGPyTorchModel
[source]¶ Abstract base class for multitask models baed on GPyTorch models.
This class provides the posterior method to models that implement a “longformat” multitask GP in the style of MultiTaskGP.

posterior
(X, output_indices=None, observation_noise=False, **kwargs)[source]¶ Computes the posterior over model outputs at the provided points.
 Parameters
X (
Tensor
) – A q x d or batch_shape x q x d (batch mode) tensor, where d is the dimension of the feature space (not including task indices) and q is the number of points considered jointly.output_indices (
Optional
[List
[int
]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multioutput). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.observation_noise (
bool
) – If True, add observation noise to the posterior.
 Return type
 Returns
A GPyTorchPosterior object, representing batch_shape joint distributions over q points and the outputs selected by output_indices. Includes measurement noise if observation_noise=True.

GPyTorch Regression Models¶
SingleTaskGP¶

class
botorch.models.gp_regression.
SingleTaskGP
(train_X, train_Y, likelihood=None, covar_module=None)[source]¶ A singletask exact GP model.
A singletask exact GP using relatively strong priors on the Kernel hyperparameters, which work best when covariates are normalized to the unit cube and outcomes are standardized (zero mean, unit variance).
This model works in batch mode (each batch having its own hyperparameters). When the training observations include multiple outputs, this model will use batching to model outputs independently.
Use this model when you have independent output(s) and all outputs use the same training data. If outputs are independent and outputs have different training data, use the ModelListGP. When modeling correlations between outputs, use the MultiTaskGP.
A singletask exact GP model.
 Parameters
train_X (
Tensor
) – A n x d or batch_shape x n x d (batch mode) tensor of training features.train_Y (
Tensor
) – A n x m or batch_shape x n x m (batch mode) tensor of training observations.likelihood (
Optional
[Likelihood
]) – A likelihood. If omitted, use a standard GaussianLikelihood with inferred noise level.covar_module (
Optional
[Module
]) – The covariance (kernel) matrix. If omitted, use the MaternKernel.
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True) >>> model = SingleTaskGP(train_X, train_Y)
FixedNoiseGP¶

class
botorch.models.gp_regression.
FixedNoiseGP
(train_X, train_Y, train_Yvar)[source]¶ A singletask exact GP model using fixed noise levels.
A singletask exact GP that uses fixed observation noise levels. This model also uses relatively strong priors on the Kernel hyperparameters, which work best when covariates are normalized to the unit cube and outcomes are standardized (zero mean, unit variance).
This model works in batch mode (each batch having its own hyperparameters).
A singletask exact GP model using fixed noise levels.
 Parameters
train_X (
Tensor
) – A n x d or batch_shape x n x d (batch mode) tensor of training features.train_Y (
Tensor
) – A n x m or batch_shape x n x m (batch mode) tensor of training observations.train_Yvar (
Tensor
) – A batch_shape x n x m or batch_shape x n x m (batch mode) tensor of observed measurement noise.
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True) >>> train_Yvar = torch.full_like(train_Y, 0.2) >>> model = FixedNoiseGP(train_X, train_Y, train_Yvar)

fantasize
(X, sampler, observation_noise=True, **kwargs)[source]¶ Construct a fantasy model.
Constructs a fantasy model in the following fashion: (1) compute the model posterior at X (if observation_noise=True, this includes observation noise, which is taken as the mean across the observation noise in the training data). (2) sample from this posterior (using sampler) to generate “fake” observations. (3) condition the model on the new fake observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).sampler (
MCSampler
) – The sampler used for sampling from the posterior at X.observation_noise (
bool
) – If True, include the mean across the observation noise in the training data as observation noise in the posterior from which the samples are drawn.
 Return type
 Returns
The constructed fantasy model.

forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them. Return type
MultivariateNormal
HeteroskedasticSingleTaskGP¶

class
botorch.models.gp_regression.
HeteroskedasticSingleTaskGP
(train_X, train_Y, train_Yvar)[source]¶ A singletask exact GP model using a heteroskeastic noise model.
This model internally wraps another GP (a SingleTaskGP) to model the observation noise. This allows the likelihood to make outofsample predictions for the observation noise levels.
A singletask exact GP model using a heteroskedastic noise model.
 Parameters
train_X (
Tensor
) – A n x d or batch_shape x n x d (batch mode) tensor of training features.train_Y (
Tensor
) – A n x m or batch_shape x n x m (batch mode) tensor of training observations.train_Yvar (
Tensor
) – A batch_shape x n x m or batch_shape x n x m (batch mode) tensor of observed measurement noise.
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True) >>> se = torch.norm(train_X, dim=1, keepdim=True) >>> train_Yvar = 0.1 + se * torch.rand_like(train_Y) >>> model = HeteroskedasticSingleTaskGP(train_X, train_Y, train_Yvar)

condition_on_observations
(X, Y, **kwargs)[source]¶ Condition the model on new observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, m is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).Y (
Tensor
) – A batch_shape’ x n’ x mdim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.
 Return type
 Returns
A BatchedMultiOutputGPyTorchModel object of the same type with n + n’ training examples, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.cat( >>> [torch.sin(train_X[:, 0]), torch.cos(train_X[:, 1])], 1 >>> ) >>> model = SingleTaskGP(train_X, train_Y) >>> new_X = torch.rand(5, 2) >>> new_Y = torch.cat([torch.sin(new_X[:, 0]), torch.cos(new_X[:, 1])], 1) >>> model = model.condition_on_observations(X=new_X, Y=new_Y)
ModelListGP¶

class
botorch.models.model_list_gp_regression.
ModelListGP
(*gp_models)[source]¶ A multioutput GP model with independent GPs for the outputs.
This model supports differentshaped training inputs for each of its submodels. It can be used with any BoTorch models.
Internally, this model is just a list of individual models, but it implements the same input/output interface as all other BoTorch models. This makes it very flexible and convenient to work with. The sequential evaluation comes at a performance cost though  if you are using a block design (i.e. the same number of training example for each output, and a similar model structure, you should consider using a batched GP model instead).
A multioutput GP model with independent GPs for the outputs.
 Parameters
*gp_models – An variable number of singleoutput BoTorch models.
Example
>>> model1 = SingleTaskGP(train_X1, train_Y1) >>> model2 = SingleTaskGP(train_X2, train_Y2) >>> model = ModelListGP(model1, model2)

condition_on_observations
(X, Y, **kwargs)[source]¶ Condition the model on new observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).Y (
Tensor
) – A batch_shape’ x n’ x mdim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.
 Return type
 Returns
A ModelListGPyTorchModel representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs). Here the ith model has n_i + n’ training examples, where the n’ training examples have been added and all testtime caches have been updated.
MultiTaskGP¶

class
botorch.models.multitask.
MultiTaskGP
(train_X, train_Y, task_feature, output_tasks=None, rank=None)[source]¶ MultiTask GP model using an ICM kernel, inferring observation noise.
Multitask exact GP that uses a simple ICM kernel. Can be singleoutput or multioutput. This model uses relatively strong priors on the base Kernel hyperparameters, which work best when covariates are normalized to the unit cube and outcomes are standardized (zero mean, unit variance).
This model infers the noise level. WARNING: It currently does not support different noise levels for the different tasks. If you have known observation noise, please use FixedNoiseMultiTaskGP instead.
MultiTask GP model using an ICM kernel, inferring observation noise.
 Parameters
train_X (
Tensor
) – A n x (d + 1) or b x n x (d + 1) (batch mode) tensor of training data. One of the columns should contain the task features (see task_feature argument).train_Y (
Tensor
) – A n or b x n (batch mode) tensor of training observations.task_feature (
int
) – The index of the task feature (d <= task_feature <= d).output_tasks (
Optional
[List
[int
]]) – A list of task indices for which to compute model outputs for. If omitted, return outputs for all task indices.rank (
Optional
[int
]) – The rank to be used for the index kernel. If omitted, use a full rank (i.e. number of tasks) kernel.
Example
>>> X1, X2 = torch.rand(10, 2), torch.rand(20, 2) >>> i1, i2 = torch.zeros(10, 1), torch.ones(20, 1) >>> train_X = torch.stack([ >>> torch.cat([X1, i1], 1), torch.cat([X2, i2], 1), >>> ]) >>> train_Y = torch.cat(f1(X1), f2(X2)) >>> model = MultiTaskGP(train_X, train_Y, task_feature=1)

forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them. Return type
MultivariateNormal
FixedNoiseMultiTaskGP¶

class
botorch.models.multitask.
FixedNoiseMultiTaskGP
(train_X, train_Y, train_Yvar, task_feature, output_tasks=None, rank=None)[source]¶ MultiTask GP model using an ICM kernel, with known observation noise.
Multitask exact GP that uses a simple ICM kernel. Can be singleoutput or multioutput. This model uses relatively strong priors on the base Kernel hyperparameters, which work best when covariates are normalized to the unit cube and outcomes are standardized (zero mean, unit variance).
This model requires observation noise data (specified in train_Yvar).
MultiTask GP model using an ICM kernel and known observatioon noise.
 Parameters
train_X (
Tensor
) – A n x (d + 1) or b x n x (d + 1) (batch mode) tensor of training data. One of the columns should contain the task features (see task_feature argument).train_Y (
Tensor
) – A n or b x n (batch mode) tensor of training observations.train_Yvar (
Tensor
) – A n or b x n (batch mode) tensor of observation noise standard errors.task_feature (
int
) – The index of the task feature (d <= task_feature <= d).output_tasks (
Optional
[List
[int
]]) – A list of task indices for which to compute model outputs for. If omitted, return outputs for all task indices.rank (
Optional
[int
]) – The rank to be used for the index kernel. If omitted, use a full rank (i.e. number of tasks) kernel.
Example
>>> X1, X2 = torch.rand(10, 2), torch.rand(20, 2) >>> i1, i2 = torch.zeros(10, 1), torch.ones(20, 1) >>> train_X = torch.cat([ >>> torch.cat([X1, i1], 1), torch.cat([X2, i2], 1), >>> ], dim=0) >>> train_Y = torch.cat(f1(X1), f2(X2)) >>> train_Yvar = 0.1 + 0.1 * torch.rand_like(train_Y) >>> model = FixedNoiseMultiTaskGP(train_X, train_Y, train_Yvar, 1)