botorch.models¶
Model APIs¶
Abstract Model API¶
Abstract base module for all BoTorch models.

class
botorch.models.model.
Model
[source]¶ Bases:
torch.nn.modules.module.Module
,abc.ABC
Abstract base class for BoTorch models.
Initializes internal Module state, shared by both nn.Module and ScriptModule.

abstract
posterior
(X, output_indices=None, observation_noise=False, **kwargs)[source]¶ Computes the posterior over model outputs at the provided points.
 Parameters
X (
Tensor
) – A b x q x ddim Tensor, where d is the dimension of the feature space, q is the number of points considered jointly, and b is the batch dimension.output_indices (
Optional
[List
[int
]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multioutput). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.observation_noise (
bool
) – If True, add observation noise to the posterior.
 Return type
 Returns
A Posterior object, representing a batch of b joint distributions over q points and m outputs each.

property
num_outputs
¶ The number of outputs of the model.
 Return type
int

subset_output
(idcs)[source]¶ Subset the model along the output dimension.
 Parameters
idcs (
List
[int
]) – The output indices to subset the model to. Return type
 Returns
A Model object of the same type and with the same parameters as the current model, subset to the specified output indices.

condition_on_observations
(X, Y, **kwargs)[source]¶ Condition the model on new observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).Y (
Tensor
) – A batch_shape’ x n’ x mdim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, it is assumed that the missing batch dimensions are the same for all Y.
 Return type
 Returns
A Model object of the same type, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).

fantasize
(X, sampler, observation_noise=True, **kwargs)[source]¶ Construct a fantasy model.
Constructs a fantasy model in the following fashion: (1) compute the model posterior at X (including observation noise if observation_noise=True). (2) sample from this posterior (using sampler) to generate “fake” observations. (3) condition the model on the new fake observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).sampler (
MCSampler
) – The sampler used for sampling from the posterior at X.observation_noise (
bool
) – If True, include observation noise.
 Return type
 Returns
The constructed fantasy model.

abstract
GPyTorch Model API¶
Abstract model class for all GPyTorchbased botorch models.
To implement your own, simply inherit from both the provided classes and a GPyTorch Model class such as an ExactGP.

class
botorch.models.gpytorch.
GPyTorchModel
[source]¶ Bases:
botorch.models.model.Model
,abc.ABC
Abstract base class for models based on GPyTorch models.
The easiest way to use this is to subclass a model from a GPyTorch model class (e.g. an ExactGP) and this GPyTorchModel. See e.g. SingleTaskGP.
Initializes internal Module state, shared by both nn.Module and ScriptModule.

property
num_outputs
¶ The number of outputs of the model.
 Return type
int

posterior
(X, observation_noise=False, **kwargs)[source]¶ Computes the posterior over model outputs at the provided points.
 Parameters
X (
Tensor
) – A (batch_shape) x q x ddim Tensor, where d is the dimension of the feature space and q is the number of points considered jointly.observation_noise (
Union
[bool
,Tensor
]) – If True, add the observation noise from the likelihood to the posterior. If a Tensor, use it directly as the observation noise (must be of shape (batch_shape) x q).
 Return type
 Returns
A GPyTorchPosterior object, representing a batch of b joint distributions over q points. Includes observation noise if specified.

condition_on_observations
(X, Y, **kwargs)[source]¶ Condition the model on new observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).Y (
Tensor
) – A batch_shape’ x n x mdim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.
 Return type
 Returns
A Model object of the same type, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.sin(train_X[:, 0]) + torch.cos(train_X[:, 1]) >>> model = SingleTaskGP(train_X, train_Y) >>> new_X = torch.rand(5, 2) >>> new_Y = torch.sin(new_X[:, 0]) + torch.cos(new_X[:, 1]) >>> model = model.condition_on_observations(X=new_X, Y=new_Y)

property

class
botorch.models.gpytorch.
BatchedMultiOutputGPyTorchModel
[source]¶ Bases:
botorch.models.gpytorch.GPyTorchModel
Base class for batched multioutput GPyTorch models with independent outputs.
This model should be used when the same training data is used for all outputs. Outputs are modeled independently by using a different batch for each output.
Initializes internal Module state, shared by both nn.Module and ScriptModule.

static
get_batch_dimensions
(train_X, train_Y)[source]¶ Get the raw batch shape and outputaugmented batch shape of the inputs.
 Parameters
train_X (
Tensor
) – A n x d or batch_shape x n x d (batch mode) tensor of training features.train_Y (
Tensor
) – A n x m or batch_shape x n x m (batch mode) tensor of training observations.
 Return type
Tuple
[Size
,Size
] Returns
2element tuple containing
The input_batch_shape
The outputaugmented batch shape: input_batch_shape x (m)

posterior
(X, output_indices=None, observation_noise=False, **kwargs)[source]¶ Computes the posterior over model outputs at the provided points.
 Parameters
X (
Tensor
) – A (batch_shape) x q x ddim Tensor, where d is the dimension of the feature space and q is the number of points considered jointly.output_indices (
Optional
[List
[int
]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multioutput). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.observation_noise (
Union
[bool
,Tensor
]) – If True, add the observation noise from the likelihood to the posterior. If a Tensor, use it directly as the observation noise (must be of shape (batch_shape) x q x m).
 Return type
 Returns
A GPyTorchPosterior object, representing batch_shape joint distributions over q points and the outputs selected by output_indices each. Includes observation noise if specified.

condition_on_observations
(X, Y, **kwargs)[source]¶ Condition the model on new observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, m is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).Y (
Tensor
) – A batch_shape’ x n’ x mdim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.
 Return type
 Returns
A BatchedMultiOutputGPyTorchModel object of the same type with n + n’ training examples, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.cat( >>> [torch.sin(train_X[:, 0]), torch.cos(train_X[:, 1])], 1 >>> ) >>> model = SingleTaskGP(train_X, train_Y) >>> new_X = torch.rand(5, 2) >>> new_Y = torch.cat([torch.sin(new_X[:, 0]), torch.cos(new_X[:, 1])], 1) >>> model = model.condition_on_observations(X=new_X, Y=new_Y)

static

class
botorch.models.gpytorch.
ModelListGPyTorchModel
[source]¶ Bases:
botorch.models.gpytorch.GPyTorchModel
,abc.ABC
Abstract base class for models based on multioutput GPyTorch models.
This is meant to be used with a gpytorch ModelList wrapper for independent evaluation of submodels.
Initializes internal Module state, shared by both nn.Module and ScriptModule.

posterior
(X, output_indices=None, observation_noise=False, **kwargs)[source]¶ Computes the posterior over model outputs at the provided points.
 Parameters
X (
Tensor
) – A b x q x ddim Tensor, where d is the dimension of the feature space, q is the number of points considered jointly, and b is the batch dimension.output_indices (
Optional
[List
[int
]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multioutput). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.observation_noise (
Union
[bool
,Tensor
]) – If True, add the observation noise from the respective likelihoods to the posterior. If a Tensor of shape (batch_shape) x q x m, use it directly as the observation noise (with observation_noise[…,i] added to the posterior of the ith model).
 Return type
 Returns
A GPyTorchPosterior object, representing batch_shape joint distributions over q points and the outputs selected by output_indices each. Includes measurement noise if observation_noise is specified.

condition_on_observations
(X, Y, **kwargs)[source]¶ Condition the model on new observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).Y (
Tensor
) – A batch_shape’ x n x mdim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.
 Return type
 Returns
A Model object of the same type, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.sin(train_X[:, 0]) + torch.cos(train_X[:, 1]) >>> model = SingleTaskGP(train_X, train_Y) >>> new_X = torch.rand(5, 2) >>> new_Y = torch.sin(new_X[:, 0]) + torch.cos(new_X[:, 1]) >>> model = model.condition_on_observations(X=new_X, Y=new_Y)


class
botorch.models.gpytorch.
MultiTaskGPyTorchModel
[source]¶ Bases:
botorch.models.gpytorch.GPyTorchModel
,abc.ABC
Abstract base class for multitask models baed on GPyTorch models.
This class provides the posterior method to models that implement a “longformat” multitask GP in the style of MultiTaskGP.
Initializes internal Module state, shared by both nn.Module and ScriptModule.

posterior
(X, output_indices=None, observation_noise=False, **kwargs)[source]¶ Computes the posterior over model outputs at the provided points.
 Parameters
X (
Tensor
) – A q x d or batch_shape x q x d (batch mode) tensor, where d is the dimension of the feature space (not including task indices) and q is the number of points considered jointly.output_indices (
Optional
[List
[int
]]) – A list of indices, corresponding to the outputs over which to compute the posterior (if the model is multioutput). Can be used to speed up computation if only a subset of the model’s outputs are required for optimization. If omitted, computes the posterior over all model outputs.observation_noise (
Union
[bool
,Tensor
]) – If True, add observation noise from the respective likelihoods. If a Tensor, specifies the observation noise levels to add.
 Return type
 Returns
A GPyTorchPosterior object, representing batch_shape joint distributions over q points and the outputs selected by output_indices. Includes measurement noise if observation_noise is specified.

Deterministic Model API¶
Deterministic Models. Simple wrappers that allow the usage of deterministic mappings via the BoTorch Model and Posterior APIs. Useful e.g. for defining known cost functions for costaware acquisition utilities.

class
botorch.models.deterministic.
DeterministicModel
[source]¶ Bases:
botorch.models.model.Model
,abc.ABC
Abstract base class for deterministic models.
Initializes internal Module state, shared by both nn.Module and ScriptModule.

abstract
forward
(X)[source]¶ Compute the (deterministic) model output at X.
 Parameters
X (
Tensor
) – A batch_shape x n x ddim input tensor X. Return type
Tensor
 Returns
A batch_shape x n x mdimensional output tensor (the outcome dimension m must be explicit if m=1).

property
num_outputs
¶ The number of outputs of the model.
 Return type
int

abstract

class
botorch.models.deterministic.
GenericDeterministicModel
(f, num_outputs=1)[source]¶ Bases:
botorch.models.deterministic.DeterministicModel
A generic deterministic model constructed from a callable.
A generic deterministic model constructed from a callable.
 Parameters
f (
Callable
[[Tensor
],Tensor
]) – A callable mapping a batch_shape x n x ddim input tensor X to a batch_shape x n x mdimensional output tensor (the outcome dimension m must be explicit, even if m=1).num_outputs (
int
) – The number of outputs m.

class
botorch.models.deterministic.
AffineDeterministicModel
(a, b=0.01)[source]¶ Bases:
botorch.models.deterministic.DeterministicModel
An affine deterministic model.
Affine deterministic model from weights and offset terms.
A simple model of the form
y[…, m] = b[m] + sum_{i=1}^d a[i, m] * X[…, i]
 Parameters
a (
Tensor
) – A d x mdim tensor of linear weights, where m is the number of outputs (must be explicit if m=1)b (
Union
[Tensor
,float
]) – The affine (offset) term. Either a float (for singleoutput models or if the offset is shared), or a mdim tensor (with different offset values for for the m different outputs).
Models¶
Cost Models (for costaware optimization)¶
Cost models to be used with multifidelity optimization.

class
botorch.models.cost.
AffineFidelityCostModel
(fidelity_weights=None, fixed_cost=0.01)[source]¶ Bases:
botorch.models.deterministic.DeterministicModel
Affine cost model operating on fidelity parameters.
For each (qbatch) element of a candidate set X, this module computes a cost of the form
cost = fixed_cost + sum_j weights[j] * X[fidelity_dims[j]]
Affine cost model operating on fidelity parameters.
 Parameters
fidelity_weights (
Optional
[Dict
[int
,float
]]) – A dictionary mapping a subset of columns of X (the fidelity parameters) to it’s associated weight in the affine cost expression. If omitted, assumes that the last column of X is the fidelity parameter with a weight of 1.0.fixed_cost (
float
) – The fixed cost of running a single candidate point (i.e. an element of a qbatch).

forward
(X)[source]¶ Evaluate the cost on a candidate set X.
Computes a cost of the form
cost = fixed_cost + sum_j weights[j] * X[fidelity_dims[j]]
for each element of the qbatch
 Parameters
X (
Tensor
) – A batch_shape x q x d’dim tensor of candidate points. Return type
Tensor
 Returns
A batch_shape x q x 1dim tensor of costs.
GP Regression Models¶
Gaussian Process Regression models based on GPyTorch models.

class
botorch.models.gp_regression.
SingleTaskGP
(train_X, train_Y, likelihood=None, covar_module=None, outcome_transform=None)[source]¶ Bases:
botorch.models.gpytorch.BatchedMultiOutputGPyTorchModel
,gpytorch.models.exact_gp.ExactGP
A singletask exact GP model.
A singletask exact GP using relatively strong priors on the Kernel hyperparameters, which work best when covariates are normalized to the unit cube and outcomes are standardized (zero mean, unit variance).
This model works in batch mode (each batch having its own hyperparameters). When the training observations include multiple outputs, this model will use batching to model outputs independently.
Use this model when you have independent output(s) and all outputs use the same training data. If outputs are independent and outputs have different training data, use the ModelListGP. When modeling correlations between outputs, use the MultiTaskGP.
A singletask exact GP model.
 Parameters
train_X (
Tensor
) – A batch_shape x n x d tensor of training features.train_Y (
Tensor
) – A batch_shape x n x m tensor of training observations.likelihood (
Optional
[Likelihood
]) – A likelihood. If omitted, use a standard GaussianLikelihood with inferred noise level.covar_module (
Optional
[Module
]) – The module computing the covariance (Kernel) matrix. If omitted, use a MaternKernel.outcome_transform (
Optional
[OutcomeTransform
]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True) >>> model = SingleTaskGP(train_X, train_Y)

forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them. Return type
MultivariateNormal

class
botorch.models.gp_regression.
FixedNoiseGP
(train_X, train_Y, train_Yvar, covar_module=None, outcome_transform=None)[source]¶ Bases:
botorch.models.gpytorch.BatchedMultiOutputGPyTorchModel
,gpytorch.models.exact_gp.ExactGP
A singletask exact GP model using fixed noise levels.
A singletask exact GP that uses fixed observation noise levels. This model also uses relatively strong priors on the Kernel hyperparameters, which work best when covariates are normalized to the unit cube and outcomes are standardized (zero mean, unit variance).
This model works in batch mode (each batch having its own hyperparameters).
A singletask exact GP model using fixed noise levels.
 Parameters
train_X (
Tensor
) – A batch_shape x n x d tensor of training features.train_Y (
Tensor
) – A batch_shape x n x m tensor of training observations.train_Yvar (
Tensor
) – A batch_shape x n x m tensor of observed measurement noise.outcome_transform (
Optional
[OutcomeTransform
]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True) >>> train_Yvar = torch.full_like(train_Y, 0.2) >>> model = FixedNoiseGP(train_X, train_Y, train_Yvar)

fantasize
(X, sampler, observation_noise=True, **kwargs)[source]¶ Construct a fantasy model.
Constructs a fantasy model in the following fashion: (1) compute the model posterior at X (if observation_noise=True, this includes observation noise taken as the mean across the observation noise in the training data. If observation_noise is a Tensor, use it directly as the observation noise to add). (2) sample from this posterior (using sampler) to generate “fake” observations. (3) condition the model on the new fake observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).sampler (
MCSampler
) – The sampler used for sampling from the posterior at X.observation_noise (
Union
[bool
,Tensor
]) – If True, include the mean across the observation noise in the training data as observation noise in the posterior from which the samples are drawn. If a Tensor, use it directly as the specified measurement noise.
 Return type
 Returns
The constructed fantasy model.

forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them. Return type
MultivariateNormal

class
botorch.models.gp_regression.
HeteroskedasticSingleTaskGP
(train_X, train_Y, train_Yvar, outcome_transform=None)[source]¶ Bases:
botorch.models.gp_regression.SingleTaskGP
A singletask exact GP model using a heteroskeastic noise model.
This model internally wraps another GP (a SingleTaskGP) to model the observation noise. This allows the likelihood to make outofsample predictions for the observation noise levels.
A singletask exact GP model using a heteroskedastic noise model.
 Parameters
train_X (
Tensor
) – A batch_shape x n x d tensor of training features.train_Y (
Tensor
) – A batch_shape x n x m tensor of training observations.train_Yvar (
Tensor
) – A batch_shape x n x m tensor of observed measurement noise.outcome_transform (
Optional
[OutcomeTransform
]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale). Note that the noise model internally logtransforms the variances, which will happen after this transform is applied.
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True) >>> se = torch.norm(train_X, dim=1, keepdim=True) >>> train_Yvar = 0.1 + se * torch.rand_like(train_Y) >>> model = HeteroskedasticSingleTaskGP(train_X, train_Y, train_Yvar)

condition_on_observations
(X, Y, **kwargs)[source]¶ Condition the model on new observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, m is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).Y (
Tensor
) – A batch_shape’ x n’ x mdim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.
 Return type
 Returns
A BatchedMultiOutputGPyTorchModel object of the same type with n + n’ training examples, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.cat( >>> [torch.sin(train_X[:, 0]), torch.cos(train_X[:, 1])], 1 >>> ) >>> model = SingleTaskGP(train_X, train_Y) >>> new_X = torch.rand(5, 2) >>> new_Y = torch.cat([torch.sin(new_X[:, 0]), torch.cos(new_X[:, 1])], 1) >>> model = model.condition_on_observations(X=new_X, Y=new_Y)
MultiFidelity GP Regression Models¶
Gaussian Process Regression models based on GPyTorch models.
 Wu2019mf(1,2)
J. Wu, S. ToscanoPalmerin, P. I. Frazier, and A. G. Wilson. Practical multifidelity bayesian optimization for hyperparameter tuning. ArXiv 2019.

class
botorch.models.gp_regression_fidelity.
SingleTaskMultiFidelityGP
(train_X, train_Y, iteration_fidelity=None, data_fidelity=None, linear_truncated=True, nu=2.5, likelihood=None, outcome_transform=None)[source]¶ Bases:
botorch.models.gp_regression.SingleTaskGP
A single task multifidelity GP model.
A SingleTaskGP model using a DownsamplingKernel for the data fidelity parameter (if present) and an ExponentialDecayKernel for the iteration fidelity parameter (if present).
This kernel is described in [Wu2019mf].
 Parameters
train_X (
Tensor
) – A batch_shape x n x (d + s) tensor of training features, where s is the dimension of the fidelity parameters (either one or two).train_Y (
Tensor
) – A batch_shape x n x m tensor of training observations.iteration_fidelity (
Optional
[int
]) – The column index for the training iteration fidelity parameter (optional).data_fidelity (
Optional
[int
]) – The column index for the downsampling fidelity parameter (optional).linear_truncated (
bool
) – If True, use a LinearTruncatedFidelityKernel instead of the default kernel.nu (
float
) – The smoothness parameter for the Matern kernel: either 1/2, 3/2, or 5/2. Only used when linear_truncated=True.likelihood (
Optional
[Likelihood
]) – A likelihood. If omitted, use a standard GaussianLikelihood with inferred noise level.outcome_transform (
Optional
[OutcomeTransform
]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).
Example
>>> train_X = torch.rand(20, 4) >>> train_Y = train_X.pow(2).sum(dim=1, keepdim=True) >>> model = SingleTaskMultiFidelityGP(train_X, train_Y, data_fidelity=3)
A singletask exact GP model.
 Parameters
train_X (
Tensor
) – A batch_shape x n x d tensor of training features.train_Y (
Tensor
) – A batch_shape x n x m tensor of training observations.likelihood (
Optional
[Likelihood
]) – A likelihood. If omitted, use a standard GaussianLikelihood with inferred noise level.covar_module – The module computing the covariance (Kernel) matrix. If omitted, use a MaternKernel.
outcome_transform (
Optional
[OutcomeTransform
]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True) >>> model = SingleTaskGP(train_X, train_Y)

class
botorch.models.gp_regression_fidelity.
FixedNoiseMultiFidelityGP
(train_X, train_Y, train_Yvar, iteration_fidelity=None, data_fidelity=None, linear_truncated=True, nu=2.5, outcome_transform=None)[source]¶ Bases:
botorch.models.gp_regression.FixedNoiseGP
A single task multifidelity GP model using fixed noise levels.
A FixedNoiseGP model analogue to SingleTaskMultiFidelityGP, using a DownsamplingKernel for the data fidelity parameter (if present) and an ExponentialDecayKernel for the iteration fidelity parameter (if present).
This kernel is described in [Wu2019mf].
 Parameters
train_X (
Tensor
) – A batch_shape x n x (d + s) tensor of training features, where s is the dimension of the fidelity parameters (either one or two).train_Y (
Tensor
) – A batch_shape x n x m tensor of training observations.train_Yvar (
Tensor
) – A batch_shape x n x m tensor of observed measurement noise.iteration_fidelity (
Optional
[int
]) – The column index for the training iteration fidelity parameter (optional).data_fidelity (
Optional
[int
]) – The column index for the downsampling fidelity parameter (optional).linear_truncated (
bool
) – If True, use a LinearTruncatedFidelityKernel instead of the default kernel.nu (
float
) – The smoothness parameter for the Matern kernel: either 1/2, 3/2, or 5/2. Only used when linear_truncated=True.outcome_transform (
Optional
[OutcomeTransform
]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).
Example
>>> train_X = torch.rand(20, 4) >>> train_Y = train_X.pow(2).sum(dim=1, keepdim=True) >>> train_Yvar = torch.full_like(train_Y) * 0.01 >>> model = FixedNoiseMultiFidelityGP( >>> train_X, >>> train_Y, >>> train_Yvar, >>> data_fidelity=3, >>> )
A singletask exact GP model using fixed noise levels.
 Parameters
train_X (
Tensor
) – A batch_shape x n x d tensor of training features.train_Y (
Tensor
) – A batch_shape x n x m tensor of training observations.train_Yvar (
Tensor
) – A batch_shape x n x m tensor of observed measurement noise.outcome_transform (
Optional
[OutcomeTransform
]) – An outcome transform that is applied to the training data during instantiation and to the posterior during inference (that is, the Posterior obtained by calling .posterior on the model will be on the original scale).
Example
>>> train_X = torch.rand(20, 2) >>> train_Y = torch.sin(train_X).sum(dim=1, keepdim=True) >>> train_Yvar = torch.full_like(train_Y, 0.2) >>> model = FixedNoiseGP(train_X, train_Y, train_Yvar)
Model List GP Regression Models¶
Model List GP Regression models.

class
botorch.models.model_list_gp_regression.
ModelListGP
(*gp_models)[source]¶ Bases:
gpytorch.models.model_list.IndependentModelList
,botorch.models.gpytorch.ModelListGPyTorchModel
A multioutput GP model with independent GPs for the outputs.
This model supports differentshaped training inputs for each of its submodels. It can be used with any BoTorch models.
Internally, this model is just a list of individual models, but it implements the same input/output interface as all other BoTorch models. This makes it very flexible and convenient to work with. The sequential evaluation comes at a performance cost though  if you are using a block design (i.e. the same number of training example for each output, and a similar model structure, you should consider using a batched GP model instead).
A multioutput GP model with independent GPs for the outputs.
 Parameters
*gp_models – An variable number of singleoutput BoTorch models. If models have input/output transforms, these are honored individually for each model.
Example
>>> model1 = SingleTaskGP(train_X1, train_Y1) >>> model2 = SingleTaskGP(train_X2, train_Y2) >>> model = ModelListGP(model1, model2)

condition_on_observations
(X, Y, **kwargs)[source]¶ Condition the model on new observations.
 Parameters
X (
Tensor
) – A batch_shape x n’ x ddim Tensor, where d is the dimension of the feature space, n’ is the number of points per batch, and batch_shape is the batch shape (must be compatible with the batch shape of the model).Y (
Tensor
) – A batch_shape’ x n’ x mdim Tensor, where m is the number of model outputs, n’ is the number of points per batch, and batch_shape’ is the batch shape of the observations. batch_shape’ must be broadcastable to batch_shape using standard broadcasting semantics. If Y has fewer batch dimensions than X, its is assumed that the missing batch dimensions are the same for all Y.
 Return type
 Returns
A ModelListGPyTorchModel representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs). Here the ith model has n_i + n’ training examples, where the n’ training examples have been added and all testtime caches have been updated.
Multitask GP Models¶
MultiTask GP models.

class
botorch.models.multitask.
MultiTaskGP
(train_X, train_Y, task_feature, output_tasks=None, rank=None)[source]¶ Bases:
gpytorch.models.exact_gp.ExactGP
,botorch.models.gpytorch.MultiTaskGPyTorchModel
MultiTask GP model using an ICM kernel, inferring observation noise.
Multitask exact GP that uses a simple ICM kernel. Can be singleoutput or multioutput. This model uses relatively strong priors on the base Kernel hyperparameters, which work best when covariates are normalized to the unit cube and outcomes are standardized (zero mean, unit variance).
This model infers the noise level. WARNING: It currently does not support different noise levels for the different tasks. If you have known observation noise, please use FixedNoiseMultiTaskGP instead.
MultiTask GP model using an ICM kernel, inferring observation noise.
 Parameters
train_X (
Tensor
) – A n x (d + 1) or b x n x (d + 1) (batch mode) tensor of training data. One of the columns should contain the task features (see task_feature argument).train_Y (
Tensor
) – A n or b x n (batch mode) tensor of training observations.task_feature (
int
) – The index of the task feature (d <= task_feature <= d).output_tasks (
Optional
[List
[int
]]) – A list of task indices for which to compute model outputs for. If omitted, return outputs for all task indices.rank (
Optional
[int
]) – The rank to be used for the index kernel. If omitted, use a full rank (i.e. number of tasks) kernel.
Example
>>> X1, X2 = torch.rand(10, 2), torch.rand(20, 2) >>> i1, i2 = torch.zeros(10, 1), torch.ones(20, 1) >>> train_X = torch.cat([ >>> torch.cat([X1, i1], 1), torch.cat([X2, i2], 1), >>> ]) >>> train_Y = torch.cat(f1(X1), f2(X2)).unsqueeze(1) >>> model = MultiTaskGP(train_X, train_Y, task_feature=1)

forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them. Return type
MultivariateNormal

class
botorch.models.multitask.
FixedNoiseMultiTaskGP
(train_X, train_Y, train_Yvar, task_feature, output_tasks=None, rank=None)[source]¶ Bases:
botorch.models.multitask.MultiTaskGP
MultiTask GP model using an ICM kernel, with known observation noise.
Multitask exact GP that uses a simple ICM kernel. Can be singleoutput or multioutput. This model uses relatively strong priors on the base Kernel hyperparameters, which work best when covariates are normalized to the unit cube and outcomes are standardized (zero mean, unit variance).
This model requires observation noise data (specified in train_Yvar).
MultiTask GP model using an ICM kernel and known observatioon noise.
 Parameters
train_X (
Tensor
) – A n x (d + 1) or b x n x (d + 1) (batch mode) tensor of training data. One of the columns should contain the task features (see task_feature argument).train_Y (
Tensor
) – A n or b x n (batch mode) tensor of training observations.train_Yvar (
Tensor
) – A n or b x n (batch mode) tensor of observation noise standard errors.task_feature (
int
) – The index of the task feature (d <= task_feature <= d).output_tasks (
Optional
[List
[int
]]) – A list of task indices for which to compute model outputs for. If omitted, return outputs for all task indices.rank (
Optional
[int
]) – The rank to be used for the index kernel. If omitted, use a full rank (i.e. number of tasks) kernel.
Example
>>> X1, X2 = torch.rand(10, 2), torch.rand(20, 2) >>> i1, i2 = torch.zeros(10, 1), torch.ones(20, 1) >>> train_X = torch.cat([ >>> torch.cat([X1, i1], 1), torch.cat([X2, i2], 1), >>> ], dim=0) >>> train_Y = torch.cat(f1(X1), f2(X2)) >>> train_Yvar = 0.1 + 0.1 * torch.rand_like(train_Y) >>> model = FixedNoiseMultiTaskGP(train_X, train_Y, train_Yvar, 1)
Pairwise GP Models¶
Preference Learning with Gaussian Process
 Chu2005preference(1,2)
Wei Chu, and Zoubin Ghahramani. Preference learning with Gaussian processes. Proceedings of the 22nd international conference on Machine learning. 2005.
 Brochu2010tutorial
Eric Brochu, Vlad M. Cora, and Nando De Freitas. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599 (2010).

class
botorch.models.pairwise_gp.
PairwiseGP
(datapoints, comparisons, covar_module=None, **kwargs)[source]¶ Bases:
botorch.models.model.Model
,gpytorch.models.gp.GP
Probit GP for preference learning with Laplace approximation
Implementation is based on [Chu2005preference]. Also see [Brochu2010tutorial] for additional reference.
Initializes internal Module state, shared by both nn.Module and ScriptModule.

property
std_noise
¶  Return type
Tensor

property
num_outputs
¶ The number of outputs of the model.
 Return type
int

set_train_data
(datapoints, comparisons, update_model=True)[source]¶ Set datapoints and comparisons and update model properties if needed
 Parameters
datapoints (
Tensor
) – A batch_shape x n x d dimension tensor Xcomparisons (
Tensor
) – A tensor of size batch_shape x m x 2. (i, j) means f_i is preferred over f_jupdate_model (
bool
) – True if we want to refit the model (see _update) after resetting the data
 Return type
None

forward
(datapoints)[source]¶ Calculate a posterior or prior prediction.
During training mode, forward implemented solely for gradientbased hyperparam opt. Essentially what it does is to recalculate the utility f using its analytical form at f_map so that we are able to obtain gradients of the hyperparameters. We only take in one parameter datapoints without the comparisons for the compatibility with other gpytorch/botorch APIs. It assumes datapoints is the same as self.datapoints. That’s what “Must train on training data” means.
 Parameters
datapoints (
Tensor
) – A batch_shape x n x d Tensor, should be the same as self.datapoints Returns
Posterior centered at MAP points for training data (training mode)
Prior predictions (prior mode)
Predictive posterior (eval mode)
 Return type
A MultivariateNormal object, being one of the followings

posterior
(X, output_indices=None, observation_noise=False, **kwargs)[source]¶ Computes the posterior over model outputs at the provided points.
 Parameters
X (
Tensor
) – A batch_shape x q x ddim Tensor, where d is the dimension of the feature space and q is the number of points considered jointly.output_indices (
Optional
[List
[int
]]) – As defined in parent Model class, not used for this model.observation_noise (
bool
) – If True, add observation noise to the posterior.
 Return type
 Returns
 A Posterior object, representing joint
distributions over q points. Includes observation noise if specified.

condition_on_observations
(X, Y, **kwargs)[source]¶ Condition the model on new observations.
Note that unlike other BoTorch models, PairwiseGP requires Y to be pairwise comparisons
 Parameters
X (
Tensor
) – A batch_shape x n x d dimension tensor XY (
Tensor
) – A tensor of size batch_shape x m x 2. (i, j) means f_i is preferred over f_j
 Return type
 Returns
A (deepcopied) Model object of the same type, representing the original model conditioned on the new observations (X, Y) (and possibly noise observations passed in via kwargs).

property

class
botorch.models.pairwise_gp.
PairwiseLaplaceMarginalLogLikelihood
(model)[source]¶ Bases:
gpytorch.mlls.marginal_log_likelihood.MarginalLogLikelihood
Laplaceapproximated marginal log likelihood/evidence for PairwiseGP
See (12) from [Chu2005preference].
 Parameters
model (
PairwiseGP
) – A model using laplace approximation (currently only supports PairwiseGP)

forward
(post, comp)[source]¶ Calculate approximated log evidence, i.e., log(P(Dtheta))
 Parameters
post (
Posterior
) – training posterior distribution from self.modelcomp (
Tensor
) – Comparisons pairs, see PairwiseGP.__init__ for more details
 Return type
Tensor
 Returns
The approximated evidence, i.e., the marginal log likelihood
Model Components¶
Kernels¶

class
botorch.models.kernels.downsampling.
DownsamplingKernel
(power_prior=None, offset_prior=None, power_constraint=None, offset_constraint=None, **kwargs)[source]¶ Bases:
gpytorch.kernels.kernel.Kernel
GPyTorch Downsampling Kernel.
Computes a covariance matrix based on the down sampling kernel between inputs x_1 and x_2 (we expect d = 1):
 K(mathbf{x_1}, mathbf{x_2}) = c + (1  x_1)^(1 + delta) *
(1  x_2)^(1 + delta).
where c is an offset parameter, and delta is a power parameter.
 Parameters
power_constraint (
Optional
[Interval
]) – Constraint to place on power parameter. Default is Positive.power_prior (
Optional
[Prior
]) – Prior over the power parameter.offset_constraint (
Optional
[Interval
]) – Constraint to place on offset parameter. Default is Positive.active_dims – List of data dimensions to operate on. len(active_dims) should equal num_dimensions.
Initializes internal Module state, shared by both nn.Module and ScriptModule.

class
botorch.models.kernels.exponential_decay.
ExponentialDecayKernel
(power_prior=None, offset_prior=None, power_constraint=None, offset_constraint=None, **kwargs)[source]¶ Bases:
gpytorch.kernels.kernel.Kernel
GPyTorch Exponential Decay Kernel.
Computes a covariance matrix based on the exponential decay kernel between inputs x_1 and x_2 (we expect d = 1):
K(x_1, x_2) = w + beta^alpha / (x_1 + x_2 + beta)^alpha.
where w is an offset parameter, beta is a lenthscale parameter, and alpha is a power parameter.
 Parameters
lengthscale_constraint – Constraint to place on lengthscale parameter. Default is Positive.
lengthscale_prior – Prior over the lengthscale parameter.
power_constraint (
Optional
[Interval
]) – Constraint to place on power parameter. Default is Positive.power_prior (
Optional
[Prior
]) – Prior over the power parameter.offset_constraint (
Optional
[Interval
]) – Constraint to place on offset parameter. Default is Positive.active_dims – List of data dimensions to operate on. len(active_dims) should equal num_dimensions.
Initializes internal Module state, shared by both nn.Module and ScriptModule.

class
botorch.models.kernels.linear_truncated_fidelity.
LinearTruncatedFidelityKernel
(fidelity_dims, dimension=None, power_prior=None, power_constraint=None, nu=2.5, lengthscale_prior_unbiased=None, lengthscale_prior_biased=None, lengthscale_constraint_unbiased=None, lengthscale_constraint_biased=None, covar_module_unbiased=None, covar_module_biased=None, **kwargs)[source]¶ Bases:
gpytorch.kernels.kernel.Kernel
GPyTorch Linear Truncated Fidelity Kernel.
Computes a covariance matrix based on the Linear truncated kernel between inputs x_1 and x_2 for up to two fidelity parmeters:
K(x_1, x_2) = k_0 + c_1(x_1, x_2)k_1 + c_2(x_1,x_2)k_2 + c_3(x_1,x_2)k_3
where
 k_i(i=0,1,2,3) are Matern kernels calculated between nonfidelity
parameters of x_1 and x_2 with different priors.
 c_1=(1  x_1[f_1])(1  x_2[f_1]))(1 + x_1[f_1] x_2[f_1])^p is the kernel
of the the bias term, which can be decomposed into a determistic part and a polynomial kernel. Here f_1 is the first fidelity dimension and p is the order of the polynomial kernel.
 c_3 is the same as c_1 but is calculated for the second fidelity
dimension f_2.
 c_2 is the interaction term with four deterministic terms and the
polynomial kernel between x_1[…, [f_1, f_2]] and x_2[…, [f_1, f_2]].
 Parameters
fidelity_dims (
List
[int
]) – A list containing either one or two indices specifying the fidelity parameters of the input.dimension (
Optional
[int
]) – The dimension of x. Unused if active_dims is specified.power_prior (
Optional
[Prior
]) – Prior for the power parameter of the polynomial kernel. Default is None.power_constraint (
Optional
[Interval
]) – Constraint on the power parameter of the polynomial kernel. Default is Positive.nu (
float
) – The smoothness parameter for the Matern kernel: either 1/2, 3/2, or 5/2. Unused if both covar_module_unbiased and covar_module_biased are specified.lengthscale_prior_unbiased (
Optional
[Prior
]) – Prior on the lengthscale parameter of Matern kernel k_0. Default is Gamma(1.1, 1/20).lengthscale_constraint_unbiased (
Optional
[Interval
]) – Constraint on the lengthscale parameter of the Matern kernel k_0. Default is Positive.lengthscale_prior_biased (
Optional
[Prior
]) – Prior on the lengthscale parameter of Matern kernels k_i(i>0). Default is Gamma(5, 1/20).lengthscale_constraint_biased (
Optional
[Interval
]) – Constraint on the lengthscale parameter of the Matern kernels k_i(i>0). Default is Positive.covar_module_unbiased (
Optional
[Kernel
]) – Specify a custom kernel for k_0. If omitted, use a MaternKernel.covar_module_biased (
Optional
[Kernel
]) – Specify a custom kernel for the biased parts k_i(i>0). If omitted, use a MaternKernel.batch_shape – If specified, use a separate lengthscale for each batch of input data. If x1 is a batch_shape x n x d tensor, this should be batch_shape.
active_dims – Compute the covariance of a subset of input dimensions. The numbers correspond to the indices of the dimensions.
Example
>>> x = torch.randn(10, 5) >>> # Nonbatch: Simple option >>> covar_module = LinearTruncatedFidelityKernel() >>> covar = covar_module(x) # Output: LazyVariable of size (10 x 10) >>> >>> batch_x = torch.randn(2, 10, 5) >>> # Batch: Simple option >>> covar_module = LinearTruncatedFidelityKernel(batch_shape = torch.Size([2])) >>> covar = covar_module(x) # Output: LazyVariable of size (2 x 10 x 10)
Initializes internal Module state, shared by both nn.Module and ScriptModule.
Transforms¶
Outcome Transforms¶

class
botorch.models.transforms.outcome.
OutcomeTransform
[source]¶ Bases:
torch.nn.modules.module.Module
,abc.ABC
Abstract base class for outcome transforms.
Initializes internal Module state, shared by both nn.Module and ScriptModule.

abstract
forward
(Y, Yvar=None)[source]¶ Transform the outcomes in a model’s training targets
 Parameters
Y (
Tensor
) – A batch_shape x n x mdim tensor of training targets.Yvar (
Optional
[Tensor
]) – A batch_shape x n x mdim tensor of observation noises associated with the training targets (if applicable).
 Returns
The transformed outcome observations.
The transformed observation noise (if applicable).
 Return type
A twotuple with the transformed outcomes

untransform
(Y, Yvar=None)[source]¶ Untransform previously transformed outcomes
 Parameters
Y (
Tensor
) – A batch_shape x n x mdim tensor of transfomred training targets.Yvar (
Optional
[Tensor
]) – A batch_shape x n x mdim tensor of transformed observation noises associated with the training targets (if applicable).
 Returns
The untransformed outcome observations.
The untransformed observation noise (if applicable).
 Return type
A twotuple with the untransformed outcomes

abstract

class
botorch.models.transforms.outcome.
ChainedOutcomeTransform
(**transforms)[source]¶ Bases:
botorch.models.transforms.outcome.OutcomeTransform
,torch.nn.modules.container.ModuleDict
An outcome transform representing the chaining of individual transforms
Chaining of outcome transforms.
 Parameters
transforms (
OutcomeTransform
) – The transforms to chain. Internally, the names of the kwargs are used as the keys for accessing the individual transforms on the module.

forward
(Y, Yvar=None)[source]¶ Transform the outcomes in a model’s training targets
 Parameters
Y (
Tensor
) – A batch_shape x n x mdim tensor of training targets.Yvar (
Optional
[Tensor
]) – A batch_shape x n x mdim tensor of observation noises associated with the training targets (if applicable).
 Returns
The transformed outcome observations.
The transformed observation noise (if applicable).
 Return type
A twotuple with the transformed outcomes

untransform
(Y, Yvar=None)[source]¶ Untransform previously transformed outcomes
 Parameters
Y (
Tensor
) – A batch_shape x n x mdim tensor of transfomred training targets.Yvar (
Optional
[Tensor
]) – A batch_shape x n x mdim tensor of transformed observation noises associated with the training targets (if applicable).
 Returns
The untransformed outcome observations.
The untransformed observation noise (if applicable).
 Return type
A twotuple with the untransformed outcomes

class
botorch.models.transforms.outcome.
Standardize
(m, outputs=None, batch_shape=torch.Size([]), min_stdv=1e08)[source]¶ Bases:
botorch.models.transforms.outcome.OutcomeTransform
Standardize outcomes (zero mean, unit variance).
This module is stateful: If in train mode, calling forward updates the module state (i.e. the mean/std normalizing constants). If in eval mode, calling forward simply applies the standardization using the current module state.
Standardize outcomes (zero mean, unit variance).
 Parameters
m (
int
) – The output dimension.outputs (
Optional
[List
[int
]]) – Which of the outputs to standardize. If omitted, all outputs will be standardized.batch_shape (
Size
) – The batch_shape of the training targets.min_stddv – The minimum standard deviation for which to perform standardization (if lower, only demean the data).

forward
(Y, Yvar=None)[source]¶ Standardize outcomes.
If the module is in train mode, this updates the module state (i.e. the mean/std normalizing constants). If the module is in eval mode, simply applies the normalization using the module state.
 Parameters
Y (
Tensor
) – A batch_shape x n x mdim tensor of training targets.Yvar (
Optional
[Tensor
]) – A batch_shape x n x mdim tensor of observation noises associated with the training targets (if applicable).
 Returns
The transformed outcome observations.
The transformed observation noise (if applicable).
 Return type
A twotuple with the transformed outcomes

untransform
(Y, Yvar=None)[source]¶ Unstandardize outcomes.
 Parameters
Y (
Tensor
) – A batch_shape x n x mdim tensor of standardized targets.Yvar (
Optional
[Tensor
]) – A batch_shape x n x mdim tensor of standardized observation noises associated with the targets (if applicable).
 Returns
The unstandardized outcome observations.
The unstandardized observation noise (if applicable).
 Return type
A twotuple with the unstandardized outcomes

class
botorch.models.transforms.outcome.
Log
(outputs=None)[source]¶ Bases:
botorch.models.transforms.outcome.OutcomeTransform
Logtransform outcomes.
Useful if the targets are modeled using a (multivariate) logNormal distribution. This means that we can use a standard GP model on the logtransformed outcomes and untransform the model posterior of that GP.
Logtransform outcomes.
 Parameters
outputs (
Optional
[List
[int
]]) – Which of the outputs to logtransform. If omitted, all outputs will be standardized.

forward
(Y, Yvar=None)[source]¶ Logtransform outcomes.
 Parameters
Y (
Tensor
) – A batch_shape x n x mdim tensor of training targets.Yvar (
Optional
[Tensor
]) – A batch_shape x n x mdim tensor of observation noises associated with the training targets (if applicable).
 Returns
The transformed outcome observations.
The transformed observation noise (if applicable).
 Return type
A twotuple with the transformed outcomes

untransform
(Y, Yvar=None)[source]¶ Untransform logtransformed outcomes
 Parameters
Y (
Tensor
) – A batch_shape x n x mdim tensor of logtransfomred targets.Yvar (
Optional
[Tensor
]) – A batch_shape x n x mdim tensor of log transformed observation noises associated with the training targets (if applicable).
 Returns
The exponentiated outcome observations.
The exponentiated observation noise (if applicable).
 Return type
A twotuple with the untransformed outcomes
Input Transforms¶

class
botorch.models.transforms.input.
InputTransform
[source]¶ Bases:
torch.nn.modules.module.Module
,abc.ABC
Abstract base class for input transforms.
Initializes internal Module state, shared by both nn.Module and ScriptModule.

class
botorch.models.transforms.input.
ChainedInputTransform
(**transforms)[source]¶ Bases:
botorch.models.transforms.input.InputTransform
,torch.nn.modules.container.ModuleDict
An input transform representing the chaining of individual transforms
Chaining of input transforms.
 Parameters
transforms (
InputTransform
) – The transforms to chain. Internally, the names of the kwargs are used as the keys for accessing the individual transforms on the module.

forward
(X)[source]¶ Transform the inputs to a model.
Individual transforms are applied in sequence.
 Parameters
X (
Tensor
) – A batch_shape x n x ddim tensor of inputs. Return type
Tensor
 Returns
A batch_shape x n x ddim tensor of transformed inputs.

untransform
(X)[source]¶ Untransform the inputs to a model.
Untransforms of the individual transforms are applied in reverse sequence.
 Parameters
X (
Tensor
) – A batch_shape x n x ddim tensor of transformed inputs. Return type
Tensor
 Returns
A batch_shape x n x ddim tensor of untransformed inputs.

class
botorch.models.transforms.input.
Normalize
(d, bounds=None, batch_shape=torch.Size([]))[source]¶ Bases:
botorch.models.transforms.input.InputTransform
Normalize the inputs to the unit cube.
If no explicit bounds are provided this module is stateful: If in train mode, calling forward updates the module state (i.e. the normalizing bounds). If in eval mode, calling forward simply applies the normalization using the current module state.
Normalize the inputs to the unit cube.
 Parameters
d (
int
) – The dimension of the input space.bounds (
Optional
[Tensor
]) – If provided, use these bounds to normalize the inputs. If omitted, learn the bounds in train mode.batch_shape (
Size
) – The batch shape of the inputs (asssuming input tensors of shape batch_shape x n x d). If provided, perform individual normalization per batch, otherwise uses a single normalization.

forward
(X)[source]¶ Normalize the inputs.
If no explicit bounds are provided, this is stateful: In train mode, calling forward updates the module state (i.e. the normalizing bounds). In eval mode, calling forward simply applies the normalization using the current module state.
 Parameters
X (
Tensor
) – A batch_shape x n x ddim tensor of inputs. Return type
Tensor
 Returns
A batch_shape x n x ddim tensor of inputs normalized to the module’s bounds.

untransform
(X)[source]¶ Unnormalize the inputs.
 Parameters
X (
Tensor
) – A batch_shape x n x ddim tensor of normalized inputs. Return type
Tensor
 Returns
A batch_shape x n x ddim tensor of unnormalized inputs.

property
bounds
¶ The bounds used for normalizing the inputs.
 Return type
Tensor
Transform Utilities¶

botorch.models.transforms.utils.
lognorm_to_norm
(mu, Cov)[source]¶ Compute mean and covariance of a MVN from those of the associated logMVN
If Y is lognormal with mean mu_ln and covariance Cov_ln, then X ~ N(mu_n, Cov_n) with
Cov_n_{ij} = log(1 + Cov_ln_{ij} / (mu_ln_{i} * mu_n_{j})) mu_n_{i} = log(mu_ln_{i})  0.5 * log(1 + Cov_ln_{ii} / mu_ln_{i}**2)
 Parameters
mu (
Tensor
) – A batch_shape x n mean vector of the logNormal distribution.Cov (
Tensor
) – A batch_shape x n x n covariance matrix of the logNormal distribution.
 Returns
The batch_shape x n mean vector of the Normal distribution
The batch_shape x n x n covariance matrix of the Normal distribution
 Return type
A twotuple containing

botorch.models.transforms.utils.
norm_to_lognorm
(mu, Cov)[source]¶ Compute mean and covariance of a logMVN from its MVN sufficient statistics
If X ~ N(mu, Cov) and Y = exp(X), then Y is lognormal with
mu_ln_{i} = exp(mu_{i} + 0.5 * Cov_{ii}) Cov_ln_{ij} = exp(mu_{i} + mu_{j} + 0.5 * (Cov_{ii} + Cov_{jj})) * (exp(Cov_{ij})  1)
 Parameters
mu (
Tensor
) – A batch_shape x n mean vector of the Normal distribution.Cov (
Tensor
) – A batch_shape x n x n covariance matrix of the Normal distribution.
 Returns
The batch_shape x n mean vector of the logNormal distribution.
 The batch_shape x n x n covariance matrix of the logNormal
distribution.
 Return type
A twotuple containing

botorch.models.transforms.utils.
norm_to_lognorm_mean
(mu, var)[source]¶ Compute mean of a logMVN from its MVN marginals
 Parameters
mu (
Tensor
) – A batch_shape x n mean vector of the Normal distribution.var (
Tensor
) – A batch_shape x n variance vectorof the Normal distribution.
 Return type
Tensor
 Returns
The batch_shape x n mean vector of the logNormal distribution

botorch.models.transforms.utils.
norm_to_lognorm_variance
(mu, var)[source]¶ Compute variance of a logMVN from its MVN marginals
 Parameters
mu (
Tensor
) – A batch_shape x n mean vector of the Normal distribution.var (
Tensor
) – A batch_shape x n variance vectorof the Normal distribution.
 Return type
Tensor
 Returns
The batch_shape x n variance vector of the logNormal distribution.
Utilities¶
Model Conversion¶
Utilities for converting between different models.

botorch.models.converter.
model_list_to_batched
(model_list)[source]¶ Convert a ModelListGP to a BatchedMultiOutputGPyTorchModel.
 Parameters
model_list (
ModelListGP
) – The ModelListGP to be converted to the appropriate BatchedMultiOutputGPyTorchModel. All submodels must be of the same type and have the shape (batch shape and number of training inputs). Return type
 Returns
The model converted into a BatchedMultiOutputGPyTorchModel.
Example
>>> list_gp = ModelListGP(gp1, gp2) >>> batch_gp = model_list_to_batched(list_gp)

botorch.models.converter.
batched_to_model_list
(batch_model)[source]¶ Convert a BatchedMultiOutputGPyTorchModel to a ModelListGP.
 Parameters
model_list – The BatchedMultiOutputGPyTorchModel to be converted to a ModelListGP.
 Return type
 Returns
The model converted into a ModelListGP.
Example
>>> train_X = torch.rand(5, 2) >>> train_Y = torch.rand(5, 2) >>> batch_gp = SingleTaskGP(train_X, train_Y) >>> list_gp = batched_to_model_list(batch_gp)
Other Utilties¶
Utiltiy functions for models.

botorch.models.utils.
multioutput_to_batch_mode_transform
(train_X, train_Y, num_outputs, train_Yvar=None)[source]¶ Transforms training inputs for a multioutput model.
Used for multioutput models that internally are represented by a batched single output model, where each output is modeled as an independent batch.
 Parameters
train_X (
Tensor
) – A n x d or input_batch_shape x n x d (batch mode) tensor of training features.train_Y (
Tensor
) – A n x m or target_batch_shape x n x m (batch mode) tensor of training observations.num_outputs (
int
) – number of outputstrain_Yvar (
Optional
[Tensor
]) – A n x m or target_batch_shape x n x m tensor of observed measurement noise.
 Return type
Tuple
[Tensor
,Tensor
,Optional
[Tensor
]] Returns
3element tuple containing
A input_batch_shape x m x n x d tensor of training features.
A target_batch_shape x m x n tensor of training observations.
A target_batch_shape x m x n tensor observed measurement noise.

botorch.models.utils.
add_output_dim
(X, original_batch_shape)[source]¶ Insert the output dimension at the correct location.
The trailing batch dimensions of X must match the original batch dimensions of the training inputs, but can also include extra batch dimensions.
 Parameters
X (
Tensor
) – A (new_batch_shape) x (original_batch_shape) x n x d tensor of features.original_batch_shape (
Size
) – the batch shape of the model’s training inputs.
 Return type
Tuple
[Tensor
,int
] Returns
2element tuple containing
 A (new_batch_shape) x (original_batch_shape) x m x n x d tensor of
features.
The index corresponding to the output dimension.

botorch.models.utils.
check_no_nans
(Z)[source]¶ Check that tensor does not contain NaN values.
Raises an InputDataError if Z contains NaN values.
 Parameters
Z (
Tensor
) – The input tensor. Return type
None

botorch.models.utils.
check_min_max_scaling
(X, strict=False, atol=0.01, raise_on_fail=False)[source]¶ Check that tensor is normalized to the unit cube.
 Parameters
X (
Tensor
) – A batch_shape x n x d input tensor. Typically the training inputs of a model.strict (
bool
) – If True, require X to be scaled to the unit cube (rather than just to be contained within the unit cube).atol (
float
) – The tolerance for the boundary check. Only used if strict=True.raise_on_fail (
bool
) – If True, raise an exception instead of a warning.
 Return type
None

botorch.models.utils.
check_standardization
(Y, atol_mean=0.01, atol_std=0.01, raise_on_fail=False)[source]¶ Check that tensor is standardized (zero mean, unit variance).
 Parameters
Y (
Tensor
) – The input tensor of shape batch_shape x n x m. Typically the train targets of a model. Standardization is checked across the ndimension.atol_mean (
float
) – The tolerance for the mean check.atol_std (
float
) – The tolerance for the std check.raise_on_fail (
bool
) – If True, raise an exception instead of a warning.
 Return type
None

botorch.models.utils.
validate_input_scaling
(train_X, train_Y, train_Yvar=None, raise_on_fail=False)[source]¶ Helper function to validate input data to models.
 Parameters
train_X (
Tensor
) – A n x d or batch_shape x n x d (batch mode) tensor of training features.train_Y (
Tensor
) – A n x m or batch_shape x n x m (batch mode) tensor of training observations.train_Yvar (
Optional
[Tensor
]) – A batch_shape x n x m or batch_shape x n x m (batch mode) tensor of observed measurement noise.raise_on_fail (
bool
) – If True, raise an error instead of emitting a warning (only for normalization/standardization checks, an error is always raised if NaN values are present).
This function is typically called inside the constructor of standard BoTorch models. It validates the following: (i) none of the inputs contain NaN values (ii) the training data (train_X) is normalized to the unit cube (iii) the training targets (train_Y) are standardized (zero mean, unit var) No checks (other than the NaN check) are performed for observed variances (train_Yvar) at this point.
 Return type
None

botorch.models.utils.
mod_batch_shape
(module, names, b)[source]¶ Recursive helper to modify gpytorch modules’ batch shape attribute.
Modifies the module inplace.
 Parameters
module (
Module
) – The module to be modified.names (
List
[str
]) – The list of names to access the attribute. If the full name of the module is “module.sub_module.leaf_module”, this will be [“sub_module”, “leaf_module”].b (
int
) – The new size of the last element of the module’s batch_shape attribute.
 Return type
None