botorch.optim

Optimization

Core

Core abstractions and generic optimizers.

class botorch.optim.core.OptimizationStatus(value)[source]

Bases: int, Enum

An enumeration.

RUNNING = 1
SUCCESS = 2
FAILURE = 3
STOPPED = 4
class botorch.optim.core.OptimizationResult(step: 'int', fval: 'float | int', status: 'OptimizationStatus', runtime: 'float | None' = None, message: 'str | None' = None)[source]

Bases: object

Parameters:
  • step (int)

  • fval (float | int)

  • status (OptimizationStatus)

  • runtime (float | None)

  • message (str | None)

step: int
fval: float | int
status: OptimizationStatus
runtime: float | None = None
message: str | None = None
botorch.optim.core.scipy_minimize(closure, parameters, bounds=None, callback=None, x0=None, method='L-BFGS-B', options=None, timeout_sec=None)[source]

Generic scipy.optimize.minimize-based optimization routine.

Parameters:
  • closure (Callable[[], tuple[Tensor, Sequence[Tensor | None]]] | NdarrayOptimizationClosure) – Callable that returns a tensor and an iterable of gradient tensors or NdarrayOptimizationClosure instance.

  • parameters (dict[str, Tensor]) – A dictionary of tensors to be optimized.

  • bounds (dict[str, tuple[float | None, float | None]] | None) – A dictionary mapping parameter names to lower and upper bounds.

  • callback (Callable[[dict[str, Tensor], OptimizationResult], None] | None) – A callable taking parameters and an OptimizationResult as arguments.

  • x0 (ndarray[tuple[int, ...], dtype[_ScalarType_co]] | None) – An optional initialization vector passed to scipy.optimize.minimize.

  • method (str) – Solver type, passed along to scipy.minimize.

  • options (dict[str, Any] | None) – Dictionary of solver options, passed along to scipy.minimize.

  • timeout_sec (float | None) – Timeout in seconds to wait before aborting the optimization loop if not converged (will return the best found solution thus far).

Returns:

An OptimizationResult summarizing the final state of the run.

Return type:

OptimizationResult

botorch.optim.core.torch_minimize(closure, parameters, bounds=None, callback=None, optimizer=<class 'torch.optim.adam.Adam'>, scheduler=None, step_limit=None, timeout_sec=None, stopping_criterion=None)[source]

Generic torch.optim-based optimization routine.

Parameters:
  • closure (Callable[[], tuple[Tensor, Sequence[Tensor | None]]]) – Callable that returns a tensor and an iterable of gradient tensors. Responsible for setting relevant parameters’ grad attributes.

  • parameters (dict[str, Tensor]) – A dictionary of tensors to be optimized.

  • bounds (dict[str, tuple[float | None, float | None]] | None) – An optional dictionary of bounds for elements of parameters.

  • callback (Callable[[dict[str, Tensor], OptimizationResult], None] | None) – A callable taking parameters and an OptimizationResult as arguments.

  • optimizer (Optimizer | Callable[[list[Tensor]], Optimizer]) – A torch.optim.Optimizer instance or a factory that takes a list of parameters and returns an Optimizer instance.

  • scheduler (LRScheduler | Callable[[Optimizer], LRScheduler] | None) – A torch.optim.lr_scheduler._LRScheduler instance or a factory that takes a Optimizer instance and returns a _LRSchedule instance.

  • step_limit (int | None) – Integer specifying a maximum number of optimization steps. One of step_limit, stopping_criterion, or timeout_sec must be passed.

  • timeout_sec (float | None) – Timeout in seconds before terminating the optimization loop. One of step_limit, stopping_criterion, or timeout_sec must be passed.

  • stopping_criterion (Callable[[Tensor], bool] | None) – A StoppingCriterion for the optimization loop.

Returns:

An OptimizationResult summarizing the final state of the run.

Return type:

OptimizationResult

Acquisition Function Optimization

Methods for optimizing acquisition functions.

class botorch.optim.optimize.OptimizeAcqfInputs(acq_function, bounds, q, num_restarts, raw_samples, options, inequality_constraints, equality_constraints, nonlinear_inequality_constraints, fixed_features, post_processing_func, batch_initial_conditions, return_best_only, gen_candidates, sequential, ic_generator=None, timeout_sec=None, return_full_tree=False, retry_on_optimization_warning=True, ic_gen_kwargs=<factory>)[source]

Bases: object

Container for inputs to optimize_acqf.

See docstring for optimize_acqf for explanation of parameters.

Parameters:
  • acq_function (AcquisitionFunction)

  • bounds (Tensor)

  • q (int)

  • num_restarts (int)

  • raw_samples (int | None)

  • options (dict[str, bool | float | int | str] | None)

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)

  • equality_constraints (list[tuple[Tensor, Tensor, float]] | None)

  • nonlinear_inequality_constraints (list[tuple[Callable, bool]] | None)

  • fixed_features (dict[int, float] | None)

  • post_processing_func (Callable[[Tensor], Tensor] | None)

  • batch_initial_conditions (Tensor | None)

  • return_best_only (bool)

  • gen_candidates (Callable[[Tensor, AcquisitionFunction, Any], tuple[Tensor, Tensor]])

  • sequential (bool)

  • ic_generator (Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None)

  • timeout_sec (float | None)

  • return_full_tree (bool)

  • retry_on_optimization_warning (bool)

  • ic_gen_kwargs (dict)

acq_function: AcquisitionFunction
bounds: Tensor
q: int
num_restarts: int
raw_samples: int | None
options: dict[str, bool | float | int | str] | None
inequality_constraints: list[tuple[Tensor, Tensor, float]] | None
equality_constraints: list[tuple[Tensor, Tensor, float]] | None
nonlinear_inequality_constraints: list[tuple[Callable, bool]] | None
fixed_features: dict[int, float] | None
post_processing_func: Callable[[Tensor], Tensor] | None
batch_initial_conditions: Tensor | None
return_best_only: bool
gen_candidates: Callable[[Tensor, AcquisitionFunction, Any], tuple[Tensor, Tensor]]
sequential: bool
ic_generator: Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None = None
timeout_sec: float | None = None
return_full_tree: bool = False
retry_on_optimization_warning: bool = True
ic_gen_kwargs: dict
property full_tree: bool
get_ic_generator()[source]
Return type:

Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None]

botorch.optim.optimize.optimize_acqf(acq_function, bounds, q, num_restarts, raw_samples=None, options=None, inequality_constraints=None, equality_constraints=None, nonlinear_inequality_constraints=None, fixed_features=None, post_processing_func=None, batch_initial_conditions=None, return_best_only=True, gen_candidates=None, sequential=False, *, ic_generator=None, timeout_sec=None, return_full_tree=False, retry_on_optimization_warning=True, **ic_gen_kwargs)[source]

Generate a set of candidates via multi-start optimization.

Parameters:
  • acq_function (AcquisitionFunction) – An AcquisitionFunction.

  • bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X (if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).

  • q (int) – The number of candidates.

  • num_restarts (int) – The number of starting points for multistart acquisition function optimization.

  • raw_samples (int | None) – The number of samples for initialization. This is required if batch_initial_conditions is not specified.

  • options (dict[str, bool | float | int | str] | None) – Options for candidate generation.

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs. indices and coefficients should be torch tensors. See the docstring of make_scipy_linear_constraints for an example. When q=1, or when applying the same constraint to each candidate in the batch (intra-point constraint), indices should be a 1-d tensor. For inter-point constraints, in which the constraint is applied to the whole batch of candidates, indices must be a 2-d tensor, where in each row indices[i] =(k_i, l_i) the first index k_i corresponds to the k_i-th element of the q-batch and the second index l_i corresponds to the l_i-th feature of that element.

  • equality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an equality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs. See the docstring of make_scipy_linear_constraints for an example.

  • nonlinear_inequality_constraints (list[tuple[Callable, bool]] | None) – A list of tuples representing the nonlinear inequality constraints. The first element in the tuple is a callable representing a constraint of the form callable(x) >= 0. In case of an intra-point constraint, callable()`takes in an one-dimensional tensor of shape `d and returns a scalar. In case of an inter-point constraint, callable() takes a two dimensional tensor of shape q x d and again returns a scalar. The second element is a boolean, indicating if it is an intra-point or inter-point constraint (True for intra-point. False for inter-point). For more information on intra-point vs inter-point constraints, see the docstring of the inequality_constraints argument to optimize_acqf(). The constraints will later be passed to the scipy solver. You need to pass in batch_initial_conditions in this case. Using non-linear inequality constraints also requires that batch_limit is set to 1, which will be done automatically if not specified in options.

  • fixed_features (dict[int, float] | None) – A map {feature_index: value} for features that should be fixed to a particular value during generation. All indices should be non-negative.

  • post_processing_func (Callable[[Tensor], Tensor] | None) – A function that post-processes an optimization result appropriately (i.e., according to round-trip transformations).

  • batch_initial_conditions (Tensor | None) – A tensor to specify the initial conditions. Set this if you do not want to use default initialization strategy.

  • return_best_only (bool) – If False, outputs the solutions corresponding to all random restart initializations of the optimization.

  • gen_candidates (Callable[[Tensor, AcquisitionFunction, Any], tuple[Tensor, Tensor]] | None) – A callable for generating candidates (and their associated acquisition values) given a tensor of initial conditions and an acquisition function. Other common inputs include lower and upper bounds and a dictionary of options, but refer to the documentation of specific generation functions (e.g gen_candidates_scipy and gen_candidates_torch) for method-specific inputs. Default: gen_candidates_scipy

  • sequential (bool) – If False, uses joint optimization, otherwise uses sequential optimization.

  • ic_generator (Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None) – Function for generating initial conditions. Not needed when batch_initial_conditions are provided. Defaults to gen_one_shot_kg_initial_conditions for qKnowledgeGradient acquisition functions and gen_batch_initial_conditions otherwise. Must be specified for nonlinear inequality constraints.

  • timeout_sec (float | None) – Max amount of time optimization can run for.

  • return_full_tree (bool) – Return the full tree of optimizers of the previous iteration.

  • retry_on_optimization_warning (bool) – Whether to retry candidate generation with a new set of initial conditions when it fails with an OptimizationWarning.

  • ic_gen_kwargs (Any) – Additional keyword arguments passed to function specified by ic_generator

Returns:

A two-element tuple containing

  • A tensor of generated candidates. The shape is

    q x d if return_best_only is True (default) – num_restarts x q x d if return_best_only is False

  • a tensor of associated acquisition values. If sequential=False,

    this is a (num_restarts)-dim tensor of joint acquisition values (with explicit restart dimension if return_best_only=False). If sequential=True, this is a q-dim tensor of expected acquisition values conditional on having observed candidates 0,1,…,i-1.

Return type:

tuple[Tensor, Tensor]

Example

>>> # generate `q=2` candidates jointly using 20 random restarts
>>> # and 512 raw samples
>>> candidates, acq_value = optimize_acqf(qEI, bounds, 2, 20, 512)
>>> generate `q=3` candidates sequentially using 15 random restarts
>>> # and 256 raw samples
>>> qEI = qExpectedImprovement(model, best_f=0.2)
>>> bounds = torch.tensor([[0.], [1.]])
>>> candidates, acq_value_list = optimize_acqf(
>>>     qEI, bounds, 3, 15, 256, sequential=True
>>> )
botorch.optim.optimize.optimize_acqf_cyclic(acq_function, bounds, q, num_restarts, raw_samples=None, options=None, inequality_constraints=None, equality_constraints=None, fixed_features=None, post_processing_func=None, batch_initial_conditions=None, cyclic_options=None, *, ic_generator=None, timeout_sec=None, return_full_tree=False, retry_on_optimization_warning=True, **ic_gen_kwargs)[source]

Generate a set of q candidates via cyclic optimization.

Parameters:
  • acq_function (AcquisitionFunction) – An AcquisitionFunction

  • bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X (if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).

  • q (int) – The number of candidates.

  • num_restarts (int) – Number of starting points for multistart acquisition function optimization.

  • raw_samples (int | None) – Number of samples for initialization. This is required if batch_initial_conditions is not specified.

  • options (dict[str, bool | float | int | str] | None) – Options for candidate generation.

  • constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs

  • constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs

  • fixed_features (dict[int, float] | None) – A map {feature_index: value} for features that should be fixed to a particular value during generation. All indices should be non-negative.

  • post_processing_func (Callable[[Tensor], Tensor] | None) – A function that post-processes an optimization result appropriately (i.e., according to round-trip transformations).

  • batch_initial_conditions (Tensor | None) – A tensor to specify the initial conditions. If no initial conditions are provided, the default initialization will be used.

  • cyclic_options (dict[str, bool | float | int | str] | None) – Options for stopping criterion for outer cyclic optimization.

  • ic_generator (Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None) – Function for generating initial conditions. Not needed when batch_initial_conditions are provided. Defaults to gen_one_shot_kg_initial_conditions for qKnowledgeGradient acquisition functions and gen_batch_initial_conditions otherwise. Must be specified for nonlinear inequality constraints.

  • timeout_sec (float | None) – Max amount of time optimization can run for.

  • return_full_tree (bool) – Return the full tree of optimizers of the previous iteration.

  • retry_on_optimization_warning (bool) – Whether to retry candidate generation with a new set of initial conditions when it fails with an OptimizationWarning.

  • ic_gen_kwargs (Any) – Additional keyword arguments passed to function specified by ic_generator

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)

  • equality_constraints (list[tuple[Tensor, Tensor, float]] | None)

Returns:

A two-element tuple containing

  • a q x d-dim tensor of generated candidates.

  • a q-dim tensor of expected acquisition values, where the value at

    index i is the acquisition value conditional on having observed all candidates except candidate i.

Return type:

tuple[Tensor, Tensor]

Example

>>> # generate `q=3` candidates cyclically using 15 random restarts
>>> # 256 raw samples, and 4 cycles
>>>
>>> qEI = qExpectedImprovement(model, best_f=0.2)
>>> bounds = torch.tensor([[0.], [1.]])
>>> candidates, acq_value_list = optimize_acqf_cyclic(
>>>     qEI, bounds, 3, 15, 256, cyclic_options={"maxiter": 4}
>>> )
botorch.optim.optimize.optimize_acqf_list(acq_function_list, bounds, num_restarts, raw_samples=None, options=None, inequality_constraints=None, equality_constraints=None, nonlinear_inequality_constraints=None, fixed_features=None, fixed_features_list=None, post_processing_func=None, ic_generator=None, ic_gen_kwargs=None)[source]

Generate a list of candidates from a list of acquisition functions.

The acquisition functions are optimized in sequence, with previous candidates set as X_pending. This is also known as sequential greedy optimization.

Parameters:
  • acq_function_list (list[AcquisitionFunction]) – A list of acquisition functions.

  • bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X (if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).

  • num_restarts (int) – Number of starting points for multistart acquisition function optimization.

  • raw_samples (int | None) – Number of samples for initialization. This is required if batch_initial_conditions is not specified.

  • options (dict[str, bool | float | int | str] | None) – Options for candidate generation.

  • constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs

  • constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs

  • nonlinear_inequality_constraints (list[tuple[Callable, bool]] | None) – A list of tuples representing the nonlinear inequality constraints. The first element in the tuple is a callable representing a constraint of the form callable(x) >= 0. In case of an intra-point constraint, callable()`takes in an one-dimensional tensor of shape `d and returns a scalar. In case of an inter-point constraint, callable() takes a two dimensional tensor of shape q x d and again returns a scalar. The second element is a boolean, indicating if it is an intra-point or inter-point constraint (True for intra-point. False for inter-point). For more information on intra-point vs inter-point constraints, see the docstring of the inequality_constraints argument to optimize_acqf(). The constraints will later be passed to the scipy solver. You need to pass in batch_initial_conditions in this case. Using non-linear inequality constraints also requires that batch_limit is set to 1, which will be done automatically if not specified in options.

  • fixed_features (dict[int, float] | None) – A map {feature_index: value} for features that should be fixed to a particular value during generation. All indices (feature_index) should be non-negative.

  • fixed_features_list (list[dict[int, float]] | None) – A list of maps {feature_index: value}. The i-th item represents the fixed_feature for the i-th optimization. If fixed_features_list is provided, optimize_acqf_mixed is invoked. All indices (feature_index) should be non-negative.

  • post_processing_func (Callable[[Tensor], Tensor] | None) – A function that post-processes an optimization result appropriately (i.e., according to round-trip transformations).

  • ic_generator (Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None) – Function for generating initial conditions. Not needed when batch_initial_conditions are provided. Defaults to gen_one_shot_kg_initial_conditions for qKnowledgeGradient acquisition functions and gen_batch_initial_conditions otherwise. Must be specified for nonlinear inequality constraints.

  • ic_gen_kwargs (dict | None) – Additional keyword arguments passed to function specified by ic_generator

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)

  • equality_constraints (list[tuple[Tensor, Tensor, float]] | None)

Returns:

A two-element tuple containing

  • a q x d-dim tensor of generated candidates.

  • a q-dim tensor of expected acquisition values, where the value at

    index i is the acquisition value conditional on having observed all candidates except candidate i.

Return type:

tuple[Tensor, Tensor]

botorch.optim.optimize.optimize_acqf_mixed(acq_function, bounds, q, num_restarts, fixed_features_list, raw_samples=None, options=None, inequality_constraints=None, equality_constraints=None, nonlinear_inequality_constraints=None, post_processing_func=None, batch_initial_conditions=None, return_best_only=True, gen_candidates=None, ic_generator=None, timeout_sec=None, retry_on_optimization_warning=True, ic_gen_kwargs=None)[source]

Optimize over a list of fixed_features and returns the best solution.

This is useful for optimizing over mixed continuous and discrete domains. For q > 1 this function always performs sequential greedy optimization (with proper conditioning on generated candidates).

Parameters:
  • acq_function (AcquisitionFunction) – An AcquisitionFunction

  • bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X (if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).

  • q (int) – The number of candidates.

  • num_restarts (int) – Number of starting points for multistart acquisition function optimization.

  • raw_samples (int | None) – Number of samples for initialization. This is required if batch_initial_conditions is not specified.

  • fixed_features_list (list[dict[int, float]]) – A list of maps {feature_index: value}. The i-th item represents the fixed_feature for the i-th optimization. All indices (feature_index) should be non-negative.

  • options (dict[str, bool | float | int | str] | None) – Options for candidate generation.

  • constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs

  • constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs

  • nonlinear_inequality_constraints (list[tuple[Callable, bool]] | None) – A list of tuples representing the nonlinear inequality constraints. The first element in the tuple is a callable representing a constraint of the form callable(x) >= 0. In case of an intra-point constraint, callable()`takes in an one-dimensional tensor of shape `d and returns a scalar. In case of an inter-point constraint, callable() takes a two dimensional tensor of shape q x d and again returns a scalar. The second element is a boolean, indicating if it is an intra-point or inter-point constraint (True for intra-point. False for inter-point). For more information on intra-point vs inter-point constraints, see the docstring of the inequality_constraints argument to optimize_acqf(). The constraints will later be passed to the scipy solver. You need to pass in batch_initial_conditions in this case. Using non-linear inequality constraints also requires that batch_limit is set to 1, which will be done automatically if not specified in options.

  • post_processing_func (Callable[[Tensor], Tensor] | None) – A function that post-processes an optimization result appropriately (i.e., according to round-trip transformations).

  • batch_initial_conditions (Tensor | None) – A tensor to specify the initial conditions. Set this if you do not want to use default initialization strategy.

  • return_best_only (bool) – If False, outputs the solutions corresponding to all random restart initializations of the optimization. Setting this keyword to False is only allowed for q=1. Defaults to True.

  • gen_candidates (Callable[[Tensor, AcquisitionFunction, Any], tuple[Tensor, Tensor]] | None) – A callable for generating candidates (and their associated acquisition values) given a tensor of initial conditions and an acquisition function. Other common inputs include lower and upper bounds and a dictionary of options, but refer to the documentation of specific generation functions (e.g gen_candidates_scipy and gen_candidates_torch) for method-specific inputs. Default: gen_candidates_scipy

  • ic_generator (Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None) – Function for generating initial conditions. Not needed when batch_initial_conditions are provided. Defaults to gen_one_shot_kg_initial_conditions for qKnowledgeGradient acquisition functions and gen_batch_initial_conditions otherwise. Must be specified for nonlinear inequality constraints.

  • timeout_sec (float | None) – Max amount of time optimization can run for.

  • retry_on_optimization_warning (bool) – Whether to retry candidate generation with a new set of initial conditions when it fails with an OptimizationWarning.

  • ic_gen_kwargs (dict | None) – Additional keyword arguments passed to function specified by ic_generator

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)

  • equality_constraints (list[tuple[Tensor, Tensor, float]] | None)

Returns:

A two-element tuple containing

  • A tensor of generated candidates. The shape is

    q x d if return_best_only is True (default) – num_restarts x q x d if return_best_only is False

  • a tensor of associated acquisition values of dim num_restarts

    if return_best_only=False else a scalar acquisition value.

Return type:

tuple[Tensor, Tensor]

botorch.optim.optimize.optimize_acqf_discrete(acq_function, q, choices, max_batch_size=2048, unique=True, X_avoid=None, inequality_constraints=None)[source]

Optimize over a discrete set of points using batch evaluation.

For q > 1 this function generates candidates by means of sequential conditioning (rather than joint optimization), since for all but the smalles number of choices the set choices^q of discrete points to evaluate quickly explodes.

Parameters:
  • acq_function (AcquisitionFunction) – An AcquisitionFunction.

  • q (int) – The number of candidates.

  • choices (Tensor) – A num_choices x d tensor of possible choices.

  • max_batch_size (int) – The maximum number of choices to evaluate in batch. A large limit can cause excessive memory usage if the model has a large training set.

  • unique (bool) – If True return unique choices, o/w choices may be repeated (only relevant if q > 1).

  • X_avoid (Tensor | None) – An n x d tensor of candidates that we aren’t allowed to pick. These will be removed from the set of choices.

  • constraints (inequality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs. Infeasible points will be removed from the set of choices.

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)

Returns:

A two-element tuple containing

  • a q x d-dim tensor of generated candidates.

  • an associated acquisition value.

Return type:

tuple[Tensor, Tensor]

Optimize acquisition function over a lattice.

This is useful when d is large and enumeration of the search space isn’t possible. For q > 1 this function always performs sequential greedy optimization (with proper conditioning on generated candidates).

NOTE: While this method supports arbitrary lattices, it has only been thoroughly tested for {0, 1}^d. Consider it to be in alpha stage for the more general case.

Parameters:
  • acq_function (AcquisitionFunction) – An AcquisitionFunction

  • discrete_choices (list[Tensor]) – A list of possible discrete choices for each dimension. Each element in the list is expected to be a torch tensor.

  • q (int) – The number of candidates.

  • num_restarts (int) – Number of starting points for multistart acquisition function optimization.

  • raw_samples (int) – Number of samples for initialization. This is required if batch_initial_conditions is not specified.

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs

  • X_avoid (Tensor | None) – An n x d tensor of candidates that we aren’t allowed to pick.

  • batch_initial_conditions (Tensor | None) – A tensor of size n x 1 x d to specify the initial conditions. Set this if you do not want to use default initialization strategy.

  • max_batch_size (int) – The maximum number of choices to evaluate in batch. A large limit can cause excessive memory usage if the model has a large training set.

  • max_tries (int) – Maximum number of iterations to try when generating initial conditions.

  • unique (bool) – If True return unique choices, o/w choices may be repeated (only relevant if q > 1).

Returns:

A two-element tuple containing

  • a q x d-dim tensor of generated candidates.

  • an associated acquisition value.

Return type:

tuple[Tensor, Tensor]

Model Fitting Optimization

Tools for model fitting.

botorch.optim.fit.fit_gpytorch_mll_scipy(mll, parameters=None, bounds=None, closure=None, closure_kwargs=None, method='L-BFGS-B', options=None, callback=None, timeout_sec=None)[source]

Generic scipy.optimized-based fitting routine for GPyTorch MLLs.

The model and likelihood in mll must already be in train mode.

Parameters:
  • mll (MarginalLogLikelihood) – MarginalLogLikelihood to be maximized.

  • parameters (dict[str, Tensor] | None) – Optional dictionary of parameters to be optimized. Defaults to all parameters of mll that require gradients.

  • bounds (dict[str, tuple[float | None, float | None]] | None) – A dictionary of user-specified bounds for parameters. Used to update default parameter bounds obtained from mll.

  • closure (Callable[[], tuple[Tensor, Sequence[Tensor | None]]] | None) – Callable that returns a tensor and an iterable of gradient tensors. Responsible for setting the grad attributes of parameters. If no closure is provided, one will be obtained by calling get_loss_closure_with_grads.

  • closure_kwargs (dict[str, Any] | None) – Keyword arguments passed to closure.

  • method (str) – Solver type, passed along to scipy.minimize.

  • options (dict[str, Any] | None) – Dictionary of solver options, passed along to scipy.minimize.

  • callback (Callable[[dict[str, Tensor], OptimizationResult], None] | None) – Optional callback taking parameters and an OptimizationResult as its sole arguments.

  • timeout_sec (float | None) – Timeout in seconds after which to terminate the fitting loop (note that timing out can result in bad fits!).

Returns:

The final OptimizationResult.

Return type:

OptimizationResult

botorch.optim.fit.fit_gpytorch_mll_torch(mll, parameters=None, bounds=None, closure=None, closure_kwargs=None, step_limit=None, stopping_criterion=<class 'botorch.utils.types.DEFAULT'>, optimizer=<class 'torch.optim.adam.Adam'>, scheduler=None, callback=None, timeout_sec=None)[source]

Generic torch.optim-based fitting routine for GPyTorch MLLs.

Parameters:
  • mll (MarginalLogLikelihood) – MarginalLogLikelihood to be maximized.

  • parameters (dict[str, Tensor] | None) – Optional dictionary of parameters to be optimized. Defaults to all parameters of mll that require gradients.

  • bounds (dict[str, tuple[float | None, float | None]] | None) – A dictionary of user-specified bounds for parameters. Used to update default parameter bounds obtained from mll.

  • closure (Callable[[], tuple[Tensor, Sequence[Tensor | None]]] | None) – Callable that returns a tensor and an iterable of gradient tensors. Responsible for setting the grad attributes of parameters. If no closure is provided, one will be obtained by calling get_loss_closure_with_grads.

  • closure_kwargs (dict[str, Any] | None) – Keyword arguments passed to closure.

  • step_limit (int | None) – Optional upper bound on the number of optimization steps.

  • stopping_criterion (Callable[[Tensor], bool] | None) – A StoppingCriterion for the optimization loop.

  • optimizer (Optimizer | Callable[[...], Optimizer]) – A torch.optim.Optimizer instance or a factory that takes a list of parameters and returns an Optimizer instance.

  • scheduler (_LRScheduler | Callable[[...], _LRScheduler] | None) – A torch.optim.lr_scheduler._LRScheduler instance or a factory that takes an Optimizer instance and returns an _LRSchedule.

  • callback (Callable[[dict[str, Tensor], OptimizationResult], None] | None) – Optional callback taking parameters and an OptimizationResult as its sole arguments.

  • timeout_sec (float | None) – Timeout in seconds after which to terminate the fitting loop (note that timing out can result in bad fits!).

Returns:

The final OptimizationResult.

Return type:

OptimizationResult

Initialization Helpers

References

[Regis]

R. G. Regis, C. A. Shoemaker. Combining radial basis function surrogates and dynamic coordinate search in high-dimensional expensive black-box optimization, Engineering Optimization, 2013.

botorch.optim.initializers.transform_constraints(constraints, q, d)[source]

Transform constraints to sample from a d*q-dimensional space instead of a d-dimensional state.

This function assumes that constraints are the same for each input batch, and broadcasts the constraints accordingly to the input batch shape.

Parameters:
  • constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an (in-)equality constraint of the form sum_i (X[indices[i]] * coefficients[i]) (>)= rhs. If indices is a 2-d Tensor, this supports specifying constraints across the points in the q-batch (inter-point constraints). If None, this function is a nullop and simply returns None.

  • q (int) – Size of the q-batch.

  • d (int) – Dimensionality of the problem.

Returns:

List of transformed constraints, if there are constraints. Returns None otherwise.

Return type:

List[Tuple[Tensor, Tensor, float]]

botorch.optim.initializers.transform_intra_point_constraint(constraint, d, q)[source]

Transforms an intra-point/pointwise constraint from d-dimensional space to a d*q-dimesional space.

Parameters:
  • constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an (in-)equality constraint of the form sum_i (X[indices[i]] * coefficients[i]) (>)= rhs. Here indices must be one-dimensional, and the constraint is applied to all points within the q-batch.

  • d (int) – Dimensionality of the problem.

  • constraint (tuple[Tensor, Tensor, float])

  • q (int)

Raises:

ValueError – If indices in the constraints are larger than the dimensionality d of the problem.

Returns:

List of transformed constraints.

Return type:

List[Tuple[Tensor, Tensor, float]]

botorch.optim.initializers.transform_inter_point_constraint(constraint, d)[source]

Transforms an inter-point constraint from d-dimensional space to a d*q dimesional space.

Parameters:
  • constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an (in-)equality constraint of the form sum_i (X[indices[i]] * coefficients[i]) (>)= rhs. indices must be a 2-d Tensor, where in each row indices[i] = (k_i, l_i) the first index k_i corresponds to the k_i-th element of the q-batch and the second index l_i corresponds to the l_i-th feature of that element.

  • constraint (tuple[Tensor, Tensor, float])

  • d (int)

Raises:

ValueError – If indices in the constraints are larger than the dimensionality d of the problem.

Returns:

Transformed constraint.

Return type:

List[Tuple[Tensor, Tensor, float]]

botorch.optim.initializers.sample_q_batches_from_polytope(n, q, bounds, n_burnin, n_thinning, seed=None, inequality_constraints=None, equality_constraints=None)[source]

Samples n q-baches from a polytope of dimension d.

Parameters:
  • n (int) – Number of q-batches to sample.

  • q (int) – Number of samples per q-batch

  • bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X.

  • n_burnin (int) – The number of burn-in samples for the Markov chain sampler.

  • n_thinning (int) – The amount of thinning. The sampler will return every n_thinning sample (after burn-in).

  • seed (int | None) – The random seed.

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs.

  • equality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs.

Returns:

A n x q x d-dim tensor of samples.

Return type:

Tensor

botorch.optim.initializers.gen_batch_initial_conditions(acq_function, bounds, q, num_restarts, raw_samples, fixed_features=None, options=None, inequality_constraints=None, equality_constraints=None, generator=None, fixed_X_fantasies=None)[source]

Generate a batch of initial conditions for random-restart optimziation.

TODO: Support t-batches of initial conditions.

Parameters:
  • acq_function (AcquisitionFunction) – The acquisition function to be optimized.

  • bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X.

  • q (int) – The number of candidates to consider.

  • num_restarts (int) – The number of starting points for multistart acquisition function optimization.

  • raw_samples (int) – The number of raw samples to consider in the initialization heuristic. Note: if sample_around_best is True (the default is False), then 2 * raw_samples samples are used.

  • fixed_features (dict[int, float] | None) – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • options (dict[str, bool | float | int] | None) – Options for initial condition generation. For valid options see initialize_q_batch_topn, initialize_q_batch_nonneg, and initialize_q_batch. If options contains a topn=True then initialize_q_batch_topn will be used. Else if options contains a nonnegative=True entry, then acq_function is assumed to be non-negative (useful when using custom acquisition functions). initialize_q_batch will be used otherwise. In addition, an “init_batch_limit” option can be passed to specify the batch limit for the initialization. This is useful for avoiding memory limits when computing the batch posterior over raw samples.

  • constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs.

  • constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs.

  • generator (Callable[[int, int, int | None], Tensor] | None) – Callable for generating samples that are then further processed. It receives n, q and seed as arguments and returns a tensor of shape n x q x d.

  • fixed_X_fantasies (Tensor | None) – A fixed set of fantasy points to concatenate to the q candidates being initialized along the -2 dimension. The shape should be num_pseudo_points x d. E.g., this should be num_fantasies x d for KG and num_fantasies*num_pareto x d for HVKG.

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)

  • equality_constraints (list[tuple[Tensor, Tensor, float]] | None)

Returns:

A num_restarts x q x d tensor of initial conditions.

Return type:

Tensor

Example

>>> qEI = qExpectedImprovement(model, best_f=0.2)
>>> bounds = torch.tensor([[0.], [1.]])
>>> Xinit = gen_batch_initial_conditions(
>>>     qEI, bounds, q=3, num_restarts=25, raw_samples=500
>>> )
botorch.optim.initializers.gen_one_shot_kg_initial_conditions(acq_function, bounds, q, num_restarts, raw_samples, fixed_features=None, options=None, inequality_constraints=None, equality_constraints=None)[source]

Generate a batch of smart initializations for qKnowledgeGradient.

This function generates initial conditions for optimizing one-shot KG using the maximizer of the posterior objective. Intutively, the maximizer of the fantasized posterior will often be close to a maximizer of the current posterior. This function uses that fact to generate the initial conditions for the fantasy points. Specifically, a fraction of 1 - frac_random (see options) is generated by sampling from the set of maximizers of the posterior objective (obtained via random restart optimization) according to a softmax transformation of their respective values. This means that this initialization strategy internally solves an acquisition function maximization problem. The remaining frac_random fantasy points as well as all q candidate points are chosen according to the standard initialization strategy in gen_batch_initial_conditions.

Parameters:
  • acq_function (qKnowledgeGradient) – The qHypervolumeKnowledgeGradient instance to be optimized.

  • bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of task features.

  • q (int) – The number of candidates to consider.

  • num_restarts (int) – The number of starting points for multistart acquisition function optimization.

  • raw_samples (int) – The number of raw samples to consider in the initialization heuristic.

  • fixed_features (dict[int, float] | None) – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • options (dict[str, bool | float | int] | None) – Options for initial condition generation. These contain all settings for the standard heuristic initialization from gen_batch_initial_conditions. In addition, they contain frac_random (the fraction of fully random fantasy points), num_inner_restarts and raw_inner_samples (the number of random restarts and raw samples for solving the posterior objective maximization problem, respectively) and eta (temperature parameter for sampling heuristic from posterior objective maximizers).

  • constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs.

  • constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs.

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)

  • equality_constraints (list[tuple[Tensor, Tensor, float]] | None)

Returns:

A num_restarts x q’ x d tensor that can be used as initial conditions for optimize_acqf(). Here q’ = q + num_fantasies is the total number of points (candidate points plus fantasy points).

Return type:

Tensor | None

Example

>>> qHVKG = qHypervolumeKnowledgeGradient(model, ref_point=num_fantasies=64)
>>> bounds = torch.tensor([[0., 0.], [1., 1.]])
>>> Xinit = gen_one_shot_hvkg_initial_conditions(
>>>     qHVKG, bounds, q=3, num_restarts=10, raw_samples=512,
>>>     options={"frac_random": 0.25},
>>> )
botorch.optim.initializers.gen_one_shot_hvkg_initial_conditions(acq_function, bounds, q, num_restarts, raw_samples, fixed_features=None, options=None, inequality_constraints=None, equality_constraints=None)[source]

Generate a batch of smart initializations for qHypervolumeKnowledgeGradient.

This function generates initial conditions for optimizing one-shot HVKG using the hypervolume maximizing set (of fixed size) under the posterior mean. Intutively, the hypervolume maximizing set of the fantasized posterior mean will often be close to a hypervolume maximizing set under the current posterior mean. This function uses that fact to generate the initial conditions for the fantasy points. Specifically, a fraction of 1 - frac_random (see options) of the restarts are generated by learning the hypervolume maximizing sets under the current posterior mean, where each hypervolume maximizing set is obtained from maximizing the hypervolume from a different starting point. Given a hypervolume maximizing set, the q candidate points are selected using to the standard initialization strategy in gen_batch_initial_conditions, with the fixed hypervolume maximizing set. The remaining frac_random restarts fantasy points as well as all q candidate points are chosen according to the standard initialization strategy in gen_batch_initial_conditions.

Parameters:
  • acq_function (qHypervolumeKnowledgeGradient) – The qKnowledgeGradient instance to be optimized.

  • bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of task features.

  • q (int) – The number of candidates to consider.

  • num_restarts (int) – The number of starting points for multistart acquisition function optimization.

  • raw_samples (int) – The number of raw samples to consider in the initialization heuristic.

  • fixed_features (dict[int, float] | None) – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • options (dict[str, bool | float | int] | None) – Options for initial condition generation. These contain all settings for the standard heuristic initialization from gen_batch_initial_conditions. In addition, they contain frac_random (the fraction of fully random fantasy points), num_inner_restarts and raw_inner_samples (the number of random restarts and raw samples for solving the posterior objective maximization problem, respectively) and eta (temperature parameter for sampling heuristic from posterior objective maximizers).

  • constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs.

  • constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs.

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)

  • equality_constraints (list[tuple[Tensor, Tensor, float]] | None)

Returns:

A num_restarts x q’ x d tensor that can be used as initial conditions for optimize_acqf(). Here q’ = q + num_fantasies is the total number of points (candidate points plus fantasy points).

Return type:

Tensor | None

Example

>>> qHVKG = qHypervolumeKnowledgeGradient(model, ref_point)
>>> bounds = torch.tensor([[0., 0.], [1., 1.]])
>>> Xinit = gen_one_shot_hvkg_initial_conditions(
>>>     qHVKG, bounds, q=3, num_restarts=10, raw_samples=512,
>>>     options={"frac_random": 0.25},
>>> )
botorch.optim.initializers.gen_value_function_initial_conditions(acq_function, bounds, num_restarts, raw_samples, current_model, fixed_features=None, options=None)[source]

Generate a batch of smart initializations for optimizing the value function of qKnowledgeGradient.

This function generates initial conditions for optimizing the inner problem of KG, i.e. its value function, using the maximizer of the posterior objective. Intutively, the maximizer of the fantasized posterior will often be close to a maximizer of the current posterior. This function uses that fact to generate the initital conditions for the fantasy points. Specifically, a fraction of 1 - frac_random (see options) of raw samples is generated by sampling from the set of maximizers of the posterior objective (obtained via random restart optimization) according to a softmax transformation of their respective values. This means that this initialization strategy internally solves an acquisition function maximization problem. The remaining raw samples are generated using draw_sobol_samples. All raw samples are then evaluated, and the initial conditions are selected according to the standard initialization strategy in ‘initialize_q_batch’ individually for each inner problem.

Parameters:
  • acq_function (AcquisitionFunction) – The value function instance to be optimized.

  • bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of task features.

  • num_restarts (int) – The number of starting points for multistart acquisition function optimization.

  • raw_samples (int) – The number of raw samples to consider in the initialization heuristic.

  • current_model (Model) – The model of the KG acquisition function that was used to generate the fantasy model of the value function.

  • fixed_features (dict[int, float] | None) – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • options (dict[str, bool | float | int] | None) – Options for initial condition generation. These contain all settings for the standard heuristic initialization from gen_batch_initial_conditions. In addition, they contain frac_random (the fraction of fully random fantasy points), num_inner_restarts and raw_inner_samples (the number of random restarts and raw samples for solving the posterior objective maximization problem, respectively) and eta (temperature parameter for sampling heuristic from posterior objective maximizers).

Returns:

A num_restarts x batch_shape x q x d tensor that can be used as initial conditions for optimize_acqf(). Here batch_shape is the batch shape of value function model.

Return type:

Tensor

Example

>>> fant_X = torch.rand(5, 1, 2)
>>> fantasy_model = model.fantasize(fant_X, SobolQMCNormalSampler(16))
>>> value_function = PosteriorMean(fantasy_model)
>>> bounds = torch.tensor([[0., 0.], [1., 1.]])
>>> Xinit = gen_value_function_initial_conditions(
>>>     value_function, bounds, num_restarts=10, raw_samples=512,
>>>     options={"frac_random": 0.25},
>>> )
botorch.optim.initializers.initialize_q_batch(X, acq_vals, n, eta=1.0)[source]

Heuristic for selecting initial conditions for candidate generation.

This heuristic selects points from X (without replacement) with probability proportional to exp(eta * Z), where Z = (acq_vals - mean(acq_vals)) / std(acq_vals) and eta is a temperature parameter.

When using an acquisiton function that is non-negative and possibly zero over large areas of the feature space (e.g. qEI), you should use initialize_q_batch_nonneg instead.

Parameters:
  • X (Tensor) – A b x batch_shape x q x d tensor of b - batch_shape samples of q-batches from a d`-dim feature space. Typically, these are generated using qMC sampling.

  • acq_vals (Tensor) – A tensor of b x batch_shape outcomes associated with the samples. Typically, this is the value of the batch acquisition function to be maximized.

  • n (int) – The number of initial condition to be generated. Must be less than b.

  • eta (float) – Temperature parameter for weighting samples.

Returns:

  • An n x batch_shape x q x d tensor of n - batch_shape q-batch initial conditions, where each batch of n x q x d samples is selected independently.

  • An n x batch_shape tensor of the corresponding acquisition values.

Return type:

tuple[Tensor, Tensor]

Example

>>> # To get `n=10` starting points of q-batch size `q=3`
>>> # for model with `d=6`:
>>> qUCB = qUpperConfidenceBound(model, beta=0.1)
>>> X_rnd = torch.rand(500, 3, 6)
>>> X_init, acq_init = initialize_q_batch(X=X_rnd, acq_vals=qUCB(X_rnd), n=10)
botorch.optim.initializers.initialize_q_batch_nonneg(X, acq_vals, n, eta=1.0, alpha=0.0001)[source]

Heuristic for selecting initial conditions for non-neg. acquisition functions.

This function is similar to initialize_q_batch, but designed specifically for acquisition functions that are non-negative and possibly zero over large areas of the feature space (e.g. qEI). All samples for which acq_vals < alpha * max(acq_vals) will be ignored (assuming that acq_vals contains at least one positive value).

Parameters:
  • X (Tensor) – A b x q x d tensor of b samples of q-batches from a d-dim. feature space. Typically, these are generated using qMC.

  • acq_vals (Tensor) – A tensor of b outcomes associated with the samples. Typically, this is the value of the batch acquisition function to be maximized.

  • n (int) – The number of initial condition to be generated. Must be less than b.

  • eta (float) – Temperature parameter for weighting samples.

  • alpha (float) – The threshold (as a fraction of the maximum observed value) under which to ignore samples. All input samples for which Y < alpha * max(Y) will be ignored.

Returns:

  • An n x q x d tensor of n q-batch initial conditions.

  • An n tensor of the corresponding acquisition values.

Return type:

tuple[Tensor, Tensor]

Example

>>> # To get `n=10` starting points of q-batch size `q=3`
>>> # for model with `d=6`:
>>> qEI = qExpectedImprovement(model, best_f=0.2)
>>> X_rnd = torch.rand(500, 3, 6)
>>> X_init, acq_init = initialize_q_batch_nonneg(
...     X=X_rnd, acq_vals=qEI(X_rnd), n=10
... )
botorch.optim.initializers.initialize_q_batch_topn(X, acq_vals, n, largest=True, sorted=True)[source]

Take the top n initial conditions for candidate generation.

Parameters:
  • X (Tensor) – A b x q x d tensor of b samples of q-batches from a d-dim. feature space. Typically, these are generated using qMC.

  • acq_vals (Tensor) – A tensor of b outcomes associated with the samples. Typically, this is the value of the batch acquisition function to be maximized.

  • n (int) – The number of initial condition to be generated. Must be less than b.

  • largest (bool)

  • sorted (bool)

Returns:

  • An n x q x d tensor of n q-batch initial conditions.

  • An n tensor of the corresponding acquisition values.

Return type:

tuple[Tensor, Tensor]

Example

>>> # To get `n=10` starting points of q-batch size `q=3`
>>> # for model with `d=6`:
>>> qUCB = qUpperConfidenceBound(model, beta=0.1)
>>> X_rnd = torch.rand(500, 3, 6)
>>> X_init, acq_init = initialize_q_batch_topn(
...     X=X_rnd, acq_vals=qUCB(X_rnd), n=10
... )
botorch.optim.initializers.sample_points_around_best(acq_function, n_discrete_points, sigma, bounds, best_pct=5.0, subset_sigma=0.1, prob_perturb=None)[source]

Find best points and sample nearby points.

Parameters:
  • acq_function (AcquisitionFunction) – The acquisition function.

  • n_discrete_points (int) – The number of points to sample.

  • sigma (float) – The standard deviation of the additive gaussian noise for perturbing the best points.

  • bounds (Tensor) – A 2 x d-dim tensor containing the bounds.

  • best_pct (float) – The percentage of best points to perturb.

  • subset_sigma (float) – The standard deviation of the additive gaussian noise for perturbing a subset of dimensions of the best points.

  • prob_perturb (float | None) – The probability of perturbing each dimension.

Returns:

An optional n_discrete_points x d-dim tensor containing the

sampled points. This is None if no baseline points are found.

Return type:

Tensor | None

botorch.optim.initializers.sample_truncated_normal_perturbations(X, n_discrete_points, sigma, bounds, qmc=True)[source]

Sample points around X.

Sample perturbed points around X such that the added perturbations are sampled from N(0, sigma^2 I) and truncated to be within [0,1]^d.

Parameters:
  • X (Tensor) – A n x d-dim tensor starting points.

  • n_discrete_points (int) – The number of points to sample.

  • sigma (float) – The standard deviation of the additive gaussian noise for perturbing the points.

  • bounds (Tensor) – A 2 x d-dim tensor containing the bounds.

  • qmc (bool) – A boolean indicating whether to use qmc.

Returns:

A n_discrete_points x d-dim tensor containing the sampled points.

Return type:

Tensor

botorch.optim.initializers.sample_perturbed_subset_dims(X, bounds, n_discrete_points, sigma=0.1, qmc=True, prob_perturb=None)[source]

Sample around X by perturbing a subset of the dimensions.

By default, dimensions are perturbed with probability equal to min(20 / d, 1). As shown in [Regis], perturbing a small number of dimensions can be beneificial. The perturbations are sampled from N(0, sigma^2 I) and truncated to be within [0,1]^d.

Parameters:
  • X (Tensor) – A n x d-dim tensor starting points. X must be normalized to be within [0, 1]^d.

  • bounds (Tensor) – The bounds to sample perturbed values from

  • n_discrete_points (int) – The number of points to sample.

  • sigma (float) – The standard deviation of the additive gaussian noise for perturbing the points.

  • qmc (bool) – A boolean indicating whether to use qmc.

  • prob_perturb (float | None) – The probability of perturbing each dimension. If omitted, defaults to min(20 / d, 1).

Returns:

A n_discrete_points x d-dim tensor containing the sampled points.

Return type:

Tensor

botorch.optim.initializers.is_nonnegative(acq_function)[source]

Determine whether a given acquisition function is non-negative.

Parameters:

acq_function (AcquisitionFunction) – The AcquisitionFunction instance.

Returns:

True if acq_function is non-negative, False if not, or if the behavior is unknown (for custom acquisition functions).

Return type:

bool

Example

>>> qEI = qExpectedImprovement(model, best_f=0.1)
>>> is_nonnegative(qEI)  # returns True

Stopping Criteria

class botorch.optim.stopping.StoppingCriterion[source]

Bases: ABC

Base class for evaluating optimization convergence.

Stopping criteria are implemented as a objects rather than a function, so that they can keep track of past function values between optimization steps.

abstract evaluate(fvals)[source]

Evaluate the stopping criterion.

Parameters:

fvals (Tensor) – tensor containing function values for the current iteration. If fvals contains more than one element, then the stopping criterion is evaluated element-wise and True is returned if the stopping criterion is true for all elements.

Returns:

Stopping indicator (if True, stop the optimziation).

Return type:

bool

class botorch.optim.stopping.ExpMAStoppingCriterion(maxiter=10000, minimize=True, n_window=10, eta=1.0, rel_tol=1e-05)[source]

Bases: StoppingCriterion

Exponential moving average stopping criterion.

Computes an exponentially weighted moving average over window length n_window and checks whether the relative decrease in this moving average between steps is less than a provided tolerance level. That is, in iteration i, it computes

v[i,j] := fvals[i - n_window + j] * w[j]

for all j = 0, …, n_window, where w[j] = exp(-eta * (1 - j / n_window)). Letting ma[i] := sum_j(v[i,j]), the criterion evaluates to True whenever

(ma[i-1] - ma[i]) / abs(ma[i-1]) < rel_tol (if minimize=True) (ma[i] - ma[i-1]) / abs(ma[i-1]) < rel_tol (if minimize=False)

Exponential moving average stopping criterion.

Parameters:
  • maxiter (int) – Maximum number of iterations.

  • minimize (bool) – If True, assume minimization.

  • n_window (int) – The size of the exponential moving average window.

  • eta (float) – The exponential decay factor in the weights.

  • rel_tol (float) – Relative tolerance for termination.

evaluate(fvals)[source]

Evaluate the stopping criterion.

Parameters:

fvals (Tensor) – tensor containing function values for the current iteration. If fvals contains more than one element, then the stopping criterion is evaluated element-wise and True is returned if the stopping criterion is true for all elements.

Return type:

bool

TODO: add support for utilizing gradient information

Returns:

Stopping indicator (if True, stop the optimziation).

Parameters:

fvals (Tensor)

Return type:

bool

Acquisition Function Optimization with Homotopy

botorch.optim.optimize_homotopy.prune_candidates(candidates, acq_values, prune_tolerance)[source]

Prune candidates based on their distance to other candidates.

Parameters:
  • candidates (Tensor) – An n x d tensor of candidates.

  • acq_values (Tensor) – An n tensor of candidate values.

  • prune_tolerance (float) – The minimum distance to prune candidates.

Returns:

An m x d tensor of pruned candidates.

Return type:

Tensor

botorch.optim.optimize_homotopy.optimize_acqf_homotopy(acq_function, bounds, q, num_restarts, homotopy, prune_tolerance=0.0001, raw_samples=None, options=None, final_options=None, inequality_constraints=None, equality_constraints=None, nonlinear_inequality_constraints=None, fixed_features=None, fixed_features_list=None, post_processing_func=None, batch_initial_conditions=None, gen_candidates=None, *, ic_generator=None, timeout_sec=None, retry_on_optimization_warning=True, **ic_gen_kwargs)[source]

Generate a set of candidates via multi-start optimization.

Parameters:
  • acq_function (AcquisitionFunction) – An AcquisitionFunction.

  • bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X (if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).

  • q (int) – The number of candidates.

  • homotopy (Homotopy) – Homotopy object that will make the necessary modifications to the problem when calling step().

  • prune_tolerance (float) – The minimum distance to prune candidates.

  • num_restarts (int) – The number of starting points for multistart acquisition function optimization.

  • raw_samples (int | None) – The number of samples for initialization. This is required if batch_initial_conditions is not specified.

  • options (dict[str, bool | float | int | str] | None) – Options for candidate generation in the initial step of the homotopy.

  • final_options (dict[str, bool | float | int | str] | None) – Options for candidate generation in the final step of the homotopy.

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs. indices and coefficients should be torch tensors. See the docstring of make_scipy_linear_constraints for an example. When q=1, or when applying the same constraint to each candidate in the batch (intra-point constraint), indices should be a 1-d tensor. For inter-point constraints, in which the constraint is applied to the whole batch of candidates, indices must be a 2-d tensor, where in each row indices[i] =(k_i, l_i) the first index k_i corresponds to the k_i-th element of the q-batch and the second index l_i corresponds to the l_i-th feature of that element.

  • equality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an equality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs. See the docstring of make_scipy_linear_constraints for an example.

  • nonlinear_inequality_constraints (list[tuple[Callable, bool]] | None) – A list of tuples representing the nonlinear inequality constraints. The first element in the tuple is a callable representing a constraint of the form callable(x) >= 0. In case of an intra-point constraint, callable()`takes in an one-dimensional tensor of shape `d and returns a scalar. In case of an inter-point constraint, callable() takes a two dimensional tensor of shape q x d and again returns a scalar. The second element is a boolean, indicating if it is an intra-point or inter-point constraint (True for intra-point. False for inter-point). For more information on intra-point vs inter-point constraints, see the docstring of the inequality_constraints argument to optimize_acqf(). The constraints will later be passed to the scipy solver. You need to pass in batch_initial_conditions in this case. Using non-linear inequality constraints also requires that batch_limit is set to 1, which will be done automatically if not specified in options.

  • fixed_features (dict[int, float] | None) – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • fixed_features_list (list[dict[int, float]] | None) – A list of maps {feature_index: value}. The i-th item represents the fixed_feature for the i-th optimization. If fixed_features_list is provided, optimize_acqf_mixed is invoked. All indices (feature_index) should be non-negative.

  • post_processing_func (Callable[[Tensor], Tensor] | None) – A function that post-processes an optimization result appropriately (i.e., according to round-trip transformations).

  • batch_initial_conditions (Tensor | None) – A tensor to specify the initial conditions. Set this if you do not want to use default initialization strategy.

  • gen_candidates (Callable[[Tensor, AcquisitionFunction, Any], tuple[Tensor, Tensor]] | None) – A callable for generating candidates (and their associated acquisition values) given a tensor of initial conditions and an acquisition function. Other common inputs include lower and upper bounds and a dictionary of options, but refer to the documentation of specific generation functions (e.g gen_candidates_scipy and gen_candidates_torch) for method-specific inputs. Default: gen_candidates_scipy

  • ic_generator (Callable[[qKnowledgeGradient, Tensor, int, int, int, dict[int, float] | None, dict[str, bool | float | int] | None, list[tuple[Tensor, Tensor, float]] | None, list[tuple[Tensor, Tensor, float]] | None], Tensor | None] | None) – Function for generating initial conditions. Not needed when batch_initial_conditions are provided. Defaults to gen_one_shot_kg_initial_conditions for qKnowledgeGradient acquisition functions and gen_batch_initial_conditions otherwise. Must be specified for nonlinear inequality constraints.

  • timeout_sec (float | None) – Max amount of time optimization can run for.

  • retry_on_optimization_warning (bool) – Whether to retry candidate generation with a new set of initial conditions when it fails with an OptimizationWarning.

  • ic_gen_kwargs (Any) – Additional keyword arguments passed to function specified by ic_generator

Return type:

tuple[Tensor, Tensor]

Acquisition Function Optimization with Mixed Integer Variables

botorch.optim.optimize_mixed.get_nearest_neighbors(current_x, bounds, discrete_dims)[source]

Generate all 1-Manhattan distance neighbors of a given input. The neighbors are generated for the discrete dimensions only.

NOTE: This assumes that current_x is detached and uses in-place operations, which are known to be incompatible with autograd.

Parameters:
  • current_x (Tensor) – The design to find the neighbors of. A tensor of shape d.

  • bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X.

  • discrete_dims (Tensor) – A tensor of indices corresponding to binary and integer parameters.

Returns:

A tensor of shape num_neighbors x d, denoting all unique 1-Manhattan distance neighbors.

Return type:

Tensor

botorch.optim.optimize_mixed.get_spray_points(X_baseline, cont_dims, discrete_dims, bounds, num_spray_points, std_cont_perturbation=0.1)[source]

Generate spray points by perturbing the Pareto optimal points.

Given the points on the Pareto frontier, we create perturbations (spray points) by adding Gaussian perturbation to the continuous parameters and 1-Manhattan distance neighbors of the discrete (binary and integer) parameters.

Parameters:
  • X_baseline (Tensor) – Tensor of best acquired points across BO run.

  • cont_dims (Tensor) – Indices of continuous parameters/input dimensions.

  • discrete_dims (Tensor) – Indices of binary/integer parameters/input dimensions.

  • bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X.

  • num_spray_points (int) – Number of spray points to return.

  • std_cont_perturbation (float) – standard deviation of Normal perturbations of continuous dimensions. Default is STD_CONT_PERTURBATION = 0.2.

Returns:

A (num_spray_points x d)-dim tensor of perturbed points.

Return type:

Tensor

botorch.optim.optimize_mixed.sample_feasible_points(opt_inputs, discrete_dims, num_points)[source]

Sample feasible points from the optimization domain.

Feasibility is determined according to the discrete dimensions taking integer values and the inequality constraints being satisfied.

If there are no inequality constraints, Sobol is used to generate the base points. Otherwise, we use the polytope sampler to generate the base points. The base points are then rounded to the nearest integer values for the discrete dimensions, and the infeasible points are filtered out (in case rounding leads to infeasibility).

This method will do 10 attempts to generate num_points feasible points, and return the points generated so far. If no points are generated, it will error out.

Parameters:
  • opt_inputs (OptimizeAcqfInputs) – Common set of arguments for acquisition optimization.

  • discrete_dims (Tensor) – A tensor of indices corresponding to binary and integer parameters.

  • num_points (int) – The number of points to sample.

Returns:

A tensor of shape num_points x d containing the sampled points.

Return type:

Tensor

botorch.optim.optimize_mixed.generate_starting_points(opt_inputs, discrete_dims, cont_dims)[source]

Generate initial starting points for the alternating optimization.

This method attempts to generate the initial points using the specified options and completes any missing points using sample_feasible_points.

Parameters:
  • opt_inputs (OptimizeAcqfInputs) – Common set of arguments for acquisition optimization. This function utilizes acq_function, bounds, num_restarts, raw_samples, options, fixed_features and constraints from opt_inputs.

  • discrete_dims (Tensor) – A tensor of indices corresponding to integer and binary parameters.

  • cont_dims (Tensor) – A tensor of indices corresponding to continuous parameters.

Returns:

a (num_restarts x d)-dim tensor of starting points and a (num_restarts)-dim tensor of their respective acquisition values. In rare cases, this method may return fewer than num_restarts points.

Return type:

A tuple of two tensors

botorch.optim.optimize_mixed.discrete_step(opt_inputs, discrete_dims, current_x)[source]

Discrete nearest neighbour search.

Parameters:
  • opt_inputs (OptimizeAcqfInputs) – Common set of arguments for acquisition optimization. This function utilizes acq_function, bounds, options and constraints from opt_inputs.

  • discrete_dims (Tensor) – A tensor of indices corresponding to binary and integer parameters.

  • current_x (Tensor) – Starting point. A tensor of shape d.

Returns:

a (d)-dim tensor of optimized point

and a scalar tensor of correspondins acquisition value.

Return type:

A tuple of two tensors

botorch.optim.optimize_mixed.continuous_step(opt_inputs, discrete_dims, current_x)[source]

Continuous search using L-BFGS-B through optimize_acqf.

Parameters:
  • opt_inputs (OptimizeAcqfInputs) – Common set of arguments for acquisition optimization. This function utilizes acq_function, bounds, options, fixed_features and constraints from opt_inputs.

  • discrete_dims (Tensor) – A tensor of indices corresponding to binary and integer parameters.

  • current_x (Tensor) – Starting point. A tensor of shape d.

Returns:

a (1 x d)-dim tensor of optimized points

and a (1)-dim tensor of acquisition values.

Return type:

A tuple of two tensors

botorch.optim.optimize_mixed.optimize_acqf_mixed_alternating(acq_function, bounds, discrete_dims, options=None, q=1, raw_samples=1024, num_restarts=20, post_processing_func=None, sequential=True, fixed_features=None, inequality_constraints=None)[source]

Optimizes acquisition function over mixed binary and continuous input spaces. Multiple random restarting starting points are picked by evaluating a large set of initial candidates. From each starting point, alternating discrete local search and continuous optimization via (L-BFGS) is performed for a fixed number of iterations.

NOTE: This method assumes that all discrete variables are integer valued. The discrete dimensions that have more than options.get(“max_discrete_values”, MAX_DISCRETE_VALUES) values will be optimized using continuous relaxation.

# TODO: Support categorical variables.

Parameters:
  • acq_function (AcquisitionFunction) – BoTorch Acquisition function.

  • bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X.

  • discrete_dims (list[int]) – A list of indices corresponding to integer and binary parameters.

  • options (dict[str, Any] | None) – Dictionary specifying optimization options. Supports the following:

  • "initialization_strategy" (-) – Strategy used to generate the initial candidates. “random”, “continuous_relaxation” or “equally_spaced” (linspace style).

  • "tol" (-) – The algorithm terminates if the absolute improvement in acquisition value of one iteration is smaller than this number.

  • "maxiter_alternating" (-) – Number of alternating steps. Defaults to 64.

  • "maxiter_discrete" (-) – Maximum number of iterations in each discrete step. Defaults to 4.

  • "maxiter_continuous" (-) – Maximum number of iterations in each continuous step. Defaults to 8.

  • "max_discrete_values" (-) – Maximum number of values for a discrete dimension to be optimized using discrete step / local search. The discrete dimensions with more values will be optimized using continuous relaxation.

  • "num_spray_points" (-) – Number of spray points (around X_baseline) to add to the points generated by the initialization strategy. Defaults to 20 if all discrete variables are binary and to 0 otherwise.

  • "std_cont_perturbation" (-) – Standard deviation of the normal perturbations of the continuous variables used to generate the spray points. Defaults to 0.1.

  • "batch_limit" (-) – The maximum batch size for jointly evaluating candidates during optimization.

  • "init_batch_limit" (-) – The maximum batch size for jointly evaluating candidates during initialization. During initialization, candidates are evaluated in a no_grad context, which reduces memory usage. As a result, init_batch_limit can be set to a larger value than batch_limit. Defaults to batch_limit, if given.

  • q (int) – Number of candidates.

  • raw_samples (int) – Number of initial candidates used to select starting points from. Defaults to 1024.

  • num_restarts (int) – Number of random restarts. Defaults to 20.

  • post_processing_func (Callable[[Tensor], Tensor] | None) – A function that post-processes an optimization result appropriately (i.e., according to round-trip transformations).

  • sequential (bool) – Whether to use joint or sequential optimization across q-batch. This currently only supports sequential optimization.

  • fixed_features (dict[int, float] | None) – A map {feature_index: value} for features that should be fixed to a particular value during generation.

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs. indices and coefficients should be torch tensors. See the docstring of make_scipy_linear_constraints for an example.

Returns:

a (q x d)-dim tensor of optimized points

and a (q)-dim tensor of their respective acquisition values.

Return type:

A tuple of two tensors

botorch.optim.optimize_mixed.complement_indices_like(indices, d)[source]

Computes a tensor of complement indices: {range(d) \ indices}. Same as complement_indices but returns an integer tensor like indices.

Parameters:
  • indices (Tensor)

  • d (int)

Return type:

Tensor

botorch.optim.optimize_mixed.complement_indices(indices, d)[source]

Computes a list of complement indices: {range(d) \ indices}.

Parameters:
  • indices (list[int]) – a list of integers.

  • d (int) – an integer dimension in which to compute the complement.

Returns:

A list of integer indices.

Return type:

list[int]

Closures

Core

Core methods for building closures in torch and interfacing with numpy.

class botorch.optim.closures.core.ForwardBackwardClosure(forward, parameters, backward=<function Tensor.backward>, reducer=<built-in method sum of type object>, callback=None, context_manager=None)[source]

Bases: object

Wrapper for fused forward and backward closures.

Initializes a ForwardBackwardClosure instance.

Parameters:
  • closure – Callable that returns a tensor.

  • parameters (dict[str, Tensor]) – A dictionary of tensors whose grad fields are to be returned.

  • backward (Callable[[Tensor], None]) – Callable that takes the (reduced) output of forward and sets the grad attributes of tensors in parameters.

  • reducer (Callable[[Tensor], Tensor] | None) – Optional callable used to reduce the output of the forward pass.

  • callback (Callable[[Tensor, Sequence[Tensor | None]], None] | None) – Optional callable that takes the reduced output of forward and the gradients of parameters as positional arguments.

  • context_manager (Callable) – A ContextManager used to wrap each forward-backward call. When passed as None, context_manager defaults to a zero_grad_ctx that zeroes the gradients of parameters upon entry.

  • forward (Callable[[], Tensor])

class botorch.optim.closures.core.NdarrayOptimizationClosure(closure, parameters, as_array=None, get_state=None, set_state=None, fill_value=0.0, persistent=True)[source]

Bases: object

Adds stateful behavior and a numpy.ndarray-typed API to a closure with an expected return type Tuple[Tensor, Union[Tensor, Sequence[Optional[Tensor]]]].

Initializes a NdarrayOptimizationClosure instance.

Parameters:
  • closure (Callable[[], tuple[Tensor, Sequence[Tensor | None]]]) – A ForwardBackwardClosure instance.

  • parameters (dict[str, Tensor]) – A dictionary of tensors representing the closure’s state. Expected to correspond with the first len(parameters) optional gradient tensors returned by closure.

  • as_array (Callable[[Tensor], npt.NDArray]) – Callable used to convert tensors to ndarrays.

  • get_state (Callable[[], npt.NDArray]) – Callable that returns the closure’s state as an ndarray. When passed as None, defaults to calling get_tensors_as_ndarray_1d on closure.parameters while passing as_array (if given by the user).

  • set_state (Callable[[npt.NDArray], None]) – Callable that takes a 1-dimensional ndarray and sets the closure’s state. When passed as None, set_state defaults to calling set_tensors_from_ndarray_1d with closure.parameters and a given ndarray.

  • fill_value (float) – Fill value for parameters whose gradients are None. In most cases, fill_value should either be zero or NaN.

  • persistent (bool) – Boolean specifying whether an ndarray should be retained as a persistent buffer for gradients.

property state: ndarray[tuple[int, ...], dtype[_ScalarType_co]]

Model Fitting Closures

Utilities for building model-based closures.

botorch.optim.closures.model_closures.get_loss_closure(mll, data_loader=None, **kwargs)[source]

Public API for GetLossClosure dispatcher.

This method, and the dispatcher that powers it, acts as a clearing house for factory functions that define how mll is evaluated.

Users may specify custom evaluation routines by registering a factory function with GetLossClosure. These factories should be registered using the type signature

Type[MarginalLogLikeLihood], Type[Likelihood], Type[Model], Type[DataLoader].

The final argument, Type[DataLoader], is optional. Evaluation routines that obtain training data from, e.g., mll.model should register this argument as type(None).

Parameters:
  • mll (MarginalLogLikelihood) – A MarginalLogLikelihood instance whose negative defines the loss.

  • data_loader (DataLoader | None) – An optional DataLoader instance for cases where training data is passed in rather than obtained from mll.model.

  • kwargs (Any)

Returns:

A closure that takes zero positional arguments and returns the negated value of mll.

Return type:

Callable[[], Tensor]

botorch.optim.closures.model_closures.get_loss_closure_with_grads(mll, parameters, data_loader=None, backward=<function Tensor.backward>, reducer=<method 'sum' of 'torch._C.TensorBase' objects>, context_manager=None, **kwargs)[source]

Public API for GetLossClosureWithGrads dispatcher.

In most cases, this method simply adds a backward pass to a loss closure obtained by calling get_loss_closure. For further details, see get_loss_closure.

Parameters:
  • mll (MarginalLogLikelihood) – A MarginalLogLikelihood instance whose negative defines the loss.

  • parameters (dict[str, Tensor]) – A dictionary of tensors whose grad fields are to be returned.

  • reducer (Callable[[Tensor], Tensor] | None) – Optional callable used to reduce the output of the forward pass.

  • data_loader (DataLoader | None) – An optional DataLoader instance for cases where training data is passed in rather than obtained from mll.model.

  • context_manager (Callable | None) – An optional ContextManager used to wrap each forward-backward pass. Defaults to a zero_grad_ctx that zeroes the gradients of parameters upon entry. None may be passed as an alias for nullcontext.

  • backward (Callable[[Tensor], None])

  • kwargs (Any)

Returns:

A closure that takes zero positional arguments and returns the reduced and negated value of mll along with the gradients of parameters.

Return type:

Callable[[], tuple[Tensor, tuple[Tensor, …]]]

Utilities

General Optimization Utilities

General-purpose optimization utilities.

Acquisition Optimization Utilities

Utilities for maximizing acquisition functions.

botorch.optim.utils.acquisition_utils.columnwise_clamp(X, lower=None, upper=None, raise_on_violation=False)[source]

Clamp values of a Tensor in column-wise fashion (with support for t-batches).

This function is useful in conjunction with optimizers from the torch.optim package, which don’t natively handle constraints. If you apply this after a gradient step you can be fancy and call it “projected gradient descent”. This funtion is also useful for post-processing candidates generated by the scipy optimizer that satisfy bounds only up to numerical accuracy.

Parameters:
  • X (Tensor) – The b x n x d input tensor. If 2-dimensional, b is assumed to be 1.

  • lower (float | Tensor | None) – The column-wise lower bounds. If scalar, apply bound to all columns.

  • upper (float | Tensor | None) – The column-wise upper bounds. If scalar, apply bound to all columns.

  • raise_on_violation (bool) – If True, raise an exception when the elments in X are out of the specified bounds (up to numerical accuracy). This is useful for post-processing candidates generated by optimizers that satisfy imposed bounds only up to numerical accuracy.

Returns:

The clamped tensor.

Return type:

Tensor

botorch.optim.utils.acquisition_utils.fix_features(X, fixed_features=None)[source]

Fix feature values in a Tensor.

The fixed features will have zero gradient in downstream calculations.

Parameters:
  • X (Tensor) – input Tensor with shape … x p, where p is the number of features

  • fixed_features (Mapping[int, float | None] | None) – A mapping with keys as column indices and values equal to what the feature should be set to in X. If the value is None, that column is just considered fixed. Keys should be in the range [0, p - 1].

Returns:

The tensor X with fixed features.

Return type:

Tensor

botorch.optim.utils.acquisition_utils.get_X_baseline(acq_function)[source]

Extract X_baseline from an acquisition function.

This tries to find the baseline set of points. First, this checks if the acquisition function has an X_baseline attribute. If it does not, then this method attempts to use the model’s train_inputs as X_baseline.

Parameters:

acq_function (AcquisitionFunction) – The acquisition function.

Return type:

Tensor | None

Returns
An optional n x d-dim tensor of baseline points. This is None if no

baseline points are found.

Model Fitting Utilities

Utilities for fitting and manipulating models.

class botorch.optim.utils.model_utils.TorchAttr(shape, dtype, device)[source]

Bases: NamedTuple

Create new instance of TorchAttr(shape, dtype, device)

Parameters:
  • shape (Size)

  • dtype (dtype)

  • device (device)

shape: Size

Alias for field number 0

dtype: dtype

Alias for field number 1

device: device

Alias for field number 2

botorch.optim.utils.model_utils.get_data_loader(model, batch_size=1024, **kwargs)[source]
Parameters:
Return type:

DataLoader

botorch.optim.utils.model_utils.get_parameters(module, requires_grad=None, name_filter=None)[source]

Helper method for obtaining a module’s parameters and their respective ranges.

Parameters:
  • module (Module) – The target module from which parameters are to be extracted.

  • requires_grad (bool | None) – Optional Boolean used to filter parameters based on whether or not their require_grad attribute matches the user provided value.

  • name_filter (Callable[[str], bool] | None) – Optional Boolean function used to filter parameters by name.

Returns:

A dictionary of parameters.

Return type:

dict[str, Tensor]

botorch.optim.utils.model_utils.get_parameters_and_bounds(module, requires_grad=None, name_filter=None, default_bounds=(-inf, inf))[source]

Helper method for obtaining a module’s parameters and their respective ranges.

Parameters:
  • module (Module) – The target module from which parameters are to be extracted.

  • name_filter (Callable[[str], bool] | None) – Optional Boolean function used to filter parameters by name.

  • requires_grad (bool | None) – Optional Boolean used to filter parameters based on whether or not their require_grad attribute matches the user provided value.

  • default_bounds (tuple[float, float]) – Default lower and upper bounds for constrained parameters with None typed bounds.

Returns:

A dictionary of parameters and a dictionary of parameter bounds.

Return type:

tuple[dict[str, Tensor], dict[str, tuple[float | None, float | None]]]

botorch.optim.utils.model_utils.get_name_filter(patterns)[source]

Returns a binary function that filters strings (or iterables whose first element is a string) according to a bank of excluded patterns. Typically, used in conjunction with generators such as module.named_parameters().

Parameters:

patterns (Iterator[Pattern | str]) – A collection of regular expressions or strings that define the set of names to be excluded.

Returns:

A binary function indicating whether or not an item should be filtered.

Return type:

Callable[[str | tuple[str, Any, …]], bool]

botorch.optim.utils.model_utils.sample_all_priors(model, max_retries=100)[source]

Sample from hyperparameter priors (in-place).

Parameters:
Return type:

None

Numpy - Torch Conversion Tools

Utilities for interfacing Numpy and Torch.

botorch.optim.utils.numpy_utils.as_ndarray(values, dtype=None, inplace=True)[source]

Helper for going from torch.Tensor to numpy.ndarray.

Parameters:
  • values (Tensor) – Tensor to be converted to ndarray.

  • dtype (dtype | None) – Optional numpy.dtype for the converted tensor.

  • inplace (bool) – Boolean indicating whether memory should be shared if possible.

Returns:

An ndarray with the same data as values.

Return type:

ndarray[tuple[int, …], dtype[_ScalarType_co]]

botorch.optim.utils.numpy_utils.get_tensors_as_ndarray_1d(tensors, out=None, dtype=None, as_array=<function as_ndarray>)[source]
Parameters:
  • tensors (Iterator[Tensor] | dict[str, Tensor])

  • out (ndarray[tuple[int, ...], dtype[_ScalarType_co]] | None)

  • dtype (dtype | str | None)

  • as_array (Callable[[Tensor], ndarray[tuple[int, ...], dtype[_ScalarType_co]]])

Return type:

ndarray[tuple[int, …], dtype[_ScalarType_co]]

botorch.optim.utils.numpy_utils.set_tensors_from_ndarray_1d(tensors, array)[source]

Sets the values of one more tensors based off of a vector of assignments.

Parameters:
  • tensors (Iterator[Tensor] | dict[str, Tensor])

  • array (ndarray[tuple[int, ...], dtype[_ScalarType_co]])

Return type:

None

botorch.optim.utils.numpy_utils.get_bounds_as_ndarray(parameters, bounds)[source]

Helper method for converting bounds into an ndarray.

Parameters:
  • parameters (dict[str, Tensor]) – A dictionary of parameters.

  • bounds (dict[str, tuple[float | Tensor | None, float | Tensor | None]]) – A dictionary of (optional) lower and upper bounds.

Returns:

An ndarray of bounds.

Return type:

ndarray[tuple[int, …], dtype[_ScalarType_co]] | None

Optimization with Timeouts

botorch.optim.utils.timeout.minimize_with_timeout(fun, x0, args=(), method=None, jac=None, hess=None, hessp=None, bounds=None, constraints=(), tol=None, callback=None, options=None, timeout_sec=None)[source]

Wrapper around scipy.optimize.minimize to support timeout.

This method calls scipy.optimize.minimize with all arguments forwarded verbatim. The only difference is that if provided a timeout_sec argument, it will automatically stop the optimziation after the timeout is reached.

Internally, this is achieved by automatically constructing a wrapper callback method that is injected to the scipy.optimize.minimize call and that keeps track of the runtime and the optimization variables at the current iteration.

Parameters:
  • fun (Callable[[ndarray[tuple[int, ...], dtype[_ScalarType_co]], ...], float])

  • x0 (ndarray[tuple[int, ...], dtype[_ScalarType_co]])

  • args (tuple[Any, ...])

  • method (str | None)

  • jac (str | Callable | bool | None)

  • hess (str | Callable | HessianUpdateStrategy | None)

  • hessp (Callable | None)

  • bounds (Sequence[tuple[float, float]] | Bounds | None)

  • tol (float | None)

  • callback (Callable | None)

  • options (dict[str, Any] | None)

  • timeout_sec (float | None)

Return type:

OptimizeResult

Parameter Constraint Utilities

Utility functions for constrained optimization.

botorch.optim.parameter_constraints.make_scipy_bounds(X, lower_bounds=None, upper_bounds=None)[source]

Creates a scipy Bounds object for optimziation

Parameters:
  • X (Tensor) – … x d tensor

  • lower_bounds (float | Tensor | None) – Lower bounds on each column (last dimension) of X. If this is a single float, then all columns have the same bound.

  • upper_bounds (float | Tensor | None) – Lower bounds on each column (last dimension) of X. If this is a single float, then all columns have the same bound.

Returns:

A scipy Bounds object if either lower_bounds or upper_bounds is not None, and None otherwise.

Return type:

Bounds | None

Example

>>> X = torch.rand(5, 2)
>>> scipy_bounds = make_scipy_bounds(X, 0.1, 0.8)
botorch.optim.parameter_constraints.make_scipy_linear_constraints(shapeX, inequality_constraints=None, equality_constraints=None)[source]

Generate scipy constraints from torch representation.

Parameters:
  • shapeX (Size) – The shape of the torch.Tensor to optimize over (i.e. (b) x q x d)

  • constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs, where indices is a single-dimensional index tensor (long dtype) containing indices into the last dimension of X, coefficients is a single-dimensional tensor of coefficients of the same length, and rhs is a scalar.

  • constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) == rhs (with indices and coefficients of the same form as in inequality_constraints).

  • inequality_constraints (list[tuple[Tensor, Tensor, float]] | None)

  • equality_constraints (list[tuple[Tensor, Tensor, float]] | None)

Returns:

A list of dictionaries containing callables for constraint function values and Jacobians and a string indicating the associated constraint type (“eq”, “ineq”), as expected by scipy.minimize.

Return type:

list[dict[str, str | Callable[[ndarray], float] | Callable[[ndarray], ndarray]]]

This function assumes that constraints are the same for each input batch, and broadcasts the constraints accordingly to the input batch shape. This function does support constraints across elements of a q-batch if the indices are a 2-d Tensor.

Example

The following will enforce that x[1] + 0.5 x[3] >= -0.1 for each x in both elements of the q-batch, and each of the 3 t-batches:

>>> constraints = make_scipy_linear_constraints(
>>>     torch.Size([3, 2, 4]),
>>>     [(torch.tensor([1, 3]), torch.tensor([1.0, 0.5]), -0.1)],
>>> )

The following will enforce that x[0, 1] + 0.5 x[1, 3] >= -0.1 where x[0, :] is the first element of the q-batch and x[1, :] is the second element of the q-batch, for each of the 3 t-batches:

>>> constraints = make_scipy_linear_constraints(
>>>     torch.size([3, 2, 4])
>>>     [(torch.tensor([[0, 1], [1, 3]), torch.tensor([1.0, 0.5]), -0.1)],
>>> )
botorch.optim.parameter_constraints.eval_lin_constraint(x, flat_idxr, coeffs, rhs)[source]

Evaluate a single linear constraint.

Parameters:
  • x (ndarray[tuple[int, ...], dtype[_ScalarType_co]]) – The input array.

  • flat_idxr (list[int]) – The indices in x to consider.

  • coeffs (ndarray[tuple[int, ...], dtype[_ScalarType_co]]) – The coefficients corresponding to the indices.

  • rhs (float) – The right-hand-side of the constraint.

Returns:

sum_i (coeffs[i] * x[i]) - rhs

Return type:

The evaluted constraint

botorch.optim.parameter_constraints.lin_constraint_jac(x, flat_idxr, coeffs, n)[source]

Return the Jacobian associated with a linear constraint.

Parameters:
  • x (ndarray[tuple[int, ...], dtype[_ScalarType_co]]) – The input array.

  • flat_idxr (list[int]) – The indices for the elements of x that appear in the constraint.

  • coeffs (ndarray[tuple[int, ...], dtype[_ScalarType_co]]) – The coefficients corresponding to the indices.

  • n (int) – number of elements

Returns:

The Jacobian.

Return type:

ndarray[tuple[int, …], dtype[_ScalarType_co]]

botorch.optim.parameter_constraints.nonlinear_constraint_is_feasible(nonlinear_inequality_constraint, is_intrapoint, x)[source]

Checks if a nonlinear inequality constraint is fulfilled.

Parameters:
  • nonlinear_inequality_constraint (Callable) – Callable to evaluate the constraint.

  • intra – If True, the constraint is an intra-point constraint that is applied pointwise and is broadcasted over the q-batch. Else, the constraint has to evaluated over the whole q-batch and is a an inter-point constraint.

  • x (Tensor) – Tensor of shape (b x q x d).

  • is_intrapoint (bool)

Returns:

True if the constraint is fulfilled, else False.

Return type:

bool

botorch.optim.parameter_constraints.make_scipy_nonlinear_inequality_constraints(nonlinear_inequality_constraints, f_np_wrapper, x0, shapeX)[source]

Generate Scipy nonlinear inequality constraints from callables.

Parameters:
  • nonlinear_inequality_constraints (list[tuple[Callable, bool]]) – A list of tuples representing the nonlinear inequality constraints. The first element in the tuple is a callable representing a constraint of the form callable(x) >= 0. In case of an intra-point constraint, callable()`takes in an one-dimensional tensor of shape `d and returns a scalar. In case of an inter-point constraint, callable() takes a two dimensional tensor of shape q x d and again returns a scalar. The second element is a boolean, indicating if it is an intra-point or inter-point constraint (True for intra-point. False for inter-point). For more information on intra-point vs inter-point constraints, see the docstring of the inequality_constraints argument to optimize_acqf(). The constraints will later be passed to the scipy solver.

  • f_np_wrapper (Callable) – A wrapper function that given a constraint evaluates the value and gradient (using autograd) of a numpy input and returns both the objective and the gradient.

  • x0 (Tensor) – The starting point for SLSQP. We return this starting point in (rare) cases where SLSQP fails and thus require it to be feasible.

  • shapeX (Size) – Shape of the three-dimensional batch X, that should be optimized.

Returns:

A list of dictionaries containing callables for constraint function values and Jacobians and a string indicating the associated constraint type (“eq”, “ineq”), as expected by scipy.minimize.

Return type:

list[dict]

Homotopy Utilities

class botorch.optim.homotopy.FixedHomotopySchedule(values)[source]

Bases: object

Homotopy schedule with a fixed list of values.

Initialize FixedHomotopySchedule.

Parameters:

values (list[float]) – A list of values used in homotopy

property num_steps: int
property value: float
property should_stop: bool
restart()[source]
Return type:

None

step()[source]
Return type:

None

class botorch.optim.homotopy.LinearHomotopySchedule(start, end, num_steps)[source]

Bases: FixedHomotopySchedule

Linear homotopy schedule.

Initialize LinearHomotopySchedule.

Parameters:
  • start (float) – start value of homotopy

  • end (float) – end value of homotopy

  • num_steps (int) – number of steps in the homotopy schedule.

class botorch.optim.homotopy.LogLinearHomotopySchedule(start, end, num_steps)[source]

Bases: FixedHomotopySchedule

Log-linear homotopy schedule.

Initialize LogLinearHomotopySchedule.

Parameters:
  • start (float) – start value of homotopy

  • end (float) – end value of homotopy

  • num_steps (int) – number of steps in the homotopy schedule.

class botorch.optim.homotopy.HomotopyParameter(parameter, schedule)[source]

Bases: object

Homotopy parameter.

The parameter is expected to either be a torch parameter or a torch tensor which may correspond to a buffer of a module. The parameter has a corresponding schedule.

Parameters:
parameter: Parameter | Tensor
schedule: FixedHomotopySchedule
class botorch.optim.homotopy.Homotopy(homotopy_parameters, callbacks=None)[source]

Bases: object

Generic homotopy class.

This class is designed to be used in optimize_acqf_homotopy. Given a set of homotopy parameters and corresponding schedules we step through the homotopies until we have solved the final problem. We additionally support passing in a list of callbacks that will be executed each time step, reset, and restart are called.

Initialize the homotopy.

Parameters:
  • homotopy_parameters (list[HomotopyParameter]) – List of homotopy parameters

  • callbacks (list[Callable] | None) – Optional list of callbacks that are executed each time restart, reset, or step are called. These may be used to, e.g., reinitialize the acquisition function which is needed when using qNEHVI.

property should_stop: bool

Returns true if all schedules have reached the end.

restart()[source]

Restart the homotopy to use the initial value in the schedule.

Return type:

None

reset()[source]

Reset the homotopy parameter to their original values.

Return type:

None

step()[source]

Take a step according to the schedules.

Return type:

None