# botorch.optim¶

## Optimization¶

### Acquisition Function Optimization¶

Methods for optimizing acquisition functions.

botorch.optim.optimize.optimize_acqf(acq_function, bounds, q, num_restarts, raw_samples=None, options=None, inequality_constraints=None, equality_constraints=None, nonlinear_inequality_constraints=None, fixed_features=None, post_processing_func=None, batch_initial_conditions=None, return_best_only=True, sequential=False, **kwargs)[source]

Generate a set of candidates via multi-start optimization.

Parameters:
• acq_function (AcquisitionFunction) – An AcquisitionFunction.

• bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X (if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).

• q (int) – The number of candidates.

• num_restarts (int) – The number of starting points for multistart acquisition function optimization.

• raw_samples (Optional[int]) – The number of samples for initialization. This is required if batch_initial_conditions is not specified.

• options (Optional[Dict[str, Union[bool, float, int, str]]]) – Options for candidate generation.

• inequality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs

• equality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an equality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs

• nonlinear_inequality_constraints (Optional[List[Callable]]) – A list of callables with that represent non-linear inequality constraints of the form callable(x) >= 0. Each callable is expected to take a (num_restarts) x q x d-dim tensor as an input and return a (num_restarts) x q-dim tensor with the constraint values. The constraints will later be passed to SLSQP. You need to pass in batch_initial_conditions in this case. Using non-linear inequality constraints also requires that batch_limit is set to 1, which will be done automatically if not specified in options.

• fixed_features (Optional[Dict[int, float]]) – A map {feature_index: value} for features that should be fixed to a particular value during generation.

• post_processing_func (Optional[Callable[[Tensor], Tensor]]) – A function that post-processes an optimization result appropriately (i.e., according to round-trip transformations).

• batch_initial_conditions (Optional[Tensor]) – A tensor to specify the initial conditions. Set this if you do not want to use default initialization strategy.

• return_best_only (bool) – If False, outputs the solutions corresponding to all random restart initializations of the optimization.

• sequential (bool) – If False, uses joint optimization, otherwise uses sequential optimization.

• kwargs (Any) – Additonal keyword arguments.

Returns:

A two-element tuple containing

• a (num_restarts) x q x d-dim tensor of generated candidates.

• a tensor of associated acquisition values. If sequential=False,

this is a (num_restarts)-dim tensor of joint acquisition values (with explicit restart dimension if return_best_only=False). If sequential=True, this is a q-dim tensor of expected acquisition values conditional on having observed candidates 0,1,…,i-1.

Return type:

Tuple[Tensor, Tensor]

Example

>>> # generate q=2 candidates jointly using 20 random restarts
>>> # and 512 raw samples
>>> candidates, acq_value = optimize_acqf(qEI, bounds, 2, 20, 512)

>>> generate q=3 candidates sequentially using 15 random restarts
>>> # and 256 raw samples
>>> qEI = qExpectedImprovement(model, best_f=0.2)
>>> bounds = torch.tensor([[0.], [1.]])
>>> candidates, acq_value_list = optimize_acqf(
>>>     qEI, bounds, 3, 15, 256, sequential=True
>>> )

botorch.optim.optimize.optimize_acqf_cyclic(acq_function, bounds, q, num_restarts, raw_samples=None, options=None, inequality_constraints=None, equality_constraints=None, fixed_features=None, post_processing_func=None, batch_initial_conditions=None, cyclic_options=None, **kwargs)[source]

Generate a set of q candidates via cyclic optimization.

Parameters:
• acq_function (AcquisitionFunction) – An AcquisitionFunction

• bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X (if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).

• q (int) – The number of candidates.

• num_restarts (int) – Number of starting points for multistart acquisition function optimization.

• raw_samples (Optional[int]) – Number of samples for initialization. This is required if batch_initial_conditions is not specified.

• options (Optional[Dict[str, Union[bool, float, int, str]]]) – Options for candidate generation.

• constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs

• constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs

• fixed_features (Optional[Dict[int, float]]) – A map {feature_index: value} for features that should be fixed to a particular value during generation.

• post_processing_func (Optional[Callable[[Tensor], Tensor]]) – A function that post-processes an optimization result appropriately (i.e., according to round-trip transformations).

• batch_initial_conditions (Optional[Tensor]) – A tensor to specify the initial conditions. If no initial conditions are provided, the default initialization will be used.

• cyclic_options (Optional[Dict[str, Union[bool, float, int, str]]]) – Options for stopping criterion for outer cyclic optimization.

• inequality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) –

• equality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) –

Returns:

A two-element tuple containing

• a q x d-dim tensor of generated candidates.

• a q-dim tensor of expected acquisition values, where the value at

index i is the acquisition value conditional on having observed all candidates except candidate i.

Return type:

Tuple[Tensor, Tensor]

Example

>>> # generate q=3 candidates cyclically using 15 random restarts
>>> # 256 raw samples, and 4 cycles
>>>
>>> qEI = qExpectedImprovement(model, best_f=0.2)
>>> bounds = torch.tensor([[0.], [1.]])
>>> candidates, acq_value_list = optimize_acqf_cyclic(
>>>     qEI, bounds, 3, 15, 256, cyclic_options={"maxiter": 4}
>>> )

botorch.optim.optimize.optimize_acqf_list(acq_function_list, bounds, num_restarts, raw_samples=None, options=None, inequality_constraints=None, equality_constraints=None, fixed_features=None, post_processing_func=None)[source]

Generate a list of candidates from a list of acquisition functions.

The acquisition functions are optimized in sequence, with previous candidates set as X_pending. This is also known as sequential greedy optimization.

Parameters:
• acq_function_list (List[AcquisitionFunction]) – A list of acquisition functions.

• bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X (if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).

• num_restarts (int) – Number of starting points for multistart acquisition function optimization.

• raw_samples (Optional[int]) – Number of samples for initialization. This is required if batch_initial_conditions is not specified.

• options (Optional[Dict[str, Union[bool, float, int, str]]]) – Options for candidate generation.

• constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs

• constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs

• fixed_features (Optional[Dict[int, float]]) – A map {feature_index: value} for features that should be fixed to a particular value during generation.

• post_processing_func (Optional[Callable[[Tensor], Tensor]]) – A function that post-processes an optimization result appropriately (i.e., according to round-trip transformations).

• inequality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) –

• equality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) –

Returns:

A two-element tuple containing

• a q x d-dim tensor of generated candidates.

• a q-dim tensor of expected acquisition values, where the value at

index i is the acquisition value conditional on having observed all candidates except candidate i.

Return type:

Tuple[Tensor, Tensor]

botorch.optim.optimize.optimize_acqf_mixed(acq_function, bounds, q, num_restarts, fixed_features_list, raw_samples=None, options=None, inequality_constraints=None, equality_constraints=None, post_processing_func=None, batch_initial_conditions=None, **kwargs)[source]

Optimize over a list of fixed_features and returns the best solution.

This is useful for optimizing over mixed continuous and discrete domains. For q > 1 this function always performs sequential greedy optimization (with proper conditioning on generated candidates).

Parameters:
• acq_function (AcquisitionFunction) – An AcquisitionFunction

• bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X (if inequality_constraints is provided, these bounds can be -inf and +inf, respectively).

• q (int) – The number of candidates.

• num_restarts (int) – Number of starting points for multistart acquisition function optimization.

• raw_samples (Optional[int]) – Number of samples for initialization. This is required if batch_initial_conditions is not specified.

• fixed_features_list (List[Dict[int, float]]) – A list of maps {feature_index: value}. The i-th item represents the fixed_feature for the i-th optimization.

• options (Optional[Dict[str, Union[bool, float, int, str]]]) – Options for candidate generation.

• constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs

• constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs

• post_processing_func (Optional[Callable[[Tensor], Tensor]]) – A function that post-processes an optimization result appropriately (i.e., according to round-trip transformations).

• batch_initial_conditions (Optional[Tensor]) – A tensor to specify the initial conditions. Set this if you do not want to use default initialization strategy.

• inequality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) –

• equality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) –

• kwargs (Any) –

Returns:

A two-element tuple containing

• a q x d-dim tensor of generated candidates.

• an associated acquisition value.

Return type:

Tuple[Tensor, Tensor]

botorch.optim.optimize.optimize_acqf_discrete(acq_function, q, choices, max_batch_size=2048, unique=True, **kwargs)[source]

Optimize over a discrete set of points using batch evaluation.

For q > 1 this function generates candidates by means of sequential conditioning (rather than joint optimization), since for all but the smalles number of choices the set choices^q of discrete points to evaluate quickly explodes.

Parameters:
• acq_function (AcquisitionFunction) – An AcquisitionFunction.

• q (int) – The number of candidates.

• choices (Tensor) – A num_choices x d tensor of possible choices.

• max_batch_size (int) – The maximum number of choices to evaluate in batch. A large limit can cause excessive memory usage if the model has a large training set.

• unique (bool) – If True return unique choices, o/w choices may be repeated (only relevant if q > 1).

• kwargs (Any) –

Returns:

A three-element tuple containing

• a q x d-dim tensor of generated candidates.

• an associated acquisition value.

Return type:

Tuple[Tensor, Tensor]

Optimize acquisition function over a lattice.

This is useful when d is large and enumeration of the search space isn’t possible. For q > 1 this function always performs sequential greedy optimization (with proper conditioning on generated candidates).

NOTE: While this method supports arbitrary lattices, it has only been thoroughly tested for {0, 1}^d. Consider it to be in alpha stage for the more general case.

Parameters:
• acq_function (AcquisitionFunction) – An AcquisitionFunction

• discrete_choices (List[Tensor]) – A list of possible discrete choices for each dimension. Each element in the list is expected to be a torch tensor.

• q (int) – The number of candidates.

• num_restarts (int) – Number of starting points for multistart acquisition function optimization.

• raw_samples (int) – Number of samples for initialization. This is required if batch_initial_conditions is not specified.

• inequality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs

• X_avoid (Optional[Tensor]) – An n x d tensor of candidates that we aren’t allowed to pick.

• batch_initial_conditions (Optional[Tensor]) – A tensor of size n x 1 x d to specify the initial conditions. Set this if you do not want to use default initialization strategy.

• max_batch_size (int) – The maximum number of choices to evaluate in batch. A large limit can cause excessive memory usage if the model has a large training set.

• unique (bool) – If True return unique choices, o/w choices may be repeated (only relevant if q > 1).

• kwargs (Any) –

Returns:

A two-element tuple containing

• a q x d-dim tensor of generated candidates.

• an associated acquisition value.

Return type:

Tuple[Tensor, Tensor]

### Model Fitting Optimization¶

Tools for model fitting.

botorch.optim.fit.fit_gpytorch_scipy(mll, bounds=None, method='L-BFGS-B', options=None, track_iterations=False, approx_mll=False, scipy_objective=<function _scipy_objective_and_grad>, module_to_array_func=<function module_to_array>, module_from_array_func=<function set_params_with_array>)[source]

Fit a gpytorch model by maximizing MLL with a scipy optimizer.

The model and likelihood in mll must already be in train mode. This method requires that the model has train_inputs and train_targets.

Parameters:
• mll (MarginalLogLikelihood) – MarginalLogLikelihood to be maximized.

• bounds (Optional[Dict[str, Tuple[Optional[float], Optional[float]]]]) – A dictionary mapping parameter names to tuples of lower and upper bounds.

• method (str) – Solver type, passed along to scipy.minimize.

• options (Optional[Dict[str, Any]]) – Dictionary of solver options, passed along to scipy.minimize.

• track_iterations (bool) – Track the function values and wall time for each iteration.

• approx_mll (bool) – If True, use gpytorch’s approximate MLL computation. This is disabled by default since the stochasticity is an issue for determistic optimizers). Enabling this is only recommended when working with large training data sets (n>2000).

• scipy_objective (Callable[[ndarray, MarginalLogLikelihood, Dict[str, TorchAttr]], Tuple[float, ndarray]]) –

• module_to_array_func (Callable[[Module, Optional[Dict[str, Tuple[Optional[float], Optional[float]]]], Optional[Set[str]]], Tuple[ndarray, Dict[str, TorchAttr], Optional[ndarray]]]) –

• module_from_array_func (Callable[[Module, ndarray, Dict[str, TorchAttr]], Module]) –

Returns:

2-element tuple containing - MarginalLogLikelihood with parameters optimized in-place. - Dictionary with the following key/values: “fopt”: Best mll value. “wall_time”: Wall time of fitting. “iterations”: List of OptimizationIteration objects with information on each iteration. If track_iterations is False, will be empty. “OptimizeResult”: The result returned by scipy.optim.minimize.

Return type:

Tuple[MarginalLogLikelihood, Dict[str, Union[float, List[OptimizationIteration]]]]

Example

>>> gp = SingleTaskGP(train_X, train_Y)
>>> mll = ExactMarginalLogLikelihood(gp.likelihood, gp)
>>> mll.train()
>>> fit_gpytorch_scipy(mll)
>>> mll.eval()

botorch.optim.fit.fit_gpytorch_torch(mll, bounds=None, optimizer_cls=<class 'torch.optim.adam.Adam'>, options=None, track_iterations=False, approx_mll=False)[source]

Fit a gpytorch model by maximizing MLL with a torch optimizer.

The model and likelihood in mll must already be in train mode. Note: this method requires that the model has train_inputs and train_targets.

Parameters:
• mll (MarginalLogLikelihood) – MarginalLogLikelihood to be maximized.

• bounds (Optional[Dict[str, Tuple[Optional[float], Optional[float]]]]) – A ParameterBounds dictionary mapping parameter names to tuples of lower and upper bounds. Bounds specified here take precedence over bounds on the same parameters specified in the constraints registered with the module.

• optimizer_cls (Optimizer) – Torch optimizer to use. Must not require a closure.

• options (Optional[Dict[str, Any]]) – options for model fitting. Relevant options will be passed to the optimizer_cls. Additionally, options can include: “disp” to specify whether to display model fitting diagnostics and “maxiter” to specify the maximum number of iterations.

• track_iterations (bool) – Track the function values and wall time for each iteration.

• approx_mll (bool) – If True, use gpytorch’s approximate MLL computation ( according to the gpytorch defaults based on the training at size). Unlike for the deterministic algorithms used in fit_gpytorch_scipy, this is not an issue for stochastic optimizers.

Returns:

2-element tuple containing - mll with parameters optimized in-place. - Dictionary with the following key/values: “fopt”: Best mll value. “wall_time”: Wall time of fitting. “iterations”: List of OptimizationIteration objects with information on each iteration. If track_iterations is False, will be empty.

Return type:

Tuple[MarginalLogLikelihood, Dict[str, Union[float, List[OptimizationIteration]]]]

Example

>>> gp = SingleTaskGP(train_X, train_Y)
>>> mll = ExactMarginalLogLikelihood(gp.likelihood, gp)
>>> mll.train()
>>> fit_gpytorch_torch(mll)
>>> mll.eval()


### Initialization Helpers¶

References

[Regis]

R. G. Regis, C. A. Shoemaker. Combining radial basis function surrogates and dynamic coordinate search in high-dimensional expensive black-box optimization, Engineering Optimization, 2013.

botorch.optim.initializers.gen_batch_initial_conditions(acq_function, bounds, q, num_restarts, raw_samples, fixed_features=None, options=None, inequality_constraints=None, equality_constraints=None)[source]

Generate a batch of initial conditions for random-restart optimziation.

TODO: Support t-batches of initial conditions.

Parameters:
• acq_function (AcquisitionFunction) – The acquisition function to be optimized.

• bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of X.

• q (int) – The number of candidates to consider.

• num_restarts (int) – The number of starting points for multistart acquisition function optimization.

• raw_samples (int) – The number of raw samples to consider in the initialization heuristic. Note: if sample_around_best is True (the default is False), then 2 * raw_samples samples are used.

• fixed_features (Optional[Dict[int, float]]) – A map {feature_index: value} for features that should be fixed to a particular value during generation.

• options (Optional[Dict[str, Union[bool, float, int]]]) – Options for initial condition generation. For valid options see initialize_q_batch and initialize_q_batch_nonneg. If options contains a nonnegative=True entry, then acq_function is assumed to be non-negative (useful when using custom acquisition functions). In addition, an “init_batch_limit” option can be passed to specify the batch limit for the initialization. This is useful for avoiding memory limits when computing the batch posterior over raw samples.

• constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs.

• constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs.

• inequality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) –

• equality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) –

Returns:

A num_restarts x q x d tensor of initial conditions.

Return type:

Tensor

Example

>>> qEI = qExpectedImprovement(model, best_f=0.2)
>>> bounds = torch.tensor([[0.], [1.]])
>>> Xinit = gen_batch_initial_conditions(
>>>     qEI, bounds, q=3, num_restarts=25, raw_samples=500
>>> )

botorch.optim.initializers.gen_one_shot_kg_initial_conditions(acq_function, bounds, q, num_restarts, raw_samples, fixed_features=None, options=None, inequality_constraints=None, equality_constraints=None)[source]

Generate a batch of smart initializations for qKnowledgeGradient.

This function generates initial conditions for optimizing one-shot KG using the maximizer of the posterior objective. Intutively, the maximizer of the fantasized posterior will often be close to a maximizer of the current posterior. This function uses that fact to generate the initital conditions for the fantasy points. Specifically, a fraction of 1 - frac_random (see options) is generated by sampling from the set of maximizers of the posterior objective (obtained via random restart optimization) according to a softmax transformation of their respective values. This means that this initialization strategy internally solves an acquisition function maximization problem. The remaining frac_random fantasy points as well as all q candidate points are chosen according to the standard initialization strategy in gen_batch_initial_conditions.

Parameters:
• acq_function (qKnowledgeGradient) – The qKnowledgeGradient instance to be optimized.

• bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of task features.

• q (int) – The number of candidates to consider.

• num_restarts (int) – The number of starting points for multistart acquisition function optimization.

• raw_samples (int) – The number of raw samples to consider in the initialization heuristic.

• fixed_features (Optional[Dict[int, float]]) – A map {feature_index: value} for features that should be fixed to a particular value during generation.

• options (Optional[Dict[str, Union[bool, float, int]]]) – Options for initial condition generation. These contain all settings for the standard heuristic initialization from gen_batch_initial_conditions. In addition, they contain frac_random (the fraction of fully random fantasy points), num_inner_restarts and raw_inner_samples (the number of random restarts and raw samples for solving the posterior objective maximization problem, respectively) and eta (temperature parameter for sampling heuristic from posterior objective maximizers).

• constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs.

• constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) = rhs.

• inequality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) –

• equality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) –

Returns:

A num_restarts x q’ x d tensor that can be used as initial conditions for optimize_acqf(). Here q’ = q + num_fantasies is the total number of points (candidate points plus fantasy points).

Return type:

Optional[Tensor]

Example

>>> qKG = qKnowledgeGradient(model, num_fantasies=64)
>>> bounds = torch.tensor([[0., 0.], [1., 1.]])
>>> Xinit = gen_one_shot_kg_initial_conditions(
>>>     qKG, bounds, q=3, num_restarts=10, raw_samples=512,
>>>     options={"frac_random": 0.25},
>>> )

botorch.optim.initializers.gen_value_function_initial_conditions(acq_function, bounds, num_restarts, raw_samples, current_model, fixed_features=None, options=None)[source]

Generate a batch of smart initializations for optimizing the value function of qKnowledgeGradient.

This function generates initial conditions for optimizing the inner problem of KG, i.e. its value function, using the maximizer of the posterior objective. Intutively, the maximizer of the fantasized posterior will often be close to a maximizer of the current posterior. This function uses that fact to generate the initital conditions for the fantasy points. Specifically, a fraction of 1 - frac_random (see options) of raw samples is generated by sampling from the set of maximizers of the posterior objective (obtained via random restart optimization) according to a softmax transformation of their respective values. This means that this initialization strategy internally solves an acquisition function maximization problem. The remaining raw samples are generated using draw_sobol_samples. All raw samples are then evaluated, and the initial conditions are selected according to the standard initialization strategy in ‘initialize_q_batch’ individually for each inner problem.

Parameters:
• acq_function (AcquisitionFunction) – The value function instance to be optimized.

• bounds (Tensor) – A 2 x d tensor of lower and upper bounds for each column of task features.

• num_restarts (int) – The number of starting points for multistart acquisition function optimization.

• raw_samples (int) – The number of raw samples to consider in the initialization heuristic.

• current_model (Model) – The model of the KG acquisition function that was used to generate the fantasy model of the value function.

• fixed_features (Optional[Dict[int, float]]) – A map {feature_index: value} for features that should be fixed to a particular value during generation.

• options (Optional[Dict[str, Union[bool, float, int]]]) – Options for initial condition generation. These contain all settings for the standard heuristic initialization from gen_batch_initial_conditions. In addition, they contain frac_random (the fraction of fully random fantasy points), num_inner_restarts and raw_inner_samples (the number of random restarts and raw samples for solving the posterior objective maximization problem, respectively) and eta (temperature parameter for sampling heuristic from posterior objective maximizers).

Returns:

A num_restarts x batch_shape x q x d tensor that can be used as initial conditions for optimize_acqf(). Here batch_shape is the batch shape of value function model.

Return type:

Tensor

Example

>>> fant_X = torch.rand(5, 1, 2)
>>> fantasy_model = model.fantasize(fant_X, SobolQMCNormalSampler(16))
>>> value_function = PosteriorMean(fantasy_model)
>>> bounds = torch.tensor([[0., 0.], [1., 1.]])
>>> Xinit = gen_value_function_initial_conditions(
>>>     value_function, bounds, num_restarts=10, raw_samples=512,
>>>     options={"frac_random": 0.25},
>>> )

botorch.optim.initializers.initialize_q_batch(X, Y, n, eta=1.0)[source]

Heuristic for selecting initial conditions for candidate generation.

This heuristic selects points from X (without replacement) with probability proportional to exp(eta * Z), where Z = (Y - mean(Y)) / std(Y) and eta is a temperature parameter.

When using an acquisiton function that is non-negative and possibly zero over large areas of the feature space (e.g. qEI), you should use initialize_q_batch_nonneg instead.

Parameters:
• X (Tensor) – A b x batch_shape x q x d tensor of b - batch_shape samples of q-batches from a d-dim feature space. Typically, these are generated using qMC sampling.

• Y (Tensor) – A tensor of b x batch_shape outcomes associated with the samples. Typically, this is the value of the batch acquisition function to be maximized.

• n (int) – The number of initial condition to be generated. Must be less than b.

• eta (float) – Temperature parameter for weighting samples.

Returns:

A n x batch_shape x q x d tensor of n - batch_shape q-batch initial conditions, where each batch of n x q x d samples is selected independently.

Return type:

Tensor

Example

>>> # To get n=10 starting points of q-batch size q=3
>>> # for model with d=6:
>>> qUCB = qUpperConfidenceBound(model, beta=0.1)
>>> Xrnd = torch.rand(500, 3, 6)
>>> Xinit = initialize_q_batch(Xrnd, qUCB(Xrnd), 10)

botorch.optim.initializers.initialize_q_batch_nonneg(X, Y, n, eta=1.0, alpha=0.0001)[source]

Heuristic for selecting initial conditions for non-neg. acquisition functions.

This function is similar to initialize_q_batch, but designed specifically for acquisition functions that are non-negative and possibly zero over large areas of the feature space (e.g. qEI). All samples for which Y < alpha * max(Y) will be ignored (assuming that Y contains at least one positive value).

Parameters:
• X (Tensor) – A b x q x d tensor of b samples of q-batches from a d-dim. feature space. Typically, these are generated using qMC.

• Y (Tensor) – A tensor of b outcomes associated with the samples. Typically, this is the value of the batch acquisition function to be maximized.

• n (int) – The number of initial condition to be generated. Must be less than b.

• eta (float) – Temperature parameter for weighting samples.

• alpha (float) – The threshold (as a fraction of the maximum observed value) under which to ignore samples. All input samples for which Y < alpha * max(Y) will be ignored.

Returns:

A n x q x d tensor of n q-batch initial conditions.

Return type:

Tensor

Example

>>> # To get n=10 starting points of q-batch size q=3
>>> # for model with d=6:
>>> qEI = qExpectedImprovement(model, best_f=0.2)
>>> Xrnd = torch.rand(500, 3, 6)
>>> Xinit = initialize_q_batch(Xrnd, qEI(Xrnd), 10)

botorch.optim.initializers.sample_points_around_best(acq_function, n_discrete_points, sigma, bounds, best_pct=5.0, subset_sigma=0.1, prob_perturb=None)[source]

Find best points and sample nearby points.

Parameters:
• acq_function (AcquisitionFunction) – The acquisition function.

• n_discrete_points (int) – The number of points to sample.

• sigma (float) – The standard deviation of the additive gaussian noise for perturbing the best points.

• bounds (Tensor) – A 2 x d-dim tensor containing the bounds.

• best_pct (float) – The percentage of best points to perturb.

• subset_sigma (float) – The standard deviation of the additive gaussian noise for perturbing a subset of dimensions of the best points.

• prob_perturb (Optional[float]) – The probability of perturbing each dimension.

Returns:

An optional n_discrete_points x d-dim tensor containing the

sampled points. This is None if no baseline points are found.

Return type:

Optional[Tensor]

botorch.optim.initializers.sample_truncated_normal_perturbations(X, n_discrete_points, sigma, bounds, qmc=True)[source]

Sample points around X.

Sample perturbed points around X such that the added perturbations are sampled from N(0, sigma^2 I) and truncated to be within [0,1]^d.

Parameters:
• X (Tensor) – A n x d-dim tensor starting points.

• n_discrete_points (int) – The number of points to sample.

• sigma (float) – The standard deviation of the additive gaussian noise for perturbing the points.

• bounds (Tensor) – A 2 x d-dim tensor containing the bounds.

• qmc (bool) – A boolean indicating whether to use qmc.

Returns:

A n_discrete_points x d-dim tensor containing the sampled points.

Return type:

Tensor

botorch.optim.initializers.sample_perturbed_subset_dims(X, bounds, n_discrete_points, sigma=0.1, qmc=True, prob_perturb=None)[source]

Sample around X by perturbing a subset of the dimensions.

By default, dimensions are perturbed with probability equal to min(20 / d, 1). As shown in [Regis], perturbing a small number of dimensions can be beneificial. The perturbations are sampled from N(0, sigma^2 I) and truncated to be within [0,1]^d.

Parameters:
• X (Tensor) – A n x d-dim tensor starting points. X must be normalized to be within [0, 1]^d.

• bounds (Tensor) – The bounds to sample perturbed values from

• n_discrete_points (int) – The number of points to sample.

• sigma (float) – The standard deviation of the additive gaussian noise for perturbing the points.

• qmc (bool) – A boolean indicating whether to use qmc.

• prob_perturb (Optional[float]) – The probability of perturbing each dimension. If omitted, defaults to min(20 / d, 1).

Returns:

A n_discrete_points x d-dim tensor containing the sampled points.

Return type:

Tensor

### Stopping Criteria¶

class botorch.optim.stopping.ExpMAStoppingCriterion(maxiter=10000, minimize=True, n_window=10, eta=1.0, rel_tol=1e-05)[source]

Bases: StoppingCriterion

Exponential moving average stopping criterion.

Computes an exponentially weighted moving average over window length n_window and checks whether the relative decrease in this moving average between steps is less than a provided tolerance level. That is, in iteration i, it computes

v[i,j] := fvals[i - n_window + j] * w[j]

for all j = 0, …, n_window, where w[j] = exp(-eta * (1 - j / n_window)). Letting ma[i] := sum_j(v[i,j]), the criterion evaluates to True whenever

(ma[i-1] - ma[i]) / abs(ma[i-1]) < rel_tol (if minimize=True) (ma[i] - ma[i-1]) / abs(ma[i-1]) < rel_tol (if minimize=False)

Exponential moving average stopping criterion.

Parameters:
• maxiter (int) – Maximum number of iterations.

• minimize (bool) – If True, assume minimization.

• n_window (int) – The size of the exponential moving average window.

• eta (float) – The exponential decay factor in the weights.

• rel_tol (float) – Relative tolerance for termination.

evaluate(fvals)[source]

Evaluate the stopping criterion.

Parameters:

fvals (Tensor) – tensor containing function values for the current iteration. If fvals contains more than one element, then the stopping criterion is evaluated element-wise and True is returned if the stopping criterion is true for all elements.

Return type:

bool

TODO: add support for utilizing gradient information

Returns:

Stopping indicator (if True, stop the optimziation).

Parameters:

fvals (Tensor) –

Return type:

bool

## Utilities¶

### Numpy - Torch Conversion Tools¶

A converter that simplifies using numpy-based optimizers with generic torch nn.Module classes. This enables using a scipy.optim.minimize optimizer for optimizing module parameters.

class botorch.optim.numpy_converter.TorchAttr(shape, dtype, device)[source]

Bases: tuple

Create new instance of TorchAttr(shape, dtype, device)

Parameters:
• shape (Size) –

• dtype (dtype) –

• device (device) –

shape: Size

Alias for field number 0

dtype: dtype

Alias for field number 1

device: device

Alias for field number 2

botorch.optim.numpy_converter.create_name_filter(patterns)[source]

Returns a binary function that filters strings (or iterables whose first element is a string) according to a bank of excluded patterns. Typically, used in conjunction with generators such as module.named_parameters().

Parameters:

patterns (Iterator[Union[Pattern, str]]) – A collection of regular expressions or strings that define the set of names to be excluded.

Returns:

A binary function indicating whether or not an item should be filtered.

Return type:

Callable[[Union[str, Tuple[str, Any, …]]], bool]

botorch.optim.numpy_converter.get_parameters_and_bounds(module, name_filter=None, requires_grad=None, default_bounds=(- inf, inf))[source]

Helper method for extracting parameters and feasible ranges thereof.

Parameters:
• module (Module) – The target module from which parameters are to be extracted.

• name_filter (Optional[Callable[[str], bool]]) – Optional Boolean function used to filter parameters by name.

• requires_grad (Optional[bool]) – Optional Boolean used to filter parameters based on whether or not their require_grad attribute matches the user provided value.

• default_bounds (Tuple[float, float]) – Default lower and upper bounds for constrained parameters with None typed bounds.

Returns:

Dictionary mapping names to Parameters. 1: Dictionary mapping names of constrained parameters to ParameterBounds.

Return type:

0

botorch.optim.numpy_converter.module_to_array(module, bounds=None, exclude=None)[source]

Extract named parameters from a module into a numpy array.

Only extracts parameters with requires_grad, since it is meant for optimizing.

Parameters:
• module (Module) – A module with parameters. May specify parameter constraints in a named_parameters_and_constraints method.

• bounds (Optional[Dict[str, Tuple[Optional[float], Optional[float]]]]) – A ParameterBounds dictionary mapping parameter names to tuples of lower and upper bounds. Bounds specified here take precedence over bounds on the same parameters specified in the constraints registered with the module.

• exclude (Optional[Set[str]]) – A list of parameter names that are to be excluded from extraction.

Returns:

3-element tuple containing - The parameter values as a numpy array. - An ordered dictionary with the name and tensor attributes of each parameter. - A 2 x n_params numpy array with lower and upper bounds if at least one constraint is finite, and None otherwise.

Return type:

Tuple[ndarray, Dict[str, TorchAttr], Optional[ndarray]]

Example

>>> mll = ExactMarginalLogLikelihood(model.likelihood, model)
>>> parameter_array, property_dict, bounds_out = module_to_array(mll)

botorch.optim.numpy_converter.set_params_with_array(module, x, property_dict)[source]

Set module parameters with values from numpy array.

Parameters:
• module (Module) – Module with parameters to be set

• x (ndarray) – Numpy array with parameter values

• property_dict (Dict[str, TorchAttr]) – Dictionary of parameter names and torch attributes as returned by module_to_array.

Returns:

module with parameters updated in-place.

Return type:

Module

Example

>>> mll = ExactMarginalLogLikelihood(model.likelihood, model)
>>> parameter_array, property_dict, bounds_out = module_to_array(mll)
>>> parameter_array += 0.1  # perturb parameters (for example only)
>>> mll = set_params_with_array(mll, parameter_array,  property_dict)


### Parameter Constraint Utilities¶

Utility functions for constrained optimization.

botorch.optim.parameter_constraints.make_scipy_bounds(X, lower_bounds=None, upper_bounds=None)[source]

Creates a scipy Bounds object for optimziation

Parameters:
• X (Tensor) – … x d tensor

• lower_bounds (Optional[Union[float, Tensor]]) – Lower bounds on each column (last dimension) of X. If this is a single float, then all columns have the same bound.

• upper_bounds (Optional[Union[float, Tensor]]) – Lower bounds on each column (last dimension) of X. If this is a single float, then all columns have the same bound.

Returns:

A scipy Bounds object if either lower_bounds or upper_bounds is not None, and None otherwise.

Return type:

Optional[Bounds]

Example

>>> X = torch.rand(5, 2)
>>> scipy_bounds = make_scipy_bounds(X, 0.1, 0.8)

botorch.optim.parameter_constraints.make_scipy_linear_constraints(shapeX, inequality_constraints=None, equality_constraints=None)[source]

Generate scipy constraints from torch representation.

Parameters:
• shapeX (Size) – The shape of the torch.Tensor to optimize over (i.e. b x q x d)

• constraints (equality) – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) >= rhs, where indices is a single-dimensional index tensor (long dtype) containing indices into the last dimension of X, coefficients is a single-dimensional tensor of coefficients of the same length, and rhs is a scalar.

• constraints – A list of tuples (indices, coefficients, rhs), with each tuple encoding an inequality constraint of the form sum_i (X[indices[i]] * coefficients[i]) == rhs (with indices and coefficients of the same form as in inequality_constraints).

• inequality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) –

• equality_constraints (Optional[List[Tuple[Tensor, Tensor, float]]]) –

Returns:

A list of dictionaries containing callables for constraint function values and Jacobians and a string indicating the associated constraint type (“eq”, “ineq”), as expected by scipy.minimize.

Return type:

List[Dict[str, Union[str, Callable[[ndarray], float], Callable[[ndarray], ndarray]]]]

This function assumes that constraints are the same for each input batch, and broadcasts the constraints accordingly to the input batch shape. This function does support constraints across elements of a q-batch if the indices are a 2-d Tensor.

Example

The following will enforce that x[1] + 0.5 x[3] >= -0.1 for each x in both elements of the q-batch, and each of the 3 t-batches:

>>> constraints = make_scipy_linear_constraints(
>>>     torch.Size([3, 2, 4]),
>>>     [(torch.tensor([1, 3]), torch.tensor([1.0, 0.5]), -0.1)],
>>> )


The following will enforce that x[0, 1] + 0.5 x[1, 3] >= -0.1 where x[0, :] is the first element of the q-batch and x[1, :] is the second element of the q-batch, for each of the 3 t-batches:

>>> constraints = make_scipy_linear_constraints(
>>>     torch.size([3, 2, 4])
>>>     [(torch.tensor([[0, 1], [1, 3]), torch.tensor([1.0, 0.5]), -0.1)],
>>> )
`
botorch.optim.parameter_constraints.eval_lin_constraint(x, flat_idxr, coeffs, rhs)[source]

Evaluate a single linear constraint.

Parameters:
• x (ndarray) – The input array.

• flat_idxr (List[int]) – The indices in x to consider.

• coeffs (ndarray) – The coefficients corresponding to the indices.

• rhs (float) – The right-hand-side of the constraint.

Returns:

sum_i (coeffs[i] * x[i]) - rhs

Return type:

The evaluted constraint

botorch.optim.parameter_constraints.lin_constraint_jac(x, flat_idxr, coeffs, n)[source]

Return the Jacobian associated with a linear constraint.

Parameters:
• x (ndarray) – The input array.

• flat_idxr (List[int]) – The indices for the elements of x that appear in the constraint.

• coeffs (ndarray) – The coefficients corresponding to the indices.

• n (int) – number of elements

Returns:

The Jacobian.

Return type:

ndarray

botorch.optim.parameter_constraints.make_scipy_nonlinear_inequality_constraints(nonlinear_inequality_constraints, f_np_wrapper, x0)[source]

Generate Scipy nonlinear inequality constraints from callables.

Parameters:
• nonlinear_inequality_constraints (List[Callable]) – List of callables for the nonlinear inequality constraints. Each callable represents a constraint of the form >= 0 and takes a torch tensor of size (p x q x dim) and returns a torch tensor of size (p x q).

• f_np_wrapper (Callable) – A wrapper function that given a constraint evaluates the value and gradient (using autograd) of a numpy input and returns both the objective and the gradient.

• x0 (Tensor) – The starting point for SLSQP. We return this starting point in (rare) cases where SLSQP fails and thus require it to be feasible.

Returns:

A list of dictionaries containing callables for constraint function values and Jacobians and a string indicating the associated constraint type (“eq”, “ineq”), as expected by scipy.minimize.

Return type:

List[Dict]

### General Optimization Utilities¶

Utilities for optimization.

botorch.optim.utils.sample_all_priors(model)[source]

Sample from hyperparameter priors (in-place).

Parameters:

model (GPyTorchModel) – A GPyTorchModel.

Return type:

None

botorch.optim.utils.columnwise_clamp(X, lower=None, upper=None, raise_on_violation=False)[source]

Clamp values of a Tensor in column-wise fashion (with support for t-batches).

This function is useful in conjunction with optimizers from the torch.optim package, which don’t natively handle constraints. If you apply this after a gradient step you can be fancy and call it “projected gradient descent”. This funtion is also useful for post-processing candidates generated by the scipy optimizer that satisfy bounds only up to numerical accuracy.

Parameters:
• X (Tensor) – The b x n x d input tensor. If 2-dimensional, b is assumed to be 1.

• lower (Optional[Union[float, Tensor]]) – The column-wise lower bounds. If scalar, apply bound to all columns.

• upper (Optional[Union[float, Tensor]]) – The column-wise upper bounds. If scalar, apply bound to all columns.

• raise_on_violation (bool) – If True, raise an exception when the elments in X are out of the specified bounds (up to numerical accuracy). This is useful for post-processing candidates generated by optimizers that satisfy imposed bounds only up to numerical accuracy.

Returns:

The clamped tensor.

Return type:

Tensor

botorch.optim.utils.fix_features(X, fixed_features=None)[source]

Fix feature values in a Tensor.

The fixed features will have zero gradient in downstream calculations.

Parameters:
• X (Tensor) – input Tensor with shape … x p, where p is the number of features

• fixed_features (Optional[Dict[int, Optional[float]]]) – A dictionary with keys as column indices and values equal to what the feature should be set to in X. If the value is None, that column is just considered fixed. Keys should be in the range [0, p - 1].

Returns:

The tensor X with fixed features.

Return type:

Tensor

botorch.optim.utils.get_X_baseline(acq_function)[source]

Extract X_baseline from an acquisition function.

This tries to find the baseline set of points. First, this checks if the acquisition function has an X_baseline attribute. If it does not, then this method attempts to use the model’s train_inputs as X_baseline.

Parameters:

acq_function (AcquisitionFunction) – The acquisition function.

Return type:

Optional[Tensor]

Returns
An optional n x d-dim tensor of baseline points. This is None if no

baseline points are found.

botorch.optim.utils.del_attribute_ctx(instance, *attrs, enforce_hasattr=False)[source]

Contextmanager for temporarily deleting attributes.

Parameters:
• instance (object) –

• attrs (str) –

• enforce_hasattr (bool) –

Return type:

Generator[None, None, None]

Contextmanager for temporarily setting the requires_grad field of a module’s parameters.

Parameters:
• module (Module) –

• assignments (Dict[str, bool]) –

Return type:

Generator[None, None, None]

botorch.optim.utils.parameter_rollback_ctx(module, name_filter=None, requires_grad=None, checkpoint=None, **tkwargs)[source]

Contextmanager that exits by rolling back parameter values.

Parameters:
• module (Module) – Module instance.

• name_filter (Optional[Callable[[str], bool]]) – Optional Boolean function used to filter parameters by name.

• requires_grad (Optional[bool]) – Optional Boolean used to filter parameters based on whether or not their require_grad attribute matches the user provided value.

• checkpoint (Optional[Dict[str, Tuple[Tensor, Dict[str, Union[device, dtype]]]]]) – Optional cache of values and tensor metadata specifying the rollback state for the module (or some subset thereof).

• **tkwargs (Any) – Keyword arguments passed to torch.Tensor.to when copying data from each tensor in module.state_dict() to the internally created checkpoint. Only adhered to when the checkpoint argument is None.

Yields:

A checkpoint dictionary for the module, mapping qualified names to cached values and tensor metadata. Any in-places changes to the checkpoint will be observed at rollback time. If the checkpoint is cleared, no rollback will occur.

Return type:

Generator[Dict[str, Tensor], None, None]

botorch.optim.utils.state_rollback_ctx(module, name_filter=None, checkpoint=None, **tkwargs)[source]

Contextmanager that exits by rolling back a module’s state_dict.

Parameters:
• module (Module) – Module instance.

• name_filter (Optional[Callable[[str], bool]]) – Optional Boolean function used to filter items by name.

• checkpoint (Optional[Dict[str, Tuple[Tensor, Dict[str, Union[device, dtype]]]]]) – Optional cache of values and tensor metadata specifying the rollback state for the module (or some subset thereof).

• **tkwargs (Any) – Keyword arguments passed to torch.Tensor.to when copying data from each tensor in module.state_dict() to the internally created checkpoint. Only adhered to when the checkpoint argument is None.

Yields:

A checkpoint dictionary for the module, mapping qualified names to cached values and tensor metadata. Any in-places changes to the checkpoint will be observed at rollback time. If the checkpoint is cleared, no rollback will occur.

Return type:

Generator[Dict[str, Tuple[Tensor, Dict[str, Union[device, dtype]]]], None, None]

botorch.optim.utils.allclose_mll(a, b, transform_a=None, transform_b=None, rtol=1e-05, atol=1e-08)[source]

Convenience method for testing whether the log likelihoods produced by different MarginalLogLikelihood instances, when evaluated on their respective models’ training sets, are allclose.

Parameters:
• a (MarginalLogLikelihood) – A MarginalLogLikelihood instance.

• b (MarginalLogLikelihood) – A second MarginalLogLikelihood instance.

• transform_a (Optional[Callable[[Tensor], Tensor]]) – Optional callable used to post-transform log likelihoods under a.

• transform_b (Optional[Callable[[Tensor], Tensor]]) – Optional callable used to post-transform log likelihoods under b.

• rtol (float) – Relative tolerance.

• atol (float) – Absolute tolerance.

Returns:

Boolean result of the allclose test.

Return type:

bool