Bayesian optimization in PyTorch

Last update: Dec 31, 2022

Related tags

Deep Learning botorch

Overview

BoTorch is a library for Bayesian Optimization built on PyTorch.

BoTorch is currently in beta and under active development!

Why BoTorch ?

BoTorch

Provides a modular and easily extensible interface for composing Bayesian optimization primitives, including probabilistic models, acquisition functions, and optimizers.
Harnesses the power of PyTorch, including auto-differentiation, native support for highly parallelized modern hardware (e.g. GPUs) using device-agnostic code, and a dynamic computation graph.
Supports Monte Carlo-based acquisition functions via the reparameterization trick, which makes it straightforward to implement new ideas without having to impose restrictive assumptions about the underlying model.
Enables seamless integration with deep and/or convolutional architectures in PyTorch.
Has first-class support for state-of-the art probabilistic models in GPyTorch, including support for multi-task Gaussian Processes (GPs) deep kernel learning, deep GPs, and approximate inference.

Target Audience

The primary audience for hands-on use of BoTorch are researchers and sophisticated practitioners in Bayesian Optimization and AI. We recommend using BoTorch as a low-level API for implementing new algorithms for Ax. Ax has been designed to be an easy-to-use platform for end-users, which at the same time is flexible enough for Bayesian Optimization researchers to plug into for handling of feature transformations, (meta-)data management, storage, etc. We recommend that end-users who are not actively doing research on Bayesian Optimization simply use Ax.

Installation

Installation Requirements

Python >= 3.7
PyTorch >= 1.7.1
gpytorch >= 1.4
scipy

Installing the latest release

The latest release of BoTorch is easily installed either via Anaconda (recommended):

conda install botorch -c pytorch -c gpytorch

or via pip:

pip install botorch

You can customize your PyTorch installation (i.e. CUDA version, CPU only option) by following the PyTorch installation instructions.

Important note for MacOS users:

Make sure your PyTorch build is linked against MKL (the non-optimized version of BoTorch can be up to an order of magnitude slower in some settings). Setting this up manually on MacOS can be tricky - to ensure this works properly, please follow the PyTorch installation instructions.
If you need CUDA on MacOS, you will need to build PyTorch from source. Please consult the PyTorch installation instructions above.

Installing from latest master

If you would like to try our bleeding edge features (and don't mind potentially running into the occasional bug here or there), you can install the latest master directly from GitHub (this will also require installing the current GPyTorch master):

pip install --upgrade git+https://github.com/cornellius-gp/gpytorch.git
pip install --upgrade git+https://github.com/pytorch/botorch.git

Manual / Dev install

Alternatively, you can do a manual install. For a basic install, run:

git clone https://github.com/pytorch/botorch.git
cd botorch
pip install -e .

To customize the installation, you can also run the following variants of the above:

pip install -e .[dev]: Also installs all tools necessary for development (testing, linting, docs building; see Contributing below).
pip install -e .[tutorials]: Also installs all packages necessary for running the tutorial notebooks.

Getting Started

Here's a quick run down of the main components of a Bayesian optimization loop. For more details see our Documentation and the Tutorials.

Fit a Gaussian Process model to data

import torch
from botorch.models import SingleTaskGP
from botorch.fit import fit_gpytorch_model
from gpytorch.mlls import ExactMarginalLogLikelihood

train_X = torch.rand(10, 2)
Y = 1 - (train_X - 0.5).norm(dim=-1, keepdim=True)  # explicit output dimension
Y += 0.1 * torch.rand_like(Y)
train_Y = (Y - Y.mean()) / Y.std()

gp = SingleTaskGP(train_X, train_Y)
mll = ExactMarginalLogLikelihood(gp.likelihood, gp)
fit_gpytorch_model(mll)

Construct an acquisition function

from botorch.acquisition import UpperConfidenceBound

UCB = UpperConfidenceBound(gp, beta=0.1)

Optimize the acquisition function

from botorch.optim import optimize_acqf

bounds = torch.stack([torch.zeros(2), torch.ones(2)])
candidate, acq_value = optimize_acqf(
    UCB, bounds=bounds, q=1, num_restarts=5, raw_samples=20,
)

Citing BoTorch

If you use BoTorch, please cite the following paper:

M. Balandat, B. Karrer, D. R. Jiang, S. Daulton, B. Letham, A. G. Wilson, and E. Bakshy. BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization. Advances in Neural Information Processing Systems 33, 2020.

@inproceedings{balandat2020botorch,
  title={{BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization}},
  author={Balandat, Maximilian and Karrer, Brian and Jiang, Daniel R. and Daulton, Samuel and Letham, Benjamin and Wilson, Andrew Gordon and Bakshy, Eytan},
  booktitle = {Advances in Neural Information Processing Systems 33},
  year={2020},
  url = {http://arxiv.org/abs/1910.06403}
}

See here for an incomplete selection of peer-reviewed papers that build off of BoTorch.

Contributing

See the CONTRIBUTING file for how to help out.

License

BoTorch is MIT licensed, as found in the LICENSE file.

Comments

Modifying Knowledge Gradient for time-dependent kernels

Issue description

I want to modify KG for time-dependent problems as follows. Given x in X (some compact space) and 0 <= t <= T, I have a GP model with prior GP(mu, k_xt), wherek_xt = k_x * k_t with k_x capturing covariance in 'x' space and k_t in 't' space. At time t I have data D_t = {(x_i, t_i), y_i }, i=1,...,n and t>t_n. I want to define KG as follows

a_KG(x, t) = E_x'[max_x' mu(x', T) | {(x, t), y_i}]

where y_i is sampled from GP(mu(x, t), k_xt) | D_t). In other words, my 'fantasy model' is at current time t however, my 'inner optimization' problem maximizes the posterior at T predicted via the fantasy model. Also my acquisition function a_KG is defined at t

Question: How should I modify the qKnowledgeGradient class to achieve this, so I can take advantage of the efficient one-shot implementation of qKG? I have provided code for the GP I am using if you want to work with that.

Any help is greatly appreciated! Please let me know if you need more information. Thanks!

(apologies for trying to write equations in Markdown)

import math
import torch

from botorch.fit import fit_gpytorch_model
from botorch.models import SingleTaskGP
from botorch.utils import standardize
from gpytorch.mlls import ExactMarginalLogLikelihood

import gpytorch                                      # main GP library
from matplotlib import cm
from matplotlib import pyplot as plt
import numpy as np
from botorch.fit import fit_gpytorch_model           # Wrapper for gpytorch to use in BO
from botorch.models.gpytorch import GPyTorchModel

def canned_dynamic_gp(train_x, train_y):
    '''
    fits a single-task GP for f(x,t) with a product kernel k_xt = k_x * k_t

    :param train_x:
    :param train_y:
    :return: gp object
    '''

    class ExactGPModel(gpytorch.models.ExactGP):
        num_outputs = 1  # to inform the BoTorch api
        def __init__(self, train_x, train_y, likelihood):
            super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
            self.mean_module = gpytorch.means.ConstantMean()
            self.Rbfx_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())
            self.Rbft_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())

        def forward(self, x):
            X = x[:, :-1]
            t = x[:, -1]
            mean_x = self.mean_module(x)
            covar_x = self.Rbfx_module(X) * self.Rbft_module(t)
            return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)

    # initialize likelihood and model
    likelihood = gpytorch.likelihoods.GaussianLikelihood()
    gp         = ExactGPModel(train_x, train_y, likelihood)
    mll        = ExactMarginalLogLikelihood(gp.likelihood, gp)
    gp.likelihood.noise_covar.raw_noise_constraint.upper_bound = 1e-3 # constraint on observation noise
    fit_gpytorch_model(mll);

    return gp

# Synthetic time-dependent test function
def quadratic(x, a=-4., s=0.5):
    y = a * np.sum(np.power(x - s, 2), axis=1)  # quadratic
    return y
def f_xt_d(x, p, coeff=None, t=None):
    n, _ = np.shape(x)
    if t is None and x.shape[1] > p:
        t = x[:, -1]
        X = x[:, :-1]
    elif t is not None and x.shape[1] == p:
        X = x
    else:
        raise ValueError('x must have p+1 columns when t is None')

    if len(t) != n and len(t) != 1:
        raise ValueError('t should be of length 1 or n')

    if coeff is None:
        coeff = np.array([1., 1.])

    phi_xt = np.vstack((2 * np.sum(np.multiply(X, np.atleast_2d(np.sin(t)).T ), axis=1),
                        -np.power(np.atleast_2d(np.sin(t)), 2)
                        ))
    f_xt = np.matmul(np.atleast_2d(phi_xt).T, coeff)
    return f_xt

training_size = 100
p  = 1  # x-dimensions
lb = 0. # x lower bound
ub= 1. # x upper bound
T  = 4. # t upper bound

bounds  = torch.stack([torch.zeros(2), torch.tensor([1,T])])
fxt_func= f_xt_d

t          = np.linspace(0, 3.9, training_size)
train_x = lb + (ub-lb) * np.random.uniform(size=[training_size, p])
train_x = np.hstack((train_x, np.atleast_2d(t).T))
train_y = quadratic(np.atleast_2d(train_x[:,0]).T) + fxt_func(train_x, p,)

# convert everything to Torch
train_x = torch.tensor(train_x, dtype=torch.float32)
train_y = torch.tensor(train_y, dtype=torch.float32)

# GP model
gp = canned_dynamic_gp(train_x, train_y)

System Info

Please provide information about your setup, including

BoTorch Version 0.3.1
GPyTorch Version 1.2.0
PyTorch Version 1.6.0
Windows and Linux

opened by r-ashwin 37

Numerical issue with cholesky decomposition (even with normalization)
Issue description

I am consistently running into numerical issues when running fit_gpytorch_model(). I am normalizing the inputs and standardizing the outputs (as described in issue 160)

Code example

fit_gpytorch_model(mll) EI = ExpectedImprovement(gp, best_f=0.1) ## optimize acquisition function candidates = joint_optimize( acq_function=EI, q = 1, bounds = bounds, num_restarts=10, raw_samples=500, # used for intialization heuristic ) new_x = candidates.detach() exact_obj = neg_eggholder(new_x) train_x_ei = torch.cat([train_x_ei, candidates]) train_y_ei = torch.cat([train_y_ei, exact_obj]) gp = SingleTaskGP( normalize(train_x_ei, bounds=bounds), standardize(train_y_ei) ) mll = ExactMarginalLogLikelihood(gp.likelihood, gp)

Here is the error message:

RuntimeError Traceback (most recent call last) in 2 3 ----> 4 fit_gpytorch_model(mll) 5 EI = ExpectedImprovement(gp, best_f=0.1) 6

C:\Anaconda3\envs\py36_new\lib\site-packages\botorch\fit.py in fit_gpytorch_model(mll, optimizer, **kwargs) 33 """ 34 mll.train() ---> 35 mll, _ = optimizer(mll, track_iterations=False, **kwargs) 36 mll.eval() 37 return mll

C:\Anaconda3\envs\py36_new\lib\site-packages\botorch\optim\fit.py in fit_gpytorch_scipy(mll, bounds, method, options, track_iterations) 186 jac=True, 187 options=options, --> 188 callback=cb, 189 ) 190 iterations = []

C:\Anaconda3\envs\py36_new\lib\site-packages\scipy\optimize_minimize.py in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options) 599 elif meth == 'l-bfgs-b': 600 return _minimize_lbfgsb(fun, x0, args, jac, bounds, --> 601 callback=callback, **options) 602 elif meth == 'tnc': 603 return _minimize_tnc(fun, x0, args, jac, bounds, callback=callback,

C:\Anaconda3\envs\py36_new\lib\site-packages\scipy\optimize\lbfgsb.py in _minimize_lbfgsb(fun, x0, args, jac, bounds, disp, maxcor, ftol, gtol, eps, maxfun, maxiter, iprint, callback, maxls, **unknown_options) 333 # until the completion of the current minimization iteration. 334 # Overwrite f and g: --> 335 f, g = func_and_grad(x) 336 elif task_str.startswith(b'NEW_X'): 337 # new iteration

C:\Anaconda3\envs\py36_new\lib\site-packages\scipy\optimize\lbfgsb.py in func_and_grad(x) 283 else: 284 def func_and_grad(x): --> 285 f = fun(x, *args) 286 g = jac(x, *args) 287 return f, g

C:\Anaconda3\envs\py36_new\lib\site-packages\scipy\optimize\optimize.py in function_wrapper(*wrapper_args) 298 def function_wrapper(wrapper_args): 299 ncalls[0] += 1 --> 300 return function((wrapper_args + args)) 301 302 return ncalls, function_wrapper

C:\Anaconda3\envs\py36_new\lib\site-packages\scipy\optimize\optimize.py in call(self, x, *args) 61 def call(self, x, *args): 62 self.x = numpy.asarray(x).copy() ---> 63 fg = self.fun(x, *args) 64 self.jac = fg[1] 65 return fg[0]

C:\Anaconda3\envs\py36_new\lib\site-packages\botorch\optim\fit.py in _scipy_objective_and_grad(x, mll, property_dict) 221 output = mll.model(*train_inputs) 222 args = [output, train_targets] + _get_extra_mll_args(mll) --> 223 loss = -mll(*args).sum() 224 loss.backward() 225 param_dict = OrderedDict(mll.named_parameters())

C:\Anaconda3\envs\py36_new\lib\site-packages\gpytorch\module.py in call(self, *inputs, **kwargs) 20 21 def call(self, *inputs, **kwargs): ---> 22 outputs = self.forward(*inputs, **kwargs) 23 if isinstance(outputs, list): 24 return [_validate_module_outputs(output) for output in outputs]

C:\Anaconda3\envs\py36_new\lib\site-packages\gpytorch\mlls\exact_marginal_log_likelihood.py in forward(self, output, target, *params) 26 # Get the log prob of the marginal distribution 27 output = self.likelihood(output, *params) ---> 28 res = output.log_prob(target) 29 30 # Add terms for SGPR / when inducing points are learned

C:\Anaconda3\envs\py36_new\lib\site-packages\gpytorch\distributions\multivariate_normal.py in log_prob(self, value) 127 128 # Get log determininat and first part of quadratic form --> 129 inv_quad, logdet = covar.inv_quad_logdet(inv_quad_rhs=diff.unsqueeze(-1), logdet=True) 130 131 res = -0.5 * sum([inv_quad, logdet, diff.size(-1) * math.log(2 * math.pi)])

C:\Anaconda3\envs\py36_new\lib\site-packages\gpytorch\lazy\lazy_tensor.py in inv_quad_logdet(self, inv_quad_rhs, logdet, reduce_inv_quad) 990 from .chol_lazy_tensor import CholLazyTensor 991 --> 992 cholesky = CholLazyTensor(self.cholesky()) 993 return cholesky.inv_quad_logdet(inv_quad_rhs=inv_quad_rhs, logdet=logdet, reduce_inv_quad=reduce_inv_quad) 994

C:\Anaconda3\envs\py36_new\lib\site-packages\gpytorch\lazy\lazy_tensor.py in cholesky(self, upper) 716 (LazyTensor) Cholesky factor (lower triangular) 717 """ --> 718 res = self._cholesky() 719 if upper: 720 res = res.transpose(-1, -2)

C:\Anaconda3\envs\py36_new\lib\site-packages\gpytorch\utils\memoize.py in g(self, *args, **kwargs) 32 cache_name = name if name is not None else method 33 if not is_in_cache(self, cache_name): ---> 34 add_to_cache(self, cache_name, method(self, *args, **kwargs)) 35 return get_from_cache(self, cache_name) 36

C:\Anaconda3\envs\py36_new\lib\site-packages\gpytorch\lazy\lazy_tensor.py in _cholesky(self) 401 evaluated_mat.register_hook(_ensure_symmetric_grad) 402 --> 403 cholesky = psd_safe_cholesky(evaluated_mat.double()).to(self.dtype) 404 return NonLazyTensor(cholesky) 405

C:\Anaconda3\envs\py36_new\lib\site-packages\gpytorch\utils\cholesky.py in psd_safe_cholesky(A, upper, out, jitter) 45 continue 46 ---> 47 raise e 48 49

C:\Anaconda3\envs\py36_new\lib\site-packages\gpytorch\utils\cholesky.py in psd_safe_cholesky(A, upper, out, jitter) 19 """ 20 try: ---> 21 L = torch.cholesky(A, upper=upper, out=out) 22 # TODO: Remove once fixed in pytorch (#16780) 23 if A.dim() > 2 and A.is_cuda:

RuntimeError: cholesky_cpu: U(2,2) is zero, singular U.

System Info

BoTorch 0.1.0 GPyTorch 0.3.2 Torch 1.1.0 Windows 10
opened by michaelyli 36

Setting up a custom GPyTorch model for BoTorch

If you are submitting a bug report or feature request, please use the respective issue template.

Issue description

I am trying to use the MultiTaskGP model from GPyTorch with the BoTorch's qMaxValueEntropy. I get the UnsupportedError because the objective kwarg is not supported. See error below

`---------------------------------------------------------------------------

UnsupportedError                          Traceback (most recent call last)
<ipython-input-9-e910224785b8> in <module>
    223 candidate_set = torch.rand(size=[1000, 1]) # MES requires a candidate set
    224 from botorch.acquisition.objective import ScalarizedObjective
--> 225 qSMES = qScalarizedMES(model, candidate_set=candidate_set, weights=torch.tensor([1.,0.]))

<ipython-input-9-e910224785b8> in __init__(self, model, candidate_set, weights, num_fantasies, num_mv_samples, num_y_samples, use_gumbel, maximize, X_pending)
     65         """
     66         sampler = SobolQMCNormalSampler(num_y_samples)
---> 67         super().__init__(model=model, sampler=sampler)
     68 
     69         # Batch GP models (e.g. fantasized models) are not currently supported

~\Anaconda3\lib\site-packages\botorch\acquisition\monte_carlo.py in __init__(self, model, sampler, objective, X_pending)
     69             if model.num_outputs != 1:
     70                 raise UnsupportedError(
---> 71                     "Must specify an objective when using a multi-output model."
     72                 )
     73             objective = IdentityMCObjective()

UnsupportedError: Must specify an objective when using a multi-output model.`

## Code example
See code below to reproduce error

import torch
import gpytorch
import math
from matplotlib import cm
from matplotlib import pyplot as plt
import numpy as np
from botorch.models import MultiTaskGP

def test_1d(X):
    a = 16
    f = 1*X**2 + torch.sin(a*X)
    dfx = 1*2*X + a * torch.cos(a*X)
    return f, dfx
x = torch.linspace(0.15, .65, 5)
f, dfx = test_1d(x)
train_x = x.unsqueeze(-1)
train_y = torch.stack((f, dfx),dim=1)
print(train_x.size())
plt.plot(x.numpy(), f.numpy())
plt.plot(x.numpy(), dfx.numpy(), ls='--', c='gray')

from botorch.posteriors import GPyTorchPosterior
from gpytorch.distributions import MultitaskMultivariateNormal
from botorch.models.gpytorch import GPyTorchModel
from gpytorch.likelihoods import MultitaskGaussianLikelihood

class GPModelWithDerivatives(gpytorch.models.ExactGP, GPyTorchModel):
    num_outputs = 2  # to inform GPyTorchModel API (only to interface with BoTorch)
    def __init__(self, train_x, train_y, likelihood):
        super().__init__(train_x, train_y, likelihood)
        self.mean_module = gpytorch.means.ConstantMeanGrad()
        self.base_kernel = gpytorch.kernels.RBFKernelGrad(ard_num_dims=1)
        self.covar_module = gpytorch.kernels.ScaleKernel(self.base_kernel)

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultitaskMultivariateNormal(mean_x, covar_x)
    
likelihood = MultitaskGaussianLikelihood(num_tasks=2)  # Value + x-derivative + y-derivative
model = GPModelWithDerivatives(train_x, train_y, likelihood)

# this is for running the notebook in our testing framework
import os
smoke_test = ('CI' in os.environ)
training_iter = 2 if smoke_test else 500


# Find optimal model hyperparameters
model.train()
likelihood.train()

# Use the adam optimizer
optimizer = torch.optim.Adam([
    {'params': model.parameters()},  # Includes GaussianLikelihood parameters
], lr=0.05)

# "Loss" for GPs - the marginal log likelihood
# likelihood.noise_covar.raw_noise_constraint.upper_bound = torch.tensor([1e-6, 1e-6])
likelihood.noise_covar.register_constraint("raw_noise", gpytorch.constraints.LessThan(1e-4) )
likelihood.noise_covar.register_constraint("raw_noise", gpytorch.constraints.GreaterThan(1e-8) )
mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)

for i in range(training_iter):
    optimizer.zero_grad()
    output = model(train_x)
    loss = -mll(output, train_y)
#     print(loss.item())
    loss.backward()
#     print("Iter %d/%d - Loss: %.3f   lengthscales: %.3f noise: %.8f" % (
#         i + 1, training_iter, loss.item(),
#         model.covar_module.base_kernel.lengthscale.squeeze().item(),
#         model.likelihood.noise.squeeze().item()
#     ))
    optimizer.step()
print(model.likelihood.noise.squeeze())

from botorch.acquisition import MCAcquisitionFunction
from botorch.acquisition.max_value_entropy_search import qMaxValueEntropy
from botorch.acquisition.objective import ScalarizedObjective

# Scalarized MES
import math

from torch import Tensor
from typing import Optional

from botorch.acquisition import MCAcquisitionObjective
from botorch.acquisition.acquisition import AcquisitionFunction
from botorch.acquisition.monte_carlo import MCAcquisitionFunction
from botorch.models.model import Model
from botorch.sampling.samplers import MCSampler, SobolQMCNormalSampler
# from botorch.utils import match_batch_shape, t_batch_mode_transform
from botorch.utils.transforms import match_batch_shape, t_batch_mode_transform

from botorch.models.utils import check_no_nans
from botorch.exceptions import UnsupportedError
CLAMP_LB = 1.0e-8

class qScalarizedMES(MCAcquisitionFunction):
    r"""The acquisition function for Max-value Entropy Search.

    This acquisition function computes the mutual information of
    max values and a candidate point X. See [Wang2018mves]_ for
    a detailed discussion.

    The model must be single-outcome.
    q > 1 is supported through cyclic optimization and fantasies.

    Example:
        >>> model = SingleTaskGP(train_X, train_Y)
        >>> candidate_set = torch.rand(1000, bounds.size(1))
        >>> candidate_set = bounds[0] + (bounds[1] - bounds[0]) * candidate_set
        >>> MES = qMaxValueEntropy(model, candidate_set)
        >>> mes = MES(test_X)
    """

    def __init__(
        self,
        model: Model,
        candidate_set: Tensor,
        weights: Tensor,
        num_fantasies: int = 16,
        num_mv_samples: int = 10,
        num_y_samples: int = 128,
        use_gumbel: bool = True,
        maximize: bool = True,
        X_pending: Optional[Tensor] = None,
    ) -> None:
        r"""Single-outcome max-value entropy search acquisition function.

        Args:
            model: A fitted single-outcome model.
            candidate_set: A `n x d` Tensor including `n` candidate points to
                discretize the design space. Max values are sampled from the
                (joint) model posterior over these points.
            num_fantasies: Number of fantasies to generate. The higher this
                number the more accurate the model (at the expense of model
                complexity, wall time and memory). Ignored if `X_pending` is `None`.
            num_mv_samples: Number of max value samples.
            num_y_samples: Number of posterior samples at specific design point `X`.
            use_gumbel: If True, use Gumbel approximation to sample the max values.
            X_pending: A `m x d`-dim Tensor of `m` design points that have been
                submitted for function evaluation but have not yet been evaluated.
            maximize: If True, consider the problem a maximization problem.
        """
        sampler = SobolQMCNormalSampler(num_y_samples)
        super().__init__(model=model, sampler=sampler)

        # Batch GP models (e.g. fantasized models) are not currently supported
        if self.model.train_inputs[0].ndim > 2:
            raise NotImplementedError(
                "Batch GP models (e.g. fantasized models) "
                "are not yet supported by qMaxValueEntropy"
            )

        self._init_model = model  # only used for the `fantasize()` in `set_X_pending()`
        train_inputs = match_batch_shape(model.train_inputs[0], candidate_set)
        self.candidate_set = torch.cat([candidate_set, train_inputs], dim=0)
        self.fantasies_sampler = SobolQMCNormalSampler(num_fantasies)
        self.num_fantasies = num_fantasies
        self.use_gumbel = use_gumbel
        self.num_mv_samples = num_mv_samples
        self.maximize = maximize
        self.weight = 1.0 if maximize else -1.0
        
        self.register_buffer("weights", torch.as_tensor(weights))

    @t_batch_mode_transform(expected_q=1)
    def forward(self, X: Tensor) -> Tensor:
        r"""Compute max-value entropy at the design points `X`.

        Args:
            X: A `batch_shape x 1 x d`-dim Tensor of `batch_shape` t-batches
                with `1` `d`-dim design points each.

        Returns:
            A `batch_shape`-dim Tensor of MVE values at the given design points `X`.
        """
        # Compute the posterior, posterior mean, variance and std
        posterior = self.model.posterior(X.unsqueeze(-3), observation_noise=False)
        mean = self.weight * posterior.mean.squeeze(-1).squeeze(-1)
        # batch_shape x num_fantasies
        variance = posterior.variance.clamp_min(CLAMP_LB).view_as(mean)
        check_no_nans(mean)
        check_no_nans(variance)
        
        posterior = self.model.posterior(X)
        samples = self.sampler(posterior)  # n x b x q x o
        scalarized_samples = samples.matmul(self.weights)  # n x b x q
#         mean = posterior.mean  # b x q x o
        scalarized_mean = mean.matmul(self.weights)  # b x q
            
        ig = self._compute_information_gain(
            X=X, mean_M=scalarized_mean, variance_M=variance, covar_mM=variance.unsqueeze(-1)
        )

        return ig.mean(dim=0)  # average over the fantasies
    
    def _compute_information_gain(
        self, X: Tensor, mean_M: Tensor, variance_M: Tensor, covar_mM: Tensor
    ) -> Tensor:
        r"""Computes the information gain at the design points `X`.

        Approximately computes the information gain at the design points `X`,
        for both MES with noisy observations and multi-fidelity MES with noisy
        observation and trace observations.

        The implementation is inspired from the paper on multi-fidelity MES by
        Takeno et. al. [Takeno2019mfmves]_. The notations in the comments in this
        function follows the Appendix A in the paper.

        Args:
            X: A `batch_shape x 1 x d`-dim Tensor of `batch_shape` t-batches
                with `1` `d`-dim design point each.
            mean_M, variance_M: `batch_shape x num_fantasies`-dim Tensors of
                `batch_shape` t-batches with `num_fantasies` fantasies.
                `num_fantasies = 1` for non-fantasized models.
                All are obtained without noise.
            covar_mM: `batch_shape x num_fantasies x (1 + num_trace_observations)`
                -dim Tensor. `num_fantasies = 1` for non-fantasized models.
                All are obtained without noise.

        Returns:
            A `num_fantasies x batch_shape`-dim Tensor of information gains at the
            given design points `X`.
        """

        # compute the std_m, variance_m with noisy observation
        posterior_m = self.model.posterior(X.unsqueeze(-3), observation_noise=True)
        mean_m = self.weight * posterior_m.mean.squeeze(-1)
        # batch_shape x num_fantasies x (1 + num_trace_observations)
        variance_m = posterior_m.mvn.covariance_matrix
        # batch_shape x num_fantasies x (1 + num_trace_observations)^2
        check_no_nans(variance_m)

        # compute mean and std for fM|ym, x, Dt ~ N(u, s^2)
        samples_m = self.weight * self.sampler(posterior_m).squeeze(-1)
        # s_m x batch_shape x num_fantasies x (1 + num_trace_observations)
        L = torch.cholesky(variance_m)
        temp_term = torch.cholesky_solve(covar_mM.unsqueeze(-1), L).transpose(-2, -1)
        # equivalent to torch.matmul(covar_mM.unsqueeze(-2), torch.inverse(variance_m))
        # batch_shape x num_fantasies x 1 x (1 + num_trace_observations)

        mean_pt1 = torch.matmul(temp_term, (samples_m - mean_m).unsqueeze(-1))
        mean_new = mean_pt1.squeeze(-1).squeeze(-1) + mean_M
        # s_m x batch_shape x num_fantasies
        variance_pt1 = torch.matmul(temp_term, covar_mM.unsqueeze(-1))
        variance_new = variance_M - variance_pt1.squeeze(-1).squeeze(-1)
        # batch_shape x num_fantasies
        stdv_new = variance_new.clamp_min(CLAMP_LB).sqrt()
        # batch_shape x num_fantasies

        # define normal distribution to compute cdf and pdf
        normal = torch.distributions.Normal(
            torch.zeros(1, device=X.device, dtype=X.dtype),
            torch.ones(1, device=X.device, dtype=X.dtype),
        )

        # Compute p(fM <= f* | ym, x, Dt)
        view_shape = (
            [self.num_mv_samples] + [1] * (len(X.shape) - 2) + [self.num_fantasies]
        )  # s_M x batch_shape x num_fantasies
        if self.X_pending is None:
            view_shape[-1] = 1
        max_vals = self.posterior_max_values.view(view_shape).unsqueeze(1)
        # s_M x 1 x batch_shape x num_fantasies
        normalized_mvs_new = (max_vals - mean_new) / stdv_new
        # s_M x s_m x batch_shape x num_fantasies =
        # s_M x 1 x batch_shape x num_fantasies - s_m x batch_shape x num_fantasies
        cdf_mvs_new = normal.cdf(normalized_mvs_new).clamp_min(CLAMP_LB)

        # Compute p(fM <= f* | x, Dt)
        stdv_M = variance_M.sqrt()
        normalized_mvs = (max_vals - mean_M) / stdv_M
        # s_M x 1 x batch_shape x num_fantasies =
        # s_M x 1 x 1 x num_fantasies - batch_shape x num_fantasies
        cdf_mvs = normal.cdf(normalized_mvs).clamp_min(CLAMP_LB)
        # s_M x 1 x batch_shape x num_fantasies

        # Compute log(p(ym | x, Dt))
        log_pdf_fm = posterior_m.mvn.log_prob(self.weight * samples_m).unsqueeze(0)
        # 1 x s_m x batch_shape x num_fantasies

        # H0 = H(ym | x, Dt)
        H0 = posterior_m.mvn.entropy()  # batch_shape x num_fantasies

        # regression adjusted H1 estimation, H1_hat = H1_bar - beta * (H0_bar - H0)
        # H1 = E_{f*|x, Dt}[H(ym|f*, x, Dt)]
        Z = cdf_mvs_new / cdf_mvs  # s_M x s_m x batch_shape x num_fantasies
        h1 = -Z * Z.log() - Z * log_pdf_fm  # s_M x s_m x batch_shape x num_fantasies
        check_no_nans(h1)
        dim = [0, 1]  # dimension of fm samples, fM samples
        H1_bar = h1.mean(dim=dim)
        h0 = -log_pdf_fm
        H0_bar = h0.mean(dim=dim)
        cov = ((h1 - H1_bar) * (h0 - H0_bar)).mean(dim=dim)
        beta = cov / (h0.var(dim=dim) * h1.var(dim=dim)).sqrt()
        H1_hat = H1_bar - beta * (H0_bar - H0)
        ig = H0 - H1_hat  # batch_shape x num_fantasies
        ig = ig.permute(-1, *range(ig.dim() - 1))  # num_fantasies x batch_shape
        return ig
    
candidate_set = torch.rand(size=[1000, 1]) # MES requires a candidate set
from botorch.acquisition.objective import ScalarizedObjective
qSMES = qScalarizedMES(model, candidate_set=candidate_set, weights=torch.tensor([1.,0.]))

System Info

Please provide information about your setup, including

BoTorch Version 0.2.5
GPyTorch Version 1.1.1
PyTorch Version 1.5.0+cpu
Computer OS windows

opened by r-ashwin 28

Improving test coverage of UnifiedSkewNormal code

Summary: This commit improves the test coverage of the code located in botorch/utils/probability. For the current coverage without this commit, see here.

Differential Revision: D39556258
CLA Signed fb-exported

opened by SebastianAment 23
Adding proximal acquisition function wrapper

Motivation

The goal of this acquisition function is to bias a GP optimization towards smooth optimization through the input domain. The proximal AF multiplies the base acquisition function by a squared exponential with a user defined lengthscale, centered at the most recently observed training point (assumed to be model.train_inputs[-1]). If the associated lengthscale is short the algorithm makes small jumps in input space, if it is long it is not strongly biased. This method differs from simply restricting the max travel size in input space by allowing large travel distances if the predicted value is large enough. See https://journals.aps.org/prab/abstract/10.1103/PhysRevAccelBeams.24.062801 for discussion and analysis.

This becomes relevant when using Bayesian optimization techniques on optimizing physical systems, where there is a cost associated with changing input parameters.

Have you read the Contributing Guidelines on pull requests?

Yes, I've tried my best to satisfy all requirements although there are possibly errors (first time contributing to a major project like this).

Test Plan

Test script test_proximal.py has been added to test/acquisition. I can also provide numerical proof that this works with a simple script but I was unsure where to include. Result is show below

Please comment if I need to change anything, thanks!

Related PRs

None
CLA Signed Merged

opened by roussel-ryan 23
Add entropy search acquisition functions

Hi,

I have provided some implementations of entropy search acquisition functions used in a recent paper (https://arxiv.org/abs/2210.02905). This PR includes the acquisition function and the necessary utilities. I have included a notebook that describes how to use these methods.

I was not sure what were the best places to add these acquisition functions, so I put them all in the multi-objective folder. Nevertheless, they should work for single-objective optimization as well.

Thanks, Ben
CLA Signed

opened by benmltu 21
Botorch closures
Summary: Changelog:

Enable user-defined loss closures.

fit_gptorch_torch rewrite

Add fit_gyptorch_mll dispatch for ApproximateGPs

Differential Revision: D39101211
CLA Signed fb-exported
opened by j-wilson 19
Support specifying observation noise explicitly

This adds support for specifying the observation noise in posterior and fantasize.

In addition to using the observation noise from the likelihood by setting observation_noise=True, now observation_noise can be a tensor. In that case, the provided noise levels are used directly as the observation noise in the posterior predictive (not in performing inference).

The primary use case for this is if we have auxiliary noise models that should not be used as the likelihood during posterior computations (e.g. b/c the model is fitted to already smoothed data), or because we have some dependency of the observation noise on parameters that we may control, e.g. the fidelity of the evaluation/sample size.

Note: This depends on https://github.com/cornellius-gp/gpytorch/pull/865

Also, this cleans up some of the boilerplate code in the gpytorch wrappers by defining the gpt_posterior_settings contextmanager that wraps the settings we use for posterior computation.
CLA Signed Merged

opened by Balandat 19
Optimizing over discrete parameter domains

Can BoTorch be used over discrete parameter domains? (if so, than this is a feature inquiry, and not a feature "request")

We have a use case of domains which are partly continuous, partly discrete, like: [{"name": "param1", "type": "continuous", "domain": [-5, 10]}, {"name": "param2", "type": "continuous", "domain": [1, 15]}, {"name": "param3", "type": "discrete", "domain": [1, 1.5, 2, 2.5, 3, 3.5, 4]}]

The functions under "botorch/optim/optimize.py" accept an argument called "bounds", which you define as : "bounds: A 2 x d tensor of lower and upper bounds for each column of X".

These are obviously bounds for a continuous search space. Can BoTorch be used for searching over discrete spaces?

Thank you so much for the package! Avi
enhancement

opened by avimit 19
low-rank cholesky updates for NEI

Summary: This uses low-rank cholesky updates in NEI. Using SAA this allows us to cache the objectives values for the in-sample points and only compute the objectives for the new test points. This is much faster when there are lots of baseline points.

However, this makes the acquisition function harder to read, so I am curious to hear what folks think.

Moreover, this is a prototype that I am using for research, but many common components with NEHVI should be refactored into a shared utility or base class.

Differential Revision: D32668278
CLA Signed fb-exported

opened by sdaulton 17

[Bug] Possible memory leak in `botorch.optim.optimize_acqf`

🐛 Bug

As far as I can tell botorch.optim.optimize_acqf leaves a tiny bit of memory behind somewhere. It seems worse for q-batched acquisition functions (at least, for qUCB and qEI) than analytic ones, and worse on ubuntu than OSX. Calls to fit_gpytorch_model and the acqf itself seem fine.

To reproduce

Sorry this is a bit long.

import torch
import numpy as np
import gpytorch
from botorch.models.gpytorch import GPyTorchModel
from botorch.fit import fit_gpytorch_model
from botorch.optim import optimize_acqf
from botorch.acquisition import (
    qUpperConfidenceBound,
    ExpectedImprovement,
    qExpectedImprovement,
)

from gpytorch.models import ApproximateGP
from gpytorch.variational import MeanFieldVariationalDistribution, VariationalStrategy

from tqdm import trange

# Haven't checked if this happens with non-variational GPs yet
class GPClassificationModel(ApproximateGP, GPyTorchModel):

    _num_outputs = 1

    def __init__(
        self, inducing_min, inducing_max, inducing_size=10,
    ):

        inducing_points = torch.linspace(
            inducing_min[0], inducing_max[0], inducing_size
        )

        variational_distribution = MeanFieldVariationalDistribution(
            inducing_points.size(0)
        )
        variational_strategy = VariationalStrategy(
            self,
            inducing_points,
            variational_distribution,
            learn_inducing_locations=False,
        )
        super(GPClassificationModel, self).__init__(variational_strategy)
        self.mean_module = gpytorch.means.ConstantMean()
        self.covar_module = gpytorch.kernels.ScaleKernel(
            gpytorch.kernels.RBFKernel(ard_num_dims=1),
        )

    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        latent_pred = gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
        return latent_pred

    def set_train_data(self, x, y):
        self.train_inputs = (x,)
        self.train_targets = y


bounds = torch.Tensor(np.r_[-1, 1])[:, None]
ntrials = 1000
restarts = 10
samps = 1000
q = 1
n = 10

# initialize
likelihood = gpytorch.likelihoods.BernoulliLikelihood()
model = GPClassificationModel(inducing_min=bounds[0], inducing_max=bounds[1])

acq = qUpperConfidenceBound(model=model, beta=3.98)

mll = gpytorch.mlls.VariationalELBO(likelihood, model, n)
x = torch.rand(size=(n,))
y = torch.randint_like(x, 0, 2, dtype=torch.long)
model.set_train_data(x, y)
model.train()

# just call something in a tight loop to see if memory grows
for i in trange(ntrials):
    # this call keeps memory steady
    # fit_gpytorch_model(mll)

    # this call keeps memory steady
    # _ = acq(x[:, None])

    # this call grows memory by a little bit every call
    new_x, batch_acq_values = optimize_acqf(
        acq_function=acq, bounds=bounds, q=q, num_restarts=restarts, raw_samples=samps,
    )

Running the above with mprof, here's what no leak looks like: No leak

Here's what a leak on OSX looks like: OSX leak

Here's what a leak on ubuntu looks like: ubuntu leak

Expected Behavior

Expecting no memory leak here -- I'm trying to run some benchmarks, which means that I run many synthetic opt runs and anything long-running gets killed.

System information

botorch version: 0.3.3
gpytorch version: 1.3.0
pytorch version: 1.7.1
OS: OSX (mild apparent leak), ubuntu (worse apparent leak).

bug upstream issue

opened by mshvartsman 17

Fix tensor shapes in unit tests

Summary: As a result of switching from self.AssertTrue(torch.allclose(...)) to self.AssertAllClose in unit tests, we will now also have checks that tensors compared are the same shape and not just numerically equal. Some of our current tests were failing; this fixes that by changing the shapes of the compared tensors.

Differential Revision: D42402387
CLA Signed fb-exported

opened by esantorella 2
Loosen tolerances to stop `TestNoisyExpectedImprovement.test_noisy_expected_improvement` from being flaky

Summary: This failed in the GH CI twice this week, for example here: https://github.com/pytorch/botorch/actions/runs/3861688944/jobs/6582809874

Differential Revision: D42402143
CLA Signed fb-exported

opened by esantorella 4
Add BotorchTestCase.assertAllClose
Summary: BotorchTestCase.assertAllClose will print more informative error messages on failure than TestCase.assertTrue(torch.allclose(...)). It uses torch.testing.assert_close.

Old test output: AssertionError: False is not true

New test output:

1) AssertionError: Scalars are not close! Absolute difference: 1.0000034868717194 (up to 0.0001 allowed) Relative difference: 0.8348668001940709 (up to 1e-05 allowed)

This currently replicates the behavior of torch.allclose so that tests remain exactly as strict as they used to be, but in the future we might want to use the behavior of assert_close instead since it uses higher tolerances for single-precision inputs by default and is more configurable.

Differential Revision: D42402142
CLA Signed fb-exported
opened by esantorella 3
Normalization for chebychev correct?

Hi, I would like to make sure that the objective values get normalized to the correct interval in the get_chebyvhev_scalarization function. Right now values get normalized to [0,1] for maximization. However, this results in every weight vector having a component of zero scalarizing every objective vector to zero since min(0,...)=0 if every element>=0. Also, for minimization the literature suggests normalizing to [0,1] instead of [-1,0]. Is this intended behavior or should the normalization interval maybe be flipped?

opened by peteole 9

ModelList in combination with qNoisyExpectedImprovement fails

Issue description

When I tried to use a ModelList in combination with qNoisyExpectedImprovement, I got an error regarding the missing attribute distribution of the object PosteriorList. Is this error intended? Because that means that it is currently not possible to use qNoisyExpectedImprovement on problems where for example the output constraint is defined by a deterministic model and the actual objective by a SingleTaskGP.

Code example

This is the MWE:

import torch
from botorch.models import SingleTaskGP, ModelList, ModelListGP
from botorch.fit import fit_gpytorch_mll
from botorch.utils import standardize
from gpytorch.mlls import ExactMarginalLogLikelihood
from botorch.acquisition import qNoisyExpectedImprovement


train_X = torch.rand(10, 2)
Y = 1 - torch.norm(train_X - 0.5, dim=-1, keepdim=True)
Y = Y + 0.1 * torch.randn_like(Y)  # add some noise
train_Y = standardize(Y)

gp = SingleTaskGP(train_X, train_Y)
mll = ExactMarginalLogLikelihood(gp.likelihood, gp)
fit_gpytorch_mll(mll)

ml = ModelList(gp)


qNoisyExpectedImprovement(model=ml, X_baseline=train_X)

And this is the error trace:

Cell In [22], line 21
     16 fit_gpytorch_mll(mll)
     18 ml = ModelList(gp)
---> 21 qNoisyExpectedImprovement(model=ml, X_baseline=train_X)

File ~/sandbox/botorch/botorch/acquisition/monte_carlo.py:294, in qNoisyExpectedImprovement.__init__(self, model, X_baseline, sampler, objective, posterior_transform, X_pending, prune_baseline, cache_root, **kwargs)
    290 self.register_buffer("baseline_samples", baseline_samples)
    291 self.register_buffer(
    292     "baseline_obj_max_values", baseline_obj.max(dim=-1).values
    293 )
--> 294 self._baseline_L = self._compute_root_decomposition(posterior=posterior)

File ~/sandbox/botorch/botorch/acquisition/cached_cholesky.py:92, in CachedCholeskyMCAcquisitionFunction._compute_root_decomposition(self, posterior)
     71 def _compute_root_decomposition(
     72     self,
     73     posterior: Posterior,
     74 ) -> Tensor:
     75     r"""Cache Cholesky of the posterior covariance over f(X_baseline).
     76 
     77     Because `LinearOperator.root_decomposition` is decorated with LinearOperator's
   (...)
     90         posterior: The posterior over f(X_baseline).
     91     """
...
    181         f"`PosteriorList` does not define the attribute {name}. "
    182         "Consider accessing the attributes of the individual posteriors instead."
    183     )

AttributeError: `PosteriorList` does not define the attribute distribution. Consider accessing the attributes of the individual posteriors instead.

opened by jduerholt 4

Releases(v0.8.1)

v0.8.1(Jan 6, 2023)
[0.8.1] - Jan 5, 2023

Highlights

This release includes changes for compatibility with the newest versions of linear_operator and gpytorch.

Several acquisition functions now have "Log" counterparts, which provide better numerical behavior for improvement-based acquisition functions in areas where the probability of improvement is low. For example, LogExpectedImprovement (#1565) should behave better than ExpectedImprovement. These new acquisition functions are

LogExpectedImprovement (#1565).

LogNoisyExpectedImprovement (#1577).

LogProbabilityOfImprovement (#1594).

LogConstrainedExpectedImprovement (#1594).

Bug fix: Stop ModelListGP.posterior from quietly ignoring Log, Power, and Bilog outcome transforms (#1563).

Turn off fast_computations setting in linear_operator by default (#1547).

Compatibility

Require linear_operator == 0.3.0 (#1538).

Require pyro-ppl >= 1.8.4 (#1606).

Require gpytorch == 1.9.1 (#1612).

New Features

Add eta to get_acquisition_function (#1541).

Support 0d-features in FixedFeatureAcquisitionFunction (#1546).

Add timeout ability to optimization functions (#1562, #1598).

Add MultiModelAcquisitionFunction, an abstract base class for acquisition functions that require multiple types of models (#1584).

Add cache_root option for qNEI in get_acquisition_function (#1608).

Other changes

Docstring corrections (#1551, #1557, #1573).

Removal of _fit_multioutput_independent and allclose_mll (#1570).

Better numerical behavior for fully Bayesian models (#1576).

More verbose Scipy minimize failure messages (#1579).

Lower-bound noise inSaasPyroModel to avoid Cholesky errors (#1586).

Bug fixes

Error rather than failing silently for NaN values in box decomposition (#1554).

Make get_bounds_as_ndarray device-safe (#1567).

Source code(tar.gz)
Source code(zip)
v0.8.0(Dec 7, 2022)
Highlights

This release includes some backwards incompatible changes.

Refactor Posterior and MCSampler modules to better support non-Gaussian distributions in BoTorch (#1486).

Introduced a TorchPosterior object that wraps a PyTorch Distribution object and makes it compatible with the rest of Posterior API.

PosteriorList no longer accepts Gaussian base samples. It should be used with a ListSampler that includes the appropriate sampler for each posterior.

The MC acquisition functions no longer construct a Sobol sampler by default. Instead, they rely on a get_sampler helper, which dispatches an appropriate sampler based on the posterior provided.

The resample and collapse_batch_dims arguments to MCSamplers have been removed. The ForkedRNGSampler and StochasticSampler can be used to get the same functionality.

Refer to the PR for additional changes. We will update the website documentation to reflect these changes in a future release.

#1191 refactors much of botorch.optim to operate based on closures that abstract away how losses (and gradients) are computed. By default, these closures are created using multiply-dispatched factory functions (such as get_loss_closure), which may be customized by registering methods with an associated dispatcher (e.g. GetLossClosure). Future releases will contain tutorials that explore these features in greater detail.

New Features

Add mixed optimization for list optimization (#1342).

Add entropy search acquisition functions (#1458).

Add utilities for straight-through gradient estimators for discretization functions (#1515).

Add support for categoricals in Round input transform and use STEs (#1516).

Add closure-based optimizers (#1191).

Other Changes

Do not count hitting maxiter as optimization failure & update default maxiter (#1478).

BoxDecomposition cleanup (#1490).

Deprecate torch.triangular_solve in favor of torch.linalg.solve_triangular (#1494).

Various docstring improvements (#1496, #1499, #1504).

Remove __getitem__ method from LinearTruncatedFidelityKernel (#1501).

Handle Cholesky errors when fitting a fully Bayesian model (#1507).

Make eta configurable in apply_constraints (#1526).

Support SAAS ensemble models in RFFs (#1530).

Deprecate botorch.optim.numpy_converter (#1191).

Deprecate fit_gpytorch_scipy and fit_gpytorch_torch (#1191).

Bug Fixes

Enforce use of float64 in NdarrayOptimizationClosure (#1508).

Replace deprecated np.bool with equivalent bool (#1524).

Fix RFF bug when using FixedNoiseGP models (#1528).

Source code(tar.gz)
Source code(zip)
0.7.3(Nov 10, 2022)
Highlights

#1454 fixes a critical bug that affected multi-output BatchedMultiOutputGPyTorchModels that were using a Normalize or InputStandardize input transform and trained using fit_gpytorch_model/mll with sequential=True (which was the default until 0.7.3). The input transform buffers would be reset after model training, leading to the model being trained on normalized input data but evaluated on raw inputs. This bug had been affecting model fits since the 0.6.5 release.

#1479 changes the inheritance structure of Models in a backwards-incompatible way. If your code relies on isinstance checks with BoTorch Models, especially SingleTaskGP, you should revisit these checks to make sure they still work as expected.

Compatibility

Require linear_operator == 0.2.0 (#1491).

New Features

Introduce bvn, MVNXPB, TruncatedMultivariateNormal, and UnifiedSkewNormal classes / methods (#1394, #1408).

Introduce AffineInputTransform (#1461).

Introduce a subset_transform decorator to consolidate subsetting of inputs in input transforms (#1468).

Other Changes

Add a warning when using float dtype (#1193).

Let Pyre know that AcquisitionFunction.model is a Model (#1216).

Remove custom BlockDiagLazyTensor logic when using Standardize (#1414).

Expose _aug_batch_shape in SaasFullyBayesianSingleTaskGP (#1448).

Adjust PairwiseGP ScaleKernel prior (#1460).

Pull out fantasize method into a FantasizeMixin class, so it isn't so widely inherited (#1462, #1479).

Don't use Pyro JIT by default , since it was causing a memory leak (#1474).

Use get_default_partitioning_alpha for NEHVI input constructor (#1481).

Bug Fixes

Fix batch_shape property of ModelListGPyTorchModel (#1441).

Tutorial fixes (#1446, #1475).

Bug-fix for Proximal acquisition function wrapper for negative base acquisition functions (#1447).

Handle RuntimeError due to constraint violation while sampling from priors (#1451).

Fix bug in model list with output indices (#1453).

Fix input transform bug when sequentially training a BatchedMultiOutputGPyTorchModel (#1454).

Fix a bug in _fit_multioutput_independent that failed mll comparison (#1455).

Fix box decomposition behavior with empty or None Y (#1489).

Source code(tar.gz)
Source code(zip)
v0.7.2(Sep 27, 2022)
New Features

A full refactor of model fitting methods (#1134).

This introduces a new fit_gpytorch_mll method that multiple-dispatches on the model type. Users may register custom fitting routines for different combinations of MLLs, Likelihoods, and Models.

Unlike previous fitting helpers, fit_gpytorch_mll does not pass kwargs to optimizer and instead introduces an optional optimizer_kwargs argument.

When a model fitting attempt fails, botorch.fit methods restore modules to their original states.

fit_gpytorch_mll throws a ModelFittingError when all model fitting attempts fail.

Upon returning from fit_gpytorch_mll, mll.training will be True if fitting failed and False otherwise.

Allow custom bounds to be passed in to SyntheticTestFunction (#1415).

Deprecations

Deprecate weights argument of risk measures in favor of a preprocessing_function (#1400),

Deprecate fit_gyptorch_model; to be superseded by fit_gpytorch_mll.

Other Changes

Support risk measures in MOO input constructors (#1401).

Bug Fixes

Fix fully Bayesian state dict loading when there are more than 10 models (#1405).

Fix batch_shape property of SaasFullyBayesianSingleTaskGP (#1413).

Fix model_list_to_batched ignoring the covar_module of the input models (#1419).

Source code(tar.gz)
Source code(zip)
v0.7.1(Sep 13, 2022)
Compatibility

Pin GPyTorch == 1.9.0 (#1397).

Pin linear_operator == 0.1.1 (#1397).

New Features

Implement SaasFullyBayesianMultiTaskGP and related utilities (#1181, #1203).

Other Changes

Support loading a state dict for SaasFullyBayesianSingleTaskGP (#1120).

Update load_state_dict for ModelList to support fully Bayesian models (#1395).

Add is_one_to_many attribute to input transforms (#1396).

Bug Fixes

Fix PairwiseGP on GPU (#1388).

Source code(tar.gz)
Source code(zip)
v0.7.0(Sep 7, 2022)
Compatibility

Require python >= 3.8 (via #1347).

Support for python 3.10 (via #1379).

Require PyTorch >= 1.11 (via (#1363).

Require GPyTorch >= 1.9.0 (#1347).

GPyTorch 1.9.0 is a major refactor that factors out the lazy tensor functionality into a new LinearOperator library, which required a number of adjustments to BoTorch (#1363, #1377).

Require pyro >= 1.8.2 (#1379).

New Features

Add ability to generate the features appended in the AppendFeatures input transform via a generic callable (#1354).

Add new synthetic test functions for sensitivity analysis (#1355, #1361).

Other Changes

Use time.monotonic() instead of time.time() to measure duration (#1353).

Allow passing Y_samples directly in MARS.set_baseline_Y (#1364).

Bug Fixes

Patch state_dict loading for PairwiseGP (#1359).

Fix batch_shape handling in Normalize and InputStandardize transforms (#1360).

Source code(tar.gz)
Source code(zip)
v0.6.6(Aug 12, 2022)
[0.6.6] - Aug 12, 2022

Compatibility

Require GPyTorch >= 1.8.1 (#1347).

New Features

Support batched models in RandomFourierFeatures (#1336).

Add a skip_expand option to AppendFeatures (#1344).

Other Changes

Allow qProbabilityOfImprovement to use batch-shaped best_f (#1324).

Make optimize_acqf re-attempt failed optimization runs and handle optimization errors in optimize_acqf and gen_candidates_scipy better (#1325).

Reduce memory overhead in MARS.set_baseline_Y (#1346).

Bug Fixes

Fix bug where outcome_transform was ignored for ModelListGP.fantasize (#1338).

Fix bug causing get_polytope_samples to sample incorrectly when variables live in multiple dimensions (#1341).

Documentation

Add more descriptive docstrings for models (#1327, #1328, #1329, #1330) and for other classes (#1313).

Expanded on the model documentation at botorch.org/docs/models (#1337).

Source code(tar.gz)
Source code(zip)
v0.6.5(Jul 15, 2022)
Compatibility

Require PyTorch >=1.10 (#1293).

Require GPyTorch >=1.7 (#1293).

New Features

Add MOMF (Multi-Objective Multi-Fidelity) acquisition function (#1153).

Support PairwiseLogitLikelihood and modularize PairwiseGP (#1193).

Add in transformed weighting flag to Proximal Acquisition function (#1194).

Add FeasibilityWeightedMCMultiOutputObjective (#1202).

Add outcome_transform to FixedNoiseMultiTaskGP (#1255).

Support Scalable Constrained Bayesian Optimization (#1257).

Support SaasFullyBayesianSingleTaskGP in prune_inferior_points (#1260).

Implement MARS as a risk measure (#1303).

Add MARS tutorial (#1305).

Other Changes

Add Bilog outcome transform (#1189).

Make get_infeasible_cost return a cost value for each outcome (#1191).

Modify risk measures to accept List[float] for weights (#1197).

Support SaasFullyBayesianSingleTaskGP in prune_inferior_points_multi_objective (#1204).

BotorchContainers and BotorchDatasets: Large refactor of the original TrainingData API to allow for more diverse types of datasets (#1205, #1221).

Proximal biasing support for multi-output SingleTaskGP models (#1212).

Improve error handling in optimize_acqf_discrete with a check that choices is non-empty (#1228).

Handle X_pending properly in FixedFeatureAcquisition (#1233, #1234).

PE and PLBO support in Ax (#1240, #1241).

Remove model.train call from get_X_baseline for better caching (#1289).

Support inf values in bounds argument of optimize_acqf (#1302).

Bug Fixes

Update get_gp_samples to support input / outcome transforms (#1201).

Fix cached Cholesky sampling in qNEHVI when using Standardize outcome transform (#1215).

Make task_feature as required input in MultiTaskGP.construct_inputs (#1246).

Fix CUDA tests (#1253).

Fix FixedSingleSampleModel dtype/device conversion (#1254).

Prevent inappropriate transforms by putting input transforms into train mode before converting models (#1283).

Fix sample_points_around_best when using 20 dimensional inputs or prob_perturb (#1290).

Skip bound validation in optimize_acqf if inequality constraints are specified (#1297).

Properly handle RFFs when used with a ModelList with individual transforms (#1299).

Update PosteriorList to support deterministic-only models and fix event_shape (#1300).

Documentation

Add a note about observation noise in the posterior in fit_model_with_torch_optimizer notebook (#1196).

Fix custom botorch model in Ax tutorial to support new interface (#1213).

Update MOO docs (#1242).

Add SMOKE_TEST option to MOMF tutorial (#1243).

Fix ModelListGP.condition_on_observations/fantasize bug (#1250).

Replace space with underscore for proper doc generation (#1256).

Update PBO tutorial to use EUBO (#1262).

Source code(tar.gz)
Source code(zip)
v0.6.4(Apr 21, 2022)
New Features

Implement ExpectationPosteriorTransform (#903).

Add PairwiseMCPosteriorVariance, a cheap active learning acquisition function (#1125).

Support computing quantiles in the fully Bayesian posterior, add FullyBayesianPosteriorList (#1161).

Add expectation risk measures (#1173).

Implement Multi-Fidelity GIBBON (Lower Bound MES) acquisition function (#1185).

Other Changes

Add an error message for one shot acquisition functions in optimize_acqf_discrete (#939).

Validate the shape of the bounds argument in optimize_acqf (#1142).

Minor tweaks to SAASBO (#1143, #1183).

Minor updates to tutorials (24f7fda7b40d4aabf502c1a67816ac1951af8c23, #1144, #1148, #1159, #1172, #1180).

Make it easier to specify a custom PyroModel (#1149).

Allow passing in a mean_module to SingleTaskGP/FixedNoiseGP (#1160).

Add a note about acquisitions using gradients to base class (#1168).

Remove deprecated box_decomposition module (#1175).

Bug Fixes

Bug-fixes for ProximalAcquisitionFunction (#1122).

Fix missing warnings on failed optimization in fit_gpytorch_scipy (#1170).

Ignore data related buffers in PairwiseGP.load_state_dict (#1171).

Make fit_gpytorch_model properly honor the debug flag (#1178).

Fix missing posterior_transform in gen_one_shot_kg_initial_conditions (#1187).

Source code(tar.gz)
Source code(zip)
v0.6.3.1(Mar 28, 2022)
New Features

Implement SAASBO - SaasFullyBayesianSingleTaskGP model for sample-efficient high-dimensional Bayesian optimization (#1123).

Add SAASBO tutorial (#1127).

Add LearnedObjective (#1131), AnalyticExpectedUtilityOfBestOption acquisition function (#1135), and a few auxiliary classes to support Bayesian optimization with preference exploration (BOPE).

Add BOPE tutorial (#1138).

Other Changes

Use qKG.evaluate in optimize_acqf_mixed (#1133).

Add construct_inputs to SAASBO (#1136).

Bug Fixes

Fix "Constraint Active Search" tutorial (#1124).

Update "Discrete Multi-Fidelity BO" tutorial (#1134).

Source code(tar.gz)
Source code(zip)
v0.6.2(Mar 9, 2022)
New Features

Use BOTORCH_MODULAR in tutorials with Ax (#1105).

Add optimize_acqf_discrete_local_search for discrete search spaces (#1111).

Bug Fixes

Fix missing posterior_transform in qNEI and get_acquisition_function (#1113).

Source code(tar.gz)
Source code(zip)
v0.6.1(Feb 28, 2022)
New Features

Add Standardize input transform (#1053).

Low-rank Cholesky updates for NEI (#1056).

Add support for non-linear input constraints (#1067).

New MOO problems: MW7 (#1077), disc brake (#1078), penicillin (#1079), RobustToy (#1082), GMM (#1083).

Other Changes

Add Dispatcher (#1009).

Modify qNEHVI to support deterministic models (#1026).

Store tensor attributes of input transforms as buffers (#1035).

Modify NEHVI to support MTGPs (#1037).

Make Normalize input transform input column-specific (#1047).

Improve find_interior_point (#1049).

Remove deprecated botorch.distributions module (#1061).

Avoid costly application of posterior transform in Kronecker & HOGP models (#1076).

Support heteroscedastic perturbations in InputPerturbations (#1088).

Performance Improvements

Make risk measures more memory efficient (#1034).

Bug Fixes

Properly handle empty fixed_features in optimization (#1029).

Fix missing weights in VaR risk measure (#1038).

Fix find_interior_point for negative variables & allow unbounded problems (#1045).

Filter out indefinite bounds in constraint utilities (#1048).

Make non-interleaved base samples use intuitive shape (#1057).

Pad small diagonalization with zeros for KroneckerMultitaskGP (#1071).

Disable learning of bounds in preprocess_transform (#1089).

Catch runtime errors with ill-conditioned covar (#1095).

Fix compare_mc_analytic_acquisition tutorial (#1099).

Source code(tar.gz)
Source code(zip)
v0.6.0(Dec 9, 2021)
Compatibility

Require PyTorch >=1.9 (#1011).

Require GPyTorch >=1.6 (#1011).

New Features

New ApproximateGPyTorchModel wrapper for various (variational) approximate GP models (#1012).

New SingleTaskVariationalGP stochastic variational Gaussian Process model (#1012).

Support for Multi-Output Risk Measures (#906, #965).

Introduce ModelList and PosteriorList (#829).

New Constraint Active Search tutorial (#1010).

Add additional multi-objective optimization test problems (#958).

Other Changes

Add covar_module as an optional input of MultiTaskGP models (#941).

Add min_range argument to Normalize transform to prevent division by zero (#931).

Add initialization heuristic for acquisition function optimization that samples around best points (#987).

Update initialization heuristic to perturb a subset of the dimensions of the best points if the dimension is > 20 (#988).

Modify apply_constraints utility to work with multi-output objectives (#994).

Short-cut t_batch_mode_transform decorator on non-tensor inputs (#991).

Performance Improvements

Use lazy covariance matrix in BatchedMultiOutputGPyTorchModel.posterior (#976).

Fast low-rank Cholesky updates for qNoisyExpectedHypervolumeImprovement (#747, #995, #996).

Bug Fixes

Update error handling to new PyTorch linear algebra messages (#940).

Avoid test failures on Ampere devices (#944).

Fixes to the Griewank test function (#972).

Handle empty base_sample_shape in Posterior.rsample (#986).

Handle NotPSDError and hitting maxiter in fit_gpytorch_model (#1007).

Use TransformedPosterior for subclasses of GPyTorchPosterior (#983).

Propagate best_f argument to qProbabilityOfImprovement in input constructors (f5a5f8b6dc20413e67c6234e31783ac340797a8d)

Source code(tar.gz)
Source code(zip)
v0.5.1(Sep 2, 2021)
Compatibility

Require GPyTorch >=1.5.1 (#928).

New Features

Add HigherOrderGP composite Bayesian Optimization tutorial notebook (#864).

Add Multi-Task Bayesian Optimization tutorial (#867).

New multi-objective test problems from (#876).

Add PenalizedMCObjective and L1PenaltyObjective (#913).

Add a ProximalAcquisitionFunction for regularizing new candidates towards previously generated ones (#919, #924).

Add a Power outcome transform (#925).

Bug Fixes

Batch mode fix for HigherOrderGP initialization (#856).

Improve CategoricalKernel precision (#857).

Fix an issue with qMultiFidelityKnowledgeGradient.evaluate (#858).

Fix an issue with transforms with HigherOrderGP. (#889)

Fix initial candidate generation when parameter constraints are on different device (#897).

Fix bad in-place op in _generate_unfixed_lin_constraints (#901).

Fix an input transform bug in fantasize call (#902).

Fix outcome transform bug in batched_to_model_list (#917).

Other Changes

Make variance optional for TransformedPosterior.mean (#855).

Support transforms in DeterministicModel (#869).

Support batch_shape in RandomFourierFeatures (#877).

Add a maximize flag to PosteriorMean (#881).

Ignore categorical dimensions when validating training inputs in MixedSingleTaskGP (#882).

Refactor HigherOrderGPPosterior for memory efficiency (#883).

Support negative weights for minimization objectives in get_chebyshev_scalarization (#884).

Move train_inputs transforms to model.train/eval calls (#894).

Source code(tar.gz)
Source code(zip)
v0.5.0(Jun 29, 2021)
Compatibility

Require PyTorch >=1.8.1 (#832).

Require GPyTorch >=1.5 (#848).

Changes to how input transforms are applied: transform_inputs is applied in model.forward if the model is in train mode, otherwise it is applied in the posterior call (#819, #835).

New Features

Improved multi-objective optimization capabilities:

qNoisyExpectedHypervolumeImprovement acquisition function that improves on qExpectedHypervolumeImprovement in terms of tolerating observation noise and speeding up computation for large q-batches (#797, #822).

qMultiObjectiveMaxValueEntropy acqusition function (913aa0e510dde10568c2b4b911124cdd626f6905, #760).

Heuristic for reference point selection (#830).

FastNondominatedPartitioning for Hypervolume computations (#699).

DominatedPartitioning for partitioning the dominated space (#726).

BoxDecompositionList for handling box decompositions of varying sizes (#712).

Direct, batched dominated partitioning for the two-outcome case (#739).

get_default_partitioning_alpha utility providing heuristic for selecting approximation level for partitioning algorithms (#793).

New method for computing Pareto Frontiers with less memory overhead (#842, #846).

New qLowerBoundMaxValueEntropy acquisition function (a.k.a. GIBBON), a lightweight variant of Multi-fidelity Max-Value Entropy Search using a Determinantal Point Process approximation (#724, #737, #749).

Support for discrete and mixed input domains:

CategoricalKernel for categorical inputs (#771).

MixedSingleTaskGP for mixed search spaces (containing both categorical and ordinal parameters) (#772, #847).

optimize_acqf_discrete for optimizing acquisition functions over fully discrete domains (#777).

Extend optimize_acqf_mixed to allow batch optimization (#804).

Support for robust / risk-aware optimization:

Risk measures for robust / risk-averse optimization (#821).

AppendFeatures transform (#820).

InputPerturbation input transform for for risk averse BO with implementation errors (#827).

Tutorial notebook for Bayesian Optimization of risk measures (#823).

Tutorial notebook for risk-averse Bayesian Optimization under input perturbations (#828).

More scalable multi-task modeling and sampling:

KroneckerMultiTaskGP model for efficient multi-task modeling for block-design settings (all tasks observed at all inputs) (#637).

Support for transforms in Multi-Task GP models (#681).

Posterior sampling based on Matheron's rule for Multi-Task GP models (#841).

Various changes to simplify and streamline integration with Ax:

Handle non-block designs in TrainingData (#794).

Acquisition function input constructor registry (#788, #802, #845).

Random Fourier Feature (RFF) utilties for fast (approximate) GP function sampling (#750).

DelaunayPolytopeSampler for fast uniform sampling from (simple) polytopes (#741).

Add evaluate method to ScalarizedObjective (#795).

Bug Fixes

Handle the case when all features are fixed in optimize_acqf (#770).

Pass fixed_features to initial candidate generation functions (#806).

Handle batch empty pareto frontier in FastPartitioning (#740).

Handle empty pareto set in is_non_dominated (#743).

Handle edge case of no or a single observation in get_chebyshev_scalarization (#762).

Fix an issue in gen_candidates_torch that caused problems with acqusition functions using fantasy models (#766).

Fix HigherOrderGP dtype bug (#728).

Normalize before clamping in Warp input warping transform (#722).

Fix bug in GP sampling (#764).

Other Changes

Modify input transforms to support one-to-many transforms (#819, #835).

Make initial conditions for acquisition function optimization honor parameter constraints (#752).

Perform optimization only over unfixed features if fixed_features is passed (#839).

Refactor Max Value Entropy Search Methods (#734).

Use Linear Algebra functions from the torch.linalg module (#735).

Use PyTorch's Kumaraswamy distribution (#746).

Improved capabilities and some bugfixes for batched models (#723, #767).

Pass callback argument to scipy.optim.minimize in gen_candidates_scipy (#744).

Modify behavior of X_pending in in multi-objective acqusiition functions (#747).

Allow multi-dimensional batch shapes in test functions (#757).

Utility for converting batched multi-output models into batched single-output models (#759).

Explicitly raise NotPSDError in _scipy_objective_and_grad (#787).

Make raw_samples optional if batch_initial_conditions is passed (#801).

Use powers of 2 in qMC docstrings & examples (#812).

Source code(tar.gz)
Source code(zip)
v0.4.0(Feb 23, 2021)
Compatibility

Require PyTorch >=1.7.1 (#714).

Require GPyTorch >=1.4 (#714).

New Features

HigherOrderGP - High-Order Gaussian Process (HOGP) model for high-dimensional output regression (#631, #646, #648, #680).

qMultiStepLookahead acquisition function for general look-ahead optimization approaches (#611, #659).

ScalarizedPosteriorMean and project_to_sample_points for more advanced MFKG functionality (#645).

Large-scale Thompson sampling tutorial (#654, #713).

Tutorial for optimizing mixed continuous/discrete domains (application to multi-fidelity KG with discrete fidelities) (#716).

GPDraw utility for sampling from (exact) GP priors (#655).

Add X as optional arg to call signature of MCAcqusitionObjective (#487).

OSY synthetic test problem (#679).

Bug Fixes

Fix matrix multiplication in scalarize_posterior (#638).

Set X_pending in get_acquisition_function in qEHVI (#662).

Make contextual kernel device-aware (#666).

Do not use an MCSampler in MaxPosteriorSampling (#701).

Add ability to subset outcome transforms (#711).

Performance Improvements

Batchify box decomposition for 2d case (#642).

Other Changes

Use scipy distribution in MES quantile bisect (#633).

Use new closure definition for GPyTorch priors (#634).

Allow enabling of approximate root decomposition in posterior calls (#652).

Support for upcoming 21201-dimensional PyTorch SobolEngine (#672, #674).

Refactored various MOO utilities to allow future additions (#656, #657, #658, #661).

Support input_transform in PairwiseGP (#632).

Output shape checks for t_batch_mode_transform (#577).

Check for NaN in gen_candidates_scipy (#688).

Introduce base_sample_shape property to Posterior objects (#718).

Source code(tar.gz)
Source code(zip)
v0.3.3(Dec 8, 2020)
Compatibility

Require PyTorch >=1.7 (#614).

Require GPyTorch >=1.3 (#614).

New Features

Models (LCE-A, LCE-M and SAC ) for Contextual Bayesian Optimziation (#581).

Implements core models from: High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization. Q. Feng, B. Letham, H. Mao, E. Bakshy. NeurIPS 2020.

See Ax for usage of these models.

Hit and run sampler for uniform sampling from a polytope (#592).

Input warping:

Core functionality (#607).

Kumaraswamy Distribution (#606).

Tutorial (8f34871652042219c57b799669a679aab5eed7e3).

TuRBO-1 tutorial (#598).

Implements the method from: Scalable Global Optimization via Local Bayesian Optimization. D. Eriksson, M. Pearce, J. Gardner, R. D. Turner, M. Poloczek. NeurIPS 2019.

Bug fixes

Fix bounds of HolderTable synthetic function (#596).

Fix device issue in MOO tutorial (#621).

Other changes

Add train_inputs option to qMaxValueEntropy (#593).

Enable gpytorch settings to override BoTorch defaults for fast_pred_var and debug (#595).

Rename set_train_data_transform -> preprocess_transform (#575).

Modify _expand_bounds() shape checks to work with >2-dim bounds (#604).

Add batch_shape property to models (#588).

Modify qMultiFidelityKnowledgeGradient.evaluate() to work with project, expand and cost_aware_utility (#594).

Add list of papers using BoTorch to website docs (#617).

Source code(tar.gz)
Source code(zip)
v0.3.2(Oct 26, 2020)
New Features

Add PenalizedAcquisitionFunction wrapper (#585)

Input transforms

Reversible input transform (#550)

Rounding input transform (#562)

Log input transform (#563)

Differentiable approximate rounding for integers (#561)

Bug fixes

Fix sign error in UCB when maximize=False (a4bfacbfb2109d3b89107d171d2101e1995822bb)

Fix batch_range sample shape logic (#574)

Other changes

Better support for two stage sampling in preference learning (0cd13d0cb49b1ac8d0971e42f1f0e9dd6126fd9a)

Remove noise term in PairwiseGP and add ScaleKernel by default (#571)

Rename prior to task_covar_prior in MultiTaskGP and FixedNoiseMultiTaskGP (16573fea066d8bb682dc68526f42b6ec7c22a555)

Support only transforming inputs on training or evaluation (#551)

Add equals method for InputTransform (#552)

Source code(tar.gz)
Source code(zip)
v0.3.1(Sep 16, 2020)
New Features

Constrained Multi-Objective tutorial (#493)

Multi-fidelity Knowledge Gradient tutorial (#509)

Support for batch qMC sampling (#510)

New evaluate method for qKnowledgeGradient (#515)

Compatibility

Require PyTorch >=1.6 (#535)

Require GPyTorch >=1.2 (#535)

Remove deprecated botorch.gen module (#532)

Bug fixes

Fix bad backward-indexing of task_feature in MultiTaskGP (#485)

Fix bounds in constrained Branin-Currin test function (#491)

Fix max_hv for C2DTLZ2 and make Hypervolume always return a float (#494)

Fix bug in draw_sobol_samples that did not use the proper effective dimension (#505)

Fix constraints for q>1 in qExpectedHypervolumeImprovement (c80c4fdb0f83f0e4f12e4ec4090d0478b1a8b532)

Only use feasible observations in partitioning for qExpectedHypervolumeImprovement in get_acquisition_function (#523)

Improved GPU compatibility for PairwiseGP (#537)

Performance Improvements

Reduce memory footprint in qExpectedHypervolumeImprovement (#522)

Add (q)ExpectedHypervolumeImprovement to nonnegative functions [for better initialization] (#496)

Other changes

Support batched best_f in qExpectedImprovement (#487)

Allow to return full tree of solutions in OneShotAcquisitionFunction (#488)

Added construct_inputs class method to models to programmatically construct the inputs to the constructor from a standardized TrainingData representation (#477, #482, 3621198d02195b723195b043e86738cd5c3b8e40)

Acquisition function constructors now accept catch-all **kwargs options (#478, e5b69352954bb10df19a59efe9221a72932bfe6c)

Use psd_safe_cholesky in qMaxValueEntropy for better numerical stabilty (#518)

Added WeightedMCMultiOutputObjective (81d91fd2e115774e561c8282b724457233b6d49f)

Add ability to specify outcomes to all multi-output objectives (#524)

Return optimization output in info_dict for fit_gpytorch_scipy (#534)

Use setuptools_scm for versioning (#539)

Source code(tar.gz)
Source code(zip)
v0.3.0(Jul 6, 2020)
New Features

Multi-Objective Acquisition Functions (#466)

q-Expected Hypervolume Improvement

q-ParEGO

Analytic Expected Hypervolume Improvement with auto-differentiation

Multi-Objective Utilities (#466)

Pareto Computation

Hypervolume Calculation

Box Decomposition algorithm

Multi-Objective Test Functions (#466)

Suite of synthetic test functions for multi-objective, constrained optimzation

Multi-Objective Tutorial (#468)

Abstract ConstrainedBaseTestProblem (#454)

Add optimize_acqf_list method for sequentially, greedily optimizing 1 candidate from each provided acquisition function (d10aec911b241b208c59c192beb9e4d572a092cd)

Bug fixes

Fixed re-arranging mean in MultiTask multi-output models (#450).

Other changes

Move gpt_posterior_settings into models.utils (#449)

Allow specifications of batch dims to collapse in samplers (#457)

Remove outcome transform before model-fitting for sequential model fitting in multi-output models (#458)

Source code(tar.gz)
Source code(zip)
v0.2.5(May 14, 2020)
Bug fixes

Fixed issue with broken wheel build (#444).

Other changes

Changed code style to use absolute imports throughout (#443).

Source code(tar.gz)
Source code(zip)
v0.2.4(May 13, 2020)
Bug fixes

There was a mysterious issue with the 0.2.3 wheel on pypi, where part of the botorch/optim/utils.py file was not included, which resulted in an ImportError for many central components of the code. Interestingly, the source dist (built with the same command) did not have this issue.

Preserve order in ChainedOutcomeTransform (#440).

New Features

Utilities for estimating the feasible volume under outcome constraints (#437).

Source code(tar.gz)
Source code(zip)
v0.2.3(Apr 26, 2020)
Introduces a new Pairwise GP model for Preference Learning with pair-wise preferential feedback, as well as a Sampling Strategies abstraction for generating candidates from a discrete candidate set.

Compatibility

Require PyTorch >=1.5 (#423).

Require GPyTorch >=1.1.1 (#425).

New Features

Add PairwiseGP for preference learning with pair-wise comparison data (#388).

Add SamplingStrategy abstraction for sampling-based generation strategies, including MaxPosteriorSampling (i.e. Thompson Sampling) and BoltzmannSampling (#218, #407).

Deprecations

The existing botorch.gen module is moved to botorch.generation.gen and imports from botorch.gen will raise a warning (an error in the next release) (#218).

Bug fixes

Fix & update a number of tutorials (#394, #398, #393, #399, #403).

Fix CUDA tests (#404).

Fix sobol maxdim limitation in prune_baseline (#419).

Other changes

Better stopping criteria for stochastic optimization (#392).

Improve numerical stability of LinearTruncatedFidelityKernel (#409).

Allow batched best_f in qExpectedImprovement and qProbabilityOfImprovement (#411).

Introduce new logger framework (#412).

Faster indexing in some situations (#414).

More generic BaseTestProblem (9e604fe2188ac85294c143d249872415c4d95823).

Source code(tar.gz)
Source code(zip)
v0.2.2(Mar 9, 2020)
Require Python 3.7 and adds new features for active learning and multi-fidelity optimization, along with a number of bug fixes.

Compatibility

Require PyTorch >=1.4 (#379).

Require Python >=3.7 (#378).

New Features

Add qNegIntegratedPosteriorVariance for Bayesian active learning (#377).

Add FixedNoiseMultiFidelityGP, analogous to SingleTaskMultiFidelityGP (#386).

Support scalarize_posterior for m>1 and q>1 posteriors (#374).

Support subset_output method on multi-fidelity models (#372).

Add utilities for sampling from simplex and hypersphere (#369).

Bug fixes

Fix TestLoader local test discovery (#376).

Fix batch-list conversion of SingleTaskMultiFidelityGP (#370).

Validate tensor args before checking input scaling for more informative error messaages (#368).

Fix flaky qNoisyExpectedImprovement test (#362).

Fix test function in closed-loop tutorial (#360).

Fix num_output attribute in BoTorch/Ax tutorial (#355).

Other changes

Require output dimension in MultiTaskGP (#383).

Update code of conduct (#380).

Remove deprecated joint_optimize and sequential_optimize (#363).

Source code(tar.gz)
Source code(zip)
v0.2.1(Jan 16, 2020)
Minor bug fix release.

New Features

Add a static method for getting batch shapes for batched MO models (#346).

Bug fixes

Revamp qKG constructor to avoid issue with missing objective (#351).

Make sure MVES can support sampled costs like KG (#352).

Other changes

Allow custom module-to-array handling in fit_gpytorch_scipy (#341).

Source code(tar.gz)
Source code(zip)
v0.2.0(Dec 20, 2019)
This release adds the popular Max-value Entropy Search (MES) acquisition function, as well as support for multi-fidelity Bayesian optimization via both the Knowledge Gradient (KG) and MES.

Compatibility

Require PyTorch >=1.3.1 (#313).

Require GPyTorch >=1.0 (#342).

New Features

Add cost-aware KnowledgeGradient (qMultiFidelityKnowledgeGradient) for multi-fidelity optimization (#292).

Add qMaxValueEntropy and qMultiFidelityMaxValueEntropy max-value entropy search acquisition functions (#298).

Add subset_output functionality to (most) models (#324).

Add outcome transforms and input transforms (#321).

Add outcome_transform kwarg to model constructors for automatic outcome transformation and un-transformation (#327).

Add cost-aware utilities for cost-sensitive acquisiiton functions (#289).

Add DeterminsticModel and DetermisticPosterior abstractions (#288).

Add AffineFidelityCostModel (f838eacb4258f570c3086d7cbd9aa3cf9ce67904).

Add project_to_target_fidelity and expand_trace_observations utilities for use in multi-fidelity optimization (1ca12ac0736e39939fff650cae617680c1a16933).

Performance Improvements

New prune_baseline option for pruning X_baseline in qNoisyExpectedImprovement (#287).

Do not use approximate MLL computation for deterministic fitting (#314).

Avoid re-evaluating the acquisition function in gen_candidates_torch (#319).

Use CPU where possible in gen_batch_initial_conditions to avoid memory issues on the GPU (#323).

Bug fixes

Properly register NoiseModelAddedLossTerm in HeteroskedasticSingleTaskGP (671c93a203b03ef03592ce322209fc5e71f23a74).

Fix batch mode for MultiTaskGPyTorchModel (#316).

Honor propagate_grads argument in fantasize of FixedNoiseGP (#303).

Properly handle diag arg in LinearTruncatedFidelityKernel (#320).

Other changes

Consolidate and simplify multi-fidelity models (#308).

New license header style (#309).

Validate shape of best_f in qExpectedImprovement (#299).

Support specifying observation noise explicitly for all models (#256).

Add num_outputs property to the Model API (#330).

Validate output shape of models upon instantiating acquisition functions (#331).

Tests

Silence warnings outside of explicit tests (#290).

Enforce full sphinx docs coverage in CI (#294).

Source code(tar.gz)
Source code(zip)
v0.1.4(Oct 2, 2019)
Breaking Changes

Require explicit output dimensions in BoTorch models (#238)

Make joint_optimize / sequential_optimize return acquisition function values (#149) [note deprecation notice below]

standardize now works on the second to last dimension (#263)

Refactor synthetic test functions (#273)

New Features

Add qKnowledgeGradient acquisition function (#272, #276)

Add input scaling check to standard models (#267)

Add cyclic_optimize, convergence criterion class (#269)

Add settings.debug context manager (#242)

Deprecations

Consolidate sequential_optimize and joint_optimize into optimize_acqf (#150)

Bug fixes

Properly pass noise levels to GPs using a FixedNoiseGaussianLikelihood (#241) [requires gpytorch > 0.3.5]

Fix q-batch dimension issue in ConstrainedExpectedImprovement (6c067185f56d3a244c4093393b8a97388fb1c0b3)

Fix parameter constraint issues on GPU (#260)

Minor changes

Add decorator for concatenating pending points (#240)

Draw independent sample from prior for each hyperparameter (#244)

Allow dim > 1111 for gen_batch_initial_conditions (#249)

Allow optimize_acqf to use q>1 for AnalyticAcquisitionFunction (#257)

Allow excluding parameters in fit functions (#259)

Track the final iteration objective value in fit_gpytorch_scipy (#258)

Error out on unexpected dims in parameter constraint generation (#270)

Compute acquisition values in gen_ functions w/o grad (#274)

Tests

Introduce BotorchTestCase to simplify test code (#243)

Refactor tests to have monolithic cuda tests (#261)

Source code(tar.gz)
Source code(zip)
v0.1.3(Aug 10, 2019)
Compatibility

Updates to support breaking changes in PyTorch to boolean masks and tensor comparisons (#224).

Require PyTorch >=1.2 (#225).

Require GPyTorch >=0.3.5 (itself a compatibility release).

New Features

Add FixedFeatureAcquisitionFunction wrapper that simplifies optimizing acquisition functions over a subset of input features (#219).

Add ScalarizedObjective for scalarizing posteriors (#210).

Change default optimization behavior to use L-BFGS-B by for box constraints (#207).

Bug fixes

Add validation to candidate generation (#213), making sure constraints are strictly satisfied (rater than just up to numerical accuracy of the optimizer).

Minor changes

Introduce AcquisitionObjective base class (#220).

Add propagate_grads context manager, replacing the propagate_grads kwarg in model posterior() calls (#221)

Add batch_initial_conditions argument to joint_optimize() for warm-starting the optimization (ec3365a37ed02319e0d2bb9bea03aee89b7d9caa).

Add return_best_only argument to joint_optimize() (#216). Useful for implementing advanced warm-starting procedures.

Source code(tar.gz)
Source code(zip)
v0.1.2(Jul 10, 2019)
Bug fixes

Avoid PyTorch bug resulting in bad gradients on GPU by requiring GPyTorch >= 0.3.4

Fixes to resampling behavior in MCSamplers (#204)

Experimental Features

Linear truncated kernel for multi-fidelity bayesian optimization (#192)

SingleTaskMultiFidelityGP for GP models that have fidelity parameters (#181)

Source code(tar.gz)
Source code(zip)
v0.1.1(Jun 28, 2019)
Breaking changes

rename botorch.qmc to botorch.sampling, move MC samplers from acquisition.sampler to botorch.sampling.samplers (#172)

New Features

Add condition_on_observations and fantasize to the Model level API (#173)

Support pending observations generically for all MCAcqusitionFunctions (#176)

Add fidelity kernel for training iterations/training data points (#178)

Support for optimization constraints across q-batches (to support things like sample budget constraints) (2a95a6c3f80e751d5cf8bc7240ca9f5b1529ec5b)

Add ModelList <-> Batched Model converter (#187)

New test functions

basic: neg_ackley, cosine8, neg_levy, neg_rosenbrock, neg_shekel (e26dc7576c7bf5fa2ba4cb8fbcf45849b95d324b)

for multi-fidelity BO: neg_aug_branin, neg_aug_hartmann6, neg_aug_rosenbrock (ec4aca744f65ca19847dc368f9fee4cc297533da)

Improved functionality:

More robust model fitting

Catch gpytorch numerical issues and return NaN to the optimizer (#184)

Restart optimization upon failure by sampling hyperparameters from their prior (#188)

Sequentially fit batched and ModelListGP models by default (#189)

Change minimum inferred noise level (e2c64fef1e76d526a33951c5eb75ac38d5581257)

Introduce optional batch limit in joint_optimize to increases scalability of parallel optimization (baab5786e8eaec02d37a511df04442471c632f8a)

Change constructor of ModelListGP to comply with GPyTorch’s IndependentModelList constructor (a6cf739e769c75319a67c7525a023ece8806b15d)

Use torch.random to set default seed for samplers (rather than random) to making sampling reproducible when setting torch.manual_seed (ae507ad97255d35f02c878f50ba68a2e27017815)

Performance Improvements

Use einsum in LinearMCObjective (22ca29535717cda0fcf7493a43bdf3dda324c22d)

Change default Sobol sample size for MCAquisitionFunctions to be base-2 for better MC integration performance (5d8e81866a23d6bfe4158f8c9b30ea14dd82e032)

Add ability to fit models in SumMarginalLogLikelihood sequentially (and make that the default setting) (#183)

Do not construct the full covariance matrix when computing posterior of single-output BatchedMultiOutputGPyTorchModel (#185)

Bug fixes

Properly handle observation_noise kwarg for BatchedMultiOutputGPyTorchModels (#182)

Fix a issue where f_best was always max for NoisyExpectedImprovement (410de585f07de0c66427d5066947e22227d11537)

Fix bug and numerical issues in initialize_q_batch (844dcd1dc8f418ae42639e211c6bb8e31a75d8bf)

Fix numerical issues with inv_transform for qMC sampling (#162)

Other

Bump GPyTorch minimum requirement to 0.3.3

Source code(tar.gz)
Source code(zip)