A highly efficient and modular implementation of Gaussian Processes in PyTorch



GPyTorch Unit Tests GPyTorch Examples Documentation Status

GPyTorch is a Gaussian process library implemented using PyTorch. GPyTorch is designed for creating scalable, flexible, and modular Gaussian process models with ease.

Internally, GPyTorch differs from many existing approaches to GP inference by performing all inference operations using modern numerical linear algebra techniques like preconditioned conjugate gradients. Implementing a scalable GP method is as simple as providing a matrix multiplication routine with the kernel matrix and its derivative via our LazyTensor interface, or by composing many of our already existing LazyTensors. This allows not only for easy implementation of popular scalable GP techniques, but often also for significantly improved utilization of GPU computing compared to solvers based on the Cholesky decomposition.

GPyTorch provides (1) significant GPU acceleration (through MVM based inference); (2) state-of-the-art implementations of the latest algorithmic advances for scalability and flexibility (SKI/KISS-GP, stochastic Lanczos expansions, LOVE, SKIP, stochastic variational deep kernel learning, ...); (3) easy integration with deep learning frameworks.

Examples, Tutorials, and Documentation

See our numerous examples and tutorials on how to construct all sorts of models in GPyTorch.



  • Python >= 3.6
  • PyTorch >= 1.7

Install GPyTorch using pip or conda:

pip install gpytorch
conda install gpytorch -c gpytorch

(To use packages globally but install GPyTorch as a user-only package, use pip install --user above.)

Latest (unstable) version

To upgrade to the latest (unstable) version, run

pip install --upgrade git+https://github.com/cornellius-gp/gpytorch.git

ArchLinux Package

Note: Experimental AUR package. For most users, we recommend installation by conda or pip.

GPyTorch is also available on the ArchLinux User Repository (AUR). You can install it with an AUR helper, like yay, as follows:

yay -S python-gpytorch

To discuss any issues related to this AUR package refer to the comments section of python-gpytorch.

Citing Us

If you use GPyTorch, please cite the following papers:

Gardner, Jacob R., Geoff Pleiss, David Bindel, Kilian Q. Weinberger, and Andrew Gordon Wilson. "GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration." In Advances in Neural Information Processing Systems (2018).

  title={GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration},
  author={Gardner, Jacob R and Pleiss, Geoff and Bindel, David and Weinberger, Kilian Q and Wilson, Andrew Gordon},
  booktitle={Advances in Neural Information Processing Systems},


To run the unit tests:

python -m unittest

By default, the random seeds are locked down for some of the tests. If you want to run the tests without locking down the seed, run

UNLOCK_SEED=true python -m unittest

If you plan on submitting a pull request, please make use of our pre-commit hooks to ensure that your commits adhere to the general style guidelines enforced by the repo. To do this, navigate to your local repository and run:

pip install pre-commit
pre-commit install

From then on, this will automatically run flake8, isort, black and other tools over the files you commit each time you commit to gpytorch or a fork of it.

The Team

GPyTorch is primarily maintained by:

We would like to thank our other contributors including (but not limited to) David Arbour, Eytan Bakshy, David Eriksson, Jared Frank, Sam Stanton, Bram Wallace, Ke Alexander Wang, Ruihan Wu.


Development of GPyTorch is supported by funding from the Bill and Melinda Gates Foundation, the National Science Foundation, and SAP.

  • Add priors [WIP]

    Add priors [WIP]

    This is an early attempt at adding priors. Lots of callsites in the code aren't updated yet, so this will fail spectacularly.

    The main thing we need to figure out is how to properly do the optimization using standard gpytorch optimizers that don't support bounds. We should probably modify the smoothed uniform prior so it has full support and is differentiable everywhere but decays rapidly outside the given bounds. Does this sound reasonable?

    opened by Balandat 50
  • Using batch-GP for learnign single common GP over multiple experiments

    Using batch-GP for learnign single common GP over multiple experiments

    Howdy folks,

    Reading the docs, I understand that batch-GP is meant to learn k independent GPs, from k independent labels y over a common data set x.

    y1 = f1(x), y2 = f2(x), ..., yk = fk(x) , for k independent GPs.

    But how would one go about using batch-GP to learn a single common GP, from k independent experiments of the same underlying process?

    y1=f(x1), y2 = f(x2), ..., yk=f(xk) for one and the same GP

    For instance, I have k sets of data and labels (y) representing measurements of how the temperature changes over altitude (x) (e.g. from weather balloons launched at k different geographical locations), and I want to induce a GP prior hat represents the temperature change over altitude between mean sea level and some maximum altitude, marginalized over the all geographical areas.

    Thanks in advance


    opened by Galto2000 25
  • Ensure compatibility with breaking changes in pytorch master branch

    Ensure compatibility with breaking changes in pytorch master branch

    This is a run of the simple_gp_regression example notebook on the current alpha_release branch. Running kissgp_gp_regression_cuda yields similar errors

    import math
    import torch
    import gpytorch
    from matplotlib import pyplot as plt
    %matplotlib inline
    %load_ext autoreload
    %autoreload 2
    from torch.autograd import Variable
    # Training data is 11 points in [0,1] inclusive regularly spaced
    train_x = Variable(torch.linspace(0, 1, 11))
    # True function is sin(2*pi*x) with Gaussian noise N(0,0.04)
    train_y = Variable(torch.sin(train_x.data * (2 * math.pi)) + torch.randn(train_x.size()) * 0.2)
    from torch import optim
    from gpytorch.kernels import RBFKernel
    from gpytorch.means import ConstantMean
    from gpytorch.likelihoods import GaussianLikelihood
    from gpytorch.random_variables import GaussianRandomVariable
    # We will use the simplest form of GP model, exact inference
    class ExactGPModel(gpytorch.models.ExactGP):
        def __init__(self, train_x, train_y, likelihood):
            super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
            # Our mean function is constant in the interval [-1,1]
            self.mean_module = ConstantMean(constant_bounds=(-1, 1))
            # We use the RBF kernel as a universal approximator
            self.covar_module = RBFKernel(log_lengthscale_bounds=(-5, 5))
        def forward(self, x):
            mean_x = self.mean_module(x)
            covar_x = self.covar_module(x)
            # Return moddl output as GaussianRandomVariable
            return GaussianRandomVariable(mean_x, covar_x)
    # initialize likelihood and model
    likelihood = GaussianLikelihood(log_noise_bounds=(-5, 5))
    model = ExactGPModel(train_x.data, train_y.data, likelihood)
    # Find optimal model hyperparameters
    # Use adam optimizer on model and likelihood parameters
    optimizer = optim.Adam(list(model.parameters()) + list(likelihood.parameters()), lr=0.1)
    optimizer.n_iter = 0
    training_iter = 50
    for i in range(training_iter):
        # Zero gradients from previous iteration
        # Output from model
        output = model(train_x)
        # Calc loss and backprop gradients
        loss = -model.marginal_log_likelihood(likelihood, output, train_y)
        optimizer.n_iter += 1
        print('Iter %d/%d - Loss: %.3f   log_lengthscale: %.3f   log_noise: %.3f' % (
            i + 1, training_iter, loss.data[0],
            model.covar_module.log_lengthscale.data[0, 0],
    TypeError                                 Traceback (most recent call last)
    <ipython-input-8-bdcf88774fd0> in <module>()
         14     output = model(train_x)
         15     # Calc loss and backprop gradients
    ---> 16     loss = -model.marginal_log_likelihood(likelihood, output, train_y)
         17     loss.backward()
         18     optimizer.n_iter += 1
    /data/users/balandat/fbsource/fbcode/buck-out/dev-nosan/gen/experimental/ae/bento_kernel_ae_experimental#link-tree/gpytorch/models/exact_gp.py in marginal_log_likelihood(self, likelihood, output, target, n_data)
         43             raise RuntimeError('You must train on the training targets!')
    ---> 45         mean, covar = likelihood(output).representation()
         46         n_data = target.size(-1)
         47         return gpytorch.exact_gp_marginal_log_likelihood(covar, target - mean).div(n_data)
    /data/users/balandat/fbsource/fbcode/buck-out/dev-nosan/gen/experimental/ae/bento_kernel_ae_experimental#link-tree/gpytorch/module.py in __call__(self, *inputs, **kwargs)
        158                 raise RuntimeError('Input must be a RandomVariable or Variable, was a %s' %
        159                                    input.__class__.__name__)
    --> 160         outputs = self.forward(*inputs, **kwargs)
        161         if isinstance(outputs, Variable) or isinstance(outputs, RandomVariable) or isinstance(outputs, LazyVariable):
        162             return outputs
    /data/users/balandat/fbsource/fbcode/buck-out/dev-nosan/gen/experimental/ae/bento_kernel_ae_experimental#link-tree/gpytorch/likelihoods/gaussian_likelihood.py in forward(self, input)
         14         assert(isinstance(input, GaussianRandomVariable))
         15         mean, covar = input.representation()
    ---> 16         noise = gpytorch.add_diag(covar, self.log_noise.exp())
         17         return GaussianRandomVariable(mean, noise)
    /data/users/balandat/fbsource/fbcode/buck-out/dev-nosan/gen/experimental/ae/bento_kernel_ae_experimental#link-tree/gpytorch/__init__.py in add_diag(input, diag)
         36         return input.add_diag(diag)
         37     else:
    ---> 38         return _add_diag(input, diag)
    /data/users/balandat/fbsource/fbcode/buck-out/dev-nosan/gen/experimental/ae/bento_kernel_ae_experimental#link-tree/gpytorch/functions/__init__.py in add_diag(input, diag)
         18                        component added.
         19     """
    ---> 20     return AddDiag()(input, diag)
    /data/users/balandat/fbsource/fbcode/buck-out/dev-nosan/gen/experimental/ae/bento_kernel_ae_experimental#link-tree/gpytorch/functions/add_diag.py in forward(self, input, diag)
         12         if input.ndimension() == 3:
         13             diag_mat = diag_mat.unsqueeze(0).expand_as(input)
    ---> 14         return diag_mat.mul_(val).add_(input)
         16     def backward(self, grad_output):
    TypeError: mul_ received an invalid combination of arguments - got (Variable), but expected one of:
     * (float value)
          didn't match because some of the arguments have invalid types: (!Variable!)
     * (torch.FloatTensor other)
          didn't match because some of the arguments have invalid types: (!Variable!)
    opened by Balandat 25
  • import gpytorch error

    import gpytorch error

    $ sudo python setup.py install [sudo] password for ubuntu: running install running bdist_egg running egg_info writing dependency_links to gpytorch.egg-info/dependency_links.txt writing top-level names to gpytorch.egg-info/top_level.txt writing requirements to gpytorch.egg-info/requires.txt writing gpytorch.egg-info/PKG-INFO reading manifest file 'gpytorch.egg-info/SOURCES.txt' writing manifest file 'gpytorch.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_py copying gpytorch/libfft/init.py -> build/lib.linux-x86_64-3.5/gpytorch/libfft running build_ext generating cffi module 'build/temp.linux-x86_64-3.5/gpytorch.libfft._libfft.c' already up-to-date creating build/bdist.linux-x86_64/egg creating build/bdist.linux-x86_64/egg/gpytorch creating build/bdist.linux-x86_64/egg/gpytorch/means copying build/lib.linux-x86_64-3.5/gpytorch/means/init.py -> build/bdist.linux-x86_64/egg/gpytorch/means copying build/lib.linux-x86_64-3.5/gpytorch/means/mean.py -> build/bdist.linux-x86_64/egg/gpytorch/means copying build/lib.linux-x86_64-3.5/gpytorch/means/constant_mean.py -> build/bdist.linux-x86_64/egg/gpytorch/means copying build/lib.linux-x86_64-3.5/gpytorch/gp_model.py -> build/bdist.linux-x86_64/egg/gpytorch copying build/lib.linux-x86_64-3.5/gpytorch/init.py -> build/bdist.linux-x86_64/egg/gpytorch creating build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/init.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/constant_random_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/independent_random_variables.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/samples_random_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/gaussian_random_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/batch_random_variables.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/random_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/categorical_random_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables copying build/lib.linux-x86_64-3.5/gpytorch/random_variables/bernoulli_random_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/random_variables creating build/bdist.linux-x86_64/egg/gpytorch/likelihoods copying build/lib.linux-x86_64-3.5/gpytorch/likelihoods/init.py -> build/bdist.linux-x86_64/egg/gpytorch/likelihoods copying build/lib.linux-x86_64-3.5/gpytorch/likelihoods/likelihood.py -> build/bdist.linux-x86_64/egg/gpytorch/likelihoods copying build/lib.linux-x86_64-3.5/gpytorch/likelihoods/gaussian_likelihood.py -> build/bdist.linux-x86_64/egg/gpytorch/likelihoods copying build/lib.linux-x86_64-3.5/gpytorch/likelihoods/bernoulli_likelihood.py -> build/bdist.linux-x86_64/egg/gpytorch/likelihoods creating build/bdist.linux-x86_64/egg/gpytorch/lazy copying build/lib.linux-x86_64-3.5/gpytorch/lazy/init.py -> build/bdist.linux-x86_64/egg/gpytorch/lazy copying build/lib.linux-x86_64-3.5/gpytorch/lazy/kronecker_product_lazy_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/lazy copying build/lib.linux-x86_64-3.5/gpytorch/lazy/toeplitz_lazy_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/lazy copying build/lib.linux-x86_64-3.5/gpytorch/lazy/lazy_variable.py -> build/bdist.linux-x86_64/egg/gpytorch/lazy copying build/lib.linux-x86_64-3.5/gpytorch/module.py -> build/bdist.linux-x86_64/egg/gpytorch creating build/bdist.linux-x86_64/egg/gpytorch/inference copying build/lib.linux-x86_64-3.5/gpytorch/inference/init.py -> build/bdist.linux-x86_64/egg/gpytorch/inference creating build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models copying build/lib.linux-x86_64-3.5/gpytorch/inference/posterior_models/init.py -> build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models copying build/lib.linux-x86_64-3.5/gpytorch/inference/posterior_models/gp_posterior.py -> build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models copying build/lib.linux-x86_64-3.5/gpytorch/inference/posterior_models/exact_gp_posterior.py -> build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models copying build/lib.linux-x86_64-3.5/gpytorch/inference/posterior_models/variational_gp_posterior.py -> build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models copying build/lib.linux-x86_64-3.5/gpytorch/inference/inference.py -> build/bdist.linux-x86_64/egg/gpytorch/inference creating build/bdist.linux-x86_64/egg/gpytorch/functions copying build/lib.linux-x86_64-3.5/gpytorch/functions/init.py -> build/bdist.linux-x86_64/egg/gpytorch/functions copying build/lib.linux-x86_64-3.5/gpytorch/functions/log_normal_cdf.py -> build/bdist.linux-x86_64/egg/gpytorch/functions copying build/lib.linux-x86_64-3.5/gpytorch/functions/normal_cdf.py -> build/bdist.linux-x86_64/egg/gpytorch/functions copying build/lib.linux-x86_64-3.5/gpytorch/functions/dsmm.py -> build/bdist.linux-x86_64/egg/gpytorch/functions copying build/lib.linux-x86_64-3.5/gpytorch/functions/add_diag.py -> build/bdist.linux-x86_64/egg/gpytorch/functions creating build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/toeplitz.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/interpolation.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/init.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/lincg.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/fft.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/lanczos_quadrature.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/function_factory.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/kronecker_product.py -> build/bdist.linux-x86_64/egg/gpytorch/utils copying build/lib.linux-x86_64-3.5/gpytorch/utils/circulant.py -> build/bdist.linux-x86_64/egg/gpytorch/utils creating build/bdist.linux-x86_64/egg/gpytorch/kernels copying build/lib.linux-x86_64-3.5/gpytorch/kernels/init.py -> build/bdist.linux-x86_64/egg/gpytorch/kernels copying build/lib.linux-x86_64-3.5/gpytorch/kernels/grid_interpolation_kernel.py -> build/bdist.linux-x86_64/egg/gpytorch/kernels copying build/lib.linux-x86_64-3.5/gpytorch/kernels/kernel.py -> build/bdist.linux-x86_64/egg/gpytorch/kernels copying build/lib.linux-x86_64-3.5/gpytorch/kernels/rbf_kernel.py -> build/bdist.linux-x86_64/egg/gpytorch/kernels copying build/lib.linux-x86_64-3.5/gpytorch/kernels/spectral_mixture_kernel.py -> build/bdist.linux-x86_64/egg/gpytorch/kernels copying build/lib.linux-x86_64-3.5/gpytorch/kernels/index_kernel.py -> build/bdist.linux-x86_64/egg/gpytorch/kernels creating build/bdist.linux-x86_64/egg/gpytorch/libfft copying build/lib.linux-x86_64-3.5/gpytorch/libfft/init.py -> build/bdist.linux-x86_64/egg/gpytorch/libfft copying build/lib.linux-x86_64-3.5/gpytorch/libfft/_libfft.abi3.so -> build/bdist.linux-x86_64/egg/gpytorch/libfft byte-compiling build/bdist.linux-x86_64/egg/gpytorch/means/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/means/mean.py to mean.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/means/constant_mean.py to constant_mean.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/gp_model.py to gp_model.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/constant_random_variable.py to constant_random_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/independent_random_variables.py to independent_random_variables.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/samples_random_variable.py to samples_random_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/gaussian_random_variable.py to gaussian_random_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/batch_random_variables.py to batch_random_variables.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/random_variable.py to random_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/categorical_random_variable.py to categorical_random_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/random_variables/bernoulli_random_variable.py to bernoulli_random_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/likelihoods/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/likelihoods/likelihood.py to likelihood.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/likelihoods/gaussian_likelihood.py to gaussian_likelihood.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/likelihoods/bernoulli_likelihood.py to bernoulli_likelihood.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/lazy/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/lazy/kronecker_product_lazy_variable.py to kronecker_product_lazy_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/lazy/toeplitz_lazy_variable.py to toeplitz_lazy_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/lazy/lazy_variable.py to lazy_variable.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/module.py to module.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/inference/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models/gp_posterior.py to gp_posterior.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models/exact_gp_posterior.py to exact_gp_posterior.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/inference/posterior_models/variational_gp_posterior.py to variational_gp_posterior.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/inference/inference.py to inference.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/functions/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/functions/log_normal_cdf.py to log_normal_cdf.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/functions/normal_cdf.py to normal_cdf.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/functions/dsmm.py to dsmm.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/functions/add_diag.py to add_diag.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/toeplitz.py to toeplitz.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/interpolation.py to interpolation.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/lincg.py to lincg.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/fft.py to fft.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/lanczos_quadrature.py to lanczos_quadrature.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/function_factory.py to function_factory.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/kronecker_product.py to kronecker_product.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/utils/circulant.py to circulant.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/kernels/init.py to init.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/kernels/grid_interpolation_kernel.py to grid_interpolation_kernel.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/kernels/kernel.py to kernel.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/kernels/rbf_kernel.py to rbf_kernel.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/kernels/spectral_mixture_kernel.py to spectral_mixture_kernel.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/kernels/index_kernel.py to index_kernel.cpython-35.pyc byte-compiling build/bdist.linux-x86_64/egg/gpytorch/libfft/init.py to init.cpython-35.pyc creating stub loader for gpytorch/libfft/_libfft.abi3.so byte-compiling build/bdist.linux-x86_64/egg/gpytorch/libfft/_libfft.py to _libfft.cpython-35.pyc creating build/bdist.linux-x86_64/egg/EGG-INFO copying gpytorch.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO copying gpytorch.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying gpytorch.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying gpytorch.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying gpytorch.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt zip_safe flag not set; analyzing archive contents... gpytorch.libfft.pycache._libfft.cpython-35: module references file creating 'dist/gpytorch-0.1-py3.5-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it removing 'build/bdist.linux-x86_64/egg' (and everything under it) Processing gpytorch-0.1-py3.5-linux-x86_64.egg removing '/usr/local/lib/python3.5/dist-packages/gpytorch-0.1-py3.5-linux-x86_64.egg' (and everything under it) creating /usr/local/lib/python3.5/dist-packages/gpytorch-0.1-py3.5-linux-x86_64.egg Extracting gpytorch-0.1-py3.5-linux-x86_64.egg to /usr/local/lib/python3.5/dist-packages gpytorch 0.1 is already the active version in easy-install.pth

    Installed /usr/local/lib/python3.5/dist-packages/gpytorch-0.1-py3.5-linux-x86_64.egg Processing dependencies for gpytorch==0.1 Searching for cffi==1.10.0 Best match: cffi 1.10.0 Adding cffi 1.10.0 to easy-install.pth file

    Using /usr/local/lib/python3.5/dist-packages Searching for pycparser==2.18 Best match: pycparser 2.18 Adding pycparser 2.18 to easy-install.pth file

    Using /usr/local/lib/python3.5/dist-packages Finished processing dependencies for gpytorch==0.1

    $ python Python 3.5.2 (default, Nov 17 2016, 17:05:23) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information.

    import gpytorch Traceback (most recent call last): File "", line 1, in File "/home/ubuntu/gpytorch-master/gpytorch/init.py", line 3, in from .lazy import LazyVariable, ToeplitzLazyVariable File "/home/ubuntu/gpytorch-master/gpytorch/lazy/init.py", line 2, in from .toeplitz_lazy_variable import ToeplitzLazyVariable File "/home/ubuntu/gpytorch-master/gpytorch/lazy/toeplitz_lazy_variable.py", line 4, in from gpytorch.utils import toeplitz File "/home/ubuntu/gpytorch-master/gpytorch/utils/toeplitz.py", line 2, in import gpytorch.utils.fft as fft File "/home/ubuntu/gpytorch-master/gpytorch/utils/fft.py", line 1, in from .. import libfft File "/home/ubuntu/gpytorch-master/gpytorch/libfft/init.py", line 3, in from ._libfft import lib as _lib, ffi as _ffi ImportError: No module named 'gpytorch.libfft._libfft'

    opened by chalesguo 22
  • Heteroskedastic likelihoods and log-noise models

    Heteroskedastic likelihoods and log-noise models

    Allows to specify generic (log-) noise models that are used to obtain out-of-sample noise estimates. This allows e.g. to stick a GP to be fit on the (log-) measured standard errors of observed data into the GaussianLikelihood, and then jointly fit that together with the GP to be fit on the data.

    enhancement WIP 
    opened by Balandat 20
  • Arbitrary number of batch dimensions for LazyTensors

    Arbitrary number of batch dimensions for LazyTensors

    Major refactors

    • [x] Refactor _get_indices from all LazyTensors
    • [x] Simplify _getitem to handle all cases - including tensor indices
    • [x] Write efficient _getitem for (most) LazyTensors
      • [x] CatLazyTensor
      • [x] BlockDiagLazyTensor
      • [x] ToeplitzLazyTensor
    • [x] Write efficient _get_indices for all LazyTensors
    • [x] Add a custom _expand_batch method for certain LazyTensors
    • [x] Add a custom _unsqueeze_batch method for certain LazyTensors
    • [x] BlockDiagLazyTensor and SumBatchLazyTensor use an explicit batch dimension (rather than implicit one) for the block structure. Also they can sum/block along any batch dimension.
    • [x] Custom _sum_batch and _prod_batch methods
      • [x] NonLazyTensor
      • [x] DiagLazyTensor
      • [x] InterpolatedLazyTensor
      • [x] ZeroLazyTensor

    New features

    • [x] LazyTensors now handle multiple batch dimensions
    • [x] LazyTensors have squeeze and unsqueeze methods
    • [x] Replace sum_batch with sum (can accept arbitrary dimensions)
    • [x] Replace mul_batch with prod (can accept arbitrary dimensions)
    • [x] LazyTensor.mul now expects a tensor of size *constant_size, 1, 1 for constant mul. (More consistent with the Tensor api).
    • [x] Add broadcasting capabilities to remaining LazyTensors


    • [x] Add MultiBatch tests for all LazyTensors
    • [x] Add unittetsts for BlockDiagLazyTensor and SumBatchLazyTensor using any batch dimension for summing/blocking
    • [x] Add tests for sum and prod methods
    • [x] Add tests for constant mul
    • [x] Add tests for permuting dimensions
    • [x] Add tests for LazyEvaluatedKernelTensor

    Miscelaneous todos (as part of the whole refactoring process)

    • [x] Ensure that InterpolatedLazyTensor.diag didn't become more inefficient
    • [x] Make CatLazyTensor work on batch dimensions
    • [x] Add to LT docs that users might have to overwrite _getitem, _get_indices, _unsqueeze_batch, _expand_batch, and transpose.
    • [x] Fix #573


    The new __getitem__ method reduces all possible indices to two cases:

    • The row and/or column of the LT is absorbed into one of the batch dimensions (this happens when a batch dimension is tensor indexed and the row/column are as well). This calls the sub-method _get_indices, in which all dimensions are indexed by Tensor indices. The output is a Tensor.
    • Neither the row nor column are absorbed into one of the batch dimensions. In this case, the _getitem sub-method is called, and the resulting output will be an LT with a reduced row and column.

    Closes #369 Closes #490 Closes #533 Closes #532 Closes #573

    enhancement refactor 
    opened by gpleiss 19
  • Replicating results presented in Doubly Stochastic Variational Inference for Deep Gaussian Processes

    Replicating results presented in Doubly Stochastic Variational Inference for Deep Gaussian Processes

    Hi, has anybody succeeded in replicating the results of the paper Doubly Stochastic Variational Inference for Deep Gaussian Processes by Salimbeni and Deisenroth in GPyTorch? There is an example DeepGP notebook referring to the paper, but when I tried to run it on the datasets used by the paper I often observe divergence in the test log-likelihood (this is the example for training on kin8nm dataset). Training on kin8nm dataset

    The divergence does not occur every time, but I am not sure what is its cause and I see no way to control it...

    I am attaching my modified notebook with reading of the datasets, a model without residual connections, batch size and layer dimensions as in the paper. Any idea what is happening here?


    Thanks, Jan

    opened by JanSochman 18
  • Add TriangularLazyTensor

    Add TriangularLazyTensor

    Adds a new TriangularLazyTensor abstraction. This tensor can be upper or lower (default) triangular. This simplifies a bunch of stuff with solves, dets, logprobs etc.

    Some of the changes with larger blast radius in this PR are:

    1. CholLazyTensor now takes in a TriangularLazyTensor
    2. The _cholesky method is expected to return a TriangularLazyTensor
    3. The _cholesky method now takes an upper kwarg (allows to work with both lower and upper variants of TriangularLazyTensor)
    4. DiagLazyTensor is not subclassed from TriangularLazyTensor
    5. The memoization functionality is updated to allow caching results depending on args/kwargs (required for dealing with the upper/lower kwargs). By setting ignore_args=False in the @cached decorator, the existing behavior can be replicated.

    Some improvements:

    1. CholLazyTensor now has a more efficient inv_matmul and inv_quad methods using the factorization of the matrix.
    2. KroneckerProductLazyTensor now returns a Cholesky decomposition that itself uses a Kronecker product representation [previously suggested in #1086]
    3. Added a test_cholesky test to the LazyTensorTestCase (this covers some previously uncovered cases explicitly)
    4. There were a number of hard-to-spot issues due to hacky manual cache handling - I replaced all these call sites with the cache helpers from gpytorch.utils.memoize, which is the correct way to go about this.
    enhancement refactor 
    opened by Balandat 18
  • [Docs] Pointer to get started with (bayesian) GPLVM

    [Docs] Pointer to get started with (bayesian) GPLVM

    I am in the process of exploring gpytorch from some of my GP applications. Currently I use pyro for GPLVM tasks (i.e. https://pyro.ai/examples/gplvm.html). I am always interested in trying out various approaches, so I would like to see how I can do similar things in gpytorch.

    Specifically, I am interested in the bayesian GPLVM as described in Titsias et al 2010.

    I have found some documentation on handling uncertain inputs, so I am guessing that would be a good place to start, but I would love to hear some thoughts from any of the gpytorch developers.

    opened by holmrenser 17
  • [Bug] Upstream changes to tensor comparisons breaks things

    [Bug] Upstream changes to tensor comparisons breaks things

    πŸ› Bug

    After https://github.com/pytorch/pytorch/pull/21113 a bunch of tests are failing b/c of the change in tensor comparison behavior (return type from uint8 to bool). Creating this issue to track the fix.

    bug compatibility 
    opened by Balandat 17
  • Query regarding custom kernel implementation.

    Query regarding custom kernel implementation.

    Hi, I am very new to the GPYTorch library and was looking at the examples related to custom kernel creation. My overall goal is to create a non-stationary kernel ( with multiple lengthscales and independent multioutput features ).

    I understood that K' = K(x-x') * K(x+x') is a non-stationary kernel ( I am using K as RBF kernel ). I created K(x+x') by just negating the x' ( torch.neg(x') ) and passed it to the covar_dist() function provided by the gpytorch library.

    I want to implement something as below:

                self.covariance_module = gpytorch \
                    .kernels \
                    .ScaleKernel (
                        base_kernel = gpytorch
                        .RBFKernel ( batch_shape = torch.Size ( [ config [ 'Dy' ] ] ),
                                     ard_num_dims = config [ 'Dy' ] ),
                        batch_shape = torch.Size ( [ config [ 'Dy' ] ] )
                # Below definition fails with error :
                # Matrix not positive definite after repeatedly adding jitter up to 1.0e-06. 
                self.covariance_module_ = gpytorch \
                    .kernels \
                    .ScaleKernel (
                        base_kernel = TestKernel ( batch_shape = torch.Size ( [ config [ 'Dy' ] ] ) ,
                                                   ard_num_dims = config [ 'Dy' ] ) ,
                        batch_shape = torch.Size ( [ config [ 'Dy' ] ] )
              # Below definition works fine.
              # self.covariance_module_ = TestKernel ()      

    and in the forward() method I am multiplying two kernels as below:

    covariance_x = self.covariance_module ( x ) * self.covariance_module_ ( x )

    My K(x+x') kernel function is as below:

    class TestKernel ( gpytorch.kernels.Kernel ) :
        has_lengthscale = True
        def forward ( self, x1, x2, diag = False, last_dim_is_batch = False, **params ):
            x1_ = x1.div ( self.lengthscale )
            x2_ = x2.div ( self.lengthscale )
            diff = self.covar_dist ( x1_, 
                                               torch.neg ( x2_ ), square_dist = True, diag = diag, postprocess = True, 
                                               dist_postprocess_func = postprocess_rbf, **params )
            return diff

    Thank you very much in advance for your reply.

    opened by Ganesh1009 0
  • Update pre-commit config

    Update pre-commit config

    This started off with a benign black version upgrade but ended up upgrading pretty much all of the pre-commits.

    opened by Balandat 0
  • [Question] Impact of x scale on results

    [Question] Impact of x scale on results

    Hello, I am trying to fit some observations with the simplest model from the tutorials:

    class ExactGPModel(gpytorch.models.ExactGP):
        def __init__(self, train_x, train_y, likelihood):
            super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
            self.mean_module = gpytorch.means.ConstantMean()
            self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())
        def forward(self, x):
            mean_x = self.mean_module(x)
            covar_x = self.covar_module(x)
            return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
    # Training data (omitted for brevity).
    #train_y = ...
    #train_x = ...
    likelihood = gpytorch.likelihoods.GaussianLikelihood()
    model = ExactGPModel(train_x, train_y, likelihood)
    # Training code (omitted for brevity).
    # Optimizer: Adam, lr=0.1
    # Loss: ExactMarginalLogLikelihood

    After training, I am evaluating the model by feeding a linspace with 100 points with this pattern:

    with torch.no_grad(), gpytorch.settings.fast_pred_var():
        test_x = torch.linspace(0, train_x[-1], 100, dtype=torch.double)
        observed_pred = likelihood(model(test_x))

    However, I am experiencing very different results depending on how I choose to scale the x dimension.

    X in [0..1]


    X in [0..100]


    X in [0..1000]


    I am confused: I did expect a result invariant w.r.t. the scale chosen for X dimension. Is this behaviour consistent with the theory of Gaussian Processes?

    Thanks in advance for any help!

    opened by LMolr 4
  • [Bug] Negative variances obtained for reasons I do not understand

    [Bug] Negative variances obtained for reasons I do not understand

    πŸ› Bug

    From time to time I observe negative variances from my trained gpytorch models when I run,

    prediction_distribution = likelihood(model)

    I don't think it would serve anyone to share a specific code snippet for this behavior since as I have mentioned in a previous comment on another negative variance issue #864 this behavior is not even reproducible with identical python, gpytorch and torch versions so I might share a snippet here and it might give reasonable non-negative variances when someone else tests it.

    What I am having trouble understanding is that if the covariance matrix of the posterior distribution output by a GP has the following form at test-time (image source, Page 3 of these class notes), image

    then how do the variance values I see on running,

    with torch.no_grad():
        # Inference on an independent model list type model
        prediction_dist = likelihood(*model(ux, ux))
    print('vars: ', prediction_dist1[0].covariance_matrix.detach().numpy(), prediction_dist1[1].covariance_matrix.detach().numpy()) 

    ever become negative?

    Expected Behavior

    I expect to not see negative variances no matter how I train and test my GP Model. In fact I would have expected that the diagonal of the covariance matrices of gpytorch.distributions not be allowed any negative entries at all.

    opened by ishank-juneja 0
  • [Bug]No exploration on the boundary

    [Bug]No exploration on the boundary

    πŸ› Bug

    I'm trying to use GPytorch to do Bayesian optimization. I rewrote the bayes_opt structure to use gpytorch GPR in it with the same kernel and hyperparameters (Matern, nu=2.5, ucb as acquisition function, same hyperparameter for ucb). The result using gpytorch is wierd: image I know the optima is on the boundary and BO over-explores boundaries[1] , but with gpytorch it never touch it Here is the result using bayes_opt (they are using sklearn GPR in it): standardbo It reaches the boundary just after several iterations.

    [1]Siivola, Eero, et al. "Correcting boundary over-exploration deficiencies in Bayesian optimization with virtual derivative sign observations." 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2018.

    opened by XieZikai 5
  • [WIP] Implement variance reduction in SLQ logdet backward pass.

    [WIP] Implement variance reduction in SLQ logdet backward pass.

    Based on "Reducing the Variance of Gaussian Process Hyperparameter Optimization with Preconditioning" by Wenger et al., 2021.

    When using iterative methods (i.e. CG/SLQ) to compute the log determinant, the forward pass currently computes: logdet K \approx logdet P + SLQ( P^{-1/2} K P^{-1/2} ), where P is a preconditioner, and SLQ is a stochastic estimate of the log determinant. If the preconditioner is a good approximation of K, then this forward pass can be seen as a form of variance reduction.

    In this PR, we apply this same variance reduction strategy to the backward pass. We compute the backward pass as: d logdet(K)/dtheta \approx d logdet(P)/dtheta + d SLQ/dtheta


    • [x] Implement pivoted cholesky as a torch.autograd.Function, so that we can compute backward passes through it.
    • [ ] Redo inv_quad_logdet function to apply variance reduction in the forward and backward passes.
    opened by gpleiss 0
  • [Bug?] Much larger variance with MultiTask kernel compared to Independent Model List

    [Bug?] Much larger variance with MultiTask kernel compared to Independent Model List

    πŸ› Bug

    On training a MultiTask kernel based model and a collection of independent models tied together in an independent model list object on the same dataset, I see variance magnitudes that are orders of magnitude different. It is unclear why this is the case since the model parameters common to the 2 learnt models (The MultiTask model MTmodel and the Independent Model List model) seem to quite similar.

    To reproduce

    ** Code snippet to reproduce **

    import torch
    import gpytorch
    import numpy as np
    class ExactGPModel(gpytorch.models.ExactGP):
        def __init__(self, train_x, train_y, likelihood):
            super().__init__(train_x, train_y, likelihood)
            self.mean_module = gpytorch.means.ConstantMean()
            self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel(ard_num_dims=4))
        def forward(self, x):
            mean_x = self.mean_module(x)
            covar_x = self.covar_module(x)
            return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)
    class MultitaskGPModel(gpytorch.models.ExactGP):
        def __init__(self, train_x, train_y, likelihood):
            super(MultitaskGPModel, self).__init__(train_x, train_y, likelihood)
            # https://docs.gpytorch.ai/en/stable/means.html#multitaskmean
            self.mean_module = gpytorch.means.MultitaskMean(
                gpytorch.means.ConstantMean(), num_tasks=2
            # Composition of index kernel and RBF kernel
            self.covar_module = gpytorch.kernels.MultitaskKernel(
                gpytorch.kernels.RBFKernel(ard_num_dims=4), num_tasks=2, rank=2
        def forward(self, x):
            mean_x = self.mean_module(x)
            covar_x = self.covar_module(x)
            # https://docs.gpytorch.ai/en/stable/distributions.html#multitaskmultivariatenormal
            return gpytorch.distributions.MultitaskMultivariateNormal(mean_x, covar_x, interleaved=False)
    # Number of train samples
    nsamps = 1000
    # Fix seed for reproducability
    # Joint input space tensor (u_t, x_t) to hold all inputs and trajectories
    train_x = torch.tensor(np.random.uniform(low=-1.0, high=1.0, size=(nsamps, 4))).float()
    # Generate output samples
    # A and B matrices
    A = torch.tensor([[1., 0.],
                      [0., 1.]])
    B = torch.tensor([[-0.2, 0.1],
                     [0.15, 0.15]])
    # Output states starting time index 1, no observation noise
    train_y = torch.zeros(nsamps, 2)
    # Obtain the output states $(x_{t+1, 1}, x_{t+1, 2})$
    for i in range(nsamps):
        # Get the output next state
        x_next = torch.matmul(A, train_x[i, 2:4]) + torch.matmul(B, train_x[i, 0:2])
        # No observation noise added
        train_y[i, :] = x_next
    # dataset = torch.cat([train_x, train_y], dim=1)
    likelihood1 = gpytorch.likelihoods.GaussianLikelihood()
    model1 = ExactGPModel(train_x, train_y[:, 0], likelihood1)
    likelihood2 = gpytorch.likelihoods.GaussianLikelihood()
    model2 = ExactGPModel(train_x, train_y[:, 1], likelihood2)
    # Collect the sub-models in an IndependentMultiOutputGP, and the respective likelihoods in a MultiOutputLikelihood
    model = gpytorch.models.IndependentModelList(model1, model2)
    likelihood = gpytorch.likelihoods.LikelihoodList(model1.likelihood, model2.likelihood)
    mll = gpytorch.mlls.SumMarginalLogLikelihood(likelihood, model)
    # Perform Ind. Model List Training
    training_iterations = 50
    # Find optimal model hyper-parameters
    # Use the Adam optimizer
    optimizer = torch.optim.Adam([
        {'params': model.parameters()},  # Includes all submodel and all likelihood parameters
    ], lr=0.1)
    print("Training Ind. Model List\n- - - - - - - - - - ")
    for i in range(training_iterations):
        output = model(*model.train_inputs)
        loss = -mll(output, model.train_targets)
        if (i + 1) % 5 == 0:
            print('Iter %d/%d - Loss: %.3f' % (i + 1, training_iterations, loss.item()))
    print("- - - - - - - - - - ")
    # MTGaussianLikelihood allows for modelling a full 2x2 Noise Cov. Prior
    MTlikelihood = gpytorch.likelihoods.MultitaskGaussianLikelihood(num_tasks=2)
    MTmodel = MultitaskGPModel(train_x, train_y, MTlikelihood)
    training_iterations = 50
    # Find optimal MTmodel hyperparameters
    # Use the adam optimizer
    optimizer = torch.optim.Adam([
        {'params': MTmodel.parameters()},  # Includes GaussianLikelihood parameters
    ], lr=0.1)
    # "Loss" for GPs - the marginal log likelihood
    mll = gpytorch.mlls.ExactMarginalLogLikelihood(MTlikelihood, MTmodel)
    print("Training MT Model\n- - - - - - - - - - ")
    for i in range(training_iterations):
        output = MTmodel(train_x)
        loss = -mll(output, train_y)
        if (i + 1) % 5 == 0:
            print('Iter %d/%d - Loss: %.3f' % (i + 1, training_iterations, loss.item()))
    print("- - - - - - - - - - ")
    # View the parameters (and others specific to MT)-
    # (1) Learned value of observation-noise covariance
    # (2) Learned constant mean prior
    # (3) Learned kernel scale parameter (\sigma)
    # (4) Learned kernel length scale (\ell)
    # Output 1 y_{a}
    print("- - - - - - - - - \nModel 1a\n- - - - - - - - - ")
    print("Learned Noise Covariance")
    print("Learned constant mean for the prior")
    print("Learned kernel scale (variance of kernel sigma)")
    print("Learned kernel length scales (one for each input) \ell")
    # Output 2 y_{b}
    print("- - - - - - - - - \nModel 1b\n- - - - - - - - - ")
    print("Learned Noise Covariance")
    print("Learned constant mean for the prior")
    print("Learned kernel scale (variance of kernel sigma)")
    print("Learned kernel length scales (one for each input) \ell")
    # MT Model
    print("- - - - - - - - - \nModel 2 (MultiTask=MT)\n- - - - - - - - - ")
    print("Learned Noise Covariance")
    print("Learned constant mean for the prior comp. 1")
    print("Learned constant mean for the prior comp. 2")
    print("Learned static Index Kernel/Fixed Covariance K_{TT} matrix")
    print("Learned kernel length scales (one for each input) \ell")
    # Set models into eval mode
    # Shift this distance away from a train data point
    shift = 0.05
    # Train data-point
    ux1 = train_x[1:2, :]
    ux2 = train_x[1:2, :] + shift
    # Performing inference on the training data points themselves
    with torch.no_grad(), gpytorch.settings.fast_pred_var():
        # Get distributions of type multivariate-normal
        prediction_dist1 = likelihood(*model(ux1, ux1))
        prediction_dist2 = likelihood(*model(ux2, ux2))
        # Get distribution of type multi-task multi-variate normal\
        prediction_dist3 = MTlikelihood(MTmodel(ux1))
        prediction_dist4 = MTlikelihood(MTmodel(ux2))
    print("Indp Model List Mean and Variance on a Train Point")
    print('mean: ', prediction_dist1[0].mean.detach().numpy(), prediction_dist1[1].mean.detach().numpy())
    print('vars: ', prediction_dist1[0].covariance_matrix.detach().numpy(), prediction_dist1[1].covariance_matrix.detach().numpy())
    print("Indp Model List Mean and Variance Nearby a Train Point")
    print('mean: ', prediction_dist2[0].mean.detach().numpy(), prediction_dist2[1].mean.detach().numpy())
    print('vars: ', prediction_dist2[0].covariance_matrix.detach().numpy(), prediction_dist2[1].covariance_matrix.detach().numpy())
    print("MT-Model Mean and Variance on a Train Point")
    print('mean: ', prediction_dist3.mean.detach().numpy())
    print('vars:\n', prediction_dist3.covariance_matrix.detach().numpy())
    print("MT-Model Mean and Variance Nearby a Train Point")
    print('mean: ', prediction_dist4.mean.detach().numpy())
    print('vars:\n', prediction_dist4.covariance_matrix.detach().numpy())
    print("Actual Data Point (True Label)")
    print(train_y[1:2, :])

    ** Stack trace/error message **

    Training Ind. Model List
    - - - - - - - - - - 
    /home/ishank/Desktop/Gaussian-Process-Dynamics/venv/lib/python3.8/site-packages/gpytorch/utils/linear_cg.py:266: UserWarning: An output with one or more elements was resized since it had shape [11], which does not match the required output shape [1, 11].This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at  ../aten/src/ATen/native/Resize.cpp:23.)
    Iter 5/50 - Loss: 0.652
    Iter 10/50 - Loss: 0.430
    Iter 15/50 - Loss: 0.200
    Iter 20/50 - Loss: -0.039
    Iter 25/50 - Loss: -0.280
    Iter 30/50 - Loss: -0.533
    Iter 35/50 - Loss: -0.793
    Iter 40/50 - Loss: -1.047
    Iter 45/50 - Loss: -1.304
    Iter 50/50 - Loss: -1.555
    - - - - - - - - - - 
    Training MT Model
    - - - - - - - - - - 
    Iter 5/50 - Loss: 0.991
    Iter 10/50 - Loss: 0.768
    Iter 15/50 - Loss: 0.540
    Iter 20/50 - Loss: 0.301
    Iter 25/50 - Loss: 0.055
    Iter 30/50 - Loss: -0.197
    Iter 35/50 - Loss: -0.454
    Iter 40/50 - Loss: -0.711
    Iter 45/50 - Loss: -0.968
    Iter 50/50 - Loss: -1.224
    - - - - - - - - - - 
    - - - - - - - - - 
    Model 1a
    - - - - - - - - - 
    Learned Noise Covariance
    tensor([0.0055], grad_fn=<AddBackward0>)
    Learned constant mean for the prior
    Parameter containing:
    tensor([-0.0137], requires_grad=True)
    Learned kernel scale (variance of kernel sigma)
    tensor(0.3450, grad_fn=<SoftplusBackward0>)
    Learned kernel length scales (one for each input) \ell
    tensor([[3.4328, 3.5159, 1.5461, 3.3662]], grad_fn=<SoftplusBackward0>)
    - - - - - - - - - 
    Model 1b
    - - - - - - - - - 
    Learned Noise Covariance
    tensor([0.0055], grad_fn=<AddBackward0>)
    Learned constant mean for the prior
    Parameter containing:
    tensor([-0.0089], requires_grad=True)
    Learned kernel scale (variance of kernel sigma)
    tensor(0.3588, grad_fn=<SoftplusBackward0>)
    Learned kernel length scales (one for each input) \ell
    tensor([[3.3585, 3.3732, 3.5119, 1.6963]], grad_fn=<SoftplusBackward0>)
    - - - - - - - - - 
    Model 2 (MultiTask=MT)
    - - - - - - - - - 
    Learned Noise Covariance
    tensor([0.0055], grad_fn=<AddBackward0>)
    Learned constant mean for the prior comp. 1
    Parameter containing:
    tensor([0.0083], requires_grad=True)
    Learned constant mean for the prior comp. 2
    Parameter containing:
    tensor([-0.0142], requires_grad=True)
    Learned static Index Kernel/Fixed Covariance K_{TT} matrix
    Parameter containing:
    tensor([[-0.1277,  0.5546],
            [ 0.6061,  0.2162]], requires_grad=True)
    Learned kernel length scales (one for each input) \ell
    tensor([[3.3520, 3.3918, 2.7205, 2.7693]], grad_fn=<SoftplusBackward0>)
    Indp Model List Mean and Variance on a Train Point
    mean:  [-0.66263324] [0.43995908]
    vars:  [[0.00555626]] [[0.00556101]]
    Indp Model List Mean and Variance Nearby a Train Point
    mean:  [-0.61773294] [0.5056697]
    vars:  [[0.00555528]] [[0.00555663]]
    MT-Model Mean and Variance on a Train Point
    mean:  [[-0.69353205  0.44850466]]
     [[0.56116694 0.04254041]
     [0.04254041 0.6265246 ]]
    MT-Model Mean and Variance Nearby a Train Point
    mean:  [[-0.6492302   0.51464087]]
     [[0.56116694 0.04254041]
     [0.04254041 0.6265246 ]]
    Actual Data Point (True Label)
    tensor([[-0.6583,  0.4381]])

    Expected Behavior

    The covariance matrix of the posterior obtained from the MultiTask kernel model is strangely frozen on- [0.56116694, 0.04254041 0.04254041, 0.6265246 ], For both the train data point and a shifted version of it.

    I find 2 problems with the covariance Matrix obtained from the MultiTask version.

    1. The diagonal entries which are the variances are quite large. Much larger than the corresponding variances obtained from the Independent Model List. Also it is strange to me that this be the case since the K_{TT} index kernel matrix associated with the MultiTask kernel has entries smaller than 1.0 (as it should since it is a static matrix of correlation coefficients is my understanding) and the length scale parameters, and the noise variance values (in the range of 1.5-3.5 and 0.0056 respectively) are very similar for the MT-model and the Ind. Model-List. Since the final variance is the composition of K_{TT} and an RBF kernel, I would have expected the variances from the 2 models to have the same order of magnitude...
    2. They remain completely unchanged between these 2 points (and on other test points for that matter)

    System information

    Please complete the following information:

    • gpytorch 1.5.1
    • pytorch 1.10.0+cu102
    • Ubuntu 20.04

    Additional context

    Colab Notebook Version- https://colab.research.google.com/drive/1OalLncVeGtNHh-DqjnkScfy46uTtNud_?usp=sharing

    opened by ishank-juneja 7
  • Fix `LazyEvaluatedKernelTensor._unsqueeze_batch`

    Fix `LazyEvaluatedKernelTensor._unsqueeze_batch`

    This is in response to #1813, which caused some bugs as the indexing code in lazy tensors did not support None indexing.


    • #1827
    • pytorch/botorch#980
    • pytorch/botorch#976

    Tests are passing locally* for GPyTorch (with this patch) and BoTorch (with patch pytorch/botorch#976).

    *Except for some CUDA tests that failed before, and are unrelated

    opened by valtron 4
  • [Bug] GetItem in BatchRepeated LazyEvaluatedKernelTensor Fails

    [Bug] GetItem in BatchRepeated LazyEvaluatedKernelTensor Fails

    πŸ› Bug

    This is seemingly directly caused by #1813 and causes downstream botorch errors (especially https://github.com/pytorch/botorch/issues/980). I've tracked it down to the

    from gpytorch.kernels import MaternKernel
    from gpytorch.lazy import KroneckerProductLazyTensor, BatchRepeatLazyTensor
    kernel = MaternKernel()
    BatchRepeatLazyTensor(kernel(torch.randn(30)), torch.Size((10,))).diag()

    produces an indexing error.

    To reproduce

    ** Stack trace/error message **

    IndexError                                Traceback (most recent call last)
    ~/Documents/GitHub/gpytorch/gpytorch/lazy/lazy_evaluated_kernel_tensor.py in _getitem(self, row_index, col_index, *batch_indices)
        111         try:
    --> 112             x2 = x2[(*batch_indices, col_index, dim_index)]
        113         # We're going to handle multi-batch indexing with a try-catch loop
    IndexError: too many indices for tensor of dimension 2
    During handling of the above exception, another exception occurred:
    RuntimeError                              Traceback (most recent call last)
    /var/folders/ms/lbkrq70x7lnbnztclff1ln6r0000gn/T/ipykernel_8975/568402408.py in <module>
          7 kernel = MaternKernel()
    ----> 8 BatchRepeatLazyTensor(kernel(torch.randn(30)), torch.Size((10,))).diag()
    ~/Documents/GitHub/gpytorch/gpytorch/lazy/lazy_tensor.py in diag(self)
       1066         row_col_iter = torch.arange(0, self.matrix_shape[-1], dtype=torch.long, device=self.device)
    -> 1067         return self[..., row_col_iter, row_col_iter]
       1069     def dim(self):
    ~/Documents/GitHub/gpytorch/gpytorch/lazy/lazy_tensor.py in __getitem__(self, index)
       2144                 self, (*batch_indices, row_index, col_index)
       2145             )
    -> 2146             res = self._get_indices(row_index, col_index, *batch_indices)
       2147         else:
       2148             res = self._getitem(row_index, col_index, *batch_indices)
    ~/Documents/GitHub/gpytorch/gpytorch/lazy/batch_repeat_lazy_tensor.py in _get_indices(self, row_index, col_index, *batch_indices)
         82         # Now call the sub _get_indices method
    ---> 83         res = self.base_lazy_tensor._get_indices(row_index, col_index, *batch_indices)
         84         return res
    ~/Documents/GitHub/gpytorch/gpytorch/lazy/lazy_tensor.py in _get_indices(self, row_index, col_index, *batch_indices)
        305         batch_indices = tuple(index.expand(final_shape) for index in batch_indices)
    --> 307         base_lazy_tensor = self._getitem(_noop_index, _noop_index, *batch_indices)._expand_batch(final_shape)
        309         # Create some interoplation indices and values
    ~/Documents/GitHub/gpytorch/gpytorch/lazy/lazy_evaluated_kernel_tensor.py in _getitem(self, row_index, col_index, *batch_indices)
        115         except IndexError:
        116             if any(not isinstance(bi, slice) for bi in batch_indices):
    --> 117                 raise RuntimeError(
        118                     "Attempting to tensor index a non-batch matrix's batch dimensions. "
        119                     f"Got batch index {batch_indices} but my shape was {self.shape}"
    RuntimeError: Attempting to tensor index a non-batch matrix's batch dimensions. Got batch index (tensor([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ....

    Expected Behavior

    Similar behavior to pre-#1813.

    Wondering if this is even possible to do without evaluating the kernel at all because we'd need to a) batch repeat the data (which is slow but possible) or b) evaluate the kernel and then batch unsqueeze it.

    I'll try to put up a PR for fixing this soon, but am stuck currently.

    System information

    Please complete the following information:

    • gpytorch master

    Additional context

    Unit test for BatchRepeatLazyTensor doesn't check the LazyEvaluatedKernelTensor case, but does check ToeplitzLazyTensor. Maybe should also update unit test.

    cc @valtron @Balandat @saitcakmak @gpleiss for visibility

    opened by wjmaddox 1
  • `LazyEvaluatedKernelTensor.add_jitter` evaluates the kernel

    `LazyEvaluatedKernelTensor.add_jitter` evaluates the kernel

    The add_jitter method of the LazyEvaluatedKernelTensor class explicitly evaluates the kernel: https://github.com/cornellius-gp/gpytorch/blob/c58dc3a77c7be7ee4434484e359a0bbc1c2d27f6/gpytorch/lazy/lazy_evaluated_kernel_tensor.py#L237

    Is there a particular reason for doing so? Adding the jitter in a lazy way as implemented by the parent class LazyTensor (here) would drastically decrease the memory consumption.

    opened by pierocor 1
  • v1.5.1(Sep 2, 2021)

    New features

    • Add gpytorch.kernels.PiecewisePolynomialKernel (#1738)
    • Include ability to turn off diagonal correction for SGPR models (#1717)
    • Include ability to cast LazyTensor to half and float types (#1726)

    Performance improvements

    • Specialty MVN log_prob method for Gaussians with sum-of-Kronecker covariances (#1674)
    • Ability to specify devices when concatenating rows of LazyTensors (#1712)
    • Improvements to LazyTensor symeig method (#1725)

    Bug fixes

    • Fix to computing batch sizes of kernels (#1685)
    • Fix SGPR prediction when fast_computations flags are turned off (#1709)
    • Improve stability of stable_qr function (#1714)
    • Fix bugs with pyro integration for full Bayesian inference (#1721)
    • num_classes in gpytorch.likelihoods.DirichletLikelihood should be an integer (#1728)
    Source code(tar.gz)
    Source code(zip)
  • v1.5.0(Jun 24, 2021)

    This release adds 2 new model classes, as well as a number of bug fixes:

    • GPLVM models for unsupervised learning
    • Polya-Gamma GPs for GP classification In addition, this release contains numerous improvements to SGPR models (that have also been included in prior bug-fix releases).

    New features

    • Add example notebook that demos binary classification with Polya-Gamma augmentation (#1523)
    • New model class: Bayesian GPLVM with Stochastic Variational Inference (#1605)
    • Periodic kernel handles multi-dimensional inputs (#1593)
    • Add missing data gaussian likelihoods (#1668)


    • Speed up SGPR models (#1517, #1528, #1670)


    • Fix erroneous loss for ExactGP multitask models (#1647)
    • Fix pyro sampling (#1594)
    • Fix initialize bug for additive kernels (#1635)
    • Fix matrix multiplication of rectangular ZeroLazyTensor (#1295)
    • Dirichlet GPs use true train targets not labels (#1641)
    Source code(tar.gz)
    Source code(zip)
  • v1.4.2(May 18, 2021)

    Various bug fixes, including

    • Use current PyTorch functionality (#1611, #1586)
    • Bug fixes to Lanczos factorization (#1607)
    • Fixes to SGPR model (#1607)
    • Various fixes to LazyTensor math (#1576, #1584)
    • SmoothedBoxPrior has a sample method (#1546)
    • Fixes to additive-structure models (#1582)
    • Doc fixes {#1603)
    • Fix to index kernel and LCM kernels (#1608, #1592)
    • Fixes to KeOps bypass (#1609)
    Source code(tar.gz)
    Source code(zip)
  • v1.4.1(Apr 15, 2021)


    • Simplify interface for 3+ layer DSPP models (#1565)
    • Fix marginal log likelihood calculation for exact Bayesian inference w/ Pyro (#1571)
    • Remove CG warning for small matrices (#1562)
    • Fix Pyro cluster-multitask example notebook (#1550)
    • Fix gradients for KeOps tensors (#1543)
    • Ensure that gradients are passed through lazily-evaluated kernels (#1518)
    • Fix bugs for models with batched fantasy observations (#1529, #1499)
    • Correct default latent_dim value for LMC variational models (#1512)

    New features

    • Create gpytorch.utils.grid.ScaleToBounds utility to replace gpytorch.utils.grid.scale_to_bounds method (#1566)
    • Fix skip connections in Deep GP example (#1531)
    • Add fantasy point support for structured kernel interpolation models (#1545)


    • Add default values to all gpytorch.settings (#1564)
    • Improve Hadamard multitask notebook (#1537)


    • Speed up SGPR models (#1517, #1528)
    Source code(tar.gz)
    Source code(zip)
  • v1.4.0(Feb 23, 2021)

    This release includes many major speed improvements, especially to Kronecker-factorized multi-output models.

    Performance improvements

    • Major speed improvements for Kronecker product multitask models (#1355, #1430, #1440, #1469, #1477)
    • Unwhitened VI speed improvements (#1487)
    • SGPR speed improvements (#1493)
    • Large scale exact GP speed improvements (#1495)
    • Random Fourier feature speed improvements (#1446, #1493)

    New Features

    • Dirichlet Classification likelihood (#1484) - based on Milios et al. (NeurIPS 2018)
    • MultivariateNormal objects have a base_sample_shape attribute for low-rank/degenerate distributions (#1502)

    New documentation

    • Tutorial for designing your own kernels (#1421)

    Debugging utilities

    • Better naming conventions for AdditiveKernel and ProductKernel (#1488)
    • gpytorch.settings.verbose_linalg context manager for seeing what linalg routines are run (#1489)
    • Unit test improvements (#1430, #1437)

    Bug Fixes

    • inverse_transform is applied to the initial values of constraints (#1482)
    • psd_safe_cholesky obeys cholesky_jitter settings (#1476)
    • fix scaling issue with priors on variational models (#1485)

    Breaking changes

    • MultitaskGaussianLikelihoodKronecker (deprecated) is fully incorporated in MultitaskGaussianLikelihood (#1471)
    Source code(tar.gz)
    Source code(zip)
  • v1.3.1(Jan 19, 2021)


    • Spectral mixture kernels work with SKI (#1392)
    • Natural gradient descent is compatible with batch-mode GPs (#1416)
    • Fix prior mean in whitened SVGP (#1427)
    • RBFKernelGrad has no more in-place operations (#1389)
    • Fixes to ConstantDiagLazyTensor (#1381, #1385)


    • Include example notebook for multitask Deep GPs (#1410)
    • Documentation updates (#1408, #1434, #1385, #1393)


    • KroneckerProductLazyTensors use root decompositions of children (#1394)
    • SGPR now uses Woodbury formula and matrix determinant lemma (#1356)


    • Delta distributions have an arg_constraints attribute (#1422)
    • Cholesky factorization now takes optional diagonal noise argument (#1377)
    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Nov 30, 2020)

    This release primarily focuses on performance improvements, and adds contour integral quadrature based variational models.

    Major Features

    Variational models with contour integral quadrature

    Minor Features

    Performance improvements

    • Kronecker product models compute a deterministic logdet (faster than the Lanczos-based logdet) (#1332)
    • Improve efficiency of KroneckerProductLazyTensor symeig method (#1338)
    • Improve SGPR efficiency (#1356)

    Other improvements

    • SpectralMixtureKernel accepts arbitrary batch shapes (#1350)
    • Variational models pass around arbitrary **kwargs to the forward method (#1339)
    • gpytorch.settings context managers keep track of their default value (#1347)
    • Kernel objects can be pickle-d (#1336)

    Bug Fixes

    • Fix requires_grad checks in gpytorch.inv_matmul (#1322)
    • Fix reshaping bug for batch independent multi-output GPs (#1368)
    • ZeroMean accepts a batch_shape argument (#1371)
    • Various doc fixes/improvements (#1327, #1343, #1315, #1373)
    Source code(tar.gz)
    Source code(zip)
  • v1.2.1(Oct 26, 2020)

    This release includes the following fixes:

    • Fix caching issues with variational GPs (#1274, #1311)
    • Ensure that constraint bounds are properly cast to floating point types (#1307)
    • Fix bug with broadcasting multitask multivariate normal shapes (#1312)
    • Bypass KeOps for small/rectangular kernels (#1319)
    • Fix issues with eigenvectors=False in LazyTensor#symeig (#1283)
    • Fix issues with fixed-noise LazyTensor preconditioner (#1299)
    • Doc fixes (#1275, #1301)
    Source code(tar.gz)
    Source code(zip)
  • v1.2.0(Aug 30, 2020)

    Major Features

    New variational and approximate models

    This release features a number of new and added features for approximate GP models:

    • Linear model of coregionalization for variational multitask GPs (#1180)
    • Deep Sigma Point Process models (#1193)
    • Mean-field decoupled (MFD) models from "Parametric Gaussian Process Regressors" (Jankowiak et al., 2020) (#1179)
    • Implement natural gradient descent (#1258)
    • Additional non-conjugate likelihoods (Beta, StudentT, Laplace) (#1211)

    New kernels

    We have just added a number of new specialty kernels:

    • gpytorch.kernels.GaussianSymmetrizedKLKernel for performing regression with uncertain inputs (#1186)
    • gpytorch.kernels.RFFKernel (random Fourier features kernel) (#1172, #1233)
    • gpytorch.kernels.SpectralDeltaKernel (a parametric kernel for patterns/extrapolation) (#1231)

    More scalable sampling

    • Large-scale sampling with contour integral quadrature from Pleiss et al., 2020 (#1194)

    Minor features

    • Ability to set amount of jitter added when performing Cholesky factorizations (#1136)
    • Improve scalability of KroneckerProductLazyTensor (#1199, #1208)
    • Improve speed of preconditioner (#1224)
    • Add symeig and svd methods to LazyTensors (#1105)
    • Add TriangularLazyTensor for Cholesky methods (#1102)

    Bug fixes

    • Fix initialization code for gpytorch.kernels.SpectralMixtureKernel (#1171)
    • Fix bugs with LazyTensor addition (#1174)
    • Fix issue with loading smoothed box priors (#1195)
    • Throw warning when variances are not positive, check for valid correlation matrices (#1237, #1241, #1245)
    • Fix sampling issues with Pyro integration (#1238)
    Source code(tar.gz)
    Source code(zip)
  • v1.1.1(Apr 24, 2020)

    Major features

    • GPyTorch is compatible with PyTorch 1.5 (latest release)
    • Several bugs with task-independent multitask models are fixed (#1110)
    • Task-dependent multitask models are more batch-mode compatible (#1087, #1089, #1095)

    Minor features

    • gpytorch.priors.MultivariateNormalPrior has an expand method (#1018)
    • Better broadcasting for batched inducing point models (#1047)
    • LazyTensor repeating works with rectangular matrices (#1068)
    • gpytorch.kernels.ScaleKernel inherits the active_dims property from its base kernel (#1072)
    • Fully-bayesian models can be saved (#1076)

    Bug Fixes

    • gpytorch.kernels.PeriodicKernel is batch-mode compatible (#1012)
    • Fix gpytorch.priors.MultivariateNormalPrior expand method (#1018)
    • Fix indexing issues with LazyTensors (#1029)
    • Fix constants with gpytorch.mlls.GammaRobustVariationalELBO (#1038, #1053)
    • Prevent doubly-computing derivatives of kernel inputs (#1042)
    • Fix initialization issues with gpytorch.kernels.SpectralMixtureKernel (#1052)
    • Fix stability of gpytorch.variational.DeltaVariationalStrategy
    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Dec 20, 2019)

    Major New Features and Improvements

    Each feature in this section comes with a new example notebook and documentation for how to use them -- check the new docs!

    • Added support for deep gaussian processes (#564).
    • KeOps integration has been added -- replace certain gpytorch.kernels.SomeKernel with gpytorch.kernels.keops.SomeKernel with KeOps installed, and run exact GPs on 100000+ data points (#812).
    • Variational inference has undergone significant internal refactoring! All old variational objects should still function, but many are deprecated. (#903).
    • Our integration with Pyro has been completely overhauled and is now much improved. For examples of interesting GP + Pyro models, see our new examples (#903).
    • Our example notebooks have been completely reorganized, and our documentation surrounding them has been rewritten to hopefully provide a better tutorial to GPyTorch (#954).
    • Added support for fully Bayesian GP modelling via NUTS (#918).

    Minor New Features and Improvements

    • GridKernel and GridInterpolationKernel now support rectangular grids (#888).
    • Added cylindrical kernel (#577).
    • Added polynomial kernel (#668).
    • Added tutorials on basic usage (hyperparameters, saving/loading, etc) (#685).
    • get_fantasy_model now supports batched models (#693).
    • Added a prior_mode context manager that causes GP models to evaluate in prior mode (#707).
    • Added linear mean (#676).
    • Added horseshoe prior (#719).
    • Added polynomial kernel with derivatives (#783).
    • Fantasy model computations now use QR for solving least squares problems, improving numerical stability (#790).
    • All legacy functions have been removed, in favor of new function format in PyTorch (#799).
    • Added Newton Girard kernel (#821).
    • GP predictions now automatically clear caches when backpropagating through them. Previously, if you wanted to train through a GP in eval mode, you had to clear the caches manually by toggling the GP back to train mode and then to eval mode again. This is no longer necessary (#916).
    • Added rational quadratic kernel (#330)
    • Switch to using torch.cholesky_solve and torch.logdet now that they support batch mode / backwards (#880)
    • Better / less redundant parameterization for correlation matrices e.g. in IndexKernel (#912).
    • Kernels now define __getitem__, which allows slicing batch dimensions (#782).
    • Performance improvements in the small data regime, e.g. n < 2000 (#926).
    • Increased the size of kernel matrix for which Cholesky is the default solve strategy to n=800 (#946).
    • Added an option for manually specifying a different preconditioner for AddedDiagLazyTensor (#930).
    • Added precommit hooks that enforce code style (#927).
    • Lengthscales have been refactored, and kernels have an is_stationary attribute (#925).
    • All of our example notebooks now get smoke tested by our CI.
    • Added a deterministic_probes setting that causes our MLL computation to be fully deterministic when using CG+Lanczos, which improves L-BFGS convergence (#929).
    • The use of the Woodbury formula for preconditioner computations is now fully replaced by QR, which improves numerical stability (#968).

    Bug fixes

    • Fix a type error when calling backward on gpytorch.functions.logdet (#711).
    • Variational models now properly skip posterior variance calculations if the skip_posterior_variances context is active (#741).
    • Fixed an issue with diag mode for PeriodicKernel (#761).
    • Stability improvements for inv_softplus and inv_sigmoid (#776).
    • Fix incorrect size handling in InterpolatedLazyTensor for rectangular matrices (#906)
    • Fix indexing in IndexKernel for batch mode (#911).
    • Fixed an issue where slicing batch mode lazy covariance matrices resulted in incorrect behavior (#782).
    • Cholesky gives a better error when there are NaNs (#944).
    • Use psd_safe_cholesky in prediction strategies rather than torch.cholesky (#956).
    • An error is now raised if Cholesky is used with KeOps, which is not supported (#959).
    • Fixed a bug where NaNs could occur during interpoilation (#971).
    • Fix MLL computation for heteroskedastic noise models (#870).
    Source code(tar.gz)
    Source code(zip)
  • v0.3.6(Oct 13, 2019)

  • v0.3.5(Aug 10, 2019)

    This release addresses breaking changes in the recent PyTorch 1.2 release. Currently, GPyTorch will run on either PyTorch 1.1 or PyTorch 1.2.

    A full list of new features and bug fixes will be coming soon in a GPyTorch 0.4 release.

    Source code(tar.gz)
    Source code(zip)
  • v0.3.4a(Aug 10, 2019)

  • v0.3.0(Apr 15, 2019)

    New Features

    • Implement kernel checkpointing, allowing exact GPs on up to 1M data points with multiple GPUs (#499)
    • GPyTorch now supports hard parameter constraints (e.g. bounds) via the register_constraint method on Module (#596)
    • All GPyTorch objects now support multiple batch dimensions. In addition to training b GPs simultaneously, you can now train a b1 x b2 matrix of GPs simultaneously if you so choose (#492, #589, #627)
    • RBFKernelGrad now supports ARD (#602)
    • FixedNoiseGaussianLikelihood offers a better interface for dealing with known observation noise values. WhiteNoiseKernel is now hard deprecated (#593)
    • InvMatmul, InvQuadLogDet and InvQuad are now twice differentiable (#603)
    • Likelihood has been redesigned. See the new documentation for details if you are creating custom likelihoods (#591)
    • Better support for more flexible Pyro models. You can now define likelihoods of the form p(y|f, z) where f is a GP and z are arbitrary latent variables learned by Pyro (#591).
    • Parameters can now be recursively initialized with full names, e.g. model.initialize(**{"covar_module.base_kernel.lengthscale": 1., "covar_module.outputscale": 1.}) (#484)
    • Added ModelList and LikelihoodList for training multiple GPs when batch mode can't be used -- see example notebooks (#471)

    Performance and stability improvements

    • CG termination is now more tolerance based, and will much more rarely terminate without returning good solves. Furthermore, a warning is raised if it ever does that includes suggested courses of action. (#569)
    • In non-ARD mode, RBFKernel and MaternKernel use custom backward implementations for performance (#517)
    • Up to a 3x performance improvement in the regime where the test set is very small (#615)
    • The noise parameter in GaussianLikelihood now has a default lower bound, similar to sklearn (#596)
    • psd_safe_cholesky now adds successively increasing amounts of jitter rather than only once (#610)
    • Variational inference initialization now uses psd_safe_cholesky rather than torch.cholesky to initialize with the prior (#610)
    • The pivoted Cholesky preconditioner now uses a QR decomposition for its solve rather than the Woodbury formula for speed and stability (#617)
    • GPyTorch now uses Cholesky for solves with very small matrices rather than CG, resulting in reduced overhead for that setting (#586)
    • Cholesky can additionally be turned on manually for help debugging (#586)
    • Kernel distance computations now use torch.cdist when on PyTorch 1.1.0 in the non-batch setting (#642)
    • CUDA unit tests now default to using the least used available GPU when run (#515)
    • MultiDeviceKernel is now much faster (#491)

    Bug Fixes

    • Fixed an issue with variational covariances at test time (#638)
    • Fixed an issue where the training covariance wasn't being detached for variance computations, occasionally resulting in backward errors (#566)
    • Fixed an issue where active_dims in kernels was being applied twice (#576)
    • Fixes and stability improvements for MultiDeviceKernel (#560)
    • Fixed an issue where fast_pred_var was failing for single training inputs (#574)
    • Fixed an issue when initializing parameter values with non-tensor values (#630)
    • Fixed an issue with handling the preconditioner log determinant value for MLL computation (#634)
    • Fixed an issue where prior_dist was being cached for VI, which was problematic for pyro models (#599)
    • Fixed a number of issues with LinearKernel, including one where the variance could go negative (#584)
    • Fixed a bug where training inputs couldn't be set with set_train_data if they are currently None (#565)
    • Fixed a number of bugs in MultitaskMultivariateNormal (#545, #553)
    • Fixed an indexing bug in batch_symeig (#547)
    • Fixed an issue where MultitaskMultivariateNormal wasn't interleaving rows correctly (#540)


    • GPyTorch is now fully Python 3.6, and we've begun to include static type hints (#581)
    • Parameters in GPyTorch no longer have default singleton batch dimensions. For example, the default shape of lengthscale is now torch.Size([1]) rather than torch.Size([1, 1]) (#605)
    • setup.py now includes optional dependents, reads requirements from requirements.txt, does not require torch if pytorch-nightly is installed (#495)
    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Feb 9, 2019)


    You can install GPyTorch via Anaconda (#463)

    Speed and stability

    • Kernel distances use the JIT for fast computations (#464)
    • LinearCG uses the JIT for fast computations (#464)
    • Improve the stability of computing kernel distances (#455)


    Variational inference improvements

    • Sped up variational models by batching all matrix solves in one call (#454)
    • Can use the same set of inducing points for batch variational GPs (#445)
    • Whitened variational inference for improved convergence (#493)
    • Variational log likelihoods for BernoulliLikelihood are computed with quadrature (#473)

    Multi-GPU Gaussian processes

    • Can train and test GPs by dividing the kernel onto multiple GPUs (#450)

    GPs with derivatives

    • Can define RBFKernels for observations and their derivatives (#462)


    • LazyTensors can broadcast matrix multiplication (#459)
    • Can use @ sign for matrix multiplication with LazyTensors


    • Convenience methods for training/testing multiple GPs in a list (#471)


    • Added a gpytorch.settings.fast_computations feature to (optionally) use Cholesky-based inference (#456)
    • Distributions define event shapes (#469)
    • Can recursively initialize parameters on GP modules (#484)


    • Can initialize noise in GaussianLikelihood (#479)
    • Fixed bugs in SGPR kernel (#487)
    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Jan 2, 2019)



    • Batch GPs, which previously were a feature, are now well-documented and much more stable (see docs)
    • Can add "fantasy observations" to models.
    • Option for exact marginal log likelihood and sampling computations (this is slower, but potentially useful for debugging) (gpytorch.settings.fast_computations)

    Bug fixes

    Source code(tar.gz)
    Source code(zip)
  • 0.1.0.rc5(Nov 19, 2018)

    Stability of hyperparameters

    • Hyperparameters taht are constrained to be positive (e.g. variance, lengthscale, etc.) are now parameterized throught the softplus function (log(1 + e^x)) rather than through the log function
    • This dramatically improves the numerical stability and optimization of hyperparameters
    • Old models that were trained with log parameters will still work, but this is deprecated.
    • Inference now handles certain numerical floating point round-off errors more gracefully.

    Various stability improvements to variational inference

    Other changes

    • GridKernel can be used for data that lies on a perfect grid.
    • New preconditioner for LazyTensors.
    • Use batched cholesky functions for improved performance (requires updating PyTorch)
    Source code(tar.gz)
    Source code(zip)
  • 0.1.0.rc4(Nov 8, 2018)

    New features

    • Implement diagonal correction for basic variational inference, improving predictive variance estimates. This is on by default.
    • LazyTensor._quad_form_derivative now has a default implementation! While custom implementations are likely to still be faster in many cases, this means that it is no longer required to implement a custom _quad_form_derivative when implementing a new LazyTensor subclass.

    Bug fixes

    • Fix a number of critical bugs for the new variational inference.
    • Do some hyperparameter tuning for the SV-DKL example notebook, and include fancier NN features like batch normalization.
    • Made it more likely that operations internally preserve the ability to perform preconditioning for linear solves and log determinants. This may have a positive impact on model performance in some cases.
    Source code(tar.gz)
    Source code(zip)
  • 0.1.0.rc3(Oct 29, 2018)

    Variational inference has been refactored

    • Easier to experiment with different variational approximations
    • Massive performance improvement for SV-DKL

    Experimental Pyro integration for variational inference

    Lots of tiny bug fixes

    (Too many to name, but everything should be better 😬)

    Source code(tar.gz)
    Source code(zip)
  • 0.1.0.rc2(Oct 29, 2018)

  • alpha(Oct 2, 2018)

    Alpha release

    We strongly encourage you to check out our beta release for lots of improvements! However, if you still need an old version, or need to use PyTorch 0.4, you can install this release.

    Source code(tar.gz)
    Source code(zip)
  • 0.1.0.rc1(Oct 2, 2018)

    Beta release

    GPyTorch is now available on pip! pip install gpytorch.

    Important! This release requires the preview build of PyTorch (>= 1.0). You should either build from source or install pytorch-nightly. See the PyTorch docs for specific installation instructions.

    If you were previously using GPyTorch, see the migration guide to help you move over.

    What's new

    • Batch mode: it is possible to train multiple GPs simultaneously
    • Improved multitask models

    Breaking changes

    • gpytorch.random_variables have been replaced by gpytorch.distributions. These build upon PyTorch distributions.
      • gpytorch.random_variables.GaussianRandomVariable -> gpytorch.distributions.MultivariateNormal.
      • gpytorch.random_variables.MultitaskGaussianRandomVariable -> gpytorch.distributions.MultitaskMultivariateNormal.


    • gpytorch.utils.scale_to_bounds is now gpytorch.utils.grid.scale_to_bounds


    • GridInterpolationKernel, GridKernel, InducingPointKernel - the attribute base_kernel_module has become base_kernel (for consistency)
    • AdditiveGridInterpolationKernel no longer exists. Now use `AdditiveStructureKernel(GridInterpolationKernel(...))
    • MultiplicativeGridInterpolationKernel no longer exists. Now useProductStructureKernel(GridInterpolationKernel(...))`.

    Attributes (n_* -> num_*)

    • IndexKernel: n_tasks -> num_tasks
    • LCMKernel: n_tasks -> num_tasks
    • MultitaskKernel: n_tasks -> num_tasks
    • MultitaskGaussianLikelihood: n_tasks -> num_tasks
    • SoftmaxLikelihood: n_features -> num_features
    • MultitaskMean: n_tasks -> num_tasks
    • VariationalMarginalLogLikelihood: n_data -> num_data
    • SpectralMixtureKernel: n_dimensions -> ard_num_dims, n_mixtures -> num_mixtures
    Source code(tar.gz)
    Source code(zip)
Supplementary code for the AISTATS 2021 paper "Matern Gaussian Processes on Graphs".

Matern Gaussian Processes on Graphs This repo provides an extension for gpflow with MatΓ©rn kernels, inducing variables and trainable models implemente

null 23 Nov 10, 2021
A Tensorflow based library for Time Series Modelling with Gaussian Processes

Markovflow Documentation | Tutorials | API reference | Slack What does Markovflow do? Markovflow is a Python library for time-series analysis via prob

Secondmind Labs 13 Dec 2, 2021
Paddle implementation for "Highly Efficient Knowledge Graph Embedding Learning with Closed-Form Orthogonal Procrustes Analysis" (NAACL 2021)

ProcrustEs-KGE Paddle implementation for Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis ?? A more detailed re

Lincedo Lab 4 Jun 9, 2021
A highly efficient, fast, powerful and light-weight anime downloader and streamer for your favorite anime.

AnimDL - Download & Stream Your Favorite Anime AnimDL is an incredibly powerful tool for downloading and streaming anime. Core features Abuses the dev

KR 161 Nov 24, 2021
Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

Restormer: Efficient Transformer for High-Resolution Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan,

Syed Waqas Zamir 177 Dec 2, 2021
Robust, modular and efficient implementation of advanced Hamiltonian Monte Carlo algorithms

AdvancedHMC.jl AdvancedHMC.jl provides a robust, modular and efficient implementation of advanced HMC algorithms. An illustrative example for Advanced

The Turing Language 133 Nov 21, 2021
Official Pytorch implementation of ICLR 2018 paper Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge.

Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge: Official Pytorch implementation of ICLR 2018 paper Deep Learning for Phy

emmanuel 42 Aug 11, 2021
Official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch.

Multi-speaker DGP This repository provides official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch. O

sarulab-speech 20 Sep 22, 2021
Implementation of "Fast and Flexible Temporal Point Processes with Triangular Maps" (Oral @ NeurIPS 2020)

Fast and Flexible Temporal Point Processes with Triangular Maps This repository includes a reference implementation of the algorithms described in "Fa

Oleksandr Shchur 14 Nov 10, 2021
PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Study-CSRNet-pytorch This is the PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

null 1 Nov 28, 2021
QuakeLabeler is a Python package to create and manage your seismic training data, processes, and visualization in a single place β€” so you can focus on building the next big thing.

QuakeLabeler Quake Labeler was born from the need for seismologists and developers who are not AI specialists to easily, quickly, and independently bu

Hao Mai 3 Nov 12, 2021
Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.

Faster R-CNN and Mask R-CNN in PyTorch 1.0 maskrcnn-benchmark has been deprecated. Please see detectron2, which includes implementations for all model

Facebook Research 8.7k Nov 27, 2021
PyTorch implementation of Value Iteration Networks (VIN): Clean, Simple and Modular. Visualization in Visdom.

VIN: Value Iteration Networks This is an implementation of Value Iteration Networks (VIN) in PyTorch to reproduce the results.(TensorFlow version) Key

Xingdong Zuo 209 Nov 17, 2021
Official code for the ICLR 2021 paper Neural ODE Processes

Neural ODE Processes Official code for the paper Neural ODE Processes (ICLR 2021). Abstract Neural Ordinary Differential Equations (NODEs) use a neura

Cristian Bodnar 32 Nov 25, 2021
Disentangled Cycle Consistency for Highly-realistic Virtual Try-On, CVPR 2021

Disentangled Cycle Consistency for Highly-realistic Virtual Try-On, CVPR 2021 [WIP] The code for CVPR 2021 paper 'Disentangled Cycle Consistency for H

ChongjianGE 69 Nov 23, 2021
A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation

Aboleth A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation [1] with stochastic gradient variational Bayes

Gradient Institute 126 Nov 29, 2021
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Skyformer This repository is the official implementation of Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr"om Method (NeurIPS 2021).

Qi Zeng 32 Dec 2, 2021
Newt - a Gaussian process library in JAX.

Newt __ \/_ (' \`\ _\, \ \\/ /`\/\ \\ \ \\

AaltoML 0 Nov 2, 2021
Multi-Output Gaussian Process Toolkit

Multi-Output Gaussian Process Toolkit Paper - API Documentation - Tutorials & Examples The Multi-Output Gaussian Process Toolkit is a Python toolkit f

GAMES 73 Nov 29, 2021