A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation

Overview

Aboleth

circleCI Documentation Status

A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation [1] with stochastic gradient variational Bayes inference [2].

Features

Some of the features of Aboleth:

  • Bayesian fully-connected, embedding and convolutional layers using SGVB [2] for inference.
  • Random Fourier and arc-cosine features for approximate Gaussian processes. Optional variational optimisation of these feature weights as per [1].
  • Imputation layers with parameters that are learned as part of a model.
  • Noise Contrastive Priors [3] for better out-of-domain uncertainty estimation.
  • Very flexible construction of networks, e.g. multiple inputs, ResNets etc.
  • Compatible and interoperable with other neural net frameworks such as Keras (see the demos for more information).

Why?

The purpose of Aboleth is to provide a set of high performance and light weight components for building Bayesian neural nets and approximate (deep) Gaussian process computational graphs. We aim for minimal abstraction over pure TensorFlow, so you can still assign parts of the computational graph to different hardware, use your own data feeds/queues, and manage your own sessions etc.

Here is an example of building a simple Bayesian neural net classifier with one hidden layer and Normal prior/posterior distributions on the network weights:

import tensorflow as tf
import aboleth as ab

# Define the network, ">>" implements function composition,
# the InputLayer gives a kwarg for this network, and
# allows us to specify the number of samples for stochastic
# gradient variational Bayes.
net = (
    ab.InputLayer(name="X", n_samples=5) >>
    ab.DenseVariational(output_dim=100) >>
    ab.Activation(tf.nn.relu) >>
    ab.DenseVariational(output_dim=1)
)

X_ = tf.placeholder(tf.float, shape=(None, D))
Y_ = tf.placeholder(tf.float, shape=(None, 1))

# Build the network, nn, and the parameter regularization, kl
nn, kl = net(X=X_)

# Define the likelihood model
likelihood = tf.distributions.Bernoulli(logits=nn).log_prob(Y_)

# Build the final loss function to use with TensorFlow train
loss = ab.elbo(likelihood, kl, N)

# Now your TensorFlow training code here!
...

At the moment the focus of Aboleth is on supervised tasks, however this is subject to change in subsequent releases if there is interest in this capability.

Installation

NOTE: Aboleth is a Python 3 library only. Some of the functionality within it depends on features only found in python 3. Sorry.

To get up and running quickly you can use pip and get the Aboleth package from PyPI:

$ pip install aboleth

For the best performance on your architecture, we recommend installing TensorFlow from sources.

Or, to install additional dependencies required by the demos:

$ pip install aboleth[demos]

To install in develop mode with packages required for development we recommend you clone the repository from GitHub:

$ git clone [email protected]:data61/aboleth.git

Then in the directory that you cloned into, issue the following:

$ pip install -e .[dev]

Getting Started

See the quick start guide to get started, and for more in depth guide, have a look at our tutorials. Also see the demos folder for more examples of creating and training algorithms with Aboleth.

The full project documentation can be found on readthedocs.

References

[1] (1, 2) Cutajar, K. Bonilla, E. Michiardi, P. Filippone, M. Random Feature Expansions for Deep Gaussian Processes. In ICML, 2017.
[2] (1, 2) Kingma, D. P. and Welling, M. Auto-encoding variational Bayes. In ICLR, 2014.
[3] Hafner, D., Tran, D., Irpan, A., Lillicrap, T. and Davidson, J., 2018. Reliable Uncertainty Estimates in Deep Neural Networks using Noise Contrastive Priors. arXiv preprint arXiv:1807.09289.

License

Copyright 2017 CSIRO (Data61)

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Comments
  • Checkpoint best model instead of n most recent

    Checkpoint best model instead of n most recent

    Find a clean way to checkpoint the current best model, according to some scalar metric on the hold-out validation set, instead of the default most recent criteria. See e.g.

    • https://stackoverflow.com/questions/39252901/tensorflow-save-the-model-with-smallest-validation-error
    • https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/04_Save_Restore.ipynb
    opened by ltiao 4
  • Variable MC samples specified with placeholders causes issues for Reshape layer

    Variable MC samples specified with placeholders causes issues for Reshape layer

    Traceback (most recent call last):
      File "mnist_softmax_regression.py", line 112, in <module>
        main()
      File "mnist_softmax_regression.py", line 79, in main
        logits, reg = net(X=X)
      File "/home/tia00c/projects/aboleth/aboleth/baselayers.py", line 69, in __call__
        Net, KL = self._build(**kwargs)
      File "/home/tia00c/projects/aboleth/aboleth/baselayers.py", line 98, in _build
        Net, KL = self.stack(**kwargs)
      File "/home/tia00c/projects/aboleth/aboleth/baselayers.py", line 159, in stackfunc
        result1, loss1 = layer1(*args, **kwargs)
      File "/home/tia00c/projects/aboleth/aboleth/baselayers.py", line 159, in stackfunc
        result1, loss1 = layer1(*args, **kwargs)
      File "/home/tia00c/projects/aboleth/aboleth/baselayers.py", line 159, in stackfunc
        result1, loss1 = layer1(*args, **kwargs)
      [Previous line repeated 7 more times]
      File "/home/tia00c/projects/aboleth/aboleth/baselayers.py", line 160, in stackfunc
        result, loss2 = layer2(result1)
      File "/home/tia00c/projects/aboleth/aboleth/baselayers.py", line 32, in __call__
        Net, KL = self._build(X)
      File "/home/tia00c/projects/aboleth/aboleth/layers.py", line 236, in _build
        new_shape = (int(X.shape[0]), tf.shape(X)[1]) + self.target_shape
    TypeError: __int__ returned non-int (type NoneType)
    
    opened by ltiao 2
  • Should we allow a user to make var in the variational layers a tf.Variable?

    Should we allow a user to make var in the variational layers a tf.Variable?

    At the moment, only a float is excepted for var, and this automatically initialised a tf.Variable in the DenseVariational and EmbedVariational. Should we allow a user to choose to make this a tf.Variable or not (which would be consistent with how lenscale in ab.kernels and var in ab.likelihoods is handled)?

    question 
    opened by dsteinberg 2
  • Investigate tf.distributions to replace our likelihoods

    Investigate tf.distributions to replace our likelihoods

    https://www.tensorflow.org/api_docs/python/tf/distributions

    most of these have a sample() method already, and there are a bunch of multivariate normal distributions in the contrib as well. So we may almost have a drop-in replacement.

    Looks like there is Reparameterization type for variational inference? https://www.tensorflow.org/api_docs/python/tf/distributions/ReparameterizationType

    We may also be able to replace our likelihoods with these, but it would require us to instantiate these distibution classes with our layers as an input. Right now our likelihoods don't require inputs.

    enhancement question in progress 
    opened by dsteinberg 2
  • KL re-weighting across Minibatches

    KL re-weighting across Minibatches

    opened by ltiao 2
  • Remove learn_prior flag from dense_var?

    Remove learn_prior flag from dense_var?

    Rather allow an end-user to wrap the reg parameter with something like ab.pos(tf.Variable(1.)). This would avoid having to propagate this flag all the way to here:

    https://github.com/determinant-io/aboleth/blob/develop/aboleth/distributions.py#L82

    question 
    opened by dsteinberg 2
  • Implement and investigate a shared stddev posterior on weights for variational layers

    Implement and investigate a shared stddev posterior on weights for variational layers

    I've noticed that a posterior of the form:

    prod_ij N(w_ij | mu_ij, sigma)

    In some cases it converges quickly compared the other posteriors... but it may be overly restrictive in terms of it's uncertainty estimates.

    enhancement question 
    opened by dsteinberg 1
  • Initialisation shapes are inconsistent

    Initialisation shapes are inconsistent

    • we have swapped n_in, n_out in initialisers.py, our 2-d weights are typically (n_out, n_in)
    • we need to generalize these functions to more than 2-dim matrices, see the conv layers

    There may be a better design here that keeps track of these dimensions... i.e. functions like sample_W in layers.py are horrible!

    bug 
    opened by dsteinberg 1
  • Replace distributions.py with TensorFlow probability trainable_distributions?

    Replace distributions.py with TensorFlow probability trainable_distributions?

    Unless the design is too different...

    This would also require speed/efficiency testing

    https://github.com/tensorflow/probability/blob/master/tensorflow_probability/python/trainable_distributions.py

    enhancement help wanted question 
    opened by dsteinberg 1
  • Remove MAP layers where possible in favour of using Keras layers?

    Remove MAP layers where possible in favour of using Keras layers?

    This will have to be tested to see what the difference is in terms of performance, and also if there is an equivalent of an embedding layer (EmbedMAP).

    We could also turn the MAP layers into light wrappers around Keras layers -- this will break fewer demos and keep the interface consistent etc too.

    help wanted question 
    opened by dsteinberg 1
  • added demo. probably should create a separate doc page instead of dem…

    added demo. probably should create a separate doc page instead of dem…

    Was trying to close some stale issues today and ended up trying finish #105 more thoroughly. I added a demo and wrote some documentation on this. It is currently under demos but probably deserves a page of its own. Here's a preview:

    screen shot 2017-12-01 at 5 30 56 pm

    screen shot 2017-12-01 at 5 31 31 pm

    It does raise some fundamental design questions in Aboleth, e.g. do we really need to implement our own MAP layers. It makes sense for variational layers since we need to explicitly work with the distribution over layer parameters, for for MAP layers, Keras / TensorFlow layers effectively already implement those that exist in Aboleth (DenseMAP, EmbedMAP, Conv2dMAP), and practically all other conceivable layers. Since we have access to the regularization losses with Keras layers (I'm not sure about the pure TensorFlow tf.layers), we can just add these to KL as outlined in the demo.

    It seems to me that all MAP layers can be subsumed into a higher-order wrapper layer. If it makes sense to do so, we should raise this as a separate issue at a later stage and properly investigate this.

    opened by ltiao 1
  • Implement Flipout sampling?

    Implement Flipout sampling?

    https://arxiv.org/abs/1803.04386

    Eqn. (4) could be implemented as an alternative to _sample_W in layers.py, caveats:

    • This maybe be hard getting to work with full-covariance weights
    • This may require the mean and std. dev. of the weight samples to be output independently for the vectorized implementation.
    enhancement somedaymaybe 
    opened by dsteinberg 0
Owner
Gradient Institute
Non-profit research institute building ethical AI systems
Gradient Institute
A bare-bones Python library for quality diversity optimization.

pyribs Website Source PyPI Conda CI/CD Docs Docs Status Twitter pyribs.org GitHub docs.pyribs.org A bare-bones Python library for quality diversity op

ICAROS 127 Jan 6, 2023
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Ilya Kostrikov 3k Dec 31, 2022
Official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch.

Multi-speaker DGP This repository provides official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch. O

sarulab-speech 24 Sep 7, 2022
Code to reproduce the experiments from our NeurIPS 2021 paper " The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective"

Code To run: python runner.py new --save <SAVE_NAME> --data <PATH_TO_DATA_DIR> --dataset <DATASET> --model <model_name> [options] --n 1000 - train - t

Geoff Pleiss 5 Dec 12, 2022
Node-level Graph Regression with Deep Gaussian Process Models

Node-level Graph Regression with Deep Gaussian Process Models Prerequests our implementation is mainly based on tensorflow 1.x and gpflow 1.x: python

null 1 Jan 16, 2022
Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayesian-Torch is designed to be flexible and seamless in extending a deterministic deep neural network architecture to corresponding Bayesian form by simply replacing the deterministic layers with Bayesian layers.

Intel Labs 210 Jan 4, 2023
This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa

Wanyu Du 18 Dec 29, 2022
Newt - a Gaussian process library in JAX.

Newt __ \/_ (' \`\ _\, \ \\/ /`\/\ \\ \ \\

AaltoML 0 Nov 2, 2021
Multi-Output Gaussian Process Toolkit

Multi-Output Gaussian Process Toolkit Paper - API Documentation - Tutorials & Examples The Multi-Output Gaussian Process Toolkit is a Python toolkit f

GAMES 113 Nov 25, 2022
TensorFlow implementation of "A Simple Baseline for Bayesian Uncertainty in Deep Learning"

TensorFlow implementation of "A Simple Baseline for Bayesian Uncertainty in Deep Learning"

YeongHyeon Park 7 Aug 28, 2022
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

Bayesian Methods for Hackers Using Python and PyMC The Bayesian method is the natural approach to inference, yet it is hidden from readers behind chap

Cameron Davidson-Pilon 25.1k Jan 2, 2023
LBK 20 Dec 2, 2022
Hierarchical Uniform Manifold Approximation and Projection

HUMAP Hierarchical Manifold Approximation and Projection (HUMAP) is a technique based on UMAP for hierarchical non-linear dimensionality reduction. HU

Wilson Estécio Marcílio Júnior 160 Jan 6, 2023
Official repository of the paper "A Variational Approximation for Analyzing the Dynamics of Panel Data". Mixed Effect Neural ODE. UAI 2021.

Official repository of the paper (UAI 2021) "A Variational Approximation for Analyzing the Dynamics of Panel Data", Mixed Effect Neural ODE. Panel dat

Jurijs Nazarovs 7 Nov 26, 2022
Fast algorithms to compute an approximation of the minimal volume oriented bounding box of a point cloud in 3D.

ApproxMVBB Status Build UnitTests Homepage Fast algorithms to compute an approximation of the minimal volume oriented bounding box of a point cloud in

Gabriel Nützi 390 Dec 31, 2022
Recurrent Scale Approximation (RSA) for Object Detection

Recurrent Scale Approximation (RSA) for Object Detection Codebase for Recurrent Scale Approximation for Object Detection in CNN published at ICCV 2017

Yu Liu (Louis) 239 Dec 28, 2022
Time Dependent DFT in Tamm-Dancoff Approximation

Density Function Theory Program - kspy-tddft(tda) This is an implementation of Time-Dependent Density Functional Theory(TDDFT) using the Tamm-Dancoff

Peter Borthwick 2 Nov 17, 2022