zeus is a Python implementation of the Ensemble Slice Sampling method.

Minas Karamanis

Last update: Dec 4, 2022

Related tags

Deep Learning python machine-learning inference sampling bayesian-inference general-purpose mcmc mcmc-sampler sampling-methods probabilistic-data-analysis black-box-bayesian-inference

Overview

zeus is a Python implementation of the Ensemble Slice Sampling method.

Fast & Robust Bayesian Inference,
Efficient Markov Chain Monte Carlo (MCMC),
Black-box inference, no hand-tuning,
Excellent performance in terms of autocorrelation time and convergence rate,
Scale to multiple CPUs without any extra effort,
Automated Convergence diagnostics.

Example

For instance, if you wanted to draw samples from a 10-dimensional Gaussian, you would do something like:

import zeus
import numpy as np

def log_prob(x, ivar):
    return - 0.5 * np.sum(ivar * x**2.0)

nsteps, nwalkers, ndim = 1000, 100, 10
ivar = 1.0 / np.random.rand(ndim)
start = np.random.randn(nwalkers,ndim)

sampler = zeus.EnsembleSampler(nwalkers, ndim, log_prob, args=[ivar])
sampler.run_mcmc(start, nsteps)
chain = sampler.get_chain(flat=True)

Documentation

Read the docs at zeus-mcmc.readthedocs.io

Installation

To install zeus using pip run:

pip install zeus-mcmc

To install zeus in a [Ana]Conda environment use:

conda install -c conda-forge zeus-mcmc

Attribution

Please cite the following papers if you found this code useful in your research:

@article{karamanis2021zeus,
  title={zeus: A Python implementation of Ensemble Slice Sampling for efficient Bayesian parameter inference},
  author={Karamanis, Minas and Beutler, Florian and Peacock, John A},
  journal={arXiv preprint arXiv:2105.03468},
  year={2021}
}

@article{karamanis2020ensemble,
    title = {Ensemble slice sampling: Parallel, black-box and gradient-free inference for correlated & multimodal distributions},
    author = {Karamanis, Minas and Beutler, Florian},
    journal = {arXiv preprint arXiv: 2002.06212},
    year = {2020}
}

Licence

zeus is free software made available under the GPL-3.0 License. For details see the LICENSE file.

Comments

Constrained sampling in parameter space
Hi, I'm trying to fit a model using Zeus, however I don't have an analytic model, which forces me to interpolate my model data. I run into issues with the sampler when the walkers explore the parameter space too far from where I expect the MAP to be. For instance, I define one parameter to have a uniform prior in (0.8, 1.2) but when the walkers explore ~1.5 my model interpolation breaks (above interpolation bounds). I have tried filling the absent values (outside interpolation range) with nans or infs but those seem to be problematic. Extrapolation also seems to be problematic too since I get

RuntimeError: Number of expansions exceeded maximum limit! Make sure that the pdf is well-defined. Otherwise increase the maximum limit (maxiter=10^4 by default).

But this could be other bug and still I doubt extrapolating is a good approach. In short, do you have some way to constrain walkers to a certain domain in the parameter space?

Thanks in advance,
opened by dforero0896 13
zeus has no attribute 'EnsembleSampler'

I am trying to reproduce manual working example. But it gives me the following error: AttributeError: module 'zeus' has no attribute 'EnsembleSampler'

But it works with emcee though!

opened by savinbeniwal 10
Stepping procedure

Is there a particular reason why the stepping out is done in linear steps rather than doubling the step width (as in other implementations)? I am running into the limit (1e4) when testing on a 100d gaussian with std ranging from 1e-1 ... 1e-9, and increasing the limit did not solve this.

opened by JohannesBuchner 8
Print parameter values if likelihood fails

I am working with a problem where some points in parameter space give invalid inputs for which my likelihood fails. When this happens, I would like zeus to print the parameter values at the point that caused the failure (as is done by emcee) to make it easier to debug/understand the physical reason for the likelihood failure.

Can you implement this please?
enhancement

opened by seshnadathur 7
Resuming a chain more efficiently

Currently I think that if you resume a chain (by passing the result of get_last_sample to a new call to run_mcmc) then time is wasted recomputing the posteriors of the starting points, which were already computed last time.

Emcee lets you optionally pass in the posteriors of the starting point, if they are known. Would this be possible for zeus?

opened by joezuntz 3
get_last_sample should be function

This is a minor issue, as I can work around the difference using get_chain instead.

In emcee, get_last_sample is a function which returns the last sample. In zeus this is a property. For compatibility, it would be useful to match the emcee API. It also doesn't make sense to me to use a property called get_XXX.

PS Thanks for working on this software and algorithm - initial tests are looking good

opened by jeremysanders 3
pip version does not have EnsembleSampler

I used a pip install. Then tried to follow the example here: https://zeus-mcmc.readthedocs.io/en/latest/

It didn't work. But changing EnsembleSampler to sampler worked.

opened by AdityaSavara 3
Can I resume jobs with the same Zeus sampler?

I am looking to make it so that I can cancel a Python script mid-run and then resume the job. With Emcee, I do this via the backend feature.

I want to have access to all the previous chains and samples so that I can use previously computed autocorrelation times, and whatnot. So simply passing the previous chains to a new instance of the sampler doesn't quite cut it.

Does Zeus have a specific feature that support resuming jobs? If not, no problem, I can achieve the desired effect via pickling the EnsembleSampler (I hope!). couldn't find anything in the docs, but figured I'd ask before writing code to do this.

opened by Jammy2211 2
add vectorisation option and some emcee compatibility
Hi @minaskar,

Great work here! Thanks for creating zeus.

I am attaching a pull request that gives some emcee compatiblity, making it easier to use the samplers interchangably.

Some changes:

store log probability in samples, so they can be retrieved later

add some functions supported by emcee to allow interchangable runs

make samples axis order consistent with emcee

I just saw that this also addresses issue #1.

I hope these changes are of interest to you.

Cheers, Johannes
opened by JohannesBuchner 2
[Feature Request] Add a `CITATION.cff`

Github recently released a new feature where repository owners can add a CITATION.cff file making it easy for others to cite the repository. Adding a CITATION.cff would make the attribution process very easy for others (myself included 😅 ) who want to cite this work.

opened by SauravMaheshkar 1
Remove error suppression by ChainManager in case of pickling failure

Hi, thanks for this great sampler! I was having trouble getting my sampling to work with MPI, everything seemed to run fine, but I got no output. The reason was that my likelihood function was unpickleable, but that was a bit hard to track down, because the pickling errors were suppressed by the ChainManager. So here is a PR to no longer suppress these exceptions. Or if there is a good reason for it, or a better solution, then feel free to close this.

opened by ewoudwempe 1

Update sklearn requirement to scikit-learn

The requirement sklearn is depreciated, and causes issue when installing zeus:

  DEPRECATION: sklearn is being installed using the legacy 'setup.py install' method, because it does not have a 'pyproject.toml' and the 'wheel' package is not installed. pip 23.1 will enforce this behaviour change. A possible replacement is to enable the '--use-pep517' option. Discussion can be found at https://github.com/pypa/pip/issues/8559
  Running setup.py install for sklearn: started
  Running setup.py install for sklearn: finished with status 'error'
  error: subprocess-exited-with-error
  
  × Running setup.py install for sklearn did not run successfully.
  │ exit code: 1
  ╰─> [18 lines of output]
      The 'sklearn' PyPI package is deprecated, use 'scikit-learn'
      rather than 'sklearn' for pip commands.
      
      Here is how to fix this error in the main use cases:
      - use 'pip install scikit-learn' rather than 'pip install sklearn'
      - replace 'sklearn' by 'scikit-learn' in your pip requirements files
        (requirements.txt, setup.py, setup.cfg, Pipfile, etc ...)
      - if the 'sklearn' package is used by one of your dependencies,
        it would be great if you take some time to track which package uses
        'sklearn' instead of 'scikit-learn' and report it to their issue tracker
      - as a last resort, set the environment variable
        SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True to avoid this error
      
      More information is available at
      https://github.com/scikit-learn/sklearn-pypi-package
      
      If the previous advice does not cover your use case, feel free to report it at
      https://github.com/scikit-learn/sklearn-pypi-package/issues/new
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> sklearn

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
Error: Process completed with exit code 1.

I can fix this myself for what I'm doing, but advise you change the requirement to scitkit-learn so other users dont have issuess.

opened by Jammy2211 0

Poor parallelisation scaling

I found that running zeus with n_MPI = n_walker/2 gives poor efficiency, with half of the CPU time being spend idling.

If I understand the code in EnsembleSampler.sample correctly, the stepping-out procedure it repeated until all walkers in the ensemble have reached their step-out position, and only then the shrinking procedure begins. This means that the ensemble has to wait until the last walker reaches the step-out position, during which all other walkers are idling. Please correct me if I misunderstood the implementation.

Since the the stepping-out and shrinking procedures are independent for each walker once the directions are set, it should be possible to restructure the loops such that walkers can start shrinking as soon as they finished stepping out, rather than having to wait for the last walker.

On a somewhat unrelated note, is there a reason the maxsteps are distributed randomly between left and right here: https://github.com/minaskar/zeus/blob/master/zeus/ensemble.py#L566 ?

opened by tilmantroester 0
Provide user option to override default logging

When calling zeus from a script/package that does its own logging, the fact that zeus relies on the root logger is somewhat vexing as it may result in duplicate logging. This commit exposes the option to pass an existing logger to the sampler so that the user can modify logging behaviour if they so desire, or capture zeus' log output in the same streams as other logs. This also enables the user to prevent zeus from clobbering handlers that have already been defined for the root logger.

The changes do not alter the behaviour of existing code - the default of None ensures that any existing code will follow the current logic. While there are other ways of avoiding duplicate logs, this does provide more customisation options to users at rather low cost to maintain.

opened by pscicluna 0
Bug: Autocorrelation fails with newest Scipy

In autocorr.py, scipy is imported as: 'import scipy as sp' The function _autocorr_func_1d uses scipy for the fft function. However, in newer versions of scipy, you must call the fft function in the following way or else an AttributeError will occur: from scipy import fft fft.fft() # how to use the fft function

Please change this soon. Thank you.

opened by TroyGustke 0
Computation of R-hat Statistic
First of all, thanks for the package and all your hard work!

I think I've encountered a couple of issues/bugs in the computation of the R-hat statistic.

First is just a typo I think. In lines 139-140 the chain means and variances are flattened by the list comprehension, where I think something like:

_means = np.vstack(means) _vars = np.vstack(vars)

will keep the structure of the chains, so that np.var(means, ddof=1, axis=0) and np.mean(_vars, axis=0) will give the between-chain and within-chain variance, respectively, across all parameters (right now they end up as scalars, since the _means and _vars are flat).

This is similar in spirit to Issue #22 , but with a more significant effect here: on lines 120-121 where each split is reshaped to (-1, ndim), samples from all walkers are collapsed into each split, homogenizing them and leading to unrealistically low R-hat values. In my case I had ~28 walkers, many of which were stuck in well-separated modes and barely mixing at all, but nonetheless had a quite low split R-hat as recorded by the callback because each of the two splits had samples from all 28 walkers, making them statistically similar.

Since this is an ensemble method I had to spend some time convincing myself, but I really think R-hat should be computed across all (possibly split) walkers, rather than by grouping them together. I can share my trace plots and make a case for this in more detail if it's helpful. With change (1) above fixing this would just be a matter of removing the reshape operation. Then nsplits would determine how many splits are made within each walker.

Thanks again, and let me know what your thoughts are. I'm happy to help implement these changes, too.
opened by Bobby-Huggins 0
Deprecation Warning for Collections

...\anaconda3\envs\zeus\Scripts\zeus.py:2: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working from collections import Iterable as IterableType

It seems this should become "from collections.abc import Iterable as IterableType"

However, I could not find zeus.py in the repository, so I could not make a pull request to fix this.

I imagine this is likely to become in an error in the next year or so, if not fixed.

opened by AdityaSavara 0

Owner

Minas Karamanis

Cosmology PhD Student at University of Edinburgh

GitHub https://zeus-mcmc.readthedocs.io/

ML-Ensemble – high performance ensemble learning

A Python library for high performance ensemble learning ML-Ensemble combines a Scikit-learn high-level API with a low-level computational graph framew

764 Dec 31, 2022

The Python ensemble sampling toolkit for affine-invariant MCMC

emcee The Python ensemble sampling toolkit for affine-invariant MCMC emcee is a stable, well tested Python implementation of the affine-invariant ense

1.3k Dec 31, 2022

MCMC samplers for Bayesian estimation in Python, including Metropolis-Hastings, NUTS, and Slice

Sampyl May 29, 2018: version 0.3 Sampyl is a package for sampling from probability distributions using MCMC methods. Similar to PyMC3 using theano to

304 Dec 25, 2022

SANet: A Slice-Aware Network for Pulmonary Nodule Detection

SANet: A Slice-Aware Network for Pulmonary Nodule Detection This paper (SANet) has been accepted and early accessed in IEEE TPAMI 2021. This code and

39 Dec 17, 2022

Python script for Linear, Non-Linear Convection, Burger’s & Poisson Equation in 1D & 2D, 1D Diffusion Equation using Standard Wall Function, 2D Heat Conduction Convection equation with Dirichlet & Neumann BC, full Navier-Stokes Equation coupled with Poisson equation for Cavity and Channel flow in 2D using Finite Difference Method & Finite Volume Method.

Navier-Stokes-numerical-solution-using-Python- Python script for Linear, Non-Linear Convection, Burger’s & Poisson Equation in 1D & 2D, 1D D

89 Jan 4, 2023

Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation Efficient Self-Ensemble Framework for Semantic Segmentation by Walid Bousselham

61 Dec 26, 2022

An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

0 May 6, 2022

Official PyTorch implementation for FastDPM, a fast sampling algorithm for diffusion probabilistic models

Official PyTorch implementation for "On Fast Sampling of Diffusion Probabilistic Models". FastDPM generation on CIFAR-10, CelebA, and LSUN datasets. S

68 Dec 26, 2022

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Vision Transformer with Progressive Sampling This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

123 Jan 1, 2023

Pytorch implementation of Straight Sampling Network For Point Cloud Learning (ICIP2021).

Pytorch code for SS-Net This is a pytorch implementation of Straight Sampling Network For Point Cloud Learning (ICIP2021). Environment Code is tested

1 May 18, 2022

A data-driven approach to quantify the value of classifiers in a machine learning ensemble.

Documentation | External Resources | Research Paper Shapley is a Python library for evaluating binary classifiers in a machine learning ensemble. The

188 Dec 29, 2022

Neural Ensemble Search for Performant and Calibrated Predictions

Neural Ensemble Search Introduction This repo contains the code accompanying the paper: Neural Ensemble Search for Performant and Calibrated Predictio

26 Dec 12, 2022

Intrusion Detection System using ensemble learning (machine learning)

IDS-ML implementation of an intrusion detection system using ensemble machine learning methods Data set This project is carried out using the UNSW-15

4 Nov 25, 2022

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning This repository is official Tensorflow implementation of paper: Ensemb

12 Oct 18, 2022

Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

Little Ball of Fur is a graph sampling extension library for Python. Please look at the Documentation, relevant Paper, Promo video and External Resour

619 Dec 14, 2022

Code for ICLR 2021 Paper, "Anytime Sampling for Autoregressive Models via Ordered Autoencoding"

Anytime Autoregressive Model Anytime Sampling for Autoregressive Models via Ordered Autoencoding , ICLR 21 Yilun Xu, Yang Song, Sahaj Gara, Linyuan Go

22 Sep 8, 2022

《LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classiﬁcation》(AAAI 2021) GitHub:

LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classiﬁcation

76 Dec 5, 2022

Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis in JAX

SYMPAIS: Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis Overview | Installation | Documentation | Examples | Notebo

4 Sep 13, 2022

NAS Benchmark in "Prioritized Architecture Sampling with Monto-Carlo Tree Search", CVPR2021

NAS-Bench-Macro This repository includes the benchmark and code for NAS-Bench-Macro in paper "Prioritized Architecture Sampling with Monto-Carlo Tree

35 Jan 3, 2023