zeus is a Python implementation of the Ensemble Slice Sampling method.

Overview

logo

zeus is a Python implementation of the Ensemble Slice Sampling method.

  • Fast & Robust Bayesian Inference,
  • Efficient Markov Chain Monte Carlo (MCMC),
  • Black-box inference, no hand-tuning,
  • Excellent performance in terms of autocorrelation time and convergence rate,
  • Scale to multiple CPUs without any extra effort,
  • Automated Convergence diagnostics.

GitHub arXiv arXiv ascl Build Status License: GPL v3 Documentation Status Downloads

Example

For instance, if you wanted to draw samples from a 10-dimensional Gaussian, you would do something like:

import zeus
import numpy as np

def log_prob(x, ivar):
    return - 0.5 * np.sum(ivar * x**2.0)

nsteps, nwalkers, ndim = 1000, 100, 10
ivar = 1.0 / np.random.rand(ndim)
start = np.random.randn(nwalkers,ndim)

sampler = zeus.EnsembleSampler(nwalkers, ndim, log_prob, args=[ivar])
sampler.run_mcmc(start, nsteps)
chain = sampler.get_chain(flat=True)

Documentation

Read the docs at zeus-mcmc.readthedocs.io

Installation

To install zeus using pip run:

pip install zeus-mcmc

To install zeus in a [Ana]Conda environment use:

conda install -c conda-forge zeus-mcmc

Attribution

Please cite the following papers if you found this code useful in your research:

@article{karamanis2021zeus,
  title={zeus: A Python implementation of Ensemble Slice Sampling for efficient Bayesian parameter inference},
  author={Karamanis, Minas and Beutler, Florian and Peacock, John A},
  journal={arXiv preprint arXiv:2105.03468},
  year={2021}
}

@article{karamanis2020ensemble,
    title = {Ensemble slice sampling: Parallel, black-box and gradient-free inference for correlated & multimodal distributions},
    author = {Karamanis, Minas and Beutler, Florian},
    journal = {arXiv preprint arXiv: 2002.06212},
    year = {2020}
}

Licence

Copyright 2019-2021 Minas Karamanis and contributors.

zeus is free software made available under the GPL-3.0 License. For details see the LICENSE file.

Comments
  • Constrained sampling in parameter space

    Constrained sampling in parameter space

    Hi, I'm trying to fit a model using Zeus, however I don't have an analytic model, which forces me to interpolate my model data. I run into issues with the sampler when the walkers explore the parameter space too far from where I expect the MAP to be. For instance, I define one parameter to have a uniform prior in (0.8, 1.2) but when the walkers explore ~1.5 my model interpolation breaks (above interpolation bounds). I have tried filling the absent values (outside interpolation range) with nans or infs but those seem to be problematic. Extrapolation also seems to be problematic too since I get

    RuntimeError: Number of expansions exceeded maximum limit! 
    Make sure that the pdf is well-defined. 
    Otherwise increase the maximum limit (maxiter=10^4 by default).
    

    But this could be other bug and still I doubt extrapolating is a good approach. In short, do you have some way to constrain walkers to a certain domain in the parameter space?

    Thanks in advance,

    opened by dforero0896 13
  • zeus has no attribute 'EnsembleSampler'

    zeus has no attribute 'EnsembleSampler'

    I am trying to reproduce manual working example. But it gives me the following error: AttributeError: module 'zeus' has no attribute 'EnsembleSampler'

    But it works with emcee though!

    opened by savinbeniwal 10
  • Stepping procedure

    Stepping procedure

    Is there a particular reason why the stepping out is done in linear steps rather than doubling the step width (as in other implementations)? I am running into the limit (1e4) when testing on a 100d gaussian with std ranging from 1e-1 ... 1e-9, and increasing the limit did not solve this.

    opened by JohannesBuchner 8
  • Print parameter values if likelihood fails

    Print parameter values if likelihood fails

    I am working with a problem where some points in parameter space give invalid inputs for which my likelihood fails. When this happens, I would like zeus to print the parameter values at the point that caused the failure (as is done by emcee) to make it easier to debug/understand the physical reason for the likelihood failure.

    Can you implement this please?

    enhancement 
    opened by seshnadathur 7
  • Resuming a chain more efficiently

    Resuming a chain more efficiently

    Currently I think that if you resume a chain (by passing the result of get_last_sample to a new call to run_mcmc) then time is wasted recomputing the posteriors of the starting points, which were already computed last time.

    Emcee lets you optionally pass in the posteriors of the starting point, if they are known. Would this be possible for zeus?

    opened by joezuntz 3
  • get_last_sample should be function

    get_last_sample should be function

    This is a minor issue, as I can work around the difference using get_chain instead.

    In emcee, get_last_sample is a function which returns the last sample. In zeus this is a property. For compatibility, it would be useful to match the emcee API. It also doesn't make sense to me to use a property called get_XXX.

    PS Thanks for working on this software and algorithm - initial tests are looking good

    opened by jeremysanders 3
  • pip version does not have EnsembleSampler

    pip version does not have EnsembleSampler

    I used a pip install. Then tried to follow the example here: https://zeus-mcmc.readthedocs.io/en/latest/

    It didn't work. But changing EnsembleSampler to sampler worked.

    opened by AdityaSavara 3
  • Can I resume jobs with the same Zeus sampler?

    Can I resume jobs with the same Zeus sampler?

    I am looking to make it so that I can cancel a Python script mid-run and then resume the job. With Emcee, I do this via the backend feature.

    I want to have access to all the previous chains and samples so that I can use previously computed autocorrelation times, and whatnot. So simply passing the previous chains to a new instance of the sampler doesn't quite cut it.

    Does Zeus have a specific feature that support resuming jobs? If not, no problem, I can achieve the desired effect via pickling the EnsembleSampler (I hope!). couldn't find anything in the docs, but figured I'd ask before writing code to do this.

    opened by Jammy2211 2
  • add vectorisation option and some emcee compatibility

    add vectorisation option and some emcee compatibility

    Hi @minaskar,

    Great work here! Thanks for creating zeus.

    I am attaching a pull request that gives some emcee compatiblity, making it easier to use the samplers interchangably.

    Some changes:

    • store log probability in samples, so they can be retrieved later
    • add some functions supported by emcee to allow interchangable runs
    • make samples axis order consistent with emcee

    I just saw that this also addresses issue #1.

    I hope these changes are of interest to you.

    Cheers, Johannes

    opened by JohannesBuchner 2
  • [Feature Request] Add a `CITATION.cff`

    [Feature Request] Add a `CITATION.cff`

    Github recently released a new feature where repository owners can add a CITATION.cff file making it easy for others to cite the repository. Adding a CITATION.cff would make the attribution process very easy for others (myself included 😅 ) who want to cite this work.

    opened by SauravMaheshkar 1
  • Remove error suppression by ChainManager in case of pickling failure

    Remove error suppression by ChainManager in case of pickling failure

    Hi, thanks for this great sampler! I was having trouble getting my sampling to work with MPI, everything seemed to run fine, but I got no output. The reason was that my likelihood function was unpickleable, but that was a bit hard to track down, because the pickling errors were suppressed by the ChainManager. So here is a PR to no longer suppress these exceptions. Or if there is a good reason for it, or a better solution, then feel free to close this.

    opened by ewoudwempe 1
  • Update sklearn requirement to scikit-learn

    Update sklearn requirement to scikit-learn

    The requirement sklearn is depreciated, and causes issue when installing zeus:

      DEPRECATION: sklearn is being installed using the legacy 'setup.py install' method, because it does not have a 'pyproject.toml' and the 'wheel' package is not installed. pip 23.1 will enforce this behaviour change. A possible replacement is to enable the '--use-pep517' option. Discussion can be found at https://github.com/pypa/pip/issues/8559
      Running setup.py install for sklearn: started
      Running setup.py install for sklearn: finished with status 'error'
      error: subprocess-exited-with-error
      
      × Running setup.py install for sklearn did not run successfully.
      │ exit code: 1
      ╰─> [18 lines of output]
          The 'sklearn' PyPI package is deprecated, use 'scikit-learn'
          rather than 'sklearn' for pip commands.
          
          Here is how to fix this error in the main use cases:
          - use 'pip install scikit-learn' rather than 'pip install sklearn'
          - replace 'sklearn' by 'scikit-learn' in your pip requirements files
            (requirements.txt, setup.py, setup.cfg, Pipfile, etc ...)
          - if the 'sklearn' package is used by one of your dependencies,
            it would be great if you take some time to track which package uses
            'sklearn' instead of 'scikit-learn' and report it to their issue tracker
          - as a last resort, set the environment variable
            SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True to avoid this error
          
          More information is available at
          https://github.com/scikit-learn/sklearn-pypi-package
          
          If the previous advice does not cover your use case, feel free to report it at
          https://github.com/scikit-learn/sklearn-pypi-package/issues/new
          [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
    error: legacy-install-failure
    
    × Encountered error while trying to install package.
    ╰─> sklearn
    
    note: This is an issue with the package mentioned above, not pip.
    hint: See above for output from the failure.
    Error: Process completed with exit code 1.
    

    I can fix this myself for what I'm doing, but advise you change the requirement to scitkit-learn so other users dont have issuess.

    opened by Jammy2211 0
  • Poor parallelisation scaling

    Poor parallelisation scaling

    I found that running zeus with n_MPI = n_walker/2 gives poor efficiency, with half of the CPU time being spend idling.

    If I understand the code in EnsembleSampler.sample correctly, the stepping-out procedure it repeated until all walkers in the ensemble have reached their step-out position, and only then the shrinking procedure begins. This means that the ensemble has to wait until the last walker reaches the step-out position, during which all other walkers are idling. Please correct me if I misunderstood the implementation.

    Since the the stepping-out and shrinking procedures are independent for each walker once the directions are set, it should be possible to restructure the loops such that walkers can start shrinking as soon as they finished stepping out, rather than having to wait for the last walker.

    On a somewhat unrelated note, is there a reason the maxsteps are distributed randomly between left and right here: https://github.com/minaskar/zeus/blob/master/zeus/ensemble.py#L566 ?

    opened by tilmantroester 0
  • Provide user option to override default logging

    Provide user option to override default logging

    When calling zeus from a script/package that does its own logging, the fact that zeus relies on the root logger is somewhat vexing as it may result in duplicate logging. This commit exposes the option to pass an existing logger to the sampler so that the user can modify logging behaviour if they so desire, or capture zeus' log output in the same streams as other logs. This also enables the user to prevent zeus from clobbering handlers that have already been defined for the root logger.

    The changes do not alter the behaviour of existing code - the default of None ensures that any existing code will follow the current logic. While there are other ways of avoiding duplicate logs, this does provide more customisation options to users at rather low cost to maintain.

    opened by pscicluna 0
  • Bug: Autocorrelation fails with newest Scipy

    Bug: Autocorrelation fails with newest Scipy

    In autocorr.py, scipy is imported as: 'import scipy as sp' The function _autocorr_func_1d uses scipy for the fft function. However, in newer versions of scipy, you must call the fft function in the following way or else an AttributeError will occur: from scipy import fft fft.fft() # how to use the fft function

    Please change this soon. Thank you.

    opened by TroyGustke 0
  • Computation of R-hat Statistic

    Computation of R-hat Statistic

    First of all, thanks for the package and all your hard work!

    I think I've encountered a couple of issues/bugs in the computation of the R-hat statistic.

    1. First is just a typo I think. In lines 139-140 the chain means and variances are flattened by the list comprehension, where I think something like:
    _means = np.vstack(means)
    _vars = np.vstack(vars)
    

    will keep the structure of the chains, so that np.var(means, ddof=1, axis=0) and np.mean(_vars, axis=0) will give the between-chain and within-chain variance, respectively, across all parameters (right now they end up as scalars, since the _means and _vars are flat).

    1. This is similar in spirit to Issue #22 , but with a more significant effect here: on lines 120-121 where each split is reshaped to (-1, ndim), samples from all walkers are collapsed into each split, homogenizing them and leading to unrealistically low R-hat values. In my case I had ~28 walkers, many of which were stuck in well-separated modes and barely mixing at all, but nonetheless had a quite low split R-hat as recorded by the callback because each of the two splits had samples from all 28 walkers, making them statistically similar.

      Since this is an ensemble method I had to spend some time convincing myself, but I really think R-hat should be computed across all (possibly split) walkers, rather than by grouping them together. I can share my trace plots and make a case for this in more detail if it's helpful. With change (1) above fixing this would just be a matter of removing the reshape operation. Then nsplits would determine how many splits are made within each walker.

    Thanks again, and let me know what your thoughts are. I'm happy to help implement these changes, too.

    opened by Bobby-Huggins 0
  • Deprecation Warning for Collections

    Deprecation Warning for Collections

    ...\anaconda3\envs\zeus\Scripts\zeus.py:2: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working from collections import Iterable as IterableType

    It seems this should become "from collections.abc import Iterable as IterableType"

    However, I could not find zeus.py in the repository, so I could not make a pull request to fix this.

    I imagine this is likely to become in an error in the next year or so, if not fixed.

    opened by AdityaSavara 0
Owner
Minas Karamanis
Cosmology PhD Student at University of Edinburgh
Minas Karamanis
ML-Ensemble – high performance ensemble learning

A Python library for high performance ensemble learning ML-Ensemble combines a Scikit-learn high-level API with a low-level computational graph framew

Sebastian Flennerhag 764 Dec 31, 2022
The Python ensemble sampling toolkit for affine-invariant MCMC

emcee The Python ensemble sampling toolkit for affine-invariant MCMC emcee is a stable, well tested Python implementation of the affine-invariant ense

Dan Foreman-Mackey 1.3k Dec 31, 2022
MCMC samplers for Bayesian estimation in Python, including Metropolis-Hastings, NUTS, and Slice

Sampyl May 29, 2018: version 0.3 Sampyl is a package for sampling from probability distributions using MCMC methods. Similar to PyMC3 using theano to

Mat Leonard 304 Dec 25, 2022
SANet: A Slice-Aware Network for Pulmonary Nodule Detection

SANet: A Slice-Aware Network for Pulmonary Nodule Detection This paper (SANet) has been accepted and early accessed in IEEE TPAMI 2021. This code and

Jie Mei 39 Dec 17, 2022
Pytorch implementation of SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation

SenFormer: Efficient Self-Ensemble Framework for Semantic Segmentation Efficient Self-Ensemble Framework for Semantic Segmentation by Walid Bousselham

null 61 Dec 26, 2022
An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

An Ensemble of CNN (Python 3.5.1 Tensorflow 1.3 numpy 1.13)

null 0 May 6, 2022
Official PyTorch implementation for FastDPM, a fast sampling algorithm for diffusion probabilistic models

Official PyTorch implementation for "On Fast Sampling of Diffusion Probabilistic Models". FastDPM generation on CIFAR-10, CelebA, and LSUN datasets. S

Zhifeng Kong 68 Dec 26, 2022
Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Vision Transformer with Progressive Sampling This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

yuexy 123 Jan 1, 2023
Pytorch implementation of Straight Sampling Network For Point Cloud Learning (ICIP2021).

Pytorch code for SS-Net This is a pytorch implementation of Straight Sampling Network For Point Cloud Learning (ICIP2021). Environment Code is tested

Sun Ran 1 May 18, 2022
A data-driven approach to quantify the value of classifiers in a machine learning ensemble.

Documentation | External Resources | Research Paper Shapley is a Python library for evaluating binary classifiers in a machine learning ensemble. The

Benedek Rozemberczki 188 Dec 29, 2022
Neural Ensemble Search for Performant and Calibrated Predictions

Neural Ensemble Search Introduction This repo contains the code accompanying the paper: Neural Ensemble Search for Performant and Calibrated Predictio

AutoML-Freiburg-Hannover 26 Dec 12, 2022
Intrusion Detection System using ensemble learning (machine learning)

IDS-ML implementation of an intrusion detection system using ensemble machine learning methods Data set This project is carried out using the UNSW-15

null 4 Nov 25, 2022
Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning This repository is official Tensorflow implementation of paper: Ensemb

Seunghyun Lee 12 Oct 18, 2022
Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

Little Ball of Fur is a graph sampling extension library for Python. Please look at the Documentation, relevant Paper, Promo video and External Resour

Benedek Rozemberczki 619 Dec 14, 2022
Code for ICLR 2021 Paper, "Anytime Sampling for Autoregressive Models via Ordered Autoencoding"

Anytime Autoregressive Model Anytime Sampling for Autoregressive Models via Ordered Autoencoding , ICLR 21 Yilun Xu, Yang Song, Sahaj Gara, Linyuan Go

Yilun Xu 22 Sep 8, 2022
《LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classification》(AAAI 2021) GitHub:

LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classication

null 76 Dec 5, 2022
Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis in JAX

SYMPAIS: Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis Overview | Installation | Documentation | Examples | Notebo

Yicheng Luo 4 Sep 13, 2022
NAS Benchmark in "Prioritized Architecture Sampling with Monto-Carlo Tree Search", CVPR2021

NAS-Bench-Macro This repository includes the benchmark and code for NAS-Bench-Macro in paper "Prioritized Architecture Sampling with Monto-Carlo Tree

null 35 Jan 3, 2023