PolytopeSampler is a Matlab implementation of constrained Riemannian Hamiltonian Monte Carlo for sampling from high dimensional disributions on polytopes

Overview

PolytopeSampler

PolytopeSampler is a Matlab implementation of constrained Riemannian Hamiltonian Monte Carlo for sampling from high dimensional disributions on polytopes. It is able to sample efficiently from sets and distributions with more than 100K dimensions.

Quick Tutorial

PolytopeSampler samples from distributions of the form exp(-f(x)), for a convex function f, subject to constraints Aineq * x <= bineq, Aeq * x = beq and lb <= x <= ub.

The function f can be specified by arrays containing its first and second derivative or function handles. Only the first derivative is required. By default, f is empty, which represents a uniform distribution. If the first derivative is a function handle, then the function and its second derivatives must also be provided.

To sample N points from a polytope P, you can call sample(P, N). The function sample will

  1. Find an initial feasible point
  2. Run constrained Hamiltonian Monte Carlo
  3. Test convergence of the sampling algorithm by computing Effective Sample Size (ESS) and terminate when ESS >= N. If the target distribution is uniform, a uniformity test will also be performed.

Extra parameters can be set up using opts. Some useful parameters include maxTime and maxStep. By default, they are set to

                        maxTime: 86400 (max sampling time in seconds)
                        maxStep: 300000 (maximum number of steps)

The output is a struct o, which stores samples generated in o.samples and a summary of the sample in o.summary. o.samples is an array of size dim x #steps.

Example

We demonstrate PolytopeSampler using a simple example, sampling uniformly from a simplex. The polytope is defined by

>> P = struct;
>> d = 10;
>> P.Aeq = ones(1, d);
>> P.beq = 1;
>> P.lb = zeros(d, 1);

The polytope has dimension d = 10 with constraint sum_i x_i = 1 and x >= 0. This is a simplex. To generate 200 samples uniformly from the polytope P, we call the function sample().

>> o = sample(P, 200);
  Time spent |  Time reamin |                  Progress | Samples |  AccProb | StepSize |  MixTime
00d:00:00:01 | 00d:00:00:00 | ######################### | 211/200 | 0.989903 | 0.200000 |     11.2
Done!

We can access the samples generated using

>> o.samples

We can print a summary of the samples:

>> o.summary

ans =

  10×7 table

                     mean        std         25%         50%         75%      n_ess      r_hat 
                   ________    ________    ________    ________    _______    ______    _______

    samples[1]     0.093187    0.091207    0.026222    0.064326    0.13375    221.51    0.99954
    samples[2]     0.092815    0.086905    0.027018    0.066017    0.13221    234.59     1.0301
    samples[3]      0.10034    0.090834    0.030968    0.075631    0.13788    216.56     1.0159
    samples[4]      0.10531    0.092285    0.035363    0.077519     0.1481    235.25     1.0062
    samples[5]      0.10437    0.087634    0.034946    0.080095     0.1533    212.54    0.99841
    samples[6]       0.1029    0.093724    0.028774    0.074354    0.15135     227.6     1.0052
    samples[7]       0.1042    0.083084    0.038431    0.081964    0.15352    231.54     1.0008
    samples[8]     0.088778    0.086902    0.025565    0.062473    0.11837    229.69     1.0469
    samples[9]      0.10627     0.09074    0.036962    0.084294    0.15125    211.64    0.99856
    samples[10]     0.10184    0.084699    0.035981    0.074923    0.14578    230.63     1.0277

n_ess shows the effective sample size of the samples generated. r_hat tests the convergence of the sampling algorithm. A value of r_hat close to 1 indicates that the algorithm has converged properly.

See demo.m for more examples, including examples of sampling from non-uniform distributions.

Comments
  • Better chol and leverage score

    Better chol and leverage score

    There are two things need to be done

    1. run both col and leverage score row by row instead of col by col
    2. Reorder the loop so that it is more cache friendly
    enhancement 
    opened by YinTat 2
  • Fixed naming conflict that prevented IBM ILOG CPLEX to initialize in …

    Fixed naming conflict that prevented IBM ILOG CPLEX to initialize in …

    Dear all,

    I identified the naming of the function setfield.m to create problems when intializing the IBM ILOG Cplex solver in the cobratoolbox. This is the result of a naming conflict with the Cplex object. I changed the name of this function to Setfield.m to solve this issue. I hope that you will accept this pull request. Please let me know if you have any concerns or questions.

    Kind regards, Tim Hensen (ORCID: https://orcid.org/0000-0003-2459-3755)

    opened by trjhensen 1
  • Volumetric Barrier

    Volumetric Barrier

    Note that tv_ball2 has slow mixing time tv_ball has fast mixing time.

    In general, seems many polytope is small due to the uniform weight of log barrier. Probably it is better to switch to log barrier

    enhancement 
    opened by YinTat 1
  • Make sure the program works well for naive solver

    Make sure the program works well for naive solver

    For dense matrices, our linear solver is pretty slow. So, make sure the program can select the right solver and the program runs efficiently using JL.

    (Example: ME_matrix_GlcAer_WT)

    bug 
    opened by YinTat 0
  • Many changes

    Many changes

    Here is the list of changes:

    • Support SIMD, but it is not optimized now.
    • Support parallel chains
    • Clean up the C codes
    • The sampler do some "warm start" first
    • uniform test further thin the sample. Seems "ess" is not independent enough
    opened by YinTat 0
Owner
null
HiPlot makes understanding high dimensional data easy

HiPlot - High dimensional Interactive Plotting HiPlot is a lightweight interactive visualization tool to help AI researchers discover correlations and

Facebook Research 2.4k Jan 4, 2023
A Python toolbox for gaining geometric insights into high-dimensional data

"To deal with hyper-planes in a 14 dimensional space, visualize a 3D space and say 'fourteen' very loudly. Everyone does it." - Geoff Hinton Overview

Contextual Dynamics Laboratory 1.6k Feb 17, 2021
HiPlot makes understanding high dimensional data easy

HiPlot - High dimensional Interactive Plotting HiPlot is a lightweight interactive visualization tool to help AI researchers discover correlations and

Facebook Research 2k Feb 17, 2021
A simple python script using Numpy and Matplotlib library to plot a Mohr's Circle when given a two-dimensional state of stress.

Mohr's Circle Calculator This is a really small personal project done for Department of Civil Engineering, Delhi Technological University (formerly, D

Agyeya Mishra 0 Jul 17, 2021
The Python ensemble sampling toolkit for affine-invariant MCMC

emcee The Python ensemble sampling toolkit for affine-invariant MCMC emcee is a stable, well tested Python implementation of the affine-invariant ense

Dan Foreman-Mackey 1.3k Jan 4, 2023
A high performance implementation of HDBSCAN clustering. http://hdbscan.readthedocs.io/en/latest/

HDBSCAN Now a part of scikit-learn-contrib HDBSCAN - Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over va

Leland McInnes 91 Dec 29, 2022
A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews

hvPlot A high-level plotting API for the PyData ecosystem built on HoloViews. Build Status Coverage Latest dev release Latest release Docs What is it?

HoloViz 697 Jan 6, 2023
The open-source tool for building high-quality datasets and computer vision models

The open-source tool for building high-quality datasets and computer vision models. Website • Docs • Try it Now • Tutorials • Examples • Blog • Commun

Voxel51 2.4k Jan 7, 2023
A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews

hvPlot A high-level plotting API for the PyData ecosystem built on HoloViews. Build Status Coverage Latest dev release Latest release Docs What is it?

HoloViz 349 Feb 15, 2021
The open-source tool for building high-quality datasets and computer vision models

The open-source tool for building high-quality datasets and computer vision models. Website • Docs • Try it Now • Tutorials • Examples • Blog • Commun

Voxel51 209 Feb 17, 2021
High-level geospatial data visualization library for Python.

geoplot: geospatial data visualization geoplot is a high-level Python geospatial plotting library. It's an extension to cartopy and matplotlib which m

Aleksey Bilogur 1k Jan 1, 2023
Some problems of SSLC ( High School ) before outputs and after outputs

Some problems of SSLC ( High School ) before outputs and after outputs 1] A Python program and its output (output1) while running the program is given

Fayas Noushad 3 Dec 1, 2021
ecoglib: visualization and statistics for high density microecog signals

ecoglib: visualization and statistics for high density microecog signals This library contains high-level analysis tools for "topos" and "chronos" asp

null 1 Nov 17, 2021
Collection of scripts for making high quality beautiful math-related posters.

Poster Collection of scripts for making high quality beautiful math-related posters. The poster can have as large printing size as 3x2 square feet wit

Nattawut Phetmak 3 Jun 9, 2022
Python ts2vg package provides high-performance algorithm implementations to build visibility graphs from time series data.

ts2vg: Time series to visibility graphs The Python ts2vg package provides high-performance algorithm implementations to build visibility graphs from t

Carlos Bergillos 26 Dec 17, 2022
High performance, editable, stylable datagrids in jupyter and jupyterlab

An ipywidgets wrapper of regular-table for Jupyter. Examples Two Billion Rows Notebook Click Events Notebook Edit Events Notebook Styling Notebook Pan

J.P. Morgan Chase 75 Dec 15, 2022
D-Analyst : High Performance Visualization Tool

D-Analyst : High Performance Visualization Tool D-Analyst is a high performance data visualization built with python and based on OpenGL. It allows to

null 4 Apr 14, 2022
Parallel t-SNE implementation with Python and Torch wrappers.

Multicore t-SNE This is a multicore modification of Barnes-Hut t-SNE by L. Van der Maaten with python and Torch CFFI-based wrappers. This code also wo

Dmitry Ulyanov 1.7k Jan 9, 2023
Parallel t-SNE implementation with Python and Torch wrappers.

Multicore t-SNE This is a multicore modification of Barnes-Hut t-SNE by L. Van der Maaten with python and Torch CFFI-based wrappers. This code also wo

Dmitry Ulyanov 1.5k Feb 17, 2021