Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Related tags

elicited
Overview

Elicited

Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Credit to Brett Hoover, packaging by @magoo

Usage

pip install elicited
import elicited as e

elicited is just a helper tool when using numpy and scipy, so you'll need these in your code.

import numpy as np
from scipy.stats import poisson, zipf, beta, pareto, lognorm

Lognormal

See Occurance and Applications for examples of lognormal distributions in nature.

Expert: Most customers hold around $20K (mode) but I could imagine a customer with $2.5M (max)

mode = 20000
max = 2500000

mean, stdv = e.elicitLogNormal(mode, max)
asset_values = lognorm(s=stdv, scale=np.exp(mean))
asset_values.rvs(100)

Pareto

The 80/20 rule. See Occurance and Applications

Expert: The legal costs of an incident could be devastating. Typically costs are almost zero (val_min) but a black swan could be $100M (val_max).

b = e.elicitPareto(val_min, val_max)
p = pareto(b, loc=val_min-1., scale=1.))

PERT

See PERT Distribution

Expert: Our customers have anywhere from $500-$6000 (val_min / val_max), but it's most typically around $4500 (val_mod)

PERT_a, PERT_b = e.elicitPERT(val_min, val_mod, val_max)
pert = beta(PERT_a, PERT_b, loc=val_min, scale=val_max-val_min)

Zipf's

See Applications

Expert: If we get sued, there will only be a few litigants (nMin). Very rarely it could be 30 or more litigants (nMax), maybe once every thousand cases (pMax) it would be more.

nMin = 1
nMax = 30
pMax = 1/1000

Zs = e.elicitZipf(nMin, nMax, pMax, report=True)

litigants = zipf(Zs, nMin-1)

litigants.rvs(100)

Reference: Other Useful Elicitations

Listed as a courtesy, these distributions are simple enough to elicit data into directly without a helper function.

Uniform

A "zero knowledge" distribution where all values within the range have equal probability of appearing. Similar to random.randint(a, b)

Expert: The crowd will be between 50 (min) and 500 (max) due to fire code restrictions and the existing residents in the building.

from scipy.stats import uniform

min = 50
max = 500

range = max - min

crowd_size = uniform(min, range)
crowd_size.rvs(100)

Poisson

Expert: About 3000 Customers (average) add a credit card to their account every quarter.

from scipy.stats import poisson
average = 3000
upsells = poisson(average)
upsells.rvs(100)
Owner
Ryan McGeehan
Founder / Advisor @ HackerOne Former Director of Security @ Coinbase Former Director of Security @ Facebook
Ryan McGeehan
Gaussian processes in TensorFlow

Website | Documentation (release) | Documentation (develop) | Glossary Table of Contents What does GPflow do? Installation Getting Started with GPflow

GPflow 1.5k Oct 22, 2021
A probabilistic programming language in TensorFlow. Deep generative models, variational inference.

Edward is a Python library for probabilistic modeling, inference, and criticism. It is a testbed for fast experimentation and research with probabilis

Blei Lab 4.7k Oct 14, 2021
A probabilistic programming library for Bayesian deep learning, generative models, based on Tensorflow

ZhuSuan is a Python probabilistic programming library for Bayesian deep learning, which conjoins the complimentary advantages of Bayesian methods and

Tsinghua Machine Learning Group 2.1k Oct 21, 2021
A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.

Disclaimer This project is stable and being incubated for long-term support. It may contain new experimental code, for which APIs are subject to chang

Uber Open Source 696 Oct 23, 2021
Using approximate bayesian posteriors in deep nets for active learning

Bayesian Active Learning (BaaL) BaaL is an active learning library developed at ElementAI. This repository contains techniques and reusable components

ElementAI 444 Oct 17, 2021
Functional tensors for probabilistic programming

Funsor Funsor is a tensor-like library for functions and distributions. See Functional tensors for probabilistic programming for a system description.

null 182 Oct 13, 2021
Retentioneering 384 Oct 17, 2021
Pandas and Dask test helper methods with beautiful error messages.

beavis Pandas and Dask test helper methods with beautiful error messages. test helpers These test helper methods are meant to be used in test suites.

Matthew Powers 7 Sep 9, 2021
pyhsmm MITpyhsmm - Bayesian inference in HSMMs and HMMs. MIT

Bayesian inference in HSMMs and HMMs This is a Python library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and expli

Matthew Johnson 501 Oct 13, 2021
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) an

PyMC 6.1k Oct 23, 2021
Fast, flexible and easy to use probabilistic modelling in Python.

Please consider citing the JMLR-MLOSS Manuscript if you've used pomegranate in your academic work! pomegranate is a package for building probabilistic

Jacob Schreiber 2.7k Oct 18, 2021
statDistros is a Python library for dealing with various statistical distributions

StatisticalDistributions statDistros statDistros is a Python library for dealing with various statistical distributions. Now it provides various stati

null 1 Oct 3, 2021
Supply a wrapper ``StockDataFrame`` based on the ``pandas.DataFrame`` with inline stock statistics/indicators support.

Stock Statistics/Indicators Calculation Helper VERSION: 0.3.2 Introduction Supply a wrapper StockDataFrame based on the pandas.DataFrame with inline s

Cedric Zhuang 869 Oct 18, 2021
Statsmodels: statistical modeling and econometrics in Python

About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an

statsmodels 6.8k Oct 24, 2021
Additional tools for particle accelerator data analysis and machine information

PyLHC Tools This package is a collection of useful scripts and tools for the Optics Measurements and Corrections group (OMC) at CERN. Documentation Au

PyLHC 3 Oct 5, 2021
A highly efficient and modular implementation of Gaussian Processes in PyTorch

GPyTorch GPyTorch is a Gaussian process library implemented using PyTorch. GPyTorch is designed for creating scalable, flexible, and modular Gaussian

null 2.5k Oct 22, 2021
Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.

weightedcalcs weightedcalcs is a pandas-based Python library for calculating weighted means, medians, standard deviations, and more. Features Plays we

Jeremy Singer-Vine 87 Sep 26, 2021
PyPSA: Python for Power System Analysis

1 Python for Power System Analysis Contents 1 Python for Power System Analysis 1.1 About 1.2 Documentation 1.3 Functionality 1.4 Example scripts as Ju

null 548 Oct 24, 2021
We're Team Arson and we're using the power of predictive modeling to combat wildfires.

We're Team Arson and we're using the power of predictive modeling to combat wildfires. Arson Map Inspiration There’s been a lot of wildfires in Califo

Jerry Lee 2 Oct 17, 2021