Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

Last update: Oct 13, 2021

Related tags

Deep Learning bayesian_model_revision

Overview

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

Installation

We use pip to install things into a python virtual environment. Refer to requirements.txt for package requirements. We use nestly + SCons to run simulations.

File descriptions

generate_data_single_pop.py -- Simulate a data stream from a single population following a logistic regression model.

Inputs:
- --simulation: string for selecting the type of distribution shift. Options for this argument are the keys in SIM_SETTINGS in constants.py.
Outputs:
- --out-file: pickle file containing the data stream

generate_data_two_pop.py -- Simulate a data stream from two subpopulations, where each are generated using logistic regression models. Similar arguments as generate_data_single_pop.py. The percentage split beween the two subpopulations is controlled by the --subpopulations argument.

Outputs:
- --out-file: pickle file containing the data stream

create_modeler.py -- Creates a model developer who fits the original prediction model and may propose a continually refitted model at each time point.

Inputs:
- --data-file: pickle file with the entire data stream
- --simulation: string for selecting the model refitting strategy by the model developer. Options are to keep the model locked (locked), refit on all accumulated data (cumulative_refit), and refit on the latest observations within some window length (boxed, window length specified by --max-box). The last two options is to train an ensemble with the original and the cumulative_refit models (combo_refit) and train an ensemble with the original and the boxed models (combo_boxed).
Outputs:
- --out-file: pickle file containing the modeler

main.py -- Given the data and the model developer, run online model recalibration/revision using MarBLR and BLR.

Inputs:
- --data-file: pickle file with the entire data stream
- --model-file: pickle file with the model developer
- --type-i-regret-factor: Type I regret will be controlled at the rate of args.type_i_regret_factor * (Initial loss of the original model)
- --reference-recalibs: comma-separated string to select which other online model revisers to run. Options are no updating at all locked, ADAM adam, cumulative logistic regression cumulativeLR.
Outputs:
- --obs-scores-file: csv file containing predicted probabilities and observed outcomes on the data stream
- --history-file: csv file containing the predicted and actual probabilities on a held-out test data stream (only available if the data stream was simulated)
- --scores-file: csv file containing performance measures on a held-out test data stream (only available if the data stream was simulated)
- --recalibrators-file: pickle file containing the history of the online model revisers

Reproducing simulation results

The simulation_recalib folder contains the first set of simulations for online model recalibration. The simulation_revise folder contains the second set of simulations where we perform online logistic revision. The simulation_revise folder contains the third set of simulations where we perform online ensembling of the original model with a continually refitted model. The copd_analysis folder contains code for online model recalibration and revision for the COPD dataset. To reproduce the simulations, run scons.

Code for all the Advent of Code'21 challenges mostly written in python

Advent of Code 21 Code for all the Advent of Code'21 challenges mostly written in python. They are not necessarily the best or fastest solutions but j

4 May 26, 2022

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

32 Nov 9, 2021

Opinionated code formatter, just like Python's black code formatter but for Beancount

beancount-black Opinionated code formatter, just like Python's black code formatter but for Beancount Try it out online here Features MIT licensed - b

16 Oct 11, 2022

a delightful machine learning tool that allows you to train, test and use models without writing code

igel A delightful machine learning tool that allows you to train/fit, test and use models without writing code Note I'm also working on a GUI desktop

3k Jan 5, 2023

Pytorch Lightning code guideline for conferences

Deep learning project seed Use this seed to start new deep learning / ML projects. Built in setup.py Built in requirements Examples with MNIST Badges

1k Jan 2, 2023

Automatically Build Multiple ML Models with a Single Line of Code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

Auto-ViML Automatically Build Variant Interpretable ML models fast! Auto_ViML is pronounced "auto vimal" (autovimal logo created by Sanket Ghanmare) N

397 Dec 30, 2022

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

Related tags

Overview

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

Installation

File descriptions

Reproducing simulation results

You might also like...

Code for all the Advent of Code'21 challenges mostly written in python

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

Opinionated code formatter, just like Python's black code formatter but for Beancount

a delightful machine learning tool that allows you to train, test and use models without writing code

Pytorch Lightning code guideline for conferences

Automatically Build Multiple ML Models with a Single Line of Code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

Code samples for my book "Neural Networks and Deep Learning"

Code for: https://berkeleyautomation.github.io/bags/

Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

Owner

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

A code generator from ONNX to PyTorch code

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

Convert Python 3 code to CUDA code.

Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from dataset creation to model training and evaluation. Comes with pretrained models.

Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Low-code/No-code approach for deep learning inference on devices