Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Jack Parker-Holder

Last update: Nov 16, 2022

Related tags

Deep Learning PB2

Overview

Population-Based Bandits (PB2)

Code for the Population-Based Bandits (PB2) Algorithm, from the paper Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits.

The framework is based on a union of ray (using rllib and tune) and GPy. Heavily inspired by the ray tune pbt_ppo example.

NOTE PB2 is included in the ray.tune library, which is the official supported implementation. The link to the code is here, and the accompanying blog post is here.

Running the Code

To run the IMPALA experiment, use command:

python run_impala.py

To run the PPO experiment, use command:

python run_ppo.py

Config

Within that function, there are multiple ways to mix it up. You can choose the following:

-env_name: for example BreakoutNoFrameSkip-v4.
-method: either pb2 or pbt (or asha for PPO).
-freq: the frequency of updating hyperparams, we use 500,000 for IMPALA and 50,000 for PPO.
-seed: we used 0 1 2 3 4 5 6... and plan to add more seeds.
-max: the maximum number of timesteps, we used 10,000,000 for IMPALA and 1,000,000 for PPO.

It should also be possible to adapt this code to run other ray tune schedulers. We used it for ASHA in our PPO experiments. We are also working to include a BOHB baseline.

Please get in touch for all questions. jackph [at] robots [dot] ox [dot] ac [dot] uk

Citing PB2

Finally, if you found this repo useful, please consider citing us:

@inproceedings{NEURIPS2020_c7af0926,
 author = {Parker-Holder, Jack and Nguyen, Vu and Roberts, Stephen J},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
 pages = {17200--17211},
 publisher = {Curran Associates, Inc.},
 title = {Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits},
 url = {https://proceedings.neurips.cc/paper/2020/file/c7af0926b294e47e52e46cfebe173f20-Paper.pdf},
 volume = {33},
 year = {2020}
}

Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

GraphMask This repository contains an implementation of GraphMask, the interpretability technique for graph neural networks presented in our ICLR 2021

29 Sep 2, 2022

The materials used in the SaxonJS tutorial presented at Declarative Amsterdam, 2021

Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Related tags

Overview

Population-Based Bandits (PB2)

Running the Code

Config

Citing PB2

You might also like...

Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

The materials used in the SaxonJS tutorial presented at Declarative Amsterdam, 2021

Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021.

Collection of TensorFlow2 implementations of Generative Adversarial Network varieties presented in research papers.

Code for ICE-BeeM paper - NeurIPS 2020

Code for Discriminative Sounding Objects Localization (NeurIPS 2020)

Code release for NeurIPS 2020 paper "Co-Tuning for Transfer Learning"

UDP++ (ECCVW 2020 Oral), (Winner of COCO 2020 Keypoint Challenge).

git《Beta R-CNN: Looking into Pedestrian Detection from Another Perspective》(NeurIPS 2020) GitHub:[fig3]

Owner

Jack Parker-Holder

This GitHub repository contains code used for plots in NeurIPS 2021 paper 'Stochastic Multi-Armed Bandits with Control Variates.'

TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face Manipulation" published in CVPR 2020.

Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization Algorithm,Immune Algorithm, Artificial Fish Swarm Algorithm, Differential Evolution and TSP(Traveling salesman)

Prototypical python implementation of the trust-region algorithm presented in Sequential Linearization Method for Bound-Constrained Mathematical Programs with Complementarity Constraints by Larson, Leyffer, Kirches, and Manns.

A parallel framework for population-based multi-agent reinforcement learning.

Simulate genealogical trees and genomic sequence data using population genetic models

Clustering with variational Bayes and population Monte Carlo

Locally cache assets that are normally streamed in POPULATION: ONE

Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"