Code for the paper Jointly Efficient and Optimal Algorithms for Logistic Bandits, by Louis Faury, Marc Abeille, Clément Calauzènes and Kwang-Sun Jun.
Install
Clone the repository and run:
$ pip install .
Usage
This code implements the adaECOLog
algorithms (OFU and TS variants) - both from the aforedmentioned paper, along with several baselines (oldest to newest):
GLM-UCB
from Filippi et al. 2010,OL2M
from Zhang et al. 2016,GLOC
from Jun et al. 2017,LogUCB1
from Faury et al. 2020,OFULog-r
from Abeille et al. 2021.
Experiments can be ran for several Logistic Bandit (i.e structured Bernoulli feedback) environments, such as static and time-varying finite arm-sets, or inifinite arm-sets (e.g. unit ball).
Single Experiment
Single experiments (one algorithm for one environment) can be ran thanks to scripts/run_example.py
. The script instantiate the algorithm and environment indicated in the file scripts/configs/example_config.py
and plots the regret.
Benchmark
Benchmarks can be obtained thanks to scripts/run_all.py
. This script runs experiments for any config file in scripts/configs/generated_configs/
and stores the result in scripts/logs/
.
Plot results
You can use scripts/plot_regret.py
to plot regret curves. This scripts plot regret curves for all logs in scripts/logs/
that match the indicated dimension and parameter norm.
usage: plot_regret.py [-h] [-d [D]] [-pn [PN]]
Plot regret curves (by default for dimension=2 and parameter norm=3)
optional arguments:
-h, --help show this help message and exit
-d [D] Dimension (default: 2)
-pn [PN] Parameter norm (default: 4.0)
Generating configs
You can automatically generate config files thanks to scripts/generate_configs.py
.
usage: generate_configs.py [-h] [-dims DIMS [DIMS ...]] [-pn PN [PN ...]] [-algos ALGOS [ALGOS ...]] [-r [R]] [-hz [HZ]] [-ast [AST]] [-ass [ASS]] [-fl [FL]]
Automatically creates configs, stored in configs/generated_configs/
optional arguments:
-h, --help show this help message and exit
-dims DIMS [DIMS ...]
Dimension (default: None)
-pn PN [PN ...] Parameter norm (||theta_star||) (default: None)
-algos ALGOS [ALGOS ...]
Algorithms. Possibilities include GLM-UCB, LogUCB1, OFULog-r, OL2M, GLOC or adaECOLog (default: None)
-r [R] # of independent runs (default: 20)
-hz [HZ] Horizon, normalized (later multiplied by sqrt(dim)) (default: 1000)
-ast [AST] Arm set type. Must be either fixed_discrete, tv_discrete or ball (default: fixed_discrete)
-ass [ASS] Arm set size, normalized (later multiplied by dim) (default: 10)
-fl [FL] Failure level, must be in (0,1) (default: 0.05)
For instance running python generate_configs.py -dims 2 -pn 3 4 5 -algos GLM-UCB GLOC OL2M adaECOLog
generates configs in dimension 2 for GLM-UCB
, GLOC
, OL2M
and adaECOLog
, for environments (set as defaults) of ground-truth norm 3, 4 and 5.