Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study

Kentaro Matsuura

Last update: Nov 1, 2022

Related tags

Deep Learning Optimal_Adaptive_Allocation_in_a_Dose-Response_Study

Overview

Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study

Supplementary Materials for Kentaro Matsuura, Junya Honda, Imad El Hanafi, Takashi Sozu, Kentaro Sakamaki "Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study" Statistics in Medicine 202x; (doi:xxxxx)

How to Setup

We recommend using Linux or WSL on Windows, because the Ray package in Python is more stable on Linux. For example, in Ubuntu 20.04 (Python 3.8 was already installed), I was able to install the necessary packages with the following commands.

Install Ray

sudo apt update
sudo apt upgrade
sudo apt install python3-pip
sudo pip3 install tensorflow numpy pandas gym
sudo apt install cmake
sudo pip3 install -U ray
sudo pip3 install 'ray[rllib]'

Install R and RPy2

echo -e "\n## For R package"  | sudo tee -a /etc/apt/sources.list
echo "deb https://cloud.r-project.org/bin/linux/ubuntu $(lsb_release -cs)-cran40/" | sudo tee -a /etc/apt/sources.list
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
sudo apt update
sudo apt install make g++ r-base
sudo apt install libxml2-dev libssl-dev libcurl4-openssl-dev
sudo pip3 install rpy2

Install `DoseFinding` package in R

install.packages('DoseFinding')

How to Use

Change simulation settings

To change the simulation settings, it is necessary to understand MCPMod/envs/MCPModEnv.py. This part is a bit difficult because of the interaction between R and Python. Therefore, we have a plan to create an R package to use our method easily.

Obtain adaptive allocation rule

To obtain RL-MAE by learning, please run learn_RL-MAE.py like:

nohup python3 learn_RL-MAE.py > std.log 2> err.log &

To obtain other RL-methods, please change the reward_type in line 25 in learn_RL-MAE.py to something like score_TD, then run the modified file.

When we used c2-standard-4（vCPUx4, RAM16GB) on Google Cloud Platform, the learning was completed within a day.

Simulate single trial

After the learning, we will obtain a checkpoint in ~/ray_results/PPO_MCPMod-v0_[datetime]-[xxx]/checkpoint-[yyy]/. To simulate single trial using the obtained rule, please move the checkpoint files (checkpoint and checkpoint.tune_metadata) in the directory to checkpoint/ in this repository, and rename the files as you like (see the example files). Then, please run simulate-single-trial_RL-MAE.py like:

python3 simulate-single-trial_RL-MAE.py

A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

A tour through tensorflow with financial data I present several models ranging in complexity from simple regression to LSTM and policy networks. The s

195 Dec 7, 2022

Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

Transformers for variable misuse, function naming and code completion tasks The official PyTorch implementation of: Empirical Study of Transformers fo

56 Nov 15, 2022

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

pytorch-a2c-ppo-acktr Update (April 12th, 2021) PPO is great, but Soft Actor Critic can be better for many continuous control tasks. Please check out

3k Jan 9, 2023

3k Dec 31, 2022

Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*

Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*. The algorithm was extremely optimal running in ~15s to ~30s for search spaces as big as 10000000 nodes where a set of 18 actions could be performed at each node in the 3D Maze.

1 Mar 28, 2022

Official implementation of NLOS-OT: Passive Non-Line-of-Sight Imaging Using Optimal Transport (IEEE TIP, accepted)

Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study

Related tags

Overview

Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study

How to Setup

Install Ray

Install R and RPy2

Install `DoseFinding` package in R

How to Use

Change simulation settings

Obtain adaptive allocation rule

Simulate single trial

You might also like...

A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*

Official implementation of NLOS-OT: Passive Non-Line-of-Sight Imaging Using Optimal Transport (IEEE TIP, accepted)

Code for paper "Vocabulary Learning via Optimal Transport for Neural Machine Translation"

[ICLR 2021] Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization

AutoPentest-DRL: Automated Penetration Testing Using Deep Reinforcement Learning

Owner

Kentaro Matsuura

An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

PyTorch Implementation for AAAI'21 "Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection"

EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising

This code uses generative adversarial networks to generate diverse task allocation plans for Multi-agent teams.

Portfolio Optimization and Quantitative Strategic Asset Allocation in Python

Tf alloc - Simplication of GPU allocation for Tensorflow2

The Environment I built to study Reinforcement Learning + Pokemon Showdown

Adaptive Attention Span for Reinforcement Learning

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study

Related tags

Overview

Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study

How to Setup

Install Ray

Install R and RPy2

Install DoseFinding package in R

How to Use

Change simulation settings

Obtain adaptive allocation rule

Simulate single trial

You might also like...

A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*

Official implementation of NLOS-OT: Passive Non-Line-of-Sight Imaging Using Optimal Transport (IEEE TIP, accepted)

Code for paper "Vocabulary Learning via Optimal Transport for Neural Machine Translation"

[ICLR 2021] Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization

AutoPentest-DRL: Automated Penetration Testing Using Deep Reinforcement Learning

Owner

Kentaro Matsuura

An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

PyTorch Implementation for AAAI'21 "Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection"

EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising

This code uses generative adversarial networks to generate diverse task allocation plans for Multi-agent teams.

Portfolio Optimization and Quantitative Strategic Asset Allocation in Python

Tf alloc - Simplication of GPU allocation for Tensorflow2

The Environment I built to study Reinforcement Learning + Pokemon Showdown

Adaptive Attention Span for Reinforcement Learning

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

Install `DoseFinding` package in R