Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFlow 2

Overview

DreamerPro

Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFlow 2. A re-implementation of Temporal Predictive Coding for Model-Based Planning in Latent Space is also included.

DreamerPro makes large performance gains on the DeepMind Control suite both in the standard setting and when there are complex background distractions. This is achieved by combining Dreamer with prototypical representations that free the world model from reconstructing visual details.

Setup

Dependencies

First clone the repository, and then set up a conda environment with all required dependencies using the requirements.txt file:

git clone https://github.com/fdeng18/dreamer-pro.git
cd dreamer-pro
conda create --name dreamer-pro python=3.8 conda-forge::cudatoolkit conda-forge::cudnn
conda activate dreamer-pro
pip install --upgrade pip
pip install -r requirements.txt

DreamerPro has not been tested on Atari, but if you would like to try, the Atari ROMs can be imported by following these instructions.

Natural background videos

Our natural background setting follows TPC. For convenience, we have included their code to download the background videos. Simply run:

python download_videos.py

This will download the background videos into kinetics400/videos.

Training

DreamerPro

For standard DMC, run:

cd DreamerPro
python dreamerv2/train.py --logdir log/dmc_{task}/dreamer_pro/{run} --task dmc_{task} --configs defaults dmc norm_off

Here, {task} should be replaced by the actual task, and {run} should be assigned an integer indicating the independent runs of the same model on the same task. For example, to start the first run on walker_run:

cd DreamerPro
python dreamerv2/train.py --logdir log/dmc_walker_run/dreamer_pro/1 --task dmc_walker_run --configs defaults dmc norm_off

For natural background DMC, run:

cd DreamerPro
python dreamerv2/train.py --logdir log/nat_{task}/dreamer_pro/{run} --task nat_{task} --configs defaults dmc reward_1000

TPC

DreamerPro is based on a newer version of Dreamer. For fair comparison, we re-implement TPC based on the same version. Our re-implementation obtains better results in the natural background setting than reported in the original TPC paper.

For standard DMC, run:

cd TPC
python dreamerv2/train.py --logdir log/dmc_{task}/tpc/{run} --task dmc_{task} --configs defaults dmc

For natural background DMC, run:

cd TPC
python dreamerv2/train.py --logdir log/nat_{task}/tpc/{run} --task nat_{task} --configs defaults dmc reward_1000

Dreamer

For standard DMC, run:

cd Dreamer
python dreamerv2/train.py --logdir log/dmc_{task}/dreamer/{run} --task dmc_{task} --configs defaults dmc

For natural background DMC, run:

cd Dreamer
python dreamerv2/train.py --logdir log/nat_{task}/dreamer/{run} --task nat_{task} --configs defaults dmc reward_1000 --precision 32

We find it necessary to use --precision 32 in the natural background setting for numerical stability.

Outputs

The training process can be monitored via TensorBoard. We have also included performance curves in plots. Note that these curves may appear different from what is shown in TensorBoard. This is because the evaluation return in the performance curves is averaged over 10 episodes, while TensorBoard only shows the evaluation return of the last episode.

Acknowledgments

This repository is largely based on the TensorFlow 2 implementation of Dreamer. We would like to thank Danijar Hafner for releasing and updating his clean implementation. In addition, we also greatly appreciate the help from Tung Nguyen in implementing TPC.

You might also like...
Tensorflow 2 implementation of the paper: Learning and Evaluating Representations for Deep One-class Classification published at ICLR 2021

Deep Representation One-class Classification (DROC). This is not an officially supported Google product. Tensorflow 2 implementation of the paper: Lea

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

CURL Rainbow Status: Archive (code is provided as-is, no updates expected) This is an implementation of CURL: Contrastive Unsupervised Representations

Pretraining Representations For Data-Efficient Reinforcement Learning

Pretraining Representations For Data-Efficient Reinforcement Learning Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Ch

Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX
Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

Model-based 3D Hand Reconstruction via Self-Supervised Learning, CVPR2021

S2HAND: Model-based 3D Hand Reconstruction via Self-Supervised Learning S2HAND presents a self-supervised 3D hand reconstruction network that can join

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper
[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

DataFree A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation" Authors: Gongfa

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition
The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

Action Transformer A Self-Attention Model for Short-Time Human Action Recognition This repository contains the official TensorFlow implementation of t

A repository that shares tuning results of trained models generated by TensorFlow / Keras. Post-training quantization (Weight Quantization, Integer Quantization, Full Integer Quantization, Float16 Quantization), Quantization-aware training. TensorFlow Lite. OpenVINO. CoreML. TensorFlow.js. TF-TRT. MediaPipe. ONNX. [.tflite,.h5,.pb,saved_model,tfjs,tftrt,mlmodel,.xml/.bin, .onnx]
Owner
null
A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

null 117 Dec 28, 2022
Prototypical python implementation of the trust-region algorithm presented in Sequential Linearization Method for Bound-Constrained Mathematical Programs with Complementarity Constraints by Larson, Leyffer, Kirches, and Manns.

Prototypical python implementation of the trust-region algorithm presented in Sequential Linearization Method for Bound-Constrained Mathematical Programs with Complementarity Constraints by Larson, Leyffer, Kirches, and Manns.

null 3 Dec 2, 2022
Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021, official Pytorch implementatio

Microsoft 247 Dec 25, 2022
Prototypical Networks for Few shot Learning in PyTorch

Prototypical Networks for Few shot Learning in PyTorch Simple alternative Implementation of Prototypical Networks for Few Shot Learning (paper, code)

Orobix 835 Jan 8, 2023
Model-based reinforcement learning in TensorFlow

Bellman Website | Twitter | Documentation (latest) What does Bellman do? Bellman is a package for model-based reinforcement learning (MBRL) in Python,

null 46 Nov 9, 2022
Label-Free Model Evaluation with Semi-Structured Dataset Representations

Label-Free Model Evaluation with Semi-Structured Dataset Representations Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch

null 8 Oct 6, 2022
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

Yuanhao Cai 274 Jan 5, 2023
Vanilla and Prototypical Networks with Random Weights for image classification on Omniglot and mini-ImageNet. Made with Python3.

vanilla-rw-protonets-project Vanilla Prototypical Networks and PNs with Random Weights for image classification on Omniglot and mini-ImageNet. Made wi

Giovani Candido 8 Aug 31, 2022
Cross-Image Region Mining with Region Prototypical Network for Weakly Supervised Segmentation

Cross-Image Region Mining with Region Prototypical Network for Weakly Supervised Segmentation The code of: Cross-Image Region Mining with Region Proto

LiuWeide 16 Nov 26, 2022
Semi-Supervised Graph Prototypical Networks for Hyperspectral Image Classification, IGARSS, 2021.

Semi-Supervised Graph Prototypical Networks for Hyperspectral Image Classification, IGARSS, 2021. Bobo Xi, Jiaojiao Li, Yunsong Li and Qian Du. Code f

Bobo Xi 7 Nov 3, 2022