RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

Related tags

Deep Learning RE3
Overview

State Entropy Maximization with Random Encoders for Efficient Exploration (RE3) (ICML 2021)

Code for State Entropy Maximization with Random Encoders for Efficient Exploration.

In this repository, we provide code for RE3 algorithm described in the paper linked above. We provide code in three sub-directories: rad_re3 containing code for the combination of RE3 and RAD, dreamer_re3 containing code for the combination of RE3 and Dreamer, and a2c_re3 containing code for the combination of RE3 and A2C.

We also provide raw data(.csv) and code for visualization in the data directory.

If you find this repository useful for your research, please cite:

@inproceedings{seo2021state,
  title={State Entropy Maximization with Random Encoders for Efficient Exploration},
  author={Seo, Younggyo and Chen, Lili and Shin, Jinwoo and Lee, Honglak and Abbeel, Pieter and Lee, Kimin},
  booktitle={International Conference on Machine Learning},
  year={2021}
}

RAD + RE3

Our code is built on top of the DrQ repository.

Installation

You could install all dependencies by following command:

conda env install -f conda_env.yml

You should also install custom version of dm_control to run experiments on Walker Run Sparse and Cheetah Run Sparse. You could do this by following command:

cd ../envs/dm_control
pip install .

Instructions

RAD

python train.py env=hopper_hop batch_size=512 action_repeat=2 logdir=runs_rad_re3 use_state_entropy=false

RAD + RE3

python train.py env=hopper_hop batch_size=512 action_repeat=2 logdir=runs_rad_re3

We provide all scripts to reproduce Figure 4 (RAD, RAD + RE3) in scripts directory.

Dreamer + RE3

Our code is built on top of the Dreamer repository.

Installation

You could install all dependencies by following command:

pip3 install --user tensorflow-gpu==2.2.0
pip3 install --user tensorflow_probability
pip3 install --user git+git://github.com/deepmind/dm_control.git
pip3 install --user pandas
pip3 install --user matplotlib

# Install custom dm_control environments for walker_run_sparse / cheetah_run_sparse
cd ../envs/dm_control
pip3 install .

Instructions

Dreamer

python dreamer.py --logdir ./logdir/dmc_pendulum_swingup/dreamer/12345 --task dmc_pendulum_swingup --precision 32 --beta 0.0 --seed 12345

Dreamer + RE3

python dreamer.py --logdir ./logdir/dmc_pendulum_swingup/dreamer_re3/12345 --task dmc_pendulum_swingup --precision 32 --k 53 --beta 0.1 --seed 12345

We provide all scripts to reproduce Figure 4 (Dreamer, Dreamer + RE3) in scripts directory.

A2C + RE3

Training code can be found in rl-starter-files directory, which is forked from rl-starter-files, which uses a modified A2C implementation from torch-ac. Note that currently there is only support for A2C.

Installation

All of the dependencies are in the requirements.txt file in rl-starter-files. They can be installed manually or with the following command:

pip3 install -r requirements.txt

You will also need to install our cloned version of torch-ac with these commands:

cd torch-ac
pip3 install -e .

Instructions

See instructions in rl-starter-files directory. Example scripts can be found in rl-starter-files/rl-starter-files/run_sent.sh.

You might also like...
Final project code: Implementing MAE with downscaled encoders and datasets, for ESE546 FA21 at University of Pennsylvania

546 Final Project: Masked Autoencoder Haoran Tang, Qirui Wu 1. Training To train the network, please run mae_pretraining.py. Please modify folder path

[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators
[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

AMOS This repository contains the scripts for fine-tuning AMOS pretrained models on GLUE and SQuAD 2.0 benchmarks. Paper: Pretraining Text Encoders wi

Random-Afg - Afghanistan Random Old Idz Cloner Tools
Random-Afg - Afghanistan Random Old Idz Cloner Tools

AFGHANISTAN RANDOM OLD IDZ CLONER TOOLS Install $ apt update $ apt upgrade $ apt

AntroPy: entropy and complexity of (EEG) time-series in Python
AntroPy: entropy and complexity of (EEG) time-series in Python

AntroPy is a Python 3 package providing several time-efficient algorithms for computing the complexity of time-series. It can be used for example to e

PyTorch code for SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised DA
PyTorch code for SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised DA

PyTorch Code for SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised Domain Adaptation Viraj Prabhu, Shivam Khare, Deeks

ICLR21 Tent: Fully Test-Time Adaptation by Entropy Minimization

⛺️ Tent: Fully Test-Time Adaptation by Entropy Minimization This is the official project repository for Tent: Fully-Test Time Adaptation by Entropy Mi

noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.

ProSelfLC: CVPR 2021 ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks For any specific discussion or potential fu

Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy
Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Deep Unsupervised Image Hashing by Maximizing Bit Entropy This is the PyTorch implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hash

Semi-supervised Domain Adaptation via Minimax Entropy
Semi-supervised Domain Adaptation via Minimax Entropy

Semi-supervised Domain Adaptation via Minimax Entropy (ICCV 2019) Install pip install -r requirements.txt The code is written for Pytorch 0.4.0, but s

Comments
  • Bump lxml from 4.6.2 to 4.6.3 in /envs/dm_control

    Bump lxml from 4.6.2 to 4.6.3 in /envs/dm_control

    Bumps lxml from 4.6.2 to 4.6.3.

    Changelog

    Sourced from lxml's changelog.

    4.6.3 (2021-03-21)

    Bugs fixed

    • A vulnerability (CVE-2021-28957) was discovered in the HTML Cleaner by Kevin Chung, which allowed JavaScript to pass through. The cleaner now removes the HTML5 formaction attribute.
    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump lxml from 4.5.0 to 4.6.2 in /envs/dm_control

    Bump lxml from 4.5.0 to 4.6.2 in /envs/dm_control

    Bumps lxml from 4.5.0 to 4.6.2.

    Changelog

    Sourced from lxml's changelog.

    4.6.2 (2020-11-26)

    Bugs fixed

    • A vulnerability (CVE-2020-27783) was discovered in the HTML Cleaner by Yaniv Nizry, which allowed JavaScript to pass through. The cleaner now removes more sneaky "style" content.

    4.6.1 (2020-10-18)

    Bugs fixed

    • A vulnerability was discovered in the HTML Cleaner by Yaniv Nizry, which allowed JavaScript to pass through. The cleaner now removes more sneaky "style" content.

    4.6.0 (2020-10-17)

    Features added

    • GH#310: lxml.html.InputGetter supports __len__() to count the number of input fields. Patch by Aidan Woolley.

    • lxml.html.InputGetter has a new .items() method to ease processing all input fields.

    • lxml.html.InputGetter.keys() now returns the field names in document order.

    • GH-309: The API documentation is now generated using sphinx-apidoc. Patch by Chris Mayo.

    Bugs fixed

    • LP#1869455: C14N 2.0 serialisation failed for unprefixed attributes when a default namespace was defined.

    • TreeBuilder.close() raised AssertionError in some error cases where it should have raised XMLSyntaxError. It now raises a combined exception to keep up backwards compatibility, while switching to XMLSyntaxError as an interface.

    4.5.2 (2020-07-09)

    ... (truncated)

    Commits
    • 4cb5736 Work around Py2's lack of "re.ASCII".
    • c30106f Prepare release of 4.6.2.
    • a105ab8 Prevent combinations of <math/svg> and <style> to sneak JavaScript through th...
    • c053dc1 Add a recipe for a look-ahead generator to allow modifications during tree it...
    • b083124 lxml actually works in Py3.9.
    • 0f80590 lxml actually works in Py3.9.
    • fd8893c Add a doc note that the .find() methods are usually faster than one might exp...
    • eb6df27 Update release version on homepage.
    • 69b5c9b Automate the build artefact downloading from github and appveyor.
    • 61432a8 Prepare release of lxml 4.6.1.
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump pillow from 8.0.1 to 8.1.1 in /envs/dm_control

    Bump pillow from 8.0.1 to 8.1.1 in /envs/dm_control

    Bumps pillow from 8.0.1 to 8.1.1.

    Release notes

    Sourced from pillow's releases.

    8.1.1

    https://pillow.readthedocs.io/en/stable/releasenotes/8.1.1.html

    8.1.0

    https://pillow.readthedocs.io/en/stable/releasenotes/8.1.0.html

    Changes

    Dependencies

    Deprecations

    ... (truncated)

    Changelog

    Sourced from pillow's changelog.

    8.1.1 (2021-03-01)

    • Use more specific regex chars to prevent ReDoS. CVE-2021-25292 [hugovk]

    • Fix OOB Read in TiffDecode.c, and check the tile validity before reading. CVE-2021-25291 [wiredfool]

    • Fix negative size read in TiffDecode.c. CVE-2021-25290 [wiredfool]

    • Fix OOB read in SgiRleDecode.c. CVE-2021-25293 [wiredfool]

    • Incorrect error code checking in TiffDecode.c. CVE-2021-25289 [wiredfool]

    • PyModule_AddObject fix for Python 3.10 #5194 [radarhere]

    8.1.0 (2021-01-02)

    • Fix TIFF OOB Write error. CVE-2020-35654 #5175 [wiredfool]

    • Fix for Read Overflow in PCX Decoding. CVE-2020-35653 #5174 [wiredfool, radarhere]

    • Fix for SGI Decode buffer overrun. CVE-2020-35655 #5173 [wiredfool, radarhere]

    • Fix OOB Read when saving GIF of xsize=1 #5149 [wiredfool]

    • Makefile updates #5159 [wiredfool, radarhere]

    • Add support for PySide6 #5161 [hugovk]

    • Use disposal settings from previous frame in APNG #5126 [radarhere]

    • Added exception explaining that repr_png saves to PNG #5139 [radarhere]

    • Use previous disposal method in GIF load_end #5125 [radarhere]

    ... (truncated)

    Commits
    • 741d874 8.1.1 version bump
    • 179cd1c Added 8.1.1 release notes to index
    • 7d29665 Update CHANGES.rst [ci skip]
    • d25036f Credits
    • 973a4c3 Release notes for 8.1.1
    • 521dab9 Use more specific regex chars to prevent ReDoS
    • 8b8076b Fix for CVE-2021-25291
    • e25be1e Fix negative size read in TiffDecode.c
    • f891baa Fix OOB read in SgiRleDecode.c
    • cbfdde7 Incorrect error code checking in TiffDecode.c
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Owner
Younggyo Seo
Ph.D Student @ Graduate School of AI, KAIST
Younggyo Seo
Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

Who Left the Dogs Out? Evaluation and demo code for our ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization

Benjamin Biggs 29 Dec 28, 2022
The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

SIGIR2021-EGLN The implement of paper "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization" Neural graph based Col

null 15 Dec 27, 2022
This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab 89 Dec 26, 2022
Python Implementation of algorithms in Graph Mining, e.g., Recommendation, Collaborative Filtering, Community Detection, Spectral Clustering, Modularity Maximization, co-authorship networks.

Graph Mining Author: Jiayi Chen Time: April 2021 Implemented Algorithms: Network: Scrabing Data, Network Construbtion and Network Measurement (e.g., P

Jiayi Chen 3 Mar 3, 2022
Self-Supervised Learning with Kernel Dependence Maximization

Self-Supervised Learning with Kernel Dependence Maximization This is the code for SSL-HSIC, a self-supervised learning loss proposed in the paper Self

DeepMind 29 Dec 29, 2022
Joint learning of images and text via maximization of mutual information

mutual_info_img_txt Joint learning of images and text via maximization of mutual information. This repository incorporates the algorithms presented in

Ruizhi Liao 10 Dec 22, 2022
This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

TransUNet This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation Usage

null 1.4k Jan 4, 2023
TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks

TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks [Paper] [Project Website] This repository holds the source code, pretra

Humam Alwassel 83 Dec 21, 2022
GAN encoders in PyTorch that could match PGGAN, StyleGAN v1/v2, and BigGAN. Code also integrates the implementation of these GANs.

MTV-TSA: Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions. This is the official code release fo

owl 37 Dec 24, 2022
PyTorch Implement of Context Encoders: Feature Learning by Inpainting

Context Encoders: Feature Learning by Inpainting This is the Pytorch implement of CVPR 2016 paper on Context Encoders 1) Semantic Inpainting Demo Inst

null 321 Dec 25, 2022