RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

Younggyo Seo

Last update: Nov 29, 2022

Related tags

Deep Learning RE3

Overview

State Entropy Maximization with Random Encoders for Efficient Exploration (RE3) (ICML 2021)

Code for State Entropy Maximization with Random Encoders for Efficient Exploration.

In this repository, we provide code for RE3 algorithm described in the paper linked above. We provide code in three sub-directories: rad_re3 containing code for the combination of RE3 and RAD, dreamer_re3 containing code for the combination of RE3 and Dreamer, and a2c_re3 containing code for the combination of RE3 and A2C.

We also provide raw data(.csv) and code for visualization in the data directory.

If you find this repository useful for your research, please cite:

@inproceedings{seo2021state,
  title={State Entropy Maximization with Random Encoders for Efficient Exploration},
  author={Seo, Younggyo and Chen, Lili and Shin, Jinwoo and Lee, Honglak and Abbeel, Pieter and Lee, Kimin},
  booktitle={International Conference on Machine Learning},
  year={2021}
}

RAD + RE3

Our code is built on top of the DrQ repository.

Installation

You could install all dependencies by following command:

conda env install -f conda_env.yml

You should also install custom version of dm_control to run experiments on Walker Run Sparse and Cheetah Run Sparse. You could do this by following command:

cd ../envs/dm_control
pip install .

Instructions

RAD

python train.py env=hopper_hop batch_size=512 action_repeat=2 logdir=runs_rad_re3 use_state_entropy=false

RAD + RE3

python train.py env=hopper_hop batch_size=512 action_repeat=2 logdir=runs_rad_re3

We provide all scripts to reproduce Figure 4 (RAD, RAD + RE3) in scripts directory.

Dreamer + RE3

Our code is built on top of the Dreamer repository.

Installation

You could install all dependencies by following command:

pip3 install --user tensorflow-gpu==2.2.0
pip3 install --user tensorflow_probability
pip3 install --user git+git://github.com/deepmind/dm_control.git
pip3 install --user pandas
pip3 install --user matplotlib

# Install custom dm_control environments for walker_run_sparse / cheetah_run_sparse
cd ../envs/dm_control
pip3 install .

Instructions

Dreamer

python dreamer.py --logdir ./logdir/dmc_pendulum_swingup/dreamer/12345 --task dmc_pendulum_swingup --precision 32 --beta 0.0 --seed 12345

Dreamer + RE3

python dreamer.py --logdir ./logdir/dmc_pendulum_swingup/dreamer_re3/12345 --task dmc_pendulum_swingup --precision 32 --k 53 --beta 0.1 --seed 12345

We provide all scripts to reproduce Figure 4 (Dreamer, Dreamer + RE3) in scripts directory.

A2C + RE3

Training code can be found in rl-starter-files directory, which is forked from rl-starter-files, which uses a modified A2C implementation from torch-ac. Note that currently there is only support for A2C.

Installation

All of the dependencies are in the requirements.txt file in rl-starter-files. They can be installed manually or with the following command:

pip3 install -r requirements.txt

You will also need to install our cloned version of torch-ac with these commands:

cd torch-ac
pip3 install -e .

Instructions

See instructions in rl-starter-files directory. Example scripts can be found in rl-starter-files/rl-starter-files/run_sent.sh.

Final project code: Implementing MAE with downscaled encoders and datasets, for ESE546 FA21 at University of Pennsylvania

546 Final Project: Masked Autoencoder Haoran Tang, Qirui Wu 1. Training To train the network, please run mae_pretraining.py. Please modify folder path

0 Apr 22, 2022

[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

AMOS This repository contains the scripts for fine-tuning AMOS pretrained models on GLUE and SQuAD 2.0 benchmarks. Paper: Pretraining Text Encoders wi

22 Sep 15, 2022

Random-Afg - Afghanistan Random Old Idz Cloner Tools

AFGHANISTAN RANDOM OLD IDZ CLONER TOOLS Install $ apt update $ apt upgrade $ apt

5 Jan 26, 2022

AntroPy: entropy and complexity of (EEG) time-series in Python

AntroPy is a Python 3 package providing several time-efficient algorithms for computing the complexity of time-series. It can be used for example to e

153 Dec 27, 2022

PyTorch code for SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised DA

PyTorch Code for SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised Domain Adaptation Viraj Prabhu, Shivam Khare, Deeks

46 Dec 24, 2022

ICLR21 Tent: Fully Test-Time Adaptation by Entropy Minimization

⛺️ Tent: Fully Test-Time Adaptation by Entropy Minimization This is the official project repository for Tent: Fully-Test Time Adaptation by Entropy Mi

204 Dec 25, 2022

noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.

ProSelfLC: CVPR 2021 ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks For any specific discussion or potential fu

57 Dec 4, 2022

Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Deep Unsupervised Image Hashing by Maximizing Bit Entropy This is the PyTorch implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hash

62 Dec 30, 2022

Semi-supervised Domain Adaptation via Minimax Entropy

Semi-supervised Domain Adaptation via Minimax Entropy (ICCV 2019) Install pip install -r requirements.txt The code is written for Pytorch 0.4.0, but s

243 Jan 9, 2023

Comments

Bump lxml from 4.6.2 to 4.6.3 in /envs/dm_control
Bumps lxml from 4.6.2 to 4.6.3.

Changelog

Sourced from lxml's changelog.

4.6.3 (2021-03-21)

Bugs fixed

A vulnerability (CVE-2021-28957) was discovered in the HTML Cleaner by Kevin Chung, which allowed JavaScript to pass through. The cleaner now removes the HTML5 formaction attribute.

Commits

a5f9cb5 Prepare release of lxml 4.6.3.

2d01a1b Add HTML-5 "formaction" attribute to "defs.link_attrs" (GH-316)

e986a9c Fix reference in docs.

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Bump lxml from 4.5.0 to 4.6.2 in /envs/dm_control
Bumps lxml from 4.5.0 to 4.6.2.

Changelog

Sourced from lxml's changelog.

4.6.2 (2020-11-26)

Bugs fixed

A vulnerability (CVE-2020-27783) was discovered in the HTML Cleaner by Yaniv Nizry, which allowed JavaScript to pass through. The cleaner now removes more sneaky "style" content.

4.6.1 (2020-10-18)

Bugs fixed

A vulnerability was discovered in the HTML Cleaner by Yaniv Nizry, which allowed JavaScript to pass through. The cleaner now removes more sneaky "style" content.

4.6.0 (2020-10-17)

Features added

GH#310: lxml.html.InputGetter supports __len__() to count the number of input fields. Patch by Aidan Woolley.

lxml.html.InputGetter has a new .items() method to ease processing all input fields.

lxml.html.InputGetter.keys() now returns the field names in document order.

GH-309: The API documentation is now generated using sphinx-apidoc. Patch by Chris Mayo.

Bugs fixed

LP#1869455: C14N 2.0 serialisation failed for unprefixed attributes when a default namespace was defined.

TreeBuilder.close() raised AssertionError in some error cases where it should have raised XMLSyntaxError. It now raises a combined exception to keep up backwards compatibility, while switching to XMLSyntaxError as an interface.

4.5.2 (2020-07-09)

... (truncated)

Commits

4cb5736 Work around Py2's lack of "re.ASCII".

c30106f Prepare release of 4.6.2.

a105ab8 Prevent combinations of <math/svg> and <style> to sneak JavaScript through th...

c053dc1 Add a recipe for a look-ahead generator to allow modifications during tree it...

b083124 lxml actually works in Py3.9.

0f80590 lxml actually works in Py3.9.

fd8893c Add a doc note that the .find() methods are usually faster than one might exp...

eb6df27 Update release version on homepage.

69b5c9b Automate the build artefact downloading from github and appveyor.

61432a8 Prepare release of lxml 4.6.1.

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Bump pillow from 8.0.1 to 8.1.1 in /envs/dm_control
Bumps pillow from 8.0.1 to 8.1.1.

Release notes

Sourced from pillow's releases.

8.1.1

https://pillow.readthedocs.io/en/stable/releasenotes/8.1.1.html

8.1.0

https://pillow.readthedocs.io/en/stable/releasenotes/8.1.0.html

Changes

Fix TIFF OOB Write error #5175 [@radarhere]

Fix for Buffer Read Overrun in PCX Decoding #5174 [@radarhere]

Fix for SGI Decode buffer overrun #5173 [@radarhere]

Fix OOB Read when saving GIF of xsize=1 #5149 [@wiredfool]

Add support for PySide6 #5161 [@hugovk]

Moved QApplication into one test #5167 [@radarhere]

Use disposal settings from previous frame in APNG #5126 [@radarhere]

Revert "skip wheels on 3.10-dev due to wheel#354" #5163 [@radarhere]

Better _binary module use #5156 [@radarhere]

Added exception explaining that repr_png saves to PNG #5139 [@radarhere]

Use previous disposal method in GIF load_end #5125 [@radarhere]

Do not catch a ValueError only to raise another #5090 [@radarhere]

Allow putpalette to accept 1024 integers to include alpha values #5089 [@radarhere]

Fix OOB Read when writing TIFF with custom Metadata #5148 [@wiredfool]

Removed unused variable #5140 [@radarhere]

Fix dereferencing of potential null pointers #5111 [@cgohlke]

Fixed warnings assigning to "unsigned char *" from "char *" #5127 [@radarhere]

Add append_images support for ICO #4568 [@ziplantil]

Fixed comparison warnings #5122 [@radarhere]

Block TIFFTAG_SUBIFD #5120 [@radarhere]

Fix dereferencing potential null pointer #5108 [@cgohlke]

Replaced PyErr_NoMemory with ImagingError_MemoryError #5113 [@radarhere]

Remove duplicate code #5109 [@cgohlke]

Moved warning to end of execution #4965 [@radarhere]

Removed unused fromstring and tostring C methods #5026 [@radarhere]

init() if one of the formats is unrecognised #5037 [@radarhere]

Dependencies

Updated libtiff to 4.2.0 #5153 [@radarhere]

Updated openjpeg to 2.4.0 #5151 [@radarhere]

Updated harfbuzz to 2.7.4 #5138 [@radarhere]

Updated harfbuzz to 2.7.3 #5128 [@radarhere]

Updated libraqm to 0.7.1 #5070 [@radarhere]

Updated libimagequant to 2.13.1 #5065 [@radarhere]

Update FriBiDi to 1.0.10 #5064 [@nulano]

Updated libraqm to 0.7.1 #5063 [@radarhere]

Updated libjpeg-turbo to 2.0.6 #5044 [@radarhere]

Deprecations

... (truncated)

Changelog

Sourced from pillow's changelog.

8.1.1 (2021-03-01)

Use more specific regex chars to prevent ReDoS. CVE-2021-25292 [hugovk]

Fix OOB Read in TiffDecode.c, and check the tile validity before reading. CVE-2021-25291 [wiredfool]

Fix negative size read in TiffDecode.c. CVE-2021-25290 [wiredfool]

Fix OOB read in SgiRleDecode.c. CVE-2021-25293 [wiredfool]

Incorrect error code checking in TiffDecode.c. CVE-2021-25289 [wiredfool]

PyModule_AddObject fix for Python 3.10 #5194 [radarhere]

8.1.0 (2021-01-02)

Fix TIFF OOB Write error. CVE-2020-35654 #5175 [wiredfool]

Fix for Read Overflow in PCX Decoding. CVE-2020-35653 #5174 [wiredfool, radarhere]

Fix for SGI Decode buffer overrun. CVE-2020-35655 #5173 [wiredfool, radarhere]

Fix OOB Read when saving GIF of xsize=1 #5149 [wiredfool]

Makefile updates #5159 [wiredfool, radarhere]

Add support for PySide6 #5161 [hugovk]

Use disposal settings from previous frame in APNG #5126 [radarhere]

Added exception explaining that repr_png saves to PNG #5139 [radarhere]

Use previous disposal method in GIF load_end #5125 [radarhere]

... (truncated)

Commits

741d874 8.1.1 version bump

179cd1c Added 8.1.1 release notes to index

7d29665 Update CHANGES.rst [ci skip]

d25036f Credits

973a4c3 Release notes for 8.1.1

521dab9 Use more specific regex chars to prevent ReDoS

8b8076b Fix for CVE-2021-25291

e25be1e Fix negative size read in TiffDecode.c

f891baa Fix OOB read in SgiRleDecode.c

cbfdde7 Incorrect error code checking in TiffDecode.c

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0

Owner

Younggyo Seo

Ph.D Student @ Graduate School of AI, KAIST

GitHub

Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

Who Left the Dogs Out? Evaluation and demo code for our ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization

29 Dec 28, 2022

The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

SIGIR2021-EGLN The implement of paper "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization" Neural graph based Col

15 Dec 27, 2022

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab

89 Dec 26, 2022

Python Implementation of algorithms in Graph Mining, e.g., Recommendation, Collaborative Filtering, Community Detection, Spectral Clustering, Modularity Maximization, co-authorship networks.

Graph Mining Author: Jiayi Chen Time: April 2021 Implemented Algorithms: Network: Scrabing Data, Network Construbtion and Network Measurement (e.g., P

3 Mar 3, 2022

Self-Supervised Learning with Kernel Dependence Maximization

Self-Supervised Learning with Kernel Dependence Maximization This is the code for SSL-HSIC, a self-supervised learning loss proposed in the paper Self

29 Dec 29, 2022

Joint learning of images and text via maximization of mutual information

mutual_info_img_txt Joint learning of images and text via maximization of mutual information. This repository incorporates the algorithms presented in

10 Dec 22, 2022

This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

TransUNet This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation Usage

1.4k Jan 4, 2023

TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks

TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks [Paper] [Project Website] This repository holds the source code, pretra

83 Dec 21, 2022

GAN encoders in PyTorch that could match PGGAN, StyleGAN v1/v2, and BigGAN. Code also integrates the implementation of these GANs.

MTV-TSA: Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions. This is the official code release fo

37 Dec 24, 2022

PyTorch Implement of Context Encoders: Feature Learning by Inpainting

Context Encoders: Feature Learning by Inpainting This is the Pytorch implement of CVPR 2016 paper on Context Encoders 1) Semantic Inpainting Demo Inst

321 Dec 25, 2022

RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

Related tags

Overview

State Entropy Maximization with Random Encoders for Efficient Exploration (RE3) (ICML 2021)

RAD + RE3

Installation

Instructions

RAD

RAD + RE3

Dreamer + RE3

Installation

Instructions

Dreamer

Dreamer + RE3

A2C + RE3

Installation

Instructions

You might also like...

Final project code: Implementing MAE with downscaled encoders and datasets, for ESE546 FA21 at University of Pennsylvania

[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

Random-Afg - Afghanistan Random Old Idz Cloner Tools

AntroPy: entropy and complexity of (EEG) time-series in Python

PyTorch code for SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised DA

ICLR21 Tent: Fully Test-Time Adaptation by Entropy Minimization

noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.

Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Semi-supervised Domain Adaptation via Minimax Entropy

Comments

Bump lxml from 4.6.2 to 4.6.3 in /envs/dm_control

4.6.3 (2021-03-21)

Bugs fixed

Bump lxml from 4.5.0 to 4.6.2 in /envs/dm_control

4.6.2 (2020-11-26)

Bugs fixed

4.6.1 (2020-10-18)

Bugs fixed

4.6.0 (2020-10-17)

Features added

Bugs fixed

Bump pillow from 8.0.1 to 8.1.1 in /envs/dm_control

8.1.1

8.1.0

Changes

Dependencies

Deprecations

8.1.1 (2021-03-01)

8.1.0 (2021-01-02)

Owner

Younggyo Seo

Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Python Implementation of algorithms in Graph Mining, e.g., Recommendation, Collaborative Filtering, Community Detection, Spectral Clustering, Modularity Maximization, co-authorship networks.

Self-Supervised Learning with Kernel Dependence Maximization

Joint learning of images and text via maximization of mutual information

This repo holds code for TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks

GAN encoders in PyTorch that could match PGGAN, StyleGAN v1/v2, and BigGAN. Code also integrates the implementation of these GANs.

PyTorch Implement of Context Encoders: Feature Learning by Inpainting