ATAC: Adversarially Trained Actor Critic

Microsoft

Last update: Dec 8, 2022

Related tags

Deep Learning ATAC

Overview

ATAC: Adversarially Trained Actor Critic

Adversarially Trained Actor Critic for Offline Reinforcement Learning by Ching-An Cheng*, Tengyang Xie*, Nan Jiang, and Alekh Agarwal.
https://arxiv.org/abs/2202.02446

Setup

Clone the repository and create a conda environment.

git clone https://github.com/microsoft/ATAC.git
conda create -n atac python=3.8
cd atac

Prerequisite: Install Mujoco

(Optional) Install free mujoco210 for mujoco_py and mujoco211 for dm_control.

> ~/.bashrc source ~/.bashrc">

bash install_mujoco.sh
echo "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/.mujoco/mujoco210/bin:/usr/lib/nvidia" >> ~/.bashrc
source ~/.bashrc

Install ATAC

conda activate atac
pip install -e .[mujoco210]
# or below, if the original paid mujoco is used.
pip install -e .[mujoco200]

Run ATAC

python scripts/main.py

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)

AdvRush Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21) Environmental Set-up Python == 3.6.12, PyTorch =

11 Dec 10, 2022

Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Jadena Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022. arXiv

13 Nov 29, 2022

Annotate datasets with a semi-trained or fully trained YOLOv5 model

YOLOv5 Auto Annotator Annotate datasets with a semi-trained or fully trained YOLOv5 model Prerequisites Ubuntu =20.04 Python =3.7 System dependencie

3 May 14, 2022

A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution.

Awesome Pretrained StyleGAN2 A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution. Note the readme is a

1.1k Dec 24, 2022

Repository to run object detection on a model trained on an autonomous driving dataset.

Autonomous Driving Object Detection on the Raspberry Pi 4 Description of Repository This repository contains code and instructions to configure the ne

51 Nov 17, 2022

Pre-trained NFNets with 99% of the accuracy of the official paper

NFNet Pytorch Implementation This repo contains pretrained NFNet models F0-F6 with high ImageNet accuracy from the paper High-Performance Large-Scale

133 Dec 9, 2022

Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

merged_depth runs (1) AdaBins, (2) DiverseDepth, (3) MiDaS, (4) SGDepth, and (5) Monodepth2, and calculates a weighted-average per-pixel absolute dept

39 Nov 21, 2022

Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample

Maths from examples - Learning advanced mathematical computations from examples This is the source code and data sets relevant to the paper Learning a

171 Nov 23, 2022

Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two

512x512 flowers after 12 hours of training, 1 gpu 256x256 flowers after 12 hours of training, 1 gpu Pizza 'Lightweight' GAN Implementation of 'lightwe

1.5k Jan 2, 2023

Comments

Difference with Conservative-Q Learning (CQL)

The relative pessimism (1)(2)(2) proposed in ATAC seems exactly same as the learning objective (3) in [1] . And Algorithm 2 in ATAC looks remarkably similar to the Algorithm 1 in [1] omitting some implementation caveats. Could you explain what is the major difference between ATAC and CQL?

[1] Kumar et. al. Conservative Q-Learning for Offline Reinforcement Learning. NeurIPS 2020

opened by emailweixu 5
Why training ends at epoch 50?

Hello, I have tried to reproduce ATAC's results in the paper. However, when I run the official codes, the experiment automatically ends at epoch 50. I cannot find where the problem is? Could you give me some help? For example, I have run 'python scripts/main.py -e hopper-medium-expert-v2 --gpu_id 0 --seed 15'. Are there any other hyperparameters that need to be given? @chinganc

opened by yuxudong20 4
Question about D4RL MuJoCo benchmark

Thanks for sharing the codes. I have one question. It seems like you are using D4RL v2 (C.2.), and in Table 1 you mention that "the baseline results are from the respective papers". However, some previous papers were using D4RL v0. I believe the buffer quality is varied from v0 to v2 (see TD3BC paper). Thus, the comparison might be biased.

opened by HYDesmondLiu 4
[Bug in win11] when I run "main.py"

Hi, thank you very much for providing such a good open source algorithm, but I am having the problems when running "main.py" on windows.

Warning: Flow failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message. No module named 'flow' Warning: FrankaKitchen failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message. No module named 'mujoco' Warning: CARLA failed to import. Set the environment variable D4RL_SUPPRESS_IMPORT_ERROR=1 to suppress this message. No module named 'carla' pybullet build time: Nov 5 2022 13:03:11 Traceback (most recent call last): File "D:\Anacondazzp\envs\pavelzzp\lib\site-packages\tensorboardX\record_writer.py", line 58, in open_file factory = REGISTERED_FACTORIES[prefix] KeyError: '.\exp_data\OfflineATAC_hopper-medium-replay-v2\beta_16_discount_0.99_norm_constraint_100_policy_lr_5e-07_value_lr_0.0005_use_two_qfs_True_fixed_alpha_None_q_eval_mode_0.5_0.5_n_warmstart_steps_100000_seed_0\events.out.tfevents.1667737712.Pavelzzp'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "D:\pythonproject\ATAC-master\scripts\main.py", line 234, in run(**train_kwargs) File "D:\pythonproject\ATAC-master\scripts\main.py", line 197, in run full_score = train_agent(train_func, File "c:\users\pavel\atac\src\atac\garage_tools\rl_utils.py", line 47, in train_agent score = wrapped_train_func(**train_kwargs) File "D:\Anacondazzp\envs\pavelzzp\lib\site-packages\garage\experiment\experiment.py", line 368, in call ctxt = self._make_context(self._get_options(*args), **kwargs) File "D:\Anacondazzp\envs\pavelzzp\lib\site-packages\garage\experiment\experiment.py", line 329, in _make_context dowel.TensorBoardOutput(log_dir, x_axis=options['x_axis'])) File "D:\Anacondazzp\envs\pavelzzp\lib\site-packages\dowel\tensor_board_output.py", line 57, in init self._writer = tbX.SummaryWriter(log_dir, flush_secs=flush_secs) File "D:\Anacondazzp\envs\pavelzzp\lib\site-packages\tensorboardX\writer.py", line 301, in init self._get_file_writer() File "D:\Anacondazzp\envs\pavelzzp\lib\site-packages\tensorboardX\writer.py", line 349, in _get_file_writer self.file_writer = FileWriter(logdir=self.logdir, File "D:\Anacondazzp\envs\pavelzzp\lib\site-packages\tensorboardX\writer.py", line 105, in init self.event_writer = EventFileWriter( File "D:\Anacondazzp\envs\pavelzzp\lib\site-packages\tensorboardX\event_file_writer.py", line 106, in init self._ev_writer = EventsWriter(os.path.join( File "D:\Anacondazzp\envs\pavelzzp\lib\site-packages\tensorboardX\event_file_writer.py", line 43, in init self._py_recordio_writer = RecordWriter(self._file_name) File "D:\Anacondazzp\envs\pavelzzp\lib\site-packages\tensorboardX\record_writer.py", line 179, in init self._writer = open_file(path) File "D:\Anacondazzp\envs\pavelzzp\lib\site-packages\tensorboardX\record_writer.py", line 61, in open_file return open(path, 'wb') FileNotFoundError: [Errno 2] No such file or directory: '.\exp_data\OfflineATAC_hopper-medium-replay-v2\beta_16_discount_0.99_norm_constraint_100_policy_lr_5e-07_value_lr_0.0005_use_two_qfs_True_fixed_alpha_None_q_eval_mode_0.5_0.5_n_warmstart_steps_100000_seed_0\events.out.tfevents.1667737712.Pavelzzp' Exception ignored in: <function LogOutput.del at 0x0000020BA6A3D9D0>

Traceback (most recent call last): File "D:\Anacondazzp\envs\pavelzzp\lib\site-packages\dowel\logger.py", line 176, in del self.close() File "D:\Anacondazzp\envs\pavelzzp\lib\site-packages\dowel\tensor_board_output.py", line 156, in close self._writer.close() AttributeError: 'TensorBoardOutput' object has no attribute '_writer'

How can I solve the problems? Thank you very much.

opened by PavelZhao 2

Owner

Microsoft

Open source projects and samples from Microsoft

GitHub

Asynchronous Advantage Actor-Critic in PyTorch

Asynchronous Advantage Actor-Critic in PyTorch This is PyTorch implementation of A3C as described in Asynchronous Methods for Deep Reinforcement Learn

38 Dec 12, 2022

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Advantage async actor-critic Algorithms (A3C) in PyTorch @inproceedings{mnih2016asynchronous, title={Asynchronous methods for deep reinforcement lea

111 Dec 8, 2022

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

3k Dec 31, 2022

Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

Softlearning Softlearning is a deep reinforcement learning toolbox for training maximum entropy policies in continuous domains. The implementation is

997 Dec 30, 2022

Advantage Actor Critic (A2C): jax + flax implementation

Advantage Actor Critic (A2C): jax + flax implementation Current version supports only environments with continious action spaces and was tested on muj

3 Jan 23, 2022

Multi-task Multi-agent Soft Actor Critic for SMAC

Multi-task Multi-agent Soft Actor Critic for SMAC Overview The CARE formulti-task: Multi-Task Reinforcement Learning with Context-based Representation

8 Sep 30, 2022

Code for the paper: Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization (https://arxiv.org/abs/2002.11798)

Representation Robustness Evaluations Our implementation is based on code from MadryLab's robustness package and Devon Hjelm's Deep InfoMax. For all t

19 Dec 7, 2022

ATAC: Adversarially Trained Actor Critic

Related tags

Overview

ATAC: Adversarially Trained Actor Critic

Setup

Clone the repository and create a conda environment.

Prerequisite: Install Mujoco

Install ATAC

Run ATAC

Contributing

Trademarks

You might also like...

Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)

Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Annotate datasets with a semi-trained or fully trained YOLOv5 model

A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution.

Repository to run object detection on a model trained on an autonomous driving dataset.

Pre-trained NFNets with 99% of the accuracy of the official paper

Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample

Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two

Comments

Difference with Conservative-Q Learning (CQL)

Why training ends at epoch 50?

Question about D4RL MuJoCo benchmark

[Bug in win11] when I run "main.py"

Hi, thank you very much for providing such a good open source algorithm, but I am having the problems when running "main.py" on windows.

Owner

Microsoft

Asynchronous Advantage Actor-Critic in PyTorch

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

Advantage Actor Critic (A2C): jax + flax implementation

Multi-task Multi-agent Soft Actor Critic for SMAC

Code for the paper: Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization (https://arxiv.org/abs/2002.11798)

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

Code for the paper "Adversarially Regularized Autoencoders (ICML 2018)" by Zhao, Kim, Zhang, Rush and LeCun

[NeurIPS2021] Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks