This repository contains unofficial code reproducing Agent57

Last update: Dec 29, 2022

Related tags

Third-party APIs Wrappers agent57_pytorch

Overview

Agent57

This repository contains unofficial code reproducing Agent57, which outperformed humans in all Atari games.

Directory File

agent.py

define agent to play a supecific environment.
buffer.py

define buffer to store experiences with priorites.
learner.py

define learner to update parameter such as q networks and functions related to intrinsic reward.
main.py

run the main pipeline.
model.py

define some models such as q network and functions related to intrinsic reward.
segment_tree.py

define segment tree which decide segment index according to the priority.
tester.py

define tester which test performance of Agent57.
utils.py

define some classes and functions such as UCB and Retrace operator.

Requirement

python==3.9.5
matplotlib==3.4.2
ray==1.4.1
lz4==3.1.3
numpy==1.21.0
omegaconf==2.1.1
torch==1.9.0

Installation

pip install -r requirements.txt

Usage

python main.py

Citation

Agent57: Outperforming the Atari Human Benchmark

https://arxiv.org/abs/2003.13350

You might also like...

Unofficial Medium Python Flask API and SDK

PyMedium - Unofficial Medium API PyMedium is an unofficial Medium API written in python flask. It provides developers to access to user, post list and

157 Nov 11, 2022

(unofficial) Googletrans: Free and Unlimited Google translate API for Python. Translates totally free of charge.

Googletrans Googletrans is a free and unlimited python library that implemented Google Translate API. This uses the Google Translate Ajax API to make

3.2k Jan 4, 2023

An unofficial client library for Google Music.

gmusicapi: an unofficial API for Google Play Music gmusicapi allows control of Google Music with Python. from gmusicapi import Mobileclient api = Mob

2.5k Dec 15, 2022

Unofficial Python API client for Notion.so

notion-py Unofficial Python 3 client for Notion.so API v3. Object-oriented interface (mapping database tables to Python classes/attributes) Automatic

3.9k Jan 3, 2023

rewise is an unofficial wrapper for google search's auto-complete feature

71 Jul 19, 2022

An Unofficial TikTok API Wrapper In Python

This is an unofficial api wrapper for TikTok.com in python. With this api you are able to call most trending and fetch specific user information as well as much more.

2.9k Jan 8, 2023

unofficial library for discord components(on development)

discord.py-buttons unofficial library for discord buttons(on development) Install pip install --upgrade discord_buttons Example from discord import Cl

129 Dec 31, 2022

✖️ Unofficial API of 1337x.to

✖️ Unofficial Python API Wrapper of 1337x This is the unofficial API of 1337x. It supports all proxies of 1337x and almost all functions of 1337x. You

71 Dec 26, 2022

Unofficial YooMoney API python library

API Yoomoney - unofficial python library This is an unofficial YooMoney API python library. Summary Introduction Features Installation Quick start Acc

136 Dec 30, 2022

Comments

RayOutOfMemoryError: More than 95% of the memory on node xxx is used

Hello, thank you so much for the reproduction code!

I have encountered RayOutOfMemoryError when running the code. To address this issue, I have tried set num_agents from the default of 16 to 8 and 4 respectively, but this address remains unsolved. I don't know if the other parameters (such as num_rollout and num_arms) should be changed together.

My machine is a Linux server with 128GB of RAM and 4 2080-Ti GPUs. Could you please show me how to configure the parameters appropriately?

Looking forward to your reply, thanks!

opened by AptX395 1

cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

I am trying to run your code on a fresh install of Ubuntu 20.04 with Python 3.9.5, and CUDA 11.6 / cuDNN 8.3.2, but when executing main.py the following cuDNN error results:

$ python main.py 
2022-01-21 16:02:17,793	INFO services.py:1272 -- View the Ray dashboard at http://127.0.0.1:8265
(pid=36888) A.L.E: Arcade Learning Environment (version +978d2ce)
(pid=36888) [Powered by Stella]
(pid=36874) A.L.E: Arcade Learning Environment (version +978d2ce)
(pid=36874) [Powered by Stella]
(pid=36881) A.L.E: Arcade Learning Environment (version +978d2ce)
(pid=36881) [Powered by Stella]
(pid=36885) A.L.E: Arcade Learning Environment (version +978d2ce)
(pid=36885) [Powered by Stella]
(pid=36882) A.L.E: Arcade Learning Environment (version +978d2ce)
(pid=36882) [Powered by Stella]
(pid=36875) A.L.E: Arcade Learning Environment (version +978d2ce)
(pid=36875) [Powered by Stella]
====================================================================================================
Traceback (most recent call last):
  File "/home/nate/Desktop/Atom/agent57_pytorch/main.py", line 267, in <module>
    main(parser.parse_args())
  File "/home/nate/Desktop/Atom/agent57_pytorch/main.py", line 144, in main
    in_q_weight, ex_q_weight, embed_weight, trained_lifelong_weight, indices, priorities, in_q_loss, ex_q_loss, embed_loss, lifelong_loss = ray.get(finished_learner[0])
  File "/home/nate/miniconda3/lib/python3.9/site-packages/ray/_private/client_mode_hook.py", line 62, in wrapper
    return func(*args, **kwargs)
  File "/home/nate/miniconda3/lib/python3.9/site-packages/ray/worker.py", line 1495, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(RuntimeError): ray::Learner.update_network() (pid=36888, ip=192.168.137.71)
  File "python/ray/_raylet.pyx", line 501, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 451, in ray._raylet.execute_task.function_executor
  File "/home/nate/miniconda3/lib/python3.9/site-packages/ray/_private/function_manager.py", line 563, in actor_method_executor
    return method(__ray_actor, *args, **kwargs)
  File "/home/nate/Desktop/Atom/agent57_pytorch/learner.py", line 262, in update_network
    priorities, in_q_loss, ex_q_loss = self.qnet_update(weights, segments)
  File "/home/nate/Desktop/Atom/agent57_pytorch/learner.py", line 308, in qnet_update
    ex_target_qvalues = self.get_qvalues(self.ex_target_q_network, ex_h0, ex_c0)
  File "/home/nate/Desktop/Atom/agent57_pytorch/learner.py", line 371, in get_qvalues
    _, (h, c) = q_network(self.states[t],
  File "/home/nate/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nate/Desktop/Atom/agent57_pytorch/model.py", line 99, in forward
    x, states = self.lstm(x.unsqueeze(0), states)
  File "/home/nate/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/nate/miniconda3/lib/python3.9/site-packages/torch/nn/modules/rnn.py", line 679, in forward
    result = _VF.lstm(input, hx, self._flat_weights, self.bias, self.num_layers,
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

Have you encountered an error like this during development? Are you using an older version of CUDA / cuDNN? Please let me know if you have any suggestions.

opened by Obliman 2