RLMeta is a light-weight flexible framework for Distributed Reinforcement Learning Research.

Related tags

Deep Learning rlmeta
Overview

RLMeta

rlmeta - a flexible lightweight research framework for Distributed Reinforcement Learning based on PyTorch and moolib

Installation

To build from source, please install PyTorch first, and then run the commands below.

$ git clone https://github.com/facebookresearch/rlmeta
$ cd rlmeta
$ git submodule sync && git submodule update --init --recursive
$ pip install -e .

Run an Example

To run the example for Atari Pong game with PPO algorithm:

$ cd examples/atari/ppo
$ python atari_ppo.py env="PongNoFrameskip-v4" num_epochs=20

We are using hydra to define configs for trainining jobs. The configs are defined in

./conf/conf_ppo.yaml

The logs and checkpoints will be automatically saved to

./outputs/{YYYY-mm-dd}/{HH:MM:SS}/

After training, we can draw the training curve by run

$ python ../../plot.py --log_file=./outputs/{YYYY-mm-dd}/{HH:MM:SS}/atari_ppo.log --fig_file=./atari_ppo.png --xkey=time

One example of the training curve is shown below.

atari_ppo

License

rlmeta is licensed under the MIT License. See LICENSE for details.

Comments
  • m_server::push time out and m_server::act time out

    m_server::push time out and m_server::act time out

    • I was trying to execute the example program atari_ppo.py on the following machine: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz 32GB RAM GTX 1080 with 8G RAM Ubuntu 16.04 cuda 10.2 == I have edited my configuration file conf_ppo.yaml to adapt to reduce the resource usage
    m_server_name: "m_server"
    m_server_addr: "127.0.0.1:4411"
    
    r_server_name: "r_server"
    r_server_addr: "127.0.0.1:4412"
    
    c_server_name: "c_server"
    c_server_addr: "127.0.0.1:4413"
    
    train_device: "cuda:0"
    infer_device: "cuda:0"
    
    timeout: 180
    
    env: "PongNoFrameskip-v4"
    max_episode_steps: 2700
    
    num_train_rollouts: 1 
    num_train_workers: 1
    
    num_eval_rollouts: 1
    num_eval_workers: 1
    
    replay_buffer_size: 1024 
    prefetch: 2
    
    batch_size: 32
    lr: 3e-4
    push_every_n_steps: 50
    
    num_epochs: 1000
    steps_per_epoch: 3000
    
    num_eval_episodes: 20
    
    train_seed: 123
    eval_seed: 456
    

    Here is what I got:

    [2022-01-18 18:34:54,797][root][INFO] - {'m_server_name': 'm_server', 'm_server_addr': '127.0.0.1:4411', 'r_server_name': 'r_server', 'r_server_addr': '127.0.0.1:4412', 'c_server_name': 'c_server', 'c_server_addr': '127.0.0.1:4413', 'train_device': 'cuda:0', 'infer_device': 'cuda:0', 'env': 'PongNoFrameskip-v4', 'max_episode_steps': 2700, 'num_train_rollouts': 1, 'num_train_workers': 1, 'num_eval_rollouts': 1, 'num_eval_workers': 1, 'replay_buffer_size': 1024, 'prefetch': 2, 'batch_size': 8, 'lr': 0.0003, 'push_every_n_steps': 100, 'num_epochs': 20, 'steps_per_epoch': 300, 'num_eval_episodes': 20, 'train_seed': 123, 'eval_seed': 456}
    [2022-01-18 18:35:08,193][root][INFO] - Warming up replay buffer: [    0 / 1024 ]
    [2022-01-18 18:35:09,194][root][INFO] - Warming up replay buffer: [    0 / 1024 ]
    [2022-01-18 18:35:10,196][root][INFO] - Warming up replay buffer: [    0 / 1024 ]
    [2022-01-18 18:35:11,198][root][INFO] - Warming up replay buffer: [    0 / 1024 ]
    [2022-01-18 18:35:12,220][root][INFO] - Warming up replay buffer: [    0 / 1024 ]
    [2022-01-18 18:35:13,222][root][INFO] - Warming up replay buffer: [  894 / 1024 ]
    [2022-01-18 18:35:14,228][root][INFO] - Warming up replay buffer: [  894 / 1024 ]
    [2022-01-18 18:35:15,229][root][INFO] - Warming up replay buffer: [  894 / 1024 ]
    [2022-01-18 18:35:16,231][root][INFO] - Warming up replay buffer: [ 1024 / 1024 ]
    Exception in callback handle_task_exception(<Task finishe...) timed out')>) at /media/research/ml2558/rlmeta/rlmeta/utils/asycio_utils.py:11
    handle: <Handle handle_task_exception(<Task finishe...) timed out')>) at /media/research/ml2558/rlmeta/rlmeta/utils/asycio_utils.py:11>
    Traceback (most recent call last):
      File "/home/ml2558/miniconda3/lib/python3.9/asyncio/events.py", line 80, in _run
        self._context.run(self._callback, *self._args)
      File "/media/research/ml2558/rlmeta/rlmeta/utils/asycio_utils.py", line 17, in handle_task_exception
        raise e
      File "/media/research/ml2558/rlmeta/rlmeta/utils/asycio_utils.py", line 13, in handle_task_exception
        task.result()
      File "/media/research/ml2558/rlmeta/rlmeta/core/loop.py", line 161, in _run_loop
        stats = await self._run_episode(env, agent, index)
      File "/media/research/ml2558/rlmeta/rlmeta/core/loop.py", line 182, in _run_episode
        action = await agent.async_act(timestep)
      File "/media/research/ml2558/rlmeta/rlmeta/agents/ppo/ppo_agent.py", line 78, in async_act
        action, logpi, v = await self.model.async_act(
    RuntimeError: Call (m_server::act) timed out
    Error executing job with overrides: ['env=PongNoFrameskip-v4', 'num_epochs=20']
    Traceback (most recent call last):
      File "/media/research/ml2558/rlmeta/examples/atari/ppo/atari_ppo.py", line 96, in main
        stats = agent.train(cfg.steps_per_epoch)
      File "/media/research/ml2558/rlmeta/rlmeta/agents/ppo/ppo_agent.py", line 139, in train
        self.model.push()
      File "/media/research/ml2558/rlmeta/rlmeta/core/model.py", line 69, in push
        self.client.sync(self.server_name, "push", state_dict)
    RuntimeError: Call (m_server::<unknown>) timed out
    
    Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
    

    I tried to modify the timeout but seems with the same error. Any hint on how to resolve this?

    opened by lmlaaron 5
  • Replay buffer crashes after being cleared

    Replay buffer crashes after being cleared

    Minimal example:

    import torch
    from _rlmeta_extension import UniformSampler
    from rlmeta.core.replay_buffer import ReplayBuffer
    from rlmeta.storage import TensorCircularBuffer
    
    replay_buffer = ReplayBuffer(TensorCircularBuffer(12), UniformSampler())
    
    while True:
        for t in torch.randn(size=(12,2)).chunk(12,dim=0):
            replay_buffer.append(t)
            replay_buffer.sample(12)
        replay_buffer.clear()
    

    Stack trace:

    RuntimeError: output with shape [2] doesn't match the broadcast shape [1, 2]
    Exception raised from mark_resize_outputs at ../aten/src/ATen/TensorIterator.cpp:1181 (most recent call first):
    frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x3e (0x7fd72c9a220e in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libc10.so)
    frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x5c (0x7fd72c97d5e8 in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libc10.so)
    frame #2: at::TensorIteratorBase::mark_resize_outputs(at::TensorIteratorConfig const&) + 0x241 (0x7fd755cf6301 in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #3: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x64 (0x7fd755cf6e54 in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #4: <unknown function> + 0x19d4f8c (0x7fd755f11f8c in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #5: at::native::copy_(at::Tensor&, at::Tensor const&, bool) + 0x62 (0x7fd755f12ec2 in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #6: at::_ops::copy_::redispatch(c10::DispatchKeySet, at::Tensor&, at::Tensor const&, bool) + 0x75 (0x7fd756886555 in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #7: <unknown function> + 0x46e94f5 (0x7fd758c264f5 in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #8: at::_ops::copy_::redispatch(c10::DispatchKeySet, at::Tensor&, at::Tensor const&, bool) + 0x75 (0x7fd756886555 in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #9: <unknown function> + 0x46ea6ad (0x7fd758c276ad in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #10: at::_ops::copy_::call(at::Tensor&, at::Tensor const&, bool) + 0x16e (0x7fd7568cdbce in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #11: <unknown function> + 0x495df (0x7fd7024265df in /home/d3sm0/code/research/rlmeta/_rlmeta_extension.cpython-310-x86_64-linux-gnu.so)
    frame #12: <unknown function> + 0x4a0c0 (0x7fd7024270c0 in /home/d3sm0/code/research/rlmeta/_rlmeta_extension.cpython-310-x86_64-linux-gnu.so)
    frame #13: <unknown function> + 0x1dd0f (0x7fd7023fad0f in /home/d3sm0/code/research/rlmeta/_rlmeta_extension.cpython-310-x86_64-linux-gnu.so)
    <omitting python frames>
    frame #31: <unknown function> + 0x3feb0 (0x7fd7aa936eb0 in /lib64/libc.so.6)
    frame #32: __libc_start_main + 0x80 (0x7fd7aa936f60 in /lib64/libc.so.6)
    frame #33: _start + 0x25 (0x5649803a1095 in /home/d3sm0/.venvs/torch_env/bin/python)
    
    opened by d3sm0 4
  • Add more logging + ability to push to more downstream models

    Add more logging + ability to push to more downstream models

    This PR adds:

    • repr functions for some classes
    • Rich based logging: https://rich.readthedocs.io/en/stable/introduction.html
    • Extra parameter additional_downstream_models to agent.train that can be used to push to more than one downstream model (e.g., if there are multiple parallel loops, push to the model that each loop is using).
    CLA Signed 
    opened by EntilZha 4
  • Added namespace as a member of Remotable and updated example

    Added namespace as a member of Remotable and updated example

    Added an identifier for remotable class, this allows distinguish instances of the same class/different class sharing the same method as long as user defines a method.

    Alternatively or additionally, we can also add the func name to the identifier to further distinguish between different instantiated classes with the same name. for example PPOAgent.forward() and APPOAgent.forward() wouldn't need an identifier additionally.

    CLA Signed 
    opened by JD-ETH 4
  • Switch to new OpenAI Gym step API

    Switch to new OpenAI Gym step API

    The new version of OpenAI Gym uses a new step API which returns (observations, reward, termination, truncation, info) instead of (observations, reward, done, info). We have to make the wrappers to support this.

    Track this progress in this issue.

    opened by xiaomengy 2
  • Longer-term and relation to other RL libraries under Meta

    Longer-term and relation to other RL libraries under Meta

    Hi, excited to see this work on distributed RL, building off moolib (and TorchBeast originally). I'm wondering what the longer-term direction of this project is?

    Will functionality be merged into TorchRL (which mentions an upcoming IMPALA implementation)? https://github.com/facebookresearch/rl#upcoming-features

    Is moolib still being maintained? https://github.com/facebookresearch/moolib/issues/32#issuecomment-1085730793

    There are so many RL libraries these days.

    opened by etaoxing 2
  • How to sample partial trajectories?

    How to sample partial trajectories?

    Many value estimation methods relies on sub-sequences of a trajectory, (i.e. retrace, gae, n-step, lambda-returns). How can this be achieved with current samplers? A simple workaround would be to use a clever idx for each sample and use __get__ to extract the sub-sequence one element at a time, but I believe it might impact performances.

    Other ideas? Otherwise how can this be implemented in the c++ code?

    opened by d3sm0 1
  • Add passthrough rescaler + Git CI style checker

    Add passthrough rescaler + Git CI style checker

    PR adds:

    • A passthrough rescaler to use if you don't want to rescale rewards.
    • Github workflow to run yapf format checker on main branch and PRs to main branch

    Tested the build on my fork here after adding this PR branch (then deleting it before making PR) Screen Shot 2022-04-28 at 5 06 59 PM

    CLA Signed 
    opened by EntilZha 1
  • Refactor Atari Models and Atari Game settings

    Refactor Atari Models and Atari Game settings

    This PR add the following changes.

    1. Switch to the recommended settings for Atari Game based on https://arxiv.org/abs/1709.06009.
    2. Refactor Atari model's implementation and add Impala backbone.
    3. Update the default hyper-parameters of Ape-X DQN to R2D2-like settings.
    CLA Signed 
    opened by xiaomengy 0
  • Deprecate old atari_wrappers

    Deprecate old atari_wrappers

    This PR deprecate the old Atari Wrappers.

    1. Switch to Atari-v5 env as suggested in https://brosa.ca/blog/ale-release-v0.7
    2. Use gym.wrappers.AtariPreprocessing to replace old atari_wrappers.
    3. Add random seed for model server.
    4. Remove TimeLimitWrapper and switch to gym.wrappers.TimeLimit.
    CLA Signed 
    opened by xiaomengy 0
  • Update Ape-X DQN implementation with tricks in MEME

    Update Ape-X DQN implementation with tricks in MEME

    This PR updates Ape-X DQN implementation with tricks introduced in DeepMind's MEME paper. https://arxiv.org/pdf/2209.07550.pdf

    • Bootstrapping with online net
    • Q-value clip
    CLA Signed 
    opened by xiaomengy 0
  • Pip installation fails in virtual env and SIGILL on DGX machines

    Pip installation fails in virtual env and SIGILL on DGX machines

    it seems that pip install -e . does prepare the proper directories but does not include the built package. We solved by adding:

    +        include_package_data=False,
    +        packages=find_packages(include=['rlmeta', 'rlmeta.*']),
    

    here: https://github.com/facebookresearch/rlmeta/blob/c43d0f11922b2b8d513b3227242844596dbc34e5/setup.py#L87

    Nit: It might be useful to provide an easy way to pass a cuda/cudnn path to cmake, maybe something like DCUDNN_LIBRARY_PATH=os.einviron.get("CUDA_LIBRARY_PATH, "")

    Finally the flag --march=native might cause some issues especially for HPC. We removed it for our cluster and managed to reliably train on different machines.

    opened by d3sm0 0
  • TensorCircularBuffer with capacity larger of 1mln fails

    TensorCircularBuffer with capacity larger of 1mln fails

    Replay buffer of capacity of 1mln tries to allocate 846.72 gb. Steps to reproduce:

    from rlmeta.storage import TensorCircularBuffer
    import torch
    
    rb = TensorCircularBuffer(capacity=int(1e6))
    rb.append(torch.randn(10, 3, 84, 84))
    

    Log:

    RuntimeError: [enforce fail at alloc_cpu.cpp:66] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 846720000000 bytes. Error code 12 (Cannot allocate memory)
    frame #0: c10::ThrowEnforceNotMet(char const*, int, char const*, std::string const&, void const*) + 0x55 (0x7fd5b71980c5 in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libc10.so)
    frame #1: c10::alloc_cpu(unsigned long) + 0x7ac (0x7fd5b71894cc in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libc10.so)
    frame #2: <unknown function> + 0x23bc3 (0x7fd5b7176bc3 in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libc10.so)
    frame #3: at::detail::empty_generic(c10::ArrayRef<long>, c10::Allocator*, c10::DispatchKeySet, c10::ScalarType, c10::optional<c10::MemoryFormat>) + 0x7bf (0x7fd5e04a5b2f in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #4: at::detail::empty_cpu(c10::ArrayRef<long>, c10::ScalarType, bool, c10::optional<c10::MemoryFormat>) + 0x40 (0x7fd5e04a64a0 in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #5: at::detail::empty_cpu(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) + 0x34 (0x7fd5e04a64f4 in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #6: at::native::empty_cpu(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) + 0x1f (0x7fd5e09b826f in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #7: <unknown function> + 0x24f700b (0x7fd5e122a00b in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #8: at::_ops::empty_memory_format::redispatch(c10::DispatchKeySet, c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) + 0xe3 (0x7fd5e0f75653 in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #9: <unknown function> + 0x24d200f (0x7fd5e120500f in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #10: at::_ops::empty_memory_format::call(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, c10::optional<c10::MemoryFormat>) + 0x1b7 (0x7fd5e0fb3077 in /home/d3sm0/.venvs/torch_env/lib64/python3.10/site-packages/torch/lib/libtorch_cpu.so)
    frame #11: <unknown function> + 0x4586c (0x7fd5b5ba886c in /home/d3sm0/code/research/rlmeta/_rlmeta_extension.cpython-310-x86_64-linux-gnu.so)
    frame #12: <unknown function> + 0x49700 (0x7fd5b5bac700 in /home/d3sm0/code/research/rlmeta/_rlmeta_extension.cpython-310-x86_64-linux-gnu.so)
    frame #13: <unknown function> + 0x4a0c0 (0x7fd5b5bad0c0 in /home/d3sm0/code/research/rlmeta/_rlmeta_extension.cpython-310-x86_64-linux-gnu.so)
    frame #14: <unknown function> + 0x1dd0f (0x7fd5b5b80d0f in /home/d3sm0/code/research/rlmeta/_rlmeta_extension.cpython-310-x86_64-linux-gnu.so)
    <omitting python frames>
    frame #30: <unknown function> + 0x3feb0 (0x7fd65cacbeb0 in /lib64/libc.so.6)
    frame #31: __libc_start_main + 0x80 (0x7fd65cacbf60 in /lib64/libc.so.6)
    
    opened by d3sm0 1
  • Moolib Backend Issues

    Moolib Backend Issues

    Recently there are several issues from moolib backend.

    1. Based the observation of https://github.com/facebookresearch/moolib/issues/36, there is a performance regression in moolib.
    2. There are several installation issues in moolib.

    Based on this we are thinking about building another backend not using moolib. Open this issue to track the progress.

    PR for gRPC backend: https://github.com/facebookresearch/rlmeta/pull/63

    opened by xiaomengy 4
  • Add ProcessManager to maintain processes.

    Add ProcessManager to maintain processes.

    Currently the processes are created directly in Server and Loop. It is very common that there are some zombie processes left when the main process terminates. It may be better to have a ProcessManager to manage the processes on a single node.

    Open a tracking issue here for this feature request.

    opened by xiaomengy 0
Owner
Meta Research
Meta Research
Bagua is a flexible and performant distributed training algorithm development framework.

Bagua is a flexible and performant distributed training algorithm development framework.

null 786 Dec 17, 2022
Official Tensorflow implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Detection"

M-LSD: Towards Light-weight and Real-time Line Segment Detection Official Tensorflow implementation of "M-LSD: Towards Light-weight and Real-time Line

NAVER/LINE Vision 357 Jan 4, 2023
Pytorch implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Detection"

M-LSD: Towards Light-weight and Real-time Line Segment Detection Pytorch implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Det

null 123 Jan 4, 2023
A highly efficient, fast, powerful and light-weight anime downloader and streamer for your favorite anime.

AnimDL - Download & Stream Your Favorite Anime AnimDL is an incredibly powerful tool for downloading and streaming anime. Core features Abuses the dev

KR 759 Jan 8, 2023
A light-weight image labelling tool for Python designed for creating segmentation data sets.

An image labelling tool for creating segmentation data sets, for Django and Flask.

null 117 Nov 21, 2022
DeLighT: Very Deep and Light-Weight Transformers

DeLighT: Very Deep and Light-weight Transformers This repository contains the source code of our work on building efficient sequence models: DeFINE (I

Sachin Mehta 440 Dec 18, 2022
Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

MobileViT RegNet Unofficial PyTorch implementation of MobileViT based on paper MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE, AND MOBILE-FRIENDLY VISION TR

Hong-Jia Chen 91 Dec 2, 2022
Light-weight network, depth estimation, knowledge distillation, real-time depth estimation, auxiliary data.

light-weight-depth-estimation Boosting Light-Weight Depth Estimation Via Knowledge Distillation, https://arxiv.org/abs/2105.06143 Junjie Hu, Chenyou F

Junjie Hu 13 Dec 10, 2022
A light weight data augmentation tool for training CNNs and Viola Jones detectors

hey-daug A light weight data augmentation tool for training CNNs and Viola Jones detectors (Haar Cascades). This tool inflates your data by up to six

Jaiyam Sharma 2 Nov 23, 2019
An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.

An integration of several popular automatic augmentation methods, including OHL (Online Hyper-Parameter Learning for Auto-Augmentation Strategy) and AWS (Improving Auto Augment via Augmentation Wise Weight Sharing) by Sensetime Research.

null 45 Dec 8, 2022
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 8, 2023
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 5, 2023
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 19.3k Feb 12, 2021
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Master Docs License Apache MXNet (incubating) is a deep learning framework designed for both efficiency an

ROCm Software Platform 29 Nov 16, 2022
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

Facebook Research 5.1k Jan 4, 2023
Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

Karush Suri 8 Nov 7, 2022
Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

Manav Mishra 4 Apr 15, 2022
Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research

Megaverse Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research. The efficient design of the engine enables ph

Aleksei Petrenko 191 Dec 23, 2022
SenseNet is a sensorimotor and touch simulator for deep reinforcement learning research

SenseNet is a sensorimotor and touch simulator for deep reinforcement learning research

null 59 Feb 25, 2022