Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator

Overview

DRL-robot-navigation

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gradient (TD3) neural network, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles. Obstacles are detected by laser readings and a goal is given to the robot in polar coordinates. Trained in ROS Gazebo simulator with PyTorch. Tested with ROS Melodic on Ubuntu 18.04 with python 3.6.9 and pytorch 1.10.

Training example:

Pre-print of the article:

Some more information is given in the article at: https://arxiv.org/abs/2103.07119

Please cite as:
@misc{cimurs2021goaldriven,
title={Goal-Driven Autonomous Exploration Through Deep Reinforcement Learning},
author={Reinis Cimurs and Il Hong Suh and Jin Han Lee},
year={2021},
eprint={2103.07119},
archivePrefix={arXiv},
primaryClass={cs.RO}
}

Main dependencies:

Clone the repository:

$ cd ~
### Clone this repo
$ git clone https://github.com/reiniscimurs/DRL-robot-navigation

The network can be run with a standard 2D laser, but this implementation uses a simulated 3D Velodyne sensor

Compile the workspace:

$ cd ~/DRL-robot-navigation/catkin_ws
### Compile
$ catkin_make_isolated

Open a terminal and set up sources:

$ export ROS_HOSTNAME=localhost
$ export ROS_MASTER_URI=http://localhost:11311
$ export ROS_PORT_SIM=11311
$ export GAZEBO_RESOURCE_PATH=~/DRL-robot-navigation/catkin_ws/src/multi_robot_scenario/launch
$ source ~/.bashrc
$ cd ~/DRL-robot-navigation/catkin_ws
$ source devel_isolated/setup.bash
### Run the training
$ cd ~/DRL-robot-navigation/TD3
$ python3 velodyne_td3.py

To kill the training process:

$ killall -9 rosout roslaunch rosmaster gzserver nodelet robot_state_publisher gzclient python python3

Gazebo environment:

Rviz:

Comments
  • Rviz issue

    Rviz issue

    Hi, Thank you for sharing your code. I'm new to ROS/Gazebo. I have installed ROS noetic and played with some tutorials. We are aiming to develop some path planning algorithms inspired by hemispheric lateralization, I found your work interesting. Currently, I'm trying to replicate path planning using your code. To make the things simple, I have just loaded the Gazebo environment. It is observed that the robot does not move in the rviz, however, it is moving in Gazebo. The robot model, camera, and laser scan all show status error, e.g. No transform from [r1/base_link] to [base_link]. Thanks in advance.

    my file velodyne_ab.py looks like

    ############################################# import os import time import numpy as np from velodyne_env import GazeboEnv

    #variables initialization #---- env = GazeboEnv('multi_robot_scenario.launch', 1, 1, 1)

    for i in range(10000): print("AB Step") if(i%100 == 0 and i > 0): env.reset()

    action = (1 + np.random.normal(0, expl_noise, size=action_dim)).clip(-max_action, max_action)
    a_in = [(action[0] + 1) / 2, action[1]]
    next_state, reward, done, target = env.step(a_in)
    

    Originally posted by @AbuZalayedHassan in https://github.com/reiniscimurs/DRL-robot-navigation/issues/1#issuecomment-982048248

    opened by reiniscimurs 13
  • I got stack at somewhere. Help me please.

    I got stack at somewhere. Help me please.

    Dear Reiniscimurs,

    Few months back, with your help i reproduced your work. Thanks. Now, i was trying to change for the robot model you used. This time i am trying to use Turtlebot3[Burger or waffle]. I am successful partially as when i run the code, Gazebo loads the world and spawns the model. However, the issue is the training is not starting. look the screenshot below.

    ISSUES:

    • Laser is not visible even if i enabled 'True' in the Xacro file.
    • The program stops as you can see in the screenshot.

    I guess i do not subscribe some topics. But how? Screenshot from 2022-09-14 15-20-48

    opened by Hoggaan 11
  • simulation problems, modules can not be loaded

    simulation problems, modules can not be loaded

    Hi Reinis, so sorry to bother you. When i ran python3 velodyne_td3.py, the gazebo can not be activated, and the rviz running was a mess like this 2022-04-26 19-26-11屏幕截图

    it seems like the modules of world can not be loaded, and the errors was like this 2022-04-26 19-25-51屏幕截图

    Could you please tell me how to fix this, thx!

    opened by Trevor233 10
  • Add Image information to TD3 training

    Add Image information to TD3 training

    Hi, Reinis Cimurs:

    Thank you for supporting this excellent project.

    Have you ever tried to add image(front camera) information as one of the states in your TD3 network?

    Do you think adding more information about the surrounding environment will improve the navigation performance? Could you give me some hints on how to improve the performance of your project?

    I would appreciate your reply. Thank you.

    opened by Barry2333 10
  • Robot keeps circling around

    Robot keeps circling around

    Hi Reinis it's me again. When I run your program I found that after several training the robot just stays put and keeps circling around

    https://user-images.githubusercontent.com/104433600/172398376-785704f1-d439-463d-a2b1-bb4216a8ecf3.mp4

    And the rewards just remains the same 2022-06-07 21-37-42屏幕截图

    Could you tell me where the problem is, thanks.

    opened by Trevor233 8
  • Issues with Gazebo and RVis. NOT LAUNCHING.

    Issues with Gazebo and RVis. NOT LAUNCHING.

    Hi there, i was trying to run this training process, but Gazebo is not coming up, RViz is not running properly as well. However, Algorithm seems running. See pics below. Gazebo and RViz are working in other scenarios. Can you please look and figure out what is wrong.

    • Distro Ros Noetic. Screenshot from 2022-06-23 21-25-00 Screenshot from 2022-06-23 21-25-37
    opened by Hoggaan 6
  • install issue

    install issue

    when i successfully install according to the official instructions,I run python3 velodyne_td3.pyto train,but I meet the follow issue and gazebo has not started

    image

    opened by wenzhi-zhang 6
  • How to set up sources with an anaconda environment and Noetic branch?

    How to set up sources with an anaconda environment and Noetic branch?

    Hi, thanks for sharing the Noetic branch. I am testing this branch with an anaconda environment. First, I opened a terminal using non-conda env. Then complie workspace:

    $ cd ~/DRL-robot-navigation/catkin_ws

    Compile

    $ catkin_make_isolated ----------(It succeeds!)

    Next, I have created an conda env, activated it, and installed some modules as follows. conda activate py3.6.9

    pip install torch==1.2.0 -f https://download.pytorch.org/whl/torch_stable.html pip install pyyaml pip install rospkg pip install squaternion pip install attr pip install attrs pip install netifaces defusedxml

    Finally, I opened a new terminal and set up sources: ---------------------- (This two lines are added to activate conda env and source ROS) $ conda activate py3.6.9 $ source /opt/ros/noetic/setup.bash

    $ export ROS_HOSTNAME=localhost $ export ROS_MASTER_URI=http://localhost:11311 $ export ROS_PORT_SIM=11311 $ export GAZEBO_RESOURCE_PATH=~/DRL-robot-navigation/catkin_ws/src/multi_robot_scenario/launch

    $ ########### source ~/.bashrc (This line is deleted because it exits conda env py3.6.9 and turns to conda env base) $ cd ~/DRL-robot-navigation/catkin_ws $ source devel_isolated/setup.bash $ cd ~/DRL-robot-navigation/TD3 $ python3 velodyne_td3.py

    The result is as follows, the following error occurs :

    [ INFO] [1638523319.979820797]: Finished loading Gazebo ROS API Plugin. [ INFO] [1638523319.980631275]: waitForService: Service [/gazebo/set_physics_properties] has not been advertised, waiting... [INFO] [1638523320.065212, 0.000000]: Loading model XML from ros parameter robot_description [INFO] [1638523320.070311, 0.000000]: Waiting for service /gazebo/spawn_urdf_model [ INFO] [1638523320.797520126]: waitForService: Service [/gazebo/set_physics_properties] is now available. [ INFO] [1638523320.854882519]: Physics dynamic reconfigure ready. [INFO] [1638523320.976913, 0.000000]: Calling service /gazebo/spawn_urdf_model [ INFO] [1638523497.485188702, 0.201000000]: Camera Plugin: Using the 'robotNamespace' param: '/' [ INFO] [1638523497.486945746, 0.201000000]: Camera Plugin (ns = /) <tf_prefix_>, set to "" [ INFO] [1638523497.497253932, 0.201000000]: Camera Plugin: The 'robotNamespace' param was empty [ INFO] [1638523497.498658911, 0.201000000]: Camera Plugin (ns = r1) <tf_prefix_>, set to "" [INFO] [1638523498.697978, 0.201000]: Spawn status: SpawnModel: Successfully spawned entity [urdf_spawner-2] process has finished cleanly log file: /home/agent/.ros/log/770251f6-541a-11ec-aeed-e7829104ff20/urdf_spawner-2*.log [ INFO] [1638523499.690473792, 0.201000000]: Laser Plugin: The 'robotNamespace' param was empty [ INFO] [1638523499.690532451, 0.201000000]: Starting Laser Plugin (ns = r1) [ INFO] [1638523499.691076904, 0.201000000]: Laser Plugin (ns = r1) <tf_prefix_>, set to "" [ INFO] [1638523499.697194393, 0.201000000]: Velodyne laser plugin missing <min_intensity>, defaults to no clipping [ INFO] [1638523499.698738449, 0.201000000]: Velodyne laser plugin ready, 16 lasers [ INFO] [1638523499.710498725, 0.201000000]: Starting plugin DiffDrive(ns = r1/) [ INFO] [1638523499.710623456, 0.201000000]: DiffDrive(ns = r1/): = Debug [ INFO] [1638523499.711047215, 0.201000000]: DiffDrive(ns = r1/): <tf_prefix> = [DEBUG] [1638523499.711102141, 0.201000000]: DiffDrive(ns = r1/): = cmd_vel [DEBUG] [1638523499.711116354, 0.201000000]: DiffDrive(ns = r1/): = odom [DEBUG] [1638523499.711153205, 0.201000000]: DiffDrive(ns = r1/): = odom [DEBUG] [1638523499.711164916, 0.201000000]: DiffDrive(ns = r1/): = base_link [DEBUG] [1638523499.711235821, 0.201000000]: DiffDrive(ns = r1/): = false [ WARN] [1638523499.711250520, 0.201000000]: DiffDrive(ns = r1/): missing default is true [DEBUG] [1638523499.711290227, 0.201000000]: DiffDrive(ns = r1/): = true [DEBUG] [1638523499.711344992, 0.201000000]: DiffDrive(ns = r1/): = 0.29999999999999999 [DEBUG] [1638523499.711362835, 0.201000000]: DiffDrive(ns = r1/): = 0.17999999999999999 [DEBUG] [1638523499.711377474, 0.201000000]: DiffDrive(ns = r1/): = 1.8 [DEBUG] [1638523499.711391185, 0.201000000]: DiffDrive(ns = r1/): = 20 [DEBUG] [1638523499.711405342, 0.201000000]: DiffDrive(ns = r1/): = 50 [DEBUG] [1638523499.711453639, 0.201000000]: DiffDrive(ns = r1/): = world := 1 [DEBUG] [1638523499.711485094, 0.201000000]: DiffDrive(ns = r1/): = left_hub_joint [DEBUG] [1638523499.711499787, 0.201000000]: DiffDrive(ns = r1/): = right_hub_joint [ WARN] [1638523499.711518908, 0.201000000]: GazeboRosDiffDrive Plugin (ns = ) missing , defaults to 1 [ INFO] [1638523499.711922234, 0.201000000]: DiffDrive(ns = r1/): Advertise joint_states [ INFO] [1638523499.712257380, 0.201000000]: DiffDrive(ns = r1/): Try to subscribe to cmd_vel [ INFO] [1638523499.713753381, 0.201000000]: DiffDrive(ns = r1/): Subscribe to cmd_vel [ INFO] [1638523499.714150768, 0.201000000]: DiffDrive(ns = r1/): Advertise odom on odom [ INFO] [1638523499.721365047, 0.201000000]: GazeboRosJointStatePublisher is going to publish joint: chassis_swivel_joint [ INFO] [1638523499.721390973, 0.201000000]: GazeboRosJointStatePublisher is going to publish joint: swivel_wheel_joint [ INFO] [1638523499.721424750, 0.201000000]: GazeboRosJointStatePublisher is going to publish joint: left_hub_joint [ INFO] [1638523499.721433820, 0.201000000]: GazeboRosJointStatePublisher is going to publish joint: right_hub_joint [ INFO] [1638523499.721446512, 0.201000000]: Starting GazeboRosJointStatePublisher Plugin (ns = r1/)!, parent name: r1 [DEBUG] [1638523499.732124524, 0.211000000]: Trying to publish message of type [sensor_msgs/LaserScan/90c7ef2dc6895d81024acba2ac42f369] on a publisher with type [sensor_msgs/LaserScan/90c7ef2dc6895d81024acba2ac42f369] [DEBUG] [1638523499.793156774, 0.222000000]: Trying to publish message of type [nav_msgs/Odometry/cd5e73d190d741a2f92e81eda573aca7] on a publisher with type [nav_msgs/Odometry/cd5e73d190d741a2f92e81eda573aca7] [DEBUG] [1638523499.793237857, 0.222000000]: Trying to publish message of type [sensor_msgs/JointState/3066dcd76a6cfaef579bd0f34173e9fd] on a publisher with type [sensor_msgs/JointState/3066dcd76a6cfaef579bd0f34173e9fd] [DEBUG] [1638523499.807064731, 0.235000000]: Trying to publish message of type [sensor_msgs/CameraInfo/c9a58c1b0b154e0e6da7578cb991d214] on a publisher with type [sensor_msgs/CameraInfo/c9a58c1b0b154e0e6da7578cb991d214] Exception in thread /r1/odom: Traceback (most recent call last): File "/home/agent/anaconda3/envs/py3.6.9/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/home/agent/anaconda3/envs/py3.6.9/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/opt/ros/noetic/lib/python3/dist-packages/rospy/impl/tcpros_pubsub.py", line 185, in robust_connect_subscriber conn.receive_loop(receive_cb) File "/opt/ros/noetic/lib/python3/dist-packages/rospy/impl/tcpros_base.py", line 846, in receive_loop self.close() File "/opt/ros/noetic/lib/python3/dist-packages/rospy/impl/tcpros_base.py", line 858, in close self.socket.close() AttributeError: 'NoneType' object has no attribute 'close'

    Exception in thread /r1/front_laser/scan: Traceback (most recent call last): File "/home/agent/anaconda3/envs/py3.6.9/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/home/agent/anaconda3/envs/py3.6.9/lib/python3.6/threading.py", line 864, in run self._target(*self._args, **self._kwargs) File "/opt/ros/noetic/lib/python3/dist-packages/rospy/impl/tcpros_pubsub.py", line 185, in robust_connect_subscriber conn.receive_loop(receive_cb) File "/opt/ros/noetic/lib/python3/dist-packages/rospy/impl/tcpros_base.py", line 846, in receive_loop self.close() File "/opt/ros/noetic/lib/python3/dist-packages/rospy/impl/tcpros_base.py", line 858, in close self.socket.close() AttributeError: 'NoneType' object has no attribute 'close'

    opened by AgentEXPL 6
  • 'NoneType' object has no attribute 'close'

    'NoneType' object has no attribute 'close'

    Hey,bro! I've got a issue when I run the velodyne_td3.py. I can see the rviz and gazebo after the command "gzclient", but I could not see the result like epoch, rewards and so on...Do you have met problems like this? 1661409476417

    opened by blackjacket996 5
  • robot just circling around

    robot just circling around

    Hi Mr. Reinis Cimurs.

    I have already emailed the same question to the email address I found on the other issue post.

    I have been following and enjoyed your research.

    It is amazing to see you have trained a robot to explore the unknown environment

    and create a map.

    I want to train my robot to have this ability so that my robot can explore unknown environments without collision.

    I believe a good way to start is to try to reproduce the training result you had in https://github.com/reiniscimurs/DRL-robot-navigation repository.

    However, I have trained the robot for almost 2 days and 63 evaluation has been done and the robot still shows weird behavior such as circling around not hitting the wall or obstacle so that it can avoid the worst case scenario which is a collision.

    I have run your program under the following environment. Ubuntu 18.04 ROS melodic Python3 cuda 10.2 Geforce graphics card

    There are a few suspicious things I want to point out.

    1. I am getting tf error message on RViz telling me as follows.

    "for frame [r1/front_laser]:No transform to fixed frame [base_link] tf error: lookup would require extrapolation into the future. Requested time ~ but the latest data is at time ~ when looking up transform from frame [r1/front_laser] to frame [base_link]"

    =>This error message keeps appearing and disappearing.

    1. I don't see gazebo environment when I run the velodyne_td3.py => I am not sure it is normal. I am guessing the gazebo is not shown because GUI option is set to false.

    So my questions are Q1. Do I need to fine-tune to train the robot successfully? i.e. I want to robot to show the behavior like your robot does on your repository.

    Q2. How long does it take to get the expected behavior?(i.e. no collision and successfully navigating to the given goal)

    Q3. Are there some tricks and tips I should've known before I run your TD3 program?

    Thank you for any help in advance.

    Best,

    *tf error https://drive.google.com/file/d/1QK8YxdSSVKZgA7mnzIbk1M4pWwS_Kpsz/view?usp=sharing

    *circle behavior https://drive.google.com/file/d/1zO5x73s4CcbEEpgZ69MibqZK3I0h_aKW/view?usp=sharing

    opened by samwooseTW 5
  • Roscore not running

    Roscore not running

    Average Reward over 10 Evaluation Episodes, Epoch 19: -156.269393, 1.000000 Unable to register with master node [http://localhost:11311]: master may not be running yet. Will keep trying. Recently, when I trained to about 19-23 epoch, there was a problem that I could not connect to the master node. After this problem, the car kept spinning in the simulation and could not enter the next round. What might be the reason

    ------------------ 原始邮件 ------------------ 发件人: "reiniscimurs/DRL-robot-navigation" ***@***.***>; 发送时间: 2021年12月29日(星期三) 凌晨2:31 ***@***.***>; ***@***.******@***.***>; 主题: Re: [reiniscimurs/DRL-robot-navigation] About adding LSTM (Issue #9)

    Hi,

    I am not really sure how would I be able to help you with this in any way as you are not giving any specific information about it. I don't think I can help you with purely theoretical extensions of TD3 work.

    The only thing I can say is that I don't understand why would you use VAE encoding for laser data in the first place. Clearly for dimensionality reduction, but what kind of result are you expecting to come out of it?

    Second, LSTMs are used for time-series data. The implemented TD3 network in this repo assumes independence in the batched data. Meaning, that when a batch is selected for training, it is a single state-action tuple that it is trained on. There is no information about the previous states in the environment, so I am not sure how you would implement LSTM with this repo without significantly rewriting the code.

    Please provide more specific implementation descriptions in the future, as it is virtually impossible to answer a question when it is phrased in this manner. Good luck with your implementation and extension.

    — Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.Message ID: ***@***.***>

    Originally posted by @hjj-666 in https://github.com/reiniscimurs/DRL-robot-navigation/issues/9#issuecomment-1003483012

    opened by reiniscimurs 5
  • Some question

    Some question

    Excuse me After I have trained the model, do I need to save the file at the end? I'm a ros beginner, if I want to apply it to the real device and environment, how can I do it, I have pioneer 3dx and UST-10LX - Hokuyo.

    opened by chih7715 2
  • run on a headless server

    run on a headless server

    Hi, thank you for having released this cool project! My goal is to run the simulation and training on a headless server (os without GUI). Could I face issues with the gazebo simulator or should I flag something to make it work properly? Follow-up question: what do you think about running everything inside a docker container? Do you have any tips on which image to use?

    Thank you. Regards!

    opened by rosanom 3
  • Problems with training and tensorboard visualization.

    Problems with training and tensorboard visualization.

    Hi, Reinis Cimurs, Thanks for sharing your amazing work! I encountered these problems when I wanna visualize the training process. You can see in the training terminator, AttributeError: 'NoneType' object has no attribute 'close' is called. But the training is still running in Rviz and Gazebo. In another terminator, I run these command tensorboard --logdir ./runs in the dir /DRL-robot-navigation/TD3 . As shown, It was said that No dashboards are active for the current data set. Is there any config that I need to change or modify? Could you give me some help for these problems? Thank you very much.

    opened by 525753936 5
  • the convergence of network

    the convergence of network

    Thank you for your open source code and it's helpful to me. I tried to reproduce your experiment, but I couldn't get good navigation performance. After many episodes of training, the mobile robot had no ability to avoid obstacles. Could you tell me what is the final navigation accuracy of your experiment.

    opened by cx-cheng 3
  • Question for customize world file for training

    Question for customize world file for training

    Hi, Thanks for sharing this great project. I have a question of the customize world file. What I did is to change the TD3.world in empty_world.launch to my world file. However when I run the trianing it can't load it. Can anyone give me some instruction thank you.

    opened by apriljt 1
Yolo ros - YOLO-ROS for HUAWEI ATLAS200

YOLO-ROS YOLO-ROS for NVIDIA YOLO-ROS for HUAWEI ATLAS200, please checkout for b

ChrisLiu 5 Oct 18, 2022
PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集,包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。 人机交互 主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

null 185 Dec 26, 2022
Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

PyTorch code to reproduce LyDROO algorithm [1], which is an online computation offloading algorithm to maximize the network data processing capability subject to the long-term data queue stability and average power constraints. It applies Lyapunov optimization to decouple the multi-stage stochastic MINLP into deterministic per-frame MINLP subproblems and solves each subproblem via DROO algorithm. It includes:

Liang HUANG 87 Dec 28, 2022
SenseNet is a sensorimotor and touch simulator for deep reinforcement learning research

SenseNet is a sensorimotor and touch simulator for deep reinforcement learning research

null 59 Feb 25, 2022
Guiding evolutionary strategies by (inaccurate) differentiable robot simulators @ NeurIPS, 4th Robot Learning Workshop

Guiding Evolutionary Strategies by Differentiable Robot Simulators In recent years, Evolutionary Strategies were actively explored in robotic tasks fo

Vladislav Kurenkov 4 Dec 14, 2021
👨‍💻 run nanosaur in simulation with Gazebo/Ingnition

?? ??‍?? nanosaur_gazebo nanosaur The smallest NVIDIA Jetson dinosaur robot, open-source, fully 3D printable, based on ROS2 & Isaac ROS. Designed & ma

nanosaur 9 Jul 19, 2022
Anomaly Detection Based on Hierarchical Clustering of Mobile Robot Data

We proposed a new approach to detect anomalies of mobile robot data. We investigate each data seperately with two clustering method hierarchical and k-means. There are two sub-method that we used for produce an anomaly score. Then, we merge these two score and produce merged anomaly score as a result.

Zekeriyya Demirci 1 Jan 9, 2022
Position detection system of mobile robot in the warehouse enviroment

Autonomous-Forklift-System About | GUI | Tests | Starting | License | Author | ?? About An application that run the autonomous forklift paletization a

Kamil Goś 1 Nov 24, 2021
MohammadReza Sharifi 27 Dec 13, 2022
Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Mingrui Yu 3 Jan 7, 2022
[IROS'21] SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning

SurRoL IROS 2021 SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning Features dVRK compati

Med-AIR@CUHK 55 Jan 3, 2023
[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

PG-MORL This repository contains the implementation for the paper Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Contro

MIT Graphics Group 65 Jan 7, 2023
Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments (CoRL 2020)

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments [Project website] [Paper] This project is a PyTorch

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 49 Nov 28, 2022
Robot Reinforcement Learning on the Constraint Manifold

Implementation of "Robot Reinforcement Learning on the Constraint Manifold"

null 31 Dec 5, 2022
Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX

CQL-JAX This repository implements Conservative Q Learning for Offline Reinforcement Reinforcement Learning in JAX (FLAX). Implementation is built on

Karush Suri 8 Nov 7, 2022
Reinforcement-learning - Repository of the class assignment questions for the course on reinforcement learning

DSE 314/614: Reinforcement Learning This repository containing reinforcement lea

Manav Mishra 4 Apr 15, 2022
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 8, 2023
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 5, 2023
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 19.3k Feb 12, 2021