SpiderBot_DeepRL

Title: Implementation of Single and Multi-Agent Deep Reinforcement Learning Algorithms for a Walking Spider Robot Authors(s): Arijit Dasgupta, Chong Yu Quan

Welcome to our project! For this project, we aim to take our SpiderBot and make it walk using deep reinforcement learning. The code is written entirely in Python 3.7.7 and the following Python libraries are required for our code to work.

pybullet==3.0.6
numpy==1.18.5
matplotlib==3.3.2
tensorflow_probability==0.11.1
seaborn==0.11.0
pandas==1.1.4
tensorflow==2.3.1

Other than this, no additional software is needed for the code to work. The PyBullet Physics Engine is used for simulation using an OpenGL GUI. In this code, we have the following : -

A requirement.txt for required python libraries
SolidWorks CADs of the SpiderBot
SpiderBot URDFs for the SpiderBot
Folders for Training Logs & Plots
Two saved models of the SpiderBot Agent
Source Code for the Deep RL Implementation
Training Code to train the SpiderBot with Deep RL
Validation Code to test trained models
Postprocessing Code to generate plots of training

The code supports the following 5 algorithms (with their characteristics defined):

Algorithm	Agent (Actor)	Policy	Learning Network	Actions per Time-Step	Action Space	State Space
MAD3QN	Multiple (Decentralised)	Decentralised	Separate	Multiple	Discrete	Continuous
MAA2C	Multiple (Decentralised)	Decentralised	Separate	Multiple	Discrete	Continuous
A2CMA	Single (Centralised)	Decentralised	Hybrid	Multiple	Discrete	Continuous
A2CSA	Single (Centralised)	Centralised	Hybrid	Single	Discrete	Continuous
DDPG	Single (Centralised)	Centralised	Separate	Multiple	Continuous	Continuous

We will now walk through the folders and files.

Folders

SpiderBot_CADs

This folder contains all the part and assembly files for the SpiderBot. There are options for 3-legged, 4-legged, 6-legged & 8-legged SpiderBots.

SpiderBot_URDFs

This folder contains all URDF files and associated STL files for the SpiderBot. There are options for 3-legged, 4-legged, 6-legged & 8-legged SpiderBots.

Training_Logs & Training_Plots

Folders to store csv file of training data and PDF plots of training.

Saved_Models

Contains two saved models using DDPG. The FullyTrained Model (375 episodes) is able to walk well and up to 9 metres in the forward direction. The PartiallyTrained Model (50 episodes) can move forward slightly but only to a certain extent.

Source Code

SpiderBot_Environment.py

This file has the p_gym class. This uses pybullet and loads the plane environment (no obstacles) and the SpiderBot into the physics engine. The code allows an agent to retrieve state observations for a leg or whole SpiderBot and set a target velocity for joints in the SpiderBot. Finally, the code uses information from the physics engine to determine rewards for a time step.

SpiderBot_Neural_Network.py

This file has the classes for the fully-connected neural networks used. The Tensorflow 2 API is used to develop the neural networks. Depending on the algorithm and number of SpiderBot legs, the neural networks are customised for them. There is all a call method to do a forward propagation through the neural network.

SpiderBot_Agent.py

This file is a long one, which has all the operations of the agent for all 5 algorithms. It initialises the neural networks based on the algorithm in the constructor. The class also has the functionality to update the target networks for DDPG & MAD3QN. Additionally, it has a long list of methods to apply gradients for each one of the algorithms. In these methods, the TensorFlow 2 computational graph and gradient tapes are used to help in backpropagating the loss function. Finally the class also has the functionality to save all models and load all models.

SpiderBot_Replay_Buffer.py

This file contains the replay_buffer class that handles experience replay storage and operations like logging and sampling with a batch size.

SpiderBot_Walk.py

This file contains the walk function that is actually responsible for handling all training operations. This is where all the classes interact with each other. The episodes are looped through and the SpiderBot is trained. The training-related data is logged and saved as a csv into the Training_Logs folder while the best models are saved to the Saved_Models folder during training.

SpiderBot_Postprocessing.py

This file handles the plotting post-processing operations that takes the CSV file from the Training_Logs folder and saves the plot into the Training_Plots folder.

Main Code

SpiderBot_Train_Model.py

This file allows the user to set up the training session. In this file, the user can set 3 levels of configuration for training. The general config section has options for choosing algorithms, number of legs, target location, episodes etc. The Hyperparameters config section handles all hyperparameters of the entire training process. The reward structure config provides options for all the scalar rewards. The user must set all of these configs and run the file to train the SpiderBot. TIP: not using a GUI is faster for training, especially if you use a CUDA-enabled NVIDIA GPU.

SpiderBot_Validation.py

This file allows the user to validate and test a trained model, specially made for the Professors and TAs of SpiderBot to visualise our fully trained model.

How to train a model?

Unzip the SpiderBot_URDFS.zip file into the same directory. Open up SpiderBot_Train_Model.py for editing. The most important parameter is training_name that you must define. This is unique to a particular training session and all saved models, logs and plots are based on this training_name. After that set up your General Config:

#~~~~~~~~~~~~~~~~~~~~~~~~~~~ GENERAL CONFIG ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
training_name = "insert_training_name_here"
model = "DDPG"
num_of_legs = 8 
episodes = 375
target_location = 3
use_GUI = True
do_post_process = True
save_best_model = True
save_data = True

Following that, set up the configurations for the hyperparameters:

#~~~~~~~~~~~~~~~~~~~~~~~~~~~ HYPERPARAMETER CONFIG ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
time_step_size = 120./240
upper_angle = 60
lower_angle = -60
lr_actor = 0.00005
lr_critic = 0.0001
discount_rate = 0.9
update_target = None
tau = 0.005
max_mem_size = 1000000
batch_size = 512
max_action = 10
min_action = -10
noise = 1
epsilon = 1
epsilon_decay = 0.0001
epsilon_min = 0.01

Finally, set up the configuration for the reward structure:

#~~~~~~~~~~~~~~~~~~~~~~~~~~~ REWARD STRUCTURE CONFIG ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
forward_motion_reward = 500
forward_distance_reward = 250
sideways_velocity_punishment = 500
sideways_distance_penalty = 250
time_step_penalty = 1
flipped_penalty = 500
goal_reward = 500
out_of_range_penalty = 500

Then run the python code

> python SpiderBot_Train_Model.py

How to Validate/Test our Models?

To test the fully trained model, just run SpiderBot_Validation.py.

> python SpiderBot_Validation.py

If you wish to run the other saved model, the partially trained one, you can open up SpiderBot_Validation.py and edit the training_name from DDPG_FullyTrained to DDPG_PartiallyTrained in the config section as shown:

#~~~~~~~~~~~~ VALIDATION CONFIG SETUP ~~~~~~~~~~~~#
training_name = "DDPG_PartiallyTrained"
model = "DDPG"
target_location = 8
episodes = 100000000000 # A large number is set to put the simulation on loop

Video Demonstration

A custom DeepStack model that has been trained detecting ONLY the USPS logo

This repository provides a custom DeepStack model that has been trained detecting ONLY the USPS logo. This was created after I discovered that the Deepstack OpenLogo custom model I was using did not contain USPS.

9 Dec 27, 2022

Example-custom-ml-block-keras - Custom Keras ML block example for Edge Impulse

Custom Keras ML block example for Edge Impulse This repository is an example on

8 Nov 2, 2022

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator

DRL-robot-navigation Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gra

87 Jan 7, 2023

Control-Robot-Arm-using-PS4-Controller - A Robotic Arm based on Raspberry Pi and Arduino that controlled by PS4 Controller

Control-Robot-Arm-using-PS4-Controller You can see all details about this Robot

5 Jan 1, 2022

Cockpit is a visual and statistical debugger specifically designed for deep learning.

Cockpit: A Practical Debugging Tool for Training Deep Neural Networks

421 Dec 29, 2022

The NEOSSat is a dual-mission microsatellite designed to detect potentially hazardous Earth-orbit-crossing asteroids and track objects that reside in deep space

2 Jan 30, 2022

Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Adaptive Segmentation Mask Attack This repository contains the implementation of the Adaptive Segmentation Mask Attack (ASMA), a targeted adversarial

53 Jul 4, 2022

Pre-trained Deep Learning models and demos (high quality and extremely fast)

OpenVINO™ Toolkit - Open Model Zoo repository This repository includes optimized deep learning models and a set of demos to expedite development of hi

3.4k Dec 31, 2022

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition" Pre-trained Deep Convo

5 Nov 11, 2022

A custom-designed Spider Robot trained to walk using Deep RL in a PyBullet Simulation

Related tags

Overview

SpiderBot_DeepRL

Folders

SpiderBot_CADs

SpiderBot_URDFs

Training_Logs & Training_Plots

Saved_Models

Source Code

SpiderBot_Environment.py

SpiderBot_Neural_Network.py

SpiderBot_Agent.py

SpiderBot_Replay_Buffer.py

SpiderBot_Walk.py

SpiderBot_Postprocessing.py

Main Code

SpiderBot_Train_Model.py

SpiderBot_Validation.py

How to train a model?

How to Validate/Test our Models?

Video Demonstration

You might also like...

A custom DeepStack model that has been trained detecting ONLY the USPS logo

Example-custom-ml-block-keras - Custom Keras ML block example for Edge Impulse

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator

Control-Robot-Arm-using-PS4-Controller - A Robotic Arm based on Raspberry Pi and Arduino that controlled by PS4 Controller

Cockpit is a visual and statistical debugger specifically designed for deep learning.

The NEOSSat is a dual-mission microsatellite designed to detect potentially hazardous Earth-orbit-crossing asteroids and track objects that reside in deep space

Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Pre-trained Deep Learning models and demos (high quality and extremely fast)

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"

Owner

Arijit Dasgupta

Control-Raspberry-Pi-Robot-using-Hand-Gestures - A 4WD Robot car based on Raspberry Pi that controlled by hand gestures(using openCV and mediapipe)

PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and reinforcement learning

Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Guiding evolutionary strategies by (inaccurate) differentiable robot simulators @ NeurIPS, 4th Robot Learning Workshop

Notspot robot simulation - Python version

Random Walk Graph Neural Networks

Pacman-AI - AI project designed by UC Berkeley. Designed reflex and minimax agents for the game Pacman.

Annotate datasets with a semi-trained or fully trained YOLOv5 model

A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution.

PyTorch implementation of a Real-ESRGAN model trained on custom dataset