SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo
Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan, Mark Tjersland
paper / project site / blog
This repo contains the code to train the SimNet architecture on procedurally generated simulation data from scratch (no transfer learning required). We also provide a small set of in-house manually labelled validation data containing 3d oriented bounding box labels.
Training the model
Requirements
You will need a Nvidia GPU with at least 12GB of RAM. All code was tested and developed on Ubuntu 20.04.
All commands are assumed to be run from the root of the simnet
repo directory (represented by $SIMNET_REPO
in commands below).
Setup
Python
Create a python 3.8 virtual environment and install requirements:
cd $SIMNET_REPO
conda create -y --prefix ./env python=3.8
./env/bin/python -m pip install --upgrade pip
./env/bin/python -m pip install -r frozen_requirements.txt
Docker
Make sure docker is installed and working without requiring sudo
. If it is not installed, follow the official instructions for setting it up.
docker ps
Wandb
Launch wandb
local server for logging training results (you do not need to do this if you already have a wandb account setup). This will launch a local webserver http://localhost:8080 using docker that you can use to visualize training progress and validation images. You will have to visit the http://localhost:8080/authorize page to get the local API access token (this can take a few minutes the first time). Once you get the key you can paste it into the terminal to continue.
cd $SIMNET_REPO
./env/bin/wandb local
Datasets
Download and untar train+val datasets simnet2021a.tar (18GB, md5 checksum:b8e1d3cb7200b44b1de223e87141f14b
). This file contains all the training and validation you need to replicate our small objects results.
cd $SIMNET_REPO
wget https://tri-robotics-public.s3.amazonaws.com/github/simnet/datasets/simnet2021a.tar -P datasets
tar xf datasets/simnet2021a.tar -C datasets
Train and Validate
Overfit test:
./runner.sh net_train.py @config/net_config_overfit.txt
Full training run (requires 12GB GPU memory)
./runner.sh net_train.py @config/net_config.txt
Results
Check wandb (http://localhost:8080) to see training progress. On a Titan V, it takes about 48 hours for training to converge, but decent validation results can be seen around 24 hours.
Example validation image visualization:
Example 3D oriented bounding box mAP on validation dataset:
Licenses
The source code is released under the MIT license.
The datasets are released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.