Official PyTorch implementation of CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Yijia Weng

Last update: Dec 7, 2022

Related tags

Deep Learning CAPTRA

Overview

CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Introduction

This is the official PyTorch implementation of our paper CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds. This repository is still under construction.

For more information, please visit our project page.

Result visualization on real data. Our models, trained on synthetic data only, can directly generalize to real data, assuming the availability of object masks but not part masks. Left: results on a laptop trajectory from BMVC dataset. Right: results on a real drawers trajectory we captured, where a Kinova Jaco2 arm pulls out the top drawer.

Citation

If you find our work useful in your research, please consider citing:

@article{weng2021captra,
	title={CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds},
	author={Weng, Yijia and Wang, He and Zhou, Qiang and Qin, Yuzhe and Duan, Yueqi and Fan, Qingnan and Chen, Baoquan and Su, Hao and Guibas, Leonidas J},
	journal={arXiv preprint arXiv:2104.03437},
	year={2021}

Updates

[2021/04/14] Released code, data, and pretrained models for testing & evaluation.

Installation

Our code has been tested with
- Ubuntu 16.04, 20.04, and macOS(CPU only)
- CUDA 11.0
- Python 3.7.7
- PyTorch 1.6.0
We recommend using Anaconda to create an environment named captra dedicated to this repository, by running the following:
```
conda env create -n captra python=3.7
conda activate captra
```
Create a directory for code, data, and experiment checkpoints.
```
mkdir captra && cd captra
```

Clone the repository

git clone https://github.com/HalfSummer11/CAPTRA.git
cd CAPTRA

Install dependencies.
```
pip install -r requirements.txt
```

Compile the CUDA code for PointNet++ backbone.

cd network/models/pointnet_lib
python setup.py install

Datasets

Create a directory for all datasets under captra
```
mkdir data && cd data
```
- Make sure to point basepath in CAPTRA/configs/obj_config/obj_info_*.yml to your dataset if you put it at a different location.

NOCS-REAL275

mkdir nocs_data && cd nocs_data

Test

Download and unzip nocs_model_corners.tar, where the 3D bounding boxes of normalized object models are saved.
```
wget http://download.cs.stanford.edu/orion/captra/nocs_model_corners.tar
tar -xzvf nocs_real_corners.tar
```
Create nocs_full to hold original NOCS data. Download and unzip "Real Dataset - Test" from the original NOCS dataset, which contains 6 real test trajectories.
```
mkdir nocs_full && cd nocs_full
wget http://download.cs.stanford.edu/orion/nocs/real_test.zip
unzip real_test.zip
```

Generate and run the pre-processing script

cd CAPTRA/datasets/nocs_data/preproc_nocs
python generate_all.py --data_path ../../../../data/nocs_data --data_type=test_only --parallel --num_proc=10 > nocs_preproc.sh # generate the script for data preprocessing
# parallel & num_proc specifies the number of parallel processes in the following procedure
bash nocs_preproc.sh # the actual data preprocessing

After the steps above, the folder should look like File Structure - Dataset Folder Structure.

SAPIEN Synthetic Articulated Object Dataset

mkdir sapien_data && cd sapien_data

Test

Download and unzip object URDF models and testing trajectories

wget http://download.cs.stanford.edu/orion/captra/sapien_urdf.tar
wget http://download.cs.stanford.edu/orion/captra/sapien_test.tar
tar -xzvf sapien_urdf.tar
tar -xzvf sapien_test.tar

Testing & Evaluation

Download Pretrained Model Checkpoints

Create a folder runs under captra for experiments
```
mkdir runs && cd runs
```
Download our pretrained model checkpoints for
- NOCS-REAL275: nocs_ckpt.tar
- SAPIEN synthetic articulated object dataset: sapien_ckpt.tar

Unzip them in runs

tar -xzvf nocs_ckpt.tar

which should give

runs
├── 1_bottle_rot 	# RotationNet for the bottle category
├── 1_bottle_coord 	# CoordinateNet for the bottle category
├── 2_bowl_rot 
└── ...

Testing

To generate pose predictions for a certain category, run the corresponding script in CAPTRA/scripts (without further specification, all scripts are run from CAPTRA), e.g. for the bottle category from NOCS-REAL275,
```
bash scripts/track/nocs/1_bottle.sh
```
The predicted pose will be saved under the experiment folder 1_bottle_rot (see File Structure - Experiment Folder Structure).
To test the tracking speed for articulated objects in SAPIEN, make sure to set --batch_size=1 in the script. You may use --dataset_length=500 to avoid running through the whole test set.

Evaluation

To evaluate the pose predictions produced in the previous step, uncomment and run the corresponding line in CAPTRA/scripts/eval.sh, e.g. for the bottle category from NOCS-REAL275, the corresponding line is
```
python misc/eval/eval.py --config config_track.yml --obj_config obj_info_nocs.yml --obj_category=1 --experiment_dir=../runs/1_bottle_rot
```

File Structure

Overall Structure

The working directory should be organized as follows.

captra
├── CAPTRA		# this repository
├── data			# datasets
│   ├── nocs_data		# NOCS-REAL275
│   └── sapien_data	# synthetic dataset of articulated objects from SAPIEN
└── runs			# folders for individual experiments
    ├── 1_bottle_coord
    ├── 1_bottle_rot
    └── ...

Code Structure

Below is an overview of our code. Only the most relevant folders/files are shown.

CAPTRA
├── configs		# configuration files
│   ├── all_config		# experiment configs
│   ├── pointnet_config 	# pointnet++ configs (radius, etc)
│   ├── obj_config		# dataset configs
│   └── config.py		# parser
├── datasets	# data preprocessing & dataset definitions
│   ├── arti_data		# articulated data
│   │   └── ...
│   ├── nocs_data		# NOCS-REAL275 data
│   │   ├── ...
│   │   └── preproc_nocs	# prepare nocs data
│   └── ...			# utility functions
├── pose_utils		# utility functions for pose/bounding box computation
├── utils.py
├── misc		# evaluation and visualization
│   ├── eval
│   └── visualize
├── scripts		# scripts for training/testing
└── network		# main part
    ├── data		# torch dataloader definitions
    ├── models		# model definition
    │   ├── pointnet_lib
    │   ├── pointnet_utils.py
    │   ├── backbones.py
    │   ├── blocks.py		# the above defines backbone/building blocks
    │   ├── loss.py
    │   ├── networks.py		# defines CoordinateNet and RotationNet
    │   └── model.py		# defines models for training/tracking
    ├── trainer.py	# training agent
    ├── parse_args.py		# parse arguments for train/test
    ├── test.py		# test
    ├── train.py	# train
    └── train_nocs_mix.py	# finetune with a mixture of synthetic/real data

Experiment Folder Structure

For each experiment, a dedicated folder in captra/runs is organized as follows.

1_bottle_rot
├── log		# training/testing log files
│   └── log.txt
├── ckpt	# model checkpoints
│   ├── model_0001.pt
│   └── ...
└── results
    ├── data*		# per-trajectory raw network outputs 
    │   ├── bottle_shampoo_norm_scene_4.pkl
    │   └── ...
    ├── err.csv**	# per-frame error	
    └── err.pkl**	# per-frame error
*: generated after testing with --save
**: generated after running misc/eval/eval.py

Dataset Folder Structure

nocs_data
├── nocs_model_corners		# instance bounding box information	
├── nocs_full		 	# original NOCS data, organized in frames (not object-centric)
│   ├── real_test
│   │   ├── scene_1
│   │   └── ...
│   ├── real_train
│   ├── train
│   └── val			
├── instance_list*		# collects each instance's occurences in nocs_full/*/
├── render*			# per-instance segmented data for training
├── preproc**			# cashed data 	
└── splits**			# data lists for train/test	
*: generated after data-preprocessing
**: generated during training/testing

sapien_data
├── urdf			# instance URDF models
├── render_seq			# testing trajectories
├── render**			# single-frame training/validation data
├── preproc_seq*		# cashed testing trajectory data	
├── preproc**			# cashed testing trajectory data
└── splits*			# data lists for train/test	
*: generated during training/testing
**: training

Acknowledgements

This implementation is based on the following repositories. We thank the authors for open sourcing their great works!

Comments

CUDA kernel failed : no kernel image is available for execution on the device

Hi, thank you for releasing such a wonderful job! But I got an error: "CUDA kernel failed : no kernel image is available for execution on the device" when both testing and training, I have checked my environment and did not find any problem. Testing: Traing: My environment settings: Terminal checking:

opened by CNJianLiu 6
Question about the Performance on NOCS-REAL275 (Tab 1 & Tab 8)

Thanks for sharing good work.

I have some questions related to the performance on NOCS-REAL275 (Tab 1 & Tab 8)

Q1. Can I ask why the reported CASS performance of 5°, 5cm is different compared to the original CASS paper (23.5 vs 29.44)?? Can you explain how to get these results including mIoU, R_err, and T_err metrics?? Also, the reported 6-PACK performance of 5°, 5cm is worse than the original paper (33.3 vs 28.92).

Q2. Can you explain the Oracle ICP and what is different the ICP from 6-PACK?? The paper 6-PACK also reported the ICP and it performs close to NOCS at 5°, 5cm metric. Can you explain why your Oracle ICP is worse than NOCS??

Q3. As I understand your method uses the same random noise (scale = 0.02, rot = 5°, trans=3cm) with ground-truth initial pose during training and inference. Have you tried different parameter settings (ex, rot= 5°, trans=5cm)?? Also, Can you explain where the code is implemented related to rotation noise?? I only found the translation and scale noisy part of the below code. https://github.com/HalfSummer11/CAPTRA/blob/d98158222f5f6bc44687d75309caf03300e185e1/datasets/nocs_data/nocs_data_process.py#L31

opened by taeyeop-lee 6
GPU setup and training time
Hello, thank you for releasing such a wonderful work! Recently, I am reading this paper and learning the code, but I have several problems about the training setups.

What is the GPUs setup? and how long did does it take to train the NOCS dataset and the SAPIEN dataset respectively? I have tried to train the RotationNet using a 3090 GPU, however, it takes ~2 hours per category on the NOCS dataset. It seems to be very time-consuming.

What's the epoch setup of training CoordinateNet and RotationNet respectively on the two datasets?
opened by luzzou 4
Camera Parameters

Hi, thanks for your work. I am wondering how to get the camera parameters of the BMVC dataset or how to get the segmentation mask of this BMVC dataset.

opened by DC1991 2
关于实验结果的问题

您好！我注意到表1中有些论文的结果好像和原始论文给出的结果不太一样。比如NOCS的结果和6-PACK论文中报告的结果是一样的，但是和NOCS论文中给出的结果不同；还有6-PACK的结果和原始论文不一样，CASS的结果也和原始论文不一样。请问您在表1给出的实验结果是重新计算的结果吗，和原始NOCS的计算方式有什么不同之处呢？或者采用了6-PACK提供的调整后的gt pose吗？

期待您的回复

opened by Bingo-1996 2
What's the purpos of the perturbed_part

https://github.com/HalfSummer11/CAPTRA/blob/1eab3fa7a0307e4f658e4401c5ca75436838c8e8/network/models/model.py#L414

Hi, thanks for sharing the code.

Seems that the perturbed_part is never used later, I'm curious what this poses with noise for? Thanks

opened by ray8828 1

Alignment of GT points from SAPIEN

Hi, thanks for sharing the work!

I'm trying to develop a new model based on your code, I'm trying to transform the npcs pcl to the camera pcl using the ground truth poses, but I found a little un-alignment in the dataset, could you help to see whether my transformation is correct?

_input = self.feed_dict[i]["points"].squeeze(0).detach().cpu().numpy().transpose()
_cam_pts_center = self.feed_dict[i]['points_mean'].squeeze(0).detach().cpu().numpy().transpose()
_cam_pts = _input + _cam_pts_center
np.savetxt("../debug/input_pts.txt", _cam_pts)

# try to use poses to trans gt npcs to camera pts
_R = gt_part["rotation"].squeeze(0).detach().cpu().numpy()  # K,3,3
_t = gt_part["translation"].squeeze(0).detach().cpu().numpy()  # K,3,1
_s = gt_part["scale"].squeeze(0).detach().cpu().numpy()  # K

K = _R.shape[0]
for k in range(K):
    _m = _gt_labels == k
    _part_gt_npcs = _gt_npcs[_m].T
    _part_npcs_in_cam = ((_s[k] * _R[k] @ _part_gt_npcs) + _t[k]).T
    np.savetxt(f"../debug/part_{k}_cam.txt", _part_npcs_in_cam)

Then I get the point cloud like this

The green point is the camera pcl and the other colors are the pts that transformed from the npcs pcl using the ground truth

Thanks

opened by ray8828 1

Question about the training process!

Hi, I have trained 30 epochs for RotationNet and 20 epochs for CoordinateNet on bottles of the NOCS dataset. However, during training, I found many losses were output as "nan" as follows: RotationNet: CoordNet: What caused this? Will this affect the training model?

opened by CNJianLiu 0
Question about testing frames in NOCS-REAL275

Hello, authors! Thank you for your great work! In your paper, you mentioned that there were totaling 3200 frames in the testing split of NOCS-REAL275. However, in the published dataset I've downloaded, there are only 2745 frames. Could you please tell me where the extra data come from? Looking forward to your reply.

opened by shanice-l 0
Question about training model

Hi, I am sorry to bother you again. There is still a problem when I use my training model for testing: So I added "sys.setrecursionlimit(1000000)" in "test.py", then a new error occurred: But when I use your pre-trained model for testing and evaluation, it is ok. I found that the size of your pre-trained model is about 8.9M and 8.0M for 1_bottle_rot and 1_bottle_coordnet respectively, but my training model is about 26.5M and 21.2M for 1_bottle_rot and 1_bottle_coordnet respectively. I want to know how can get the model like your pre-trained ‘model_0000.pt’(I did not change any code except 'epoch'). Looking forward to your reply!

opened by CNJianLiu 0
2 Fixes in Setup
Hey Yijia, I tried setting up CAPTRA on my machine. Thank you for that really thorough read-me! While setting up I encountered two small problems. Which is solved by

chaning conda env create according to https://github.com/conda/conda/issues/3859 to conda create

removing pointnet2 from the requirements.txt since it's installed manually

Cheers, -Nick
opened by SuperN1ck 0
Train on more GPUs

Hi authors! What a wonderful job! I've found that it costs such a long time to train, so I'm wondering whether I can use method like 'torch.nn.DataParallel' to train on more GPUs? Looking forward to your reply!

opened by Neal2020GitHub 0

Official PyTorch implementation of CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Related tags

Overview

CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Introduction

Citation

Updates

Installation

Datasets

NOCS-REAL275

Test

SAPIEN Synthetic Articulated Object Dataset

Test

Testing & Evaluation

Download Pretrained Model Checkpoints

Testing

Evaluation

File Structure

Overall Structure

Code Structure

Experiment Folder Structure

Dataset Folder Structure

Acknowledgements

Comments

Owner

Yijia Weng

PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

The official implementation of ICCV paper "Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds".

Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning.

Code for "CloudAAE: Learning 6D Object Pose Regression with On-line Data Synthesis on Point Clouds" @ICRA2021

Where2Act: From Pixels to Actions for Articulated 3D Objects

Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)

This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

[ICRA 2022] CaTGrasp: Learning Category-Level Task-Relevant Grasping in Clutter from Simulation

This repo is a PyTorch implementation for Paper "Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds"

SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

[CVPR 2022] Pytorch implementation of "Templates for 3D Object Pose Estimation Revisited: Generalization to New objects and Robustness to Occlusions" paper

Implementation of CVPR'2022:Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors

Implementation of CVPR'2022:Surface Reconstruction from Point Clouds by Learning Predictive Context Priors

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.