Official PyTorch implementation of CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Related tags

Deep Learning CAPTRA
Overview

CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

teaser

Introduction

This is the official PyTorch implementation of our paper CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds. This repository is still under construction.

For more information, please visit our project page.

Result visualization on real data. Our models, trained on synthetic data only, can directly generalize to real data, assuming the availability of object masks but not part masks. Left: results on a laptop trajectory from BMVC dataset. Right: results on a real drawers trajectory we captured, where a Kinova Jaco2 arm pulls out the top drawer.

Citation

If you find our work useful in your research, please consider citing:

@article{weng2021captra,
	title={CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds},
	author={Weng, Yijia and Wang, He and Zhou, Qiang and Qin, Yuzhe and Duan, Yueqi and Fan, Qingnan and Chen, Baoquan and Su, Hao and Guibas, Leonidas J},
	journal={arXiv preprint arXiv:2104.03437},
	year={2021}

Updates

  • [2021/04/14] Released code, data, and pretrained models for testing & evaluation.

Installation

  • Our code has been tested with

    • Ubuntu 16.04, 20.04, and macOS(CPU only)
    • CUDA 11.0
    • Python 3.7.7
    • PyTorch 1.6.0
  • We recommend using Anaconda to create an environment named captra dedicated to this repository, by running the following:

    conda env create -n captra python=3.7
    conda activate captra
  • Create a directory for code, data, and experiment checkpoints.

    mkdir captra && cd captra
  • Clone the repository

    git clone https://github.com/HalfSummer11/CAPTRA.git
    cd CAPTRA
  • Install dependencies.

    pip install -r requirements.txt
  • Compile the CUDA code for PointNet++ backbone.

    cd network/models/pointnet_lib
    python setup.py install

Datasets

  • Create a directory for all datasets under captra

    mkdir data && cd data
    • Make sure to point basepath in CAPTRA/configs/obj_config/obj_info_*.yml to your dataset if you put it at a different location.

NOCS-REAL275

mkdir nocs_data && cd nocs_data

Test

  • Download and unzip nocs_model_corners.tar, where the 3D bounding boxes of normalized object models are saved.

    wget http://download.cs.stanford.edu/orion/captra/nocs_model_corners.tar
    tar -xzvf nocs_real_corners.tar
  • Create nocs_full to hold original NOCS data. Download and unzip "Real Dataset - Test" from the original NOCS dataset, which contains 6 real test trajectories.

    mkdir nocs_full && cd nocs_full
    wget http://download.cs.stanford.edu/orion/nocs/real_test.zip
    unzip real_test.zip
  • Generate and run the pre-processing script

    cd CAPTRA/datasets/nocs_data/preproc_nocs
    python generate_all.py --data_path ../../../../data/nocs_data --data_type=test_only --parallel --num_proc=10 > nocs_preproc.sh # generate the script for data preprocessing
    # parallel & num_proc specifies the number of parallel processes in the following procedure
    bash nocs_preproc.sh # the actual data preprocessing
  • After the steps above, the folder should look like File Structure - Dataset Folder Structure.

SAPIEN Synthetic Articulated Object Dataset

mkdir sapien_data && cd sapien_data

Test

  • Download and unzip object URDF models and testing trajectories

    wget http://download.cs.stanford.edu/orion/captra/sapien_urdf.tar
    wget http://download.cs.stanford.edu/orion/captra/sapien_test.tar
    tar -xzvf sapien_urdf.tar
    tar -xzvf sapien_test.tar

Testing & Evaluation

Download Pretrained Model Checkpoints

  • Create a folder runs under captra for experiments

    mkdir runs && cd runs
  • Download our pretrained model checkpoints for

  • Unzip them in runs

    tar -xzvf nocs_ckpt.tar  

    which should give

    runs
    ├── 1_bottle_rot 	# RotationNet for the bottle category
    ├── 1_bottle_coord 	# CoordinateNet for the bottle category
    ├── 2_bowl_rot 
    └── ...

Testing

  • To generate pose predictions for a certain category, run the corresponding script in CAPTRA/scripts (without further specification, all scripts are run from CAPTRA), e.g. for the bottle category from NOCS-REAL275,

    bash scripts/track/nocs/1_bottle.sh
  • The predicted pose will be saved under the experiment folder 1_bottle_rot (see File Structure - Experiment Folder Structure).

  • To test the tracking speed for articulated objects in SAPIEN, make sure to set --batch_size=1 in the script. You may use --dataset_length=500 to avoid running through the whole test set.

Evaluation

  • To evaluate the pose predictions produced in the previous step, uncomment and run the corresponding line in CAPTRA/scripts/eval.sh, e.g. for the bottle category from NOCS-REAL275, the corresponding line is

    python misc/eval/eval.py --config config_track.yml --obj_config obj_info_nocs.yml --obj_category=1 --experiment_dir=../runs/1_bottle_rot

File Structure

Overall Structure

The working directory should be organized as follows.

captra
├── CAPTRA		# this repository
├── data			# datasets
│   ├── nocs_data		# NOCS-REAL275
│   └── sapien_data	# synthetic dataset of articulated objects from SAPIEN
└── runs			# folders for individual experiments
    ├── 1_bottle_coord
    ├── 1_bottle_rot
    └── ...

Code Structure

Below is an overview of our code. Only the most relevant folders/files are shown.

CAPTRA
├── configs		# configuration files
│   ├── all_config		# experiment configs
│   ├── pointnet_config 	# pointnet++ configs (radius, etc)
│   ├── obj_config		# dataset configs
│   └── config.py		# parser
├── datasets	# data preprocessing & dataset definitions
│   ├── arti_data		# articulated data
│   │   └── ...
│   ├── nocs_data		# NOCS-REAL275 data
│   │   ├── ...
│   │   └── preproc_nocs	# prepare nocs data
│   └── ...			# utility functions
├── pose_utils		# utility functions for pose/bounding box computation
├── utils.py
├── misc		# evaluation and visualization
│   ├── eval
│   └── visualize
├── scripts		# scripts for training/testing
└── network		# main part
    ├── data		# torch dataloader definitions
    ├── models		# model definition
    │   ├── pointnet_lib
    │   ├── pointnet_utils.py
    │   ├── backbones.py
    │   ├── blocks.py		# the above defines backbone/building blocks
    │   ├── loss.py
    │   ├── networks.py		# defines CoordinateNet and RotationNet
    │   └── model.py		# defines models for training/tracking
    ├── trainer.py	# training agent
    ├── parse_args.py		# parse arguments for train/test
    ├── test.py		# test
    ├── train.py	# train
    └── train_nocs_mix.py	# finetune with a mixture of synthetic/real data

Experiment Folder Structure

For each experiment, a dedicated folder in captra/runs is organized as follows.

1_bottle_rot
├── log		# training/testing log files
│   └── log.txt
├── ckpt	# model checkpoints
│   ├── model_0001.pt
│   └── ...
└── results
    ├── data*		# per-trajectory raw network outputs 
    │   ├── bottle_shampoo_norm_scene_4.pkl
    │   └── ...
    ├── err.csv**	# per-frame error	
    └── err.pkl**	# per-frame error
*: generated after testing with --save
**: generated after running misc/eval/eval.py

Dataset Folder Structure

nocs_data
├── nocs_model_corners		# instance bounding box information	
├── nocs_full		 	# original NOCS data, organized in frames (not object-centric)
│   ├── real_test
│   │   ├── scene_1
│   │   └── ...
│   ├── real_train
│   ├── train
│   └── val			
├── instance_list*		# collects each instance's occurences in nocs_full/*/
├── render*			# per-instance segmented data for training
├── preproc**			# cashed data 	
└── splits**			# data lists for train/test	
*: generated after data-preprocessing
**: generated during training/testing

sapien_data
├── urdf			# instance URDF models
├── render_seq			# testing trajectories
├── render**			# single-frame training/validation data
├── preproc_seq*		# cashed testing trajectory data	
├── preproc**			# cashed testing trajectory data
└── splits*			# data lists for train/test	
*: generated during training/testing
**: training

Acknowledgements

This implementation is based on the following repositories. We thank the authors for open sourcing their great works!

Comments
  • CUDA kernel failed : no kernel image is available for execution on the device

    CUDA kernel failed : no kernel image is available for execution on the device

    Hi, thank you for releasing such a wonderful job! But I got an error: "CUDA kernel failed : no kernel image is available for execution on the device" when both testing and training, I have checked my environment and did not find any problem. Testing: testing Traing: training My environment settings: 1 Terminal checking: 2

    opened by CNJianLiu 6
  • Question about the Performance on NOCS-REAL275 (Tab 1 & Tab 8)

    Question about the Performance on NOCS-REAL275 (Tab 1 & Tab 8)

    Thanks for sharing good work.

    I have some questions related to the performance on NOCS-REAL275 (Tab 1 & Tab 8)

    Q1. Can I ask why the reported CASS performance of 5°, 5cm is different compared to the original CASS paper (23.5 vs 29.44)?? Can you explain how to get these results including mIoU, R_err, and T_err metrics?? Also, the reported 6-PACK performance of 5°, 5cm is worse than the original paper (33.3 vs 28.92).

    Q2. Can you explain the Oracle ICP and what is different the ICP from 6-PACK?? The paper 6-PACK also reported the ICP and it performs close to NOCS at 5°, 5cm metric. Can you explain why your Oracle ICP is worse than NOCS??

    Q3. As I understand your method uses the same random noise (scale = 0.02, rot = 5°, trans=3cm) with ground-truth initial pose during training and inference. Have you tried different parameter settings (ex, rot= 5°, trans=5cm)?? Also, Can you explain where the code is implemented related to rotation noise?? I only found the translation and scale noisy part of the below code. https://github.com/HalfSummer11/CAPTRA/blob/d98158222f5f6bc44687d75309caf03300e185e1/datasets/nocs_data/nocs_data_process.py#L31

    opened by taeyeop-lee 6
  • GPU setup and training time

    GPU setup and training time

    Hello, thank you for releasing such a wonderful work! Recently, I am reading this paper and learning the code, but I have several problems about the training setups.

    1. What is the GPUs setup? and how long did does it take to train the NOCS dataset and the SAPIEN dataset respectively? I have tried to train the RotationNet using a 3090 GPU, however, it takes ~2 hours per category on the NOCS dataset. It seems to be very time-consuming.
    2. What's the epoch setup of training CoordinateNet and RotationNet respectively on the two datasets?
    opened by luzzou 4
  • Camera Parameters

    Camera Parameters

    Hi, thanks for your work. I am wondering how to get the camera parameters of the BMVC dataset or how to get the segmentation mask of this BMVC dataset.

    opened by DC1991 2
  • 关于实验结果的问题

    关于实验结果的问题

    您好!我注意到表1中有些论文的结果好像和原始论文给出的结果不太一样。 比如NOCS的结果和6-PACK论文中报告的结果是一样的,但是和NOCS论文中给出的结果不同;还有6-PACK的结果和原始论文不一样,CASS的结果也和原始论文不一样。 请问您在表1给出的实验结果是重新计算的结果吗,和原始NOCS的计算方式有什么不同之处呢?或者采用了6-PACK提供的调整后的gt pose吗?

    期待您的回复

    opened by Bingo-1996 2
  • What's the purpos of the perturbed_part

    What's the purpos of the perturbed_part

    https://github.com/HalfSummer11/CAPTRA/blob/1eab3fa7a0307e4f658e4401c5ca75436838c8e8/network/models/model.py#L414

    Hi, thanks for sharing the code.

    Seems that the perturbed_part is never used later, I'm curious what this poses with noise for? Thanks

    opened by ray8828 1
  • Alignment of GT points from SAPIEN

    Alignment of GT points from SAPIEN

    Hi, thanks for sharing the work!

    I'm trying to develop a new model based on your code, I'm trying to transform the npcs pcl to the camera pcl using the ground truth poses, but I found a little un-alignment in the dataset, could you help to see whether my transformation is correct?

    _input = self.feed_dict[i]["points"].squeeze(0).detach().cpu().numpy().transpose()
    _cam_pts_center = self.feed_dict[i]['points_mean'].squeeze(0).detach().cpu().numpy().transpose()
    _cam_pts = _input + _cam_pts_center
    np.savetxt("../debug/input_pts.txt", _cam_pts)
    
    # try to use poses to trans gt npcs to camera pts
    _R = gt_part["rotation"].squeeze(0).detach().cpu().numpy()  # K,3,3
    _t = gt_part["translation"].squeeze(0).detach().cpu().numpy()  # K,3,1
    _s = gt_part["scale"].squeeze(0).detach().cpu().numpy()  # K
    
    K = _R.shape[0]
    for k in range(K):
        _m = _gt_labels == k
        _part_gt_npcs = _gt_npcs[_m].T
        _part_npcs_in_cam = ((_s[k] * _R[k] @ _part_gt_npcs) + _t[k]).T
        np.savetxt(f"../debug/part_{k}_cam.txt", _part_npcs_in_cam)
    

    Then I get the point cloud like this

    image

    image

    The green point is the camera pcl and the other colors are the pts that transformed from the npcs pcl using the ground truth

    Thanks

    opened by ray8828 1
  • Question about the training process!

    Question about the training process!

    Hi, I have trained 30 epochs for RotationNet and 20 epochs for CoordinateNet on bottles of the NOCS dataset. However, during training, I found many losses were output as "nan" as follows: RotationNet: image CoordNet: image What caused this? Will this affect the training model?

    opened by CNJianLiu 0
  • Question about testing frames in NOCS-REAL275

    Question about testing frames in NOCS-REAL275

    Hello, authors! Thank you for your great work! In your paper, you mentioned that there were totaling 3200 frames in the testing split of NOCS-REAL275. However, in the published dataset I've downloaded, there are only 2745 frames. Could you please tell me where the extra data come from? Looking forward to your reply.

    opened by shanice-l 0
  • Question about training model

    Question about training model

    Hi, I am sorry to bother you again. There is still a problem when I use my training model for testing: 1 So I added "sys.setrecursionlimit(1000000)" in "test.py", then a new error occurred: 2 But when I use your pre-trained model for testing and evaluation, it is ok. I found that the size of your pre-trained model is about 8.9M and 8.0M for 1_bottle_rot and 1_bottle_coordnet respectively, but my training model is about 26.5M and 21.2M for 1_bottle_rot and 1_bottle_coordnet respectively. I want to know how can get the model like your pre-trained ‘model_0000.pt’(I did not change any code except 'epoch'). Looking forward to your reply!

    opened by CNJianLiu 0
  • 2 Fixes in Setup

    2 Fixes in Setup

    Hey Yijia, I tried setting up CAPTRA on my machine. Thank you for that really thorough read-me! While setting up I encountered two small problems. Which is solved by

    • chaning conda env create according to https://github.com/conda/conda/issues/3859 to conda create
    • removing pointnet2 from the requirements.txt since it's installed manually

    Cheers, -Nick

    opened by SuperN1ck 0
  • Train on more GPUs

    Train on more GPUs

    Hi authors! What a wonderful job! I've found that it costs such a long time to train, so I'm wondering whether I can use method like 'torch.nn.DataParallel' to train on more GPUs? Looking forward to your reply!

    opened by Neal2020GitHub 0
Owner
Yijia Weng
Another day, another destiny.
Yijia Weng
PyTorch implemention of ICCV'21 paper SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation

SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation This is the PyTorch implemention of ICCV'21 paper SGPA: Structure

Chen Kai 24 Dec 5, 2022
The official implementation of ICCV paper "Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds".

Box-Aware Tracker (BAT) Pytorch-Lightning implementation of the Box-Aware Tracker. Box-Aware Feature Enhancement for Single Object Tracking on Point C

Kangel Zenn 5 Mar 26, 2022
Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image

CenterPose Overview This repository is the official implementation of the paper "Single-stage Keypoint-based Category-level Object Pose Estimation fro

NVIDIA Research Projects 188 Dec 27, 2022
Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose Paper | Website | Data A-NeRF: Articulated Neural Radiance F

Shih-Yang Su 172 Dec 22, 2022
A general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning.

Open3DSOT A general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning. The official code release of BAT an

Kangel Zenn 172 Dec 23, 2022
Code for "CloudAAE: Learning 6D Object Pose Regression with On-line Data Synthesis on Point Clouds" @ICRA2021

CloudAAE This is an tensorflow implementation of "CloudAAE: Learning 6D Object Pose Regression with On-line Data Synthesis on Point Clouds" Files log:

Gee 35 Nov 14, 2022
Where2Act: From Pixels to Actions for Articulated 3D Objects

Where2Act: From Pixels to Actions for Articulated 3D Objects The Proposed Where2Act Task. Given as input an articulated 3D object, we learn to propose

Kaichun Mo 69 Nov 28, 2022
Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

PV-RAFT This repository contains the PyTorch implementation for paper "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clou

Yi Wei 43 Dec 5, 2022
Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral) This is the official implementat

Yifan Zhang 259 Dec 25, 2022
This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

Haotong Qin 59 Dec 17, 2022
(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

Haoxi Ran 264 Dec 23, 2022
[ICRA 2022] CaTGrasp: Learning Category-Level Task-Relevant Grasping in Clutter from Simulation

This is the official implementation of our paper: Bowen Wen, Wenzhao Lian, Kostas Bekris, and Stefan Schaal. "CaTGrasp: Learning Category-Level Task-R

Bowen Wen 199 Jan 4, 2023
This repo is a PyTorch implementation for Paper "Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds"

Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds This repository is a PyTorch implementation for paper: Uns

Kaizhi Yang 42 Dec 9, 2022
SE3 Pose Interp - Interpolate camera pose or trajectory in SE3, pose interpolation, trajectory interpolation

SE3 Pose Interpolation Pose estimated from SLAM system are always discrete, and

Ran Cheng 4 Dec 15, 2022
[CVPR 2022] Pytorch implementation of "Templates for 3D Object Pose Estimation Revisited: Generalization to New objects and Robustness to Occlusions" paper

template-pose Pytorch implementation of "Templates for 3D Object Pose Estimation Revisited: Generalization to New objects and Robustness to Occlusions

Van Nguyen Nguyen 92 Dec 28, 2022
Implementation of CVPR'2022:Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors

Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository contains

null 151 Dec 26, 2022
Implementation of CVPR'2022:Surface Reconstruction from Point Clouds by Learning Predictive Context Priors

Surface Reconstruction from Point Clouds by Learning Predictive Context Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository c

null 136 Dec 12, 2022
Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

null 235 Dec 26, 2022