Official Pytorch implementation of RePOSE (ICCV2021)

Related tags

Deep Learning RePOSE
Overview

RePOSE: Iterative Rendering and Refinement for 6D Object Detection (ICCV2021) [Link]

overview

Abstract

We present RePOSE, a fast iterative refinement method for 6D object pose estimation. Prior methods perform refinement by feeding zoomed-in input and rendered RGB images into a CNN and directly regressing an update of a refined pose. Their runtime is slow due to the computational cost of CNN, which is especially prominent in multiple-object pose refinement. To overcome this problem, RePOSE leverages image rendering for fast feature extraction using a 3D model with a learnable texture. We call this deep texture rendering, which uses a shallow multi-layer perceptron to directly regress a view-invariant image representation of an object. Furthermore, we utilize differentiable Levenberg-Marquardt (LM) optimization to refine a pose fast and accurately by minimizing the feature-metric error between the input and rendered image representations without the need of zooming in. These image representations are trained such that differentiable LM optimization converges within few iterations. Consequently, RePOSE runs at 92 FPS and achieves state-of-the-art accuracy of 51.6% on the Occlusion LineMOD dataset - a 4.1% absolute improvement over the prior art, and comparable result on the YCB-Video dataset with a much faster runtime.

Prerequisites

  • Python >= 3.6
  • Pytorch == 1.9.0
  • Torchvision == 0.10.0
  • CUDA == 10.1

Downloads

Installation

  1. Set up the python environment:
    $ pip install torch==1.9.0 torchvision==0.10.0
    $ pip install Cython==0.29.17
    $ sudo apt-get install libglfw3-dev libglfw3
    $ pip install -r requirements.txt
    
    # Install Differentiable Renderer
    $ cd renderer
    $ python3 setup.py install
    
  2. Compile cuda extensions under lib/csrc:
    ROOT=/path/to/RePOSE
    cd $ROOT/lib/csrc
    export CUDA_HOME="/usr/local/cuda-10.1"
    cd ../ransac_voting
    python setup.py build_ext --inplace
    cd ../camera_jacobian
    python setup.py build_ext --inplace
    cd ../nn
    python setup.py build_ext --inplace
    cd ../fps
    python setup.py
    
  3. Set up datasets:
    $ ROOT=/path/to/RePOSE
    $ cd $ROOT/data
    
    $ ln -s /path/to/linemod linemod
    $ ln -s /path/to/linemod_orig linemod_orig
    $ ln -s /path/to/occlusion_linemod occlusion_linemod
    
    $ cd $ROOT/data/model/
    $ unzip pretrained_models.zip
    
    $ cd $ROOT/cache/LinemodTest
    $ unzip ape.zip benchvise.zip .... phone.zip
    $ cd $ROOT/cache/LinemodOccTest
    $ unzip ape.zip can.zip .... holepuncher.zip
    

Testing

We have 13 categories (ape, benchvise, cam, can, cat, driller, duck, eggbox, glue, holepuncher, iron, lamp, phone) on the LineMOD dataset and 8 categories (ape, can, cat, driller, duck, eggbox, glue, holepuncher) on the Occlusion LineMOD dataset. Please choose the one category you like (replace ape with another category) and perform testing.

Evaluate the ADD(-S) score

  1. Generate the annotation data:
    python run.py --type linemod cls_type ape model ape
    
  2. Test:
    # Test on the LineMOD dataset
    $ python run.py --type evaluate --cfg_file configs/linemod.yaml cls_type ape model ape
    
    # Test on the Occlusion LineMOD dataset
    $ python run.py --type evaluate --cfg_file configs/linemod.yaml test.dataset LinemodOccTest cls_type ape model ape
    

Visualization

  1. Generate the annotation data:
    python run.py --type linemod cls_type ape model ape
    
  2. Visualize:
    # Visualize the results of the LineMOD dataset
    python run.py --type visualize --cfg_file configs/linemod.yaml cls_type ape model ape
    
    # Visualize the results of the Occlusion LineMOD dataset
    python run.py --type visualize --cfg_file configs/linemod.yaml test.dataset LinemodOccTest cls_type ape model ape
    

Citation

@InProceedings{Iwase_2021_ICCV,
    author    = {Iwase, Shun and Liu, Xingyu and Khirodkar, Rawal and Yokota, Rio and Kitani, Kris M.},
    title     = {RePOSE: Fast 6D Object Pose Refinement via Deep Texture Rendering},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {3303-3312}
}

Acknowledgement

Our code is largely based on clean-pvnet and our rendering code is based on neural_renderer. Thank you so much for making these codes publicly available!

Contact

If you have any questions about the paper and implementation, please feel free to email me ([email protected])! Thank you!

Comments
  • KeyError: 'R_all'

    KeyError: 'R_all'

    Hi, very nice work. And Here when I run your visulization code, it shows: image It seems that it has no key of 'R_all'.

    And could you please share about your training codes? Thanks!

    opened by pyni 11
  • How train linemod or ycb-v?

    How train linemod or ycb-v?

    Hi, thanks your work, this is a meaningful and interesting. but I want to know that the code update work has been completed? How can I start training?

    opened by johnbhlm 6
  • Can this work with purely synthetic training data (and no texture map)?

    Can this work with purely synthetic training data (and no texture map)?

    I like how this approach does not require a texture of the object's 3D model. As you mention in the paper, this is a typical situation, especially for objects that are challenging to scan.

    In some cases, there are also no real images of the object to train on. In these cases, synthetic data with domain randomization is often used to render images with random textures applied to the object's 3D model.

    If I understand correctly, RePOSE trains on real images of the object. Have you done any experimentation with just domain-randomized synthetic data? Any intuition on whether this will work?

    opened by murrdpirate 5
  • Occ-Linemod initial results

    Occ-Linemod initial results

    Hi, this is a great work. As you metioned in the paper, you did experiments on Occ-Linemod, using the pvnet results as the initial pose. Can you please share the pvnet initial result? Any format is okay, Thanks.

    opened by YangHai-1218 3
  • Question about the PVNet result

    Question about the PVNet result

    Hi,

    Thank you for your excellent work! I am a little confused about the PVNet's initial pose for LINEMOD and LINEMOD Occlusion. How could I find them?

    Also, for the initial pose result in YCB-V based on PoseCNN you used, how could I find it?

    Best, Rui

    opened by 63445538 3
  • Questions about requirements.txt

    Questions about requirements.txt

    Hi! Thanks for your great works

    I have one question.

    I got this error from "pip install -r requirements.txt" ERROR: Could not find a version that satisfies the requirement soft-renderer==1.0.0 (from versions: none) ERROR: No matching distribution found for soft-renderer==1.0.0

    and I cannot find any package name "soft-renderer" in pypi either.

    What should I do?

    opened by parkjaewoo0611 2
  • Question about installing the Neural-Renderer

    Question about installing the Neural-Renderer

    Hello, thanks for sharing such great work!

    So I am able to use neural-renderer example code to generate rendering results with torch==1.2 installed. While the REPOSE's prerequisite is Pytorch == 1.9.0, and if I install neural-renderer under torch==1.2 and upgrade to torch==1.9 then I will encounter error: import neural_renderer.cuda.load_textures as load_textures_cuda ImportError: /usr/local/lib/python3.7/dist-packages/neural_renderer/cuda/load_textures.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe26detail37_typeMetaDataInstance_preallocated_32E when running neural-renderer example code. How should I solve this conflict?

    opened by YangXJ95 1
  • Question about

    Question about "pip install -r requirements.txt"

    Hi, thank you for your awesome work! But when I try to run the command "pip install -r requirements.txt", and it just shows that

    "ERROR: [email protected]:sh8/rdopt.git@7601bca4818a03ef1ace1e0c1df396ccec56003f#egg=camera_jacobian is not a valid editable requirement. It should either be a path to a local project or a VCS URL (beginning with bzr+http, bzr+https, bzr+ssh, bzr+sftp, bzr+ftp, bzr+lp, bzr+file, git+http, git+https, git+ssh, git+git, git+file, hg+file, hg+http, hg+https, hg+ssh, hg+static-http, svn+ssh, svn+http, svn+https, svn+svn, svn+file)."

    How could I fix this? Thank you very much!

    opened by ldylab 1
  • hello, the problem about driller metric in linemod dataset!

    hello, the problem about driller metric in linemod dataset!

    I have ran the code, and test the ADD(-S) in every object in linemod dataset(not occ), the cache file(pvnet result) of object driller seems to be wrong. The test metric ADD(-S) of driller is just 41.76%. Could you please upload the cache file of object driller again?

    opened by wwwwwlllllllllll 0
Owner
Shun Iwase
Carnegie Mellon University, Robotics Institute
Shun Iwase
Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021) This repository is the official P

Jingyun Liang 159 Dec 30, 2022
Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021) This repository is the official PyTorc

Jingyun Liang 139 Dec 29, 2022
Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

A Unified Objective for Novel Class Discovery This is the official repository for the paper: A Unified Objective for Novel Class Discovery Enrico Fini

Enrico Fini 118 Dec 26, 2022
This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

PyTorch implementation of DAQ This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021. For more informatio

CV Lab @ Yonsei University 36 Nov 4, 2022
This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is accepted to ICCV2021.

GMPQ: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation This is the pytorch implementation for the paper: Generalizable Mix

null 18 Sep 2, 2022
PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation

StructDepth PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimat

SJTU-ViSYS 112 Nov 28, 2022
A PyTorch implementation of "From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network" (ICCV2021)

From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network The official code of VisionLAN (ICCV2021). VisionLAN successfully a

null 81 Dec 12, 2022
Official code for "Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021".

Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer. ICCV2021. Introduction We proposed a novel model training paradi

Lucas 103 Dec 14, 2022
Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"

CSRA This is the official code of ICCV 2021 paper: Residual Attention: A Simple But Effective Method for Multi-Label Recoginition Demo, Train and Vali

null 163 Dec 22, 2022
Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network"

M3D-VTON: A Monocular-to-3D Virtual Try-On Network Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network" Paper | Suppl

null 109 Dec 29, 2022
[ICCV2021] Official code for "Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition"

CTR-GCN This repo is the official implementation for Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition. The pap

Yuxin Chen 148 Dec 16, 2022
Official Repo for ICCV2021 Paper: Learning to Regress Bodies from Images using Differentiable Semantic Rendering

[ICCV2021] Learning to Regress Bodies from Images using Differentiable Semantic Rendering Getting Started DSR has been implemented and tested on Ubunt

Sai Kumar Dwivedi 83 Nov 27, 2022
Dynamic Attentive Graph Learning for Image Restoration, ICCV2021 [PyTorch Code]

Dynamic Attentive Graph Learning for Image Restoration This repository is for GATIR introduced in the following paper: Chong Mou, Jian Zhang, Zhuoyuan

Jian Zhang 84 Dec 9, 2022
Dense Deep Unfolding Network with 3D-CNN Prior for Snapshot Compressive Imaging, ICCV2021 [PyTorch Code]

Dense Deep Unfolding Network with 3D-CNN Prior for Snapshot Compressive Imaging, ICCV2021 [PyTorch Code]

Jian Zhang 20 Oct 24, 2022
Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation Created by Zeyu HU Introduction This work is based on our paper VMNet: Voxel-Mes

HU Zeyu 82 Dec 27, 2022
Implementation for our ICCV2021 paper: Internal Video Inpainting by Implicit Long-range Propagation

Implicit Internal Video Inpainting Implementation for our ICCV2021 paper: Internal Video Inpainting by Implicit Long-range Propagation paper | project

null 202 Dec 30, 2022
source code of “Visual Saliency Transformer” (ICCV2021)

Visual Saliency Transformer (VST) source code for our ICCV 2021 paper “Visual Saliency Transformer” by Nian Liu, Ni Zhang, Kaiyuan Wan, Junwei Han, an

null 89 Dec 21, 2022
HiFT: Hierarchical Feature Transformer for Aerial Tracking (ICCV2021)

HiFT: Hierarchical Feature Transformer for Aerial Tracking Ziang Cao, Changhong Fu, Junjie Ye, Bowen Li, and Yiming Li Our paper is Accepted by ICCV 2

Intelligent Vision for Robotics in Complex Environment 55 Nov 23, 2022
Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment (ICCV2021)

Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment This is a pytorch project for the paper Seeing Dynamic Scene i

DV Lab 21 Nov 28, 2022