This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

Overview

SO-Pose

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an incremental work to paper: GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. We analyze and leverage self-occlusion in 6D pose estimation.

Datasets

The code is based on the released code of GDR-Net in this git (The code of GDR-Net is already included) The struture of the datasets is the same.

Since we need ground truth 2D-3D matching and self-occlusion results, we provide generation methods in .gdrn_selfocc_modeling/tools. Please refer to generate_*.py. Note that public renderers (e.g. EGL, GLUMPY) may introduce noise in rendering, the inherent relations between P (2D-3D matching) and Q (self-occlusion) are not guaranteed. So if you use a renderer for efficiency, please make sure that P and Q lie on the same line.

Training and Testing

Please directly run ./gdrn_selfocc_modeling/main_gdrn.py for training and testing.

Important parameters include

config-file : the path to the configuration file.

resume: if 'True', continue the training process from the last checkpoint.

eval-only: if 'True', directly evalute the model.

Trained Models

The trained models can be downloaded here. PLease unzip the trained models in the directory specified in the configuration file. An example output of the evaluation on LMO is provided.

If you find the code useful, please cite the following papers:

[1]Wang, G., Manhardt, F., Tombari, F., & Ji, X. (2021). GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 16611-16621).

[2]Di, Y., Manhardt, F., Wang, G., Ji, X., Navab, N., & Tombari, F. (2021). SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation. arXiv preprint arXiv:2108.08367.

Comments
  • question about implementation of 2D cross layer consistency

    question about implementation of 2D cross layer consistency

    Hi,

    Thanks again for your great work. I have a question about the implementation of 2D consistency loss https://github.com/shangbuhuan13/SO-Pose/blob/a3a61d2c97b1084a4754d6c12e45e16d85809729/core/gdrn_selfocc_modeling/losses/crosstask_projection_loss.py#L158

    I am confused why the loss is divide by 572.3. In datasets/BOP_DATASETS/lm/camera.json I see the camera information is

    {
      "cx": 325.2611,
      "cy": 242.04899,
      "depth_scale": 1.0,
      "fx": 572.4114,
      "fy": 573.57043,
      "height": 480,
      "width": 640
    }
    

    Also, will this impact YCBV dataset, since it has different camera intrinsic parameters? Thanks!

    opened by RuyiLian 8
  • question about backbone in experiment configs for LM dataset

    question about backbone in experiment configs for LM dataset

    Hi,

    Thanks for your great work! I have a question about the backbone in LM datasets. In your paper, you said that "As backbone we leverage ResNet34 [6] for all experiments on the LM dataset", but the config files in this repo seems different:

           BACKBONE=dict(
                FREEZE=False,
                PRETRAINED="mmcls://resnet50_v1d",
                INIT_CFG=dict(
                    _delete_=True,
                    type="mm/ResNetV1d",
                    depth=50,
                    in_channels=3,
                    out_indices=(3,),
                ),
            ),
    

    and the output feature dimension is 2048, not 512 as in GDR-Net.

    Could you help to clarify this? Thanks!

    opened by RuyiLian 6
  • 缺失lib.egl_renderer

    缺失lib.egl_renderer

    您好,请问作者可以再提供一次lib.egl_renderer吗,其他以前的网盘都过期了,十分感谢。 https://github.com/shangbuhuan13/SO-Pose/blob/b1cfa9c20bfbb0b4ccd1e8c421e392958628aa4f/core/gdrn_selfocc_modeling/tools/lmo/lmo_2_vis_poses.py#:~:text=from%20lib.egl_renderer.egl_renderer_v3%20import%20EGLRenderer

    opened by jiaming3 3
  • 请问在可视化结果时可以提供一下你们的lib.egl_renderer吗,感谢。

    请问在可视化结果时可以提供一下你们的lib.egl_renderer吗,感谢。

    如题目所示,请问作者可否提供下可视化结果时用到的lib.egl_renderer,无比感谢。 https://github.com/shangbuhuan13/SO-Pose/blob/b1cfa9c20bfbb0b4ccd1e8c421e392958628aa4f/core/gdrn_selfocc_modeling/tools/lmo/lmo_2_vis_poses.py#:~:text=from%20lib.egl_renderer.egl_renderer_v3%20import%20EGLRenderer

    opened by liuicheng123 3
  • 配置文件中`TRAIN2=(

    配置文件中`TRAIN2=("lmo_pbr_train")`是否必要

    您好! 我准备使用gdrn_selfocc_multistep_40E.py配置文件复现实验,但是我发现在生成xyz后lm/train_pbr/xyz_crop已经很大,并且准备生成train_pbr/Q0,但是这个我估计要3T的存储空间,所以想问下配置文件中TRAIN2=("lmo_pbr_train",),的这个是否必要?

    opened by flyinghu123 2
  • 请问在训练过程中,cpu无法跑满怎么办。

    请问在训练过程中,cpu无法跑满怎么办。

    image

    如图所示,cpu无法跑满,导致速度很慢,请问这个有解决办法吗。不知道是不是因为你们基于detectron2的问题还是别的。 感谢解答。 我跑了很多天了,设置多个num_workers后,cpu无法跑满,导致我的速度非常慢,这个怎么办呢。是和你们依托detectron有关吗,还是你别的原因,我看你们在代码中,好像是也有遇到相关问题吗

    opened by liuicheng123 2
  • How to evaluate the trained model on YCBV?

    How to evaluate the trained model on YCBV?

    Could you provide a brief script to show how to evaluate the trained model on YCBV dataset? I just found it is hard to run the evaluation based on your current descriptions. Thx.

    opened by DecaYale 1
  • Some questions about generate_pbr_P_fast.py

    Some questions about generate_pbr_P_fast.py

    sorry to be a bother, when I run generate_pbr_P_fast.py, it shows an error called "'ref' has no attribute 'lm_full'", trace back from line 110, how can I fixed it?

    opened by micki-37 0
  • Models link not working

    Models link not working

    Hi! Thanks for opensourcing the code! I cannot download models by the link: https://drive.google.com/file/d/136ExcMykxsVVSzOiGQVYspq1fx9Hjd6R/view?usp=sharing (error is: "Sorry, the file you have requested does not exist."). Does it work?

    opened by Daniil-Osokin 0
  • 生成xyz_crop时碰到的一些问题

    生成xyz_crop时碰到的一些问题

    您好,我之前在研究gdrnet的相关工作,但一直无法成功运行生成xyz_crop的程序。而我在您开源的这份代码的readme中看到您提及generate p.py对应的是2d-3d matching的groundtruth,这里生成的是否就是gdrnet中需要的xyz_crop呢。如果是的话,我是应该运行generate_pbr_P.py还是generate_pbr_P_fast.py呢?生成的xyz_crop和gdrnet中要求的是否一致呢?

    opened by micki-37 2
  • linemod results with resnet50 or resnet34

    linemod results with resnet50 or resnet34

    Hi, I have a small question about the results on linemod dataset.

    I see that in the paper, the backbone should be ResNet34, however, in the codebase, it seems like ResNet50 (https://github.com/shangbuhuan13/SO-Pose/blob/a3a61d2c97b1084a4754d6c12e45e16d85809729/configs/gdrn_selfocc/lm/gdrn_2rothead_multistep_02CT.py#L53).

    I also run the linemod experiments with exact the config file in the repo, getting result ADI.10 ~95.5 with ResNet50. So I would like to confirm with you what is the Backbone used in linemod results?

    opened by GUOShuxuan 1
  • Some error questions about generate_P_fast.py

    Some error questions about generate_P_fast.py

    sorry to be a bother, when I run this python file, it shows an error called "AssertionError: /home/fxj/SO-Pose-main/datasets/BOP_DATASETS/lm/train_pbr/xyz_crop/000000/000000_000001-xyz.pkl", where can I find"000000_000001-xyz.pkl"?

    opened by micki-37 4
Owner
shangbuhuan
shangbuhuan
This is an official implementation for "Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation".

Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation This repo is the official implementation of Exploiting Temporal Con

Vegetabird 241 Jan 7, 2023
PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation

StructDepth PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimat

SJTU-ViSYS 112 Nov 28, 2022
GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

GDR-Net This repo provides the PyTorch implementation of the work: Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji. GDR-Net: Geometry-Guided

null 169 Jan 7, 2023
Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency(ECCV 2020) This is an official python implementati

null 304 Jan 3, 2023
Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation Code repository for the paper: PoseAug: A Differentiable Pose Augme

Pyjcsx 328 Dec 17, 2022
This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".

Self-Diagnosis and Self-Debiasing This repository contains the source code for Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based

Timo Schick 62 Dec 12, 2022
git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

NingWang 236 Dec 22, 2022
Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation

Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation

Bae, Gwangbin 95 Jan 4, 2023
Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

Denis Emelin 42 Nov 24, 2022
This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider

SBEVNet: End-to-End Deep Stereo Layout Estimation This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by D

Divam Gupta 19 Dec 17, 2022
Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [CVPR 2021]

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [BCNet, CVPR 2021] This is the official pytorch implementation of BCNet built on

Lei Ke 434 Dec 1, 2022
Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021]

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021] Abstract Analyzing complex scenes with DNN is a challenging ta

Irene Yuan 24 Jun 27, 2022
Occlusion robust 3D face reconstruction model in CFR-GAN (WACV 2022)

Occlusion Robust 3D face Reconstruction Yeong-Joon Ju, Gun-Hee Lee, Jung-Ho Hong, and Seong-Whan Lee Code for Occlusion Robust 3D Face Reconstruction

Yeongjoon 31 Dec 19, 2022
The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

OC-SORT Observation-Centric SORT (OC-SORT) is a pure motion-model-based multi-object tracker. It aims to improve tracking robustness in crowded scenes

Jinkun Cao 325 Jan 5, 2023
Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

OpenGaze: Web Service for OpenFace Facial Behaviour Analysis Toolkit Overview OpenFace is a fantastic tool intended for computer vision and machine le

Sayom Shakib 4 Nov 3, 2022
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

OpenFace 2.2.0: a facial behavior analysis toolkit Over the past few years, there has been an increased interest in automatic facial behavior analysis

Tadas Baltrusaitis 5.8k Dec 31, 2022
This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

EEND-vector clustering The EEND-vector clustering (End-to-End-Neural-Diarization-vector clustering) is a speaker diarization framework that integrates

null 45 Dec 26, 2022
Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

SimplePose Code and pre-trained models for our paper, “Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation”, a

Jia Li 256 Dec 24, 2022
Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

Ibai Gorordo 99 Dec 31, 2022