This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

shangbuhuan

Last update: Nov 25, 2022

Related tags

Deep Learning SO-Pose

Overview

SO-Pose

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation This paper is basically an incremental work to paper: GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. We analyze and leverage self-occlusion in 6D pose estimation.

Datasets

The code is based on the released code of GDR-Net in this git (The code of GDR-Net is already included) The struture of the datasets is the same.

Since we need ground truth 2D-3D matching and self-occlusion results, we provide generation methods in .gdrn_selfocc_modeling/tools. Please refer to generate_*.py. Note that public renderers (e.g. EGL, GLUMPY) may introduce noise in rendering, the inherent relations between P (2D-3D matching) and Q (self-occlusion) are not guaranteed. So if you use a renderer for efficiency, please make sure that P and Q lie on the same line.

Training and Testing

Please directly run ./gdrn_selfocc_modeling/main_gdrn.py for training and testing.

Important parameters include

config-file : the path to the configuration file.

resume: if 'True', continue the training process from the last checkpoint.

eval-only: if 'True', directly evalute the model.

Trained Models

The trained models can be downloaded here. PLease unzip the trained models in the directory specified in the configuration file. An example output of the evaluation on LMO is provided.

If you find the code useful, please cite the following papers:

[1]Wang, G., Manhardt, F., Tombari, F., & Ji, X. (2021). GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 16611-16621).

[2]Di, Y., Manhardt, F., Wang, G., Ji, X., Navab, N., & Tombari, F. (2021). SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation. arXiv preprint arXiv:2108.08367.

Comments

question about implementation of 2D cross layer consistency
Hi,

Thanks again for your great work. I have a question about the implementation of 2D consistency loss https://github.com/shangbuhuan13/SO-Pose/blob/a3a61d2c97b1084a4754d6c12e45e16d85809729/core/gdrn_selfocc_modeling/losses/crosstask_projection_loss.py#L158

I am confused why the loss is divide by 572.3. In datasets/BOP_DATASETS/lm/camera.json I see the camera information is

{ "cx": 325.2611, "cy": 242.04899, "depth_scale": 1.0, "fx": 572.4114, "fy": 573.57043, "height": 480, "width": 640 }

Also, will this impact YCBV dataset, since it has different camera intrinsic parameters? Thanks!
opened by RuyiLian 8
question about backbone in experiment configs for LM dataset
Hi,

Thanks for your great work! I have a question about the backbone in LM datasets. In your paper, you said that "As backbone we leverage ResNet34 [6] for all experiments on the LM dataset", but the config files in this repo seems different:

BACKBONE=dict( FREEZE=False, PRETRAINED="mmcls://resnet50_v1d", INIT_CFG=dict( _delete_=True, type="mm/ResNetV1d", depth=50, in_channels=3, out_indices=(3,), ), ),

and the output feature dimension is 2048, not 512 as in GDR-Net.

Could you help to clarify this? Thanks!
opened by RuyiLian 6
缺失lib.egl_renderer

您好，请问作者可以再提供一次lib.egl_renderer吗，其他以前的网盘都过期了，十分感谢。 https://github.com/shangbuhuan13/SO-Pose/blob/b1cfa9c20bfbb0b4ccd1e8c421e392958628aa4f/core/gdrn_selfocc_modeling/tools/lmo/lmo_2_vis_poses.py#:~:text=from%20lib.egl_renderer.egl_renderer_v3%20import%20EGLRenderer

opened by jiaming3 3
请问在可视化结果时可以提供一下你们的lib.egl_renderer吗，感谢。

如题目所示，请问作者可否提供下可视化结果时用到的lib.egl_renderer，无比感谢。 https://github.com/shangbuhuan13/SO-Pose/blob/b1cfa9c20bfbb0b4ccd1e8c421e392958628aa4f/core/gdrn_selfocc_modeling/tools/lmo/lmo_2_vis_poses.py#:~:text=from%20lib.egl_renderer.egl_renderer_v3%20import%20EGLRenderer

opened by liuicheng123 3
配置文件中`TRAIN2=("lmo_pbr_train")`是否必要

您好! 我准备使用gdrn_selfocc_multistep_40E.py配置文件复现实验，但是我发现在生成xyz后lm/train_pbr/xyz_crop已经很大，并且准备生成train_pbr/Q0,但是这个我估计要3T的存储空间，所以想问下配置文件中TRAIN2=("lmo_pbr_train",),的这个是否必要?

opened by flyinghu123 2
请问在训练过程中，cpu无法跑满怎么办。

如图所示，cpu无法跑满，导致速度很慢，请问这个有解决办法吗。不知道是不是因为你们基于detectron2的问题还是别的。感谢解答。我跑了很多天了，设置多个num_workers后，cpu无法跑满，导致我的速度非常慢，这个怎么办呢。是和你们依托detectron有关吗，还是你别的原因，我看你们在代码中，好像是也有遇到相关问题吗

opened by liuicheng123 2
How to evaluate the trained model on YCBV?

Could you provide a brief script to show how to evaluate the trained model on YCBV dataset? I just found it is hard to run the evaluation based on your current descriptions. Thx.

opened by DecaYale 1
Some questions about generate_pbr_P_fast.py

sorry to be a bother, when I run generate_pbr_P_fast.py, it shows an error called "'ref' has no attribute 'lm_full'", trace back from line 110, how can I fixed it?

opened by micki-37 0
Models link not working

Hi! Thanks for opensourcing the code! I cannot download models by the link: https://drive.google.com/file/d/136ExcMykxsVVSzOiGQVYspq1fx9Hjd6R/view?usp=sharing (error is: "Sorry, the file you have requested does not exist."). Does it work?

opened by Daniil-Osokin 0
生成xyz_crop时碰到的一些问题

您好，我之前在研究gdrnet的相关工作，但一直无法成功运行生成xyz_crop的程序。而我在您开源的这份代码的readme中看到您提及generate p.py对应的是2d-3d matching的groundtruth，这里生成的是否就是gdrnet中需要的xyz_crop呢。如果是的话，我是应该运行generate_pbr_P.py还是generate_pbr_P_fast.py呢？生成的xyz_crop和gdrnet中要求的是否一致呢？

opened by micki-37 2
linemod results with resnet50 or resnet34

Hi, I have a small question about the results on linemod dataset.

I see that in the paper, the backbone should be ResNet34, however, in the codebase, it seems like ResNet50 (https://github.com/shangbuhuan13/SO-Pose/blob/a3a61d2c97b1084a4754d6c12e45e16d85809729/configs/gdrn_selfocc/lm/gdrn_2rothead_multistep_02CT.py#L53).

I also run the linemod experiments with exact the config file in the repo, getting result ADI.10 ~95.5 with ResNet50. So I would like to confirm with you what is the Backbone used in linemod results?

opened by GUOShuxuan 1
Some error questions about generate_P_fast.py

sorry to be a bother, when I run this python file, it shows an error called "AssertionError: /home/fxj/SO-Pose-main/datasets/BOP_DATASETS/lm/train_pbr/xyz_crop/000000/000000_000001-xyz.pkl", where can I find"000000_000001-xyz.pkl"?

opened by micki-37 4

This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

Related tags

Overview

SO-Pose

Datasets

Training and Testing

Trained Models

Comments

Owner

shangbuhuan

This is an official implementation for "Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation".

PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Repository for the paper "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation", CVPR 2021.

This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [CVPR 2021]

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021]

Occlusion robust 3D face reconstruction model in CFR-GAN (WACV 2022)

The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.