GUPNet - Geometry Uncertainty Projection Network for Monocular 3D Object Detection

Yan Lu

Last update: Dec 28, 2022

Related tags

Deep Learning GUPNet

Overview

GUPNet

This is the official implementation of "Geometry Uncertainty Projection Network for Monocular 3D Object Detection".

citation

If you find our work useful in your research, please consider citing:

@article{lu2021geometry,
title={Geometry Uncertainty Projection Network for Monocular 3D Object Detection},
author={Lu, Yan and Ma, Xinzhu and Yang, Lei and Zhang, Tianzhu and Liu, Yating and Chu, Qi and Yan, Junjie and Ouyang, Wanli},
journal={arXiv preprint arXiv:2107.13774},year={2021}}

Usage

Train

Download the KITTI dataset from KITTI website, including left color images, camera calibration matrices and training labels.

Clone this project and then go to the code directory:

git clone https://github.com/SuperMHP/GUPNet.git
cd code

We train the model on the following environments:

Python 3.6
Pytorch 1.1
Cuda 9.0

You can build the environment easily by installing the requirements:

pip install -r requirements.yml

Train the model:

CUDA_VISIBLE_DEVICES=0,1,2 python tools/train_val.py

Evaluate

After training the model will directly feedback the detection files for evaluation (If so, you can skip this setep). But if you want to test a given checkpoint, you need to modify the "resume" of the "tester" in the code/experiments/config.yaml and then run:

python tools/train_val.py -e

After that, please use the kitti evaluation devkit (deails can be refered to FrustumPointNet) to evaluate:

g++ evaluate_object_3d_offline_apXX.cpp -o evaluate_object_3d_offline_ap
../../tools/kitti_eval/evaluate_object_3d_offline_apXX KITTI_LABEL_DIR ./output

We also provide the trained checkpoint which achieved the best multi-category performance on the validation set. It can be downloaded at here. This checkpoint performance is as follow:

Models	Car@IoU=0.7			Pedestrian@IoU=0.5			Cyclist@IoU=0.5
Models	Easy	Mod	Hard	Easy	Mod	Hard	Easy	Mod	Hard
original paper	22.76%	16.46%	13.72%	-	-	-	-	-	-
released chpt	23.19%	16.23%	13.57%	11.29%	7.05%	6.36%	9.49%	5.01%	4.14%

Test (I will modify this section to be more automatical in future)

Modify the train set to the trainval set (You can modify it in the code/libs/helpers/dataloader_helper.py), and then modify the input of the evaluation function to the test set (code/tools/train_val.py).

Compressed the output file to a zip file (Please note that this zip file do NOT include any root directory):

cd outputs/data
zip -r submission.zip .

submit this file to the KITTI page (You need to register an account.)

We also give our trained checkpoint on the trainval dataset. You can download it from here. This checkpoint performance is as follow (KITTI page):

Models	Car@IoU=0.7			Pedestrian@IoU=0.5			Cyclist@IoU=0.5
Models	Easy	Mod	Hard	Easy	Mod	Hard	Easy	Mod	Hard
original paper	20.11%	14.20%	11.77%	14.72%	9.53%	7.87%	4.18%	2.65%	2.09%
released chpt	22.26%	15.02%	13.12%	14.95%	9.76%	8.41%	5.58%	3.21%	2.66%

Other relative things

The releases code is originally set to train on multi-category here. If you would like to train on the single category (Car), please modify the code/experiments/config.yaml. Single-category training can lead to higher performance on the Car.
This implementation includes some tricks that do not describe in the paper. Please feel free to ask me in the issue. And I will also update the principle of them in the supplementary materials
The overall code cannot completely remove randomness because we use some functions which do not have reproduced implementation (e.g. ROI align). So the performance may have a certain degree of jitter, which is normal for this project.

Contact

If you have any question about this project, please feel free to contact [email protected].

Comments

Much lower AP_3D compared to AP_BEV

Thanks for your great work!

I use the released code to retrain the network, while the results are strange. I obtain: [email protected] [29.223935810025136, 21.975801299792906, 19.0762136467218] [email protected] [17.863160821111062, 12.961739635817185, 10.802839248636912]

where the AP_BEV is OK, but the AP_3D is considerably low. I tried three times and get similar results.

opened by SPengLiang 11
Calibration.flip()函数的原理

作者你好，在数据增强的水平翻转操作中，若图像进行了翻转，那么相机的相关标定信息会发生变化。在您的代码中体现在kitti.py中的calib.flip(img_size)这一操作，但是我不是很理解函数中为什么要构造cos_matrix这个矩阵以及用奇异值分解来求解相关系数，因此想向您请教一下该函数的相关数学原理出处，期待回复，非常感谢!!!

opened by kwong292521 4
Questions about evaluate results

HI, @SuperMHP Thanks for your codes! When evaluate the mAP|40, I use the cpp script you provided to get results. As is shown in the following figure, I wonder what is the difference between car_detection_ground and car_detection?

opened by Senwang98 3
About training under different version of pytorch and cuda

Thanks for your great work! I am now training the code under pytorch 1.10 and cuda 11.0, because I don't have a proper GPU that satisfies the environment in README. However, I got a much lower result in AP40 moderate: 13.69, compared to the given ckpt 16.23. Do you have some ideas about why the performance deteriorate sharply under different environments? Thanks very much

opened by shangbuhuan13 3
why not the bias depth is a negative number?

bias_depth = 1. / (depth_net_out[:,0:1].sigmoid() + 1e-6) - 1. ∈(0, +inf) the bias_depth need be negative number if the geo_depth was bigger than real_depth.

opened by lfydegithub 3
Is val mAP=16.46 reported in paper only trained with 'car' category?

@SuperMHP Hi, thanks to your work. I want to know if map=16.46 only uses car category during training

I trained my GUPNet by ['Car','Pedestrian','Cyclist']. Unfortunately, I tried to train three times and could only reach up to 15.5! So, I doubt how you can train out model with mAP=16.46. (Is the kitti training error of more than 1 point too large?) Whising for your reply.

opened by Senwang98 2
Question about the Calibration.flip

Hello, you have done a great job! I am some confused about the lib.datasets.kitti_utils.Calibration.flip function, which is used in your kitti dataloader. I am not sure about its function and result. when I try to use the flipped calib to back project some points from fliped image to the camera coordinate, I got very wrong result. I wonder the lib.datasets.kitti_utils.Calibration class is written by yourself or referenced from some orther codebase, Thank you very much!!! Thanks!

opened by yfthu 2
why not h3d_std * X + h3d_mean ?
https://github.com/SuperMHP/GUPNet/blob/f4e26601ba5551065c36ab7ecf17f6915015d311/code/lib/models/gupnet.py#L165 according to h3d should be equal to

h3d = self.mean_size[cls_ids[mask].long()] + size3d_offset * h3d_std

thx
opened by lfydegithub 2
The difference between original paper and released ckpt

Dear author,

Thanks for your wonderful work. I am following your repo to build a 3D detection framework. Do you mind telling me the difference between the results of original paper and released ckpt (especially on the test set) . And what causes the performance gap between them?

opened by aqianwang 2
Zero score on detetion

Hey! Great work!

I was trying the train_val.py in the evaluation mode using the model you provided. However. I see that all the detections are all 0. Any idea why this might be happening?

Thanks!

opened by pk1996 1
Question About the AP in test set

Hello , Thanks for your great work! And we hope to follow your work. We retrained your code on the KITTI train split (3dop), and evaluate it on the val set, get car's AP in val set: [22.698555, 15.741446, 13.477293], and it is close to your paper's report. Then we just use this checkpoint(trained on train split) to infer in KITTI's test set, and submit the result to KITTI benchmark, get the following result in test set:

The 3D AP is very low(Car's 3D AP: 14.93%, 10.15%, 8.22%). I wonder if it is normal considering the model is trained on train split(not in trainval).

And we want to reproduce your paper's report on KITTI test set, close to car‘s AP（22.26%, 15.02%, 13.12% as your released checkpoint or 20.11%, 14.20%, 11.77% in your original paper）. In addition to setting to the trainval set, what else do we need to do? And if I set to trainval set, how do I choose the best checkpoint in case that the val set is included in the trainval set? Thank you very much for your reply！

opened by yfthu 1

Owner

Yan Lu

GitHub

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

GDR-Net This repo provides the PyTorch implementation of the work: Gu Wang, Fabian Manhardt, Federico Tombari, Xiangyang Ji. GDR-Net: Geometry-Guided

169 Jan 7, 2023

Categorical Depth Distribution Network for Monocular 3D Object Detection

CaDDN CaDDN is a monocular-based 3D object detection method. This repository is based off of [OpenPCDet]. Categorical Depth Distribution Network for M

289 Jan 5, 2023

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency(ECCV 2020) This is an official python implementati

304 Jan 3, 2023

Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation

SUO-SLAM This repository hosts the code for our CVPR 2022 paper "Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation". ArXiv li

Robot Perception & Navigation Group (RPNG)

97 Jan 3, 2023

Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Bidirectional Projection Network for Cross Dimension Scene Understanding CVPR 2021 (Oral) [ Project Webpage ] [ arXiv ] [ Video ] Existing segmentatio

135 Dec 26, 2022

The repo of the preprinting paper "Labels Are Not Perfect: Inferring Spatial Uncertainty in Object Detection"

Inferring Spatial Uncertainty in Object Detection A teaser version of the code for the paper Labels Are Not Perfect: Inferring Spatial Uncertainty in

21 Mar 3, 2022

Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

An official implementation of paper Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

11 Nov 23, 2022

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

Visual 3D Detection Package: This repo aims to provide flexible and reproducible visual 3D detection on KITTI dataset. We expect scripts starting from

305 Dec 19, 2022

Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Mo

76 Jan 2, 2023

Delving into Localization Errors for Monocular 3D Object Detection, CVPR'2021

Delving into Localization Errors for Monocular 3D Detection By Xinzhu Ma, Yinmin Zhang, Dan Xu, Dongzhan Zhou, Shuai Yi, Haojie Li, Wanli Ouyang. Intr

124 Jan 4, 2023

Progressive Coordinate Transforms for Monocular 3D Object Detection

Progressive Coordinate Transforms for Monocular 3D Object Detection This repository is the official implementation of PCT. Introduction In this paper,

58 Nov 6, 2022

Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

MonoFlex Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21. Work in progress. Installation This repo is tested w

169 Dec 6, 2022

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

107 Dec 20, 2022

Unofficial PyTorch implementation of "RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving" (ECCV 2020)

RTM3D-PyTorch The PyTorch Implementation of the paper: RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving (ECCV 2020

271 Nov 29, 2022

Curved Projection Reformation

Description Assuming that we already know the image of the centerline, we want the lumen to be displayed on a plane, which requires curved projection

5 Sep 11, 2022

Hierarchical Uniform Manifold Approximation and Projection

HUMAP Hierarchical Manifold Approximation and Projection (HUMAP) is a technique based on UMAP for hierarchical non-linear dimensionality reduction. HU

160 Jan 6, 2023

my graduation project is about live human face augmentation by projection mapping by using CNN

Live-human-face-expression-augmentation-by-projection my graduation project is about live human face augmentation by projection mapping by using CNN o

1 Mar 8, 2022

Using NumPy to solve the equations of fluid mechanics together with Finite Differences, explicit time stepping and Chorin's Projection methods

Computational Fluid Dynamics in Python Using NumPy to solve the equations of fluid mechanics ?? ?? ?? together with Finite Differences, explicit time

4 Nov 12, 2022

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

5 Dec 10, 2022