EProPnP
EProPnP: Generalized EndtoEnd Probabilistic PerspectivenPoints for Monocular Object Pose Estimation
In CVPR 2022 (Oral). [paper]
Hansheng Chen*^{1,2}, Pichao Wang†^{2}, Fan Wang^{2}, Wei Tian†^{1}, Lu Xiong^{1}, Hao Li^{2}
^{1}Tongji University, ^{2}Alibaba Group
*Part of work done during an internship at Alibaba Group.
†Corresponding Authors: Pichao Wang, Wei Tian.
Introduction
EProPnP is a probabilistic PerspectivenPoints (PnP) layer for endtoend 6DoF pose estimation networks. Broadly speaking, it is essentially a continuous counterpart of the widely used categorical Softmax layer, and is theoretically generalizable to other learning models with nested optimization.
Given the layer input: an point correspondence set consisting of 3D object coordinates , 2D image coordinates , and 2D weights , a conventional PnP solver searches for an optimal pose (rigid transformation in SE(3)) that minimizes the weighted reprojection error. Previous work tries to backpropagate through the PnP operation, yet is inherently nondifferentiable due to the inner operation. This leads to convergence issue if all the components in must be learned by the network.
In contrast, our probabilistic PnP layer outputs a posterior distribution of pose, whose probability density can be derived for proper backpropagation. The distribution is approximated via Monte Carlo sampling. With EProPnP, the correspondences can be learned from scratch altogether by minimizing the KL divergence between the predicted and target pose distribution.
Models
We release two distinct networks trained with EProPnP:

EProPnP6DoF for 6DoF pose estimation

EProPnPDet for 3D object detection
Use EProPnP in Your Own Model
We provide a demo on the usage of the EProPnP layer.
Citation
If you find this project useful in your research, please consider citing:
@inproceedings{epropnp,
author = {Hansheng Chen and Pichao Wang and Fan Wang and Wei Tian and Lu Xiong and Hao Li,
title = {EProPnP: Generalized EndtoEnd Probabilistic PerspectivenPoints for Monocular Object Pose Estimation},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2022}
}