Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

Last update: Jan 7, 2023

Related tags

Deep Learning Pyramid-RCNN

Overview

Pyramid R-CNN

This is a reproduced repo of Pyramid R-CNN for 3D object detection.

The code is mainly based on OpenPCDet.

Introduction

We provide code and training configurations of Pyramid-V/PV on the KITTI and Waymo Open dataset. Checkpoints will not be released. The dataset organization is same with PCDet.

Requirements

The codes are tested in the following environment:

Ubuntu 18.04
Python 3.6
PyTorch 1.5
CUDA 10.1
OpenPCDet v0.3.0
spconv v1.2.1

Installation

a. Clone this repository.

git clone https://github.com/PointsCoder/Pyramid_R-CNN.git

b. Install the dependent libraries as follows:

Install the dependent python libraries:

pip install -r requirements.txt

Install the SparseConv library, we use the implementation from [spconv].
- If you use PyTorch 1.1, then make sure you install the spconv v1.0 with (commit 8da6f96) instead of the latest one.
- If you use PyTorch 1.3+, then you need to install the spconv v1.2. As mentioned by the author of spconv, you need to use their docker if you use PyTorch 1.4+.

c. Compile CUDA operators by running the following command:

python setup.py develop

Training

We train all the models with 8 Tesla V100 GPU (32Gb), and all the configs (epochs/learning rate/batch size) are for 8-GPU Distributed Data Parallel (DDP) training. Users may change those training parameters if they want to run with different GPU numbers and memories.

models

# pyramid_rcnn_pv.yaml: pyramid roi head on the point-voxel backbone
# pyramid_rcnn_v.yaml: pyramid roi head on the spconv u-net backbone

DDP training

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 sh scripts/dist_train.sh 8 --cfg_file cfgs/waymo_models/pyramid_rcnn_pv.yaml

DDP testing

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 sh scripts/dist_test.sh 8 --cfg_file cfgs/waymo_models/pyramid_rcnn_pv.yaml --eval_all

Citation

If you find this project useful in your research, please consider cite:

@article{mao2021pyramid,
  title={Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection},
  author={Mao, Jiageng and Niu, Minzhe and Bai, Haoyue and Liang, Xiaodan and Xu, Hang and Xu, Chunjing},
  journal={ICCV},
  year={2021}
}

Comments

The problem about TransformerEncoderLayer.

Hello, Thank you for your excellent work.Why use 'Normal' but 'NoTr' in PyramidModule?After RoI-grid Attention, the TransformerEncoderLayer() is used.What is the function of it?There is no ablation study about this in the paper.

 if self.tr_mode == 'NoTr':
                pass
            elif self.tr_mode == 'Normal':
                new_features = new_features.permute(1, 0, 2).contiguous()  # (L, B, C)
                pos_emb = self.pos_embeddings[i].unsqueeze(1).repeat(1, new_features.shape[1], 1)  # (L, B, C)
                new_features = self.transformer_encoders[i](new_features, pos=pos_emb)
                new_features = new_features.permute(1, 0, 2).contiguous()  # (B, L, C)
            elif self.tr_mode == 'Residual':
                tr_new_features = new_features.permute(1, 0, 2).contiguous()  # (L, B, C)
                pos_emb = self.pos_embeddings[i].unsqueeze(1).repeat(1, tr_new_features.shape[1], 1)  # (L, B, C)
                tr_new_features = self.transformer_encoders[i](tr_new_features, pos=pos_emb)
                tr_new_features = tr_new_features.permute(1, 0, 2).contiguous()  # (B, L, C)
                new_features = new_features + tr_new_features
            else:
                raise NotImplementedError

opened by WangYaqi180 3

The code and training configuration for Pyramid-P

Thank you for sharing this amazing paper. I have been working on Pointrcnn improvements recently. Could you please release the code and training configuration for Pyramid-P on KITTI and Waymo Open datasets? I have tried to patch it myself but have been unsuccessful.

opened by xuheyang 2
Test models on Waymo

In paper, it says "We append another frame and use a larger voxel backbone" for test. What does it mean? How can I append the frame and which backbone should I use to reproduce the results on test leaderboard?

opened by tdzdog 2
Question about Density-Aware Radius Prediction

Thanks for your greak work.

I noticed that in the paper the grid feature is computed as below formulation:

But I don't find the corresponding code of multiplication of s(i, r), can you help to explain it.

opened by Eaphan 2
How to use the KITTI test_set to evaluate model?

Thanks for your greate work! I want to use KITTI test set to evaluate the model. I modified DATA_SPLIT and INFO_PATH in the kitti_dataset.yaml, but the result shows that all the recalls are 0, and the mAP is not displayed.

opened by zhangwanjingjj 1
The code and training configuration for Pyramid-P

Thank you for sharing this amazing paper. I have been working on Pointrcnn improvements recently. Could you please release the code and training configuration for Pyramid-P on KITTI and Waymo Open datasets? I have tried to patch it myself but have been unsuccessful.

opened by xuheyang 0
Hello, can you give me a detection code submitted to Kitti on openpcdet? I just learned this knowledge and wrote a code myself, but the submitted result is very low. I hope I can refer to it.

Hello, can you give me a detection code submitted to Kitti on openpcdet? I just learned this knowledge and wrote a code myself, but the submitted result is very low. I hope I can refer to it.

opened by libingDY 2

Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

Related tags

Overview

Pyramid R-CNN

Introduction

Requirements

Installation

Training

Citation

Comments

The problem about TransformerEncoderLayer.

The code and training configuration for Pyramid-P

Test models on Waymo

Question about Density-Aware Radius Prediction

How to use the KITTI test_set to evaluate model?

The code and training configuration for Pyramid-P

Hello, can you give me a detection code submitted to Kitti on openpcdet? I just learned this knowledge and wrote a code myself, but the submitted result is very low. I hope I can refer to it.

Owner

Pytorch implementation of Feature Pyramid Network (FPN) for Object Detection

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

This repository contains the implementation of the paper: "Towards Frequency-Based Explanation for Robust CNN"

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Implementation for the paper 'YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs'

Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

Yoloxkeypointsegment - An anchor-free version of YOLO, with a simpler design but better performance

Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Yolo object detection - Yolo object detection with python

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?

nnDetection is a self-configuring framework for 3D (volumetric) medical object detection which can be applied to new data sets without manual intervention. It includes guides for 12 data sets that were used to develop and evaluate the performance of the proposed method.

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.