Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

Overview

Pyramid R-CNN

This is a reproduced repo of Pyramid R-CNN for 3D object detection.

The code is mainly based on OpenPCDet.

Introduction

We provide code and training configurations of Pyramid-V/PV on the KITTI and Waymo Open dataset. Checkpoints will not be released. The dataset organization is same with PCDet.

Requirements

The codes are tested in the following environment:

  • Ubuntu 18.04
  • Python 3.6
  • PyTorch 1.5
  • CUDA 10.1
  • OpenPCDet v0.3.0
  • spconv v1.2.1

Installation

a. Clone this repository.

git clone https://github.com/PointsCoder/Pyramid_R-CNN.git

b. Install the dependent libraries as follows:

  • Install the dependent python libraries:
pip install -r requirements.txt 
  • Install the SparseConv library, we use the implementation from [spconv].
    • If you use PyTorch 1.1, then make sure you install the spconv v1.0 with (commit 8da6f96) instead of the latest one.
    • If you use PyTorch 1.3+, then you need to install the spconv v1.2. As mentioned by the author of spconv, you need to use their docker if you use PyTorch 1.4+.

c. Compile CUDA operators by running the following command:

python setup.py develop

Training

We train all the models with 8 Tesla V100 GPU (32Gb), and all the configs (epochs/learning rate/batch size) are for 8-GPU Distributed Data Parallel (DDP) training. Users may change those training parameters if they want to run with different GPU numbers and memories.

  • models
# pyramid_rcnn_pv.yaml: pyramid roi head on the point-voxel backbone
# pyramid_rcnn_v.yaml: pyramid roi head on the spconv u-net backbone
  • DDP training
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 sh scripts/dist_train.sh 8 --cfg_file cfgs/waymo_models/pyramid_rcnn_pv.yaml
  • DDP testing
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 sh scripts/dist_test.sh 8 --cfg_file cfgs/waymo_models/pyramid_rcnn_pv.yaml --eval_all

Citation

If you find this project useful in your research, please consider cite:

@article{mao2021pyramid,
  title={Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection},
  author={Mao, Jiageng and Niu, Minzhe and Bai, Haoyue and Liang, Xiaodan and Xu, Hang and Xu, Chunjing},
  journal={ICCV},
  year={2021}
}
Comments
  • The problem about TransformerEncoderLayer.

    The problem about TransformerEncoderLayer.

    Hello, Thank you for your excellent work.Why use 'Normal' but 'NoTr' in PyramidModule?After RoI-grid Attention, the TransformerEncoderLayer() is used.What is the function of it?There is no ablation study about this in the paper.

     if self.tr_mode == 'NoTr':
                    pass
                elif self.tr_mode == 'Normal':
                    new_features = new_features.permute(1, 0, 2).contiguous()  # (L, B, C)
                    pos_emb = self.pos_embeddings[i].unsqueeze(1).repeat(1, new_features.shape[1], 1)  # (L, B, C)
                    new_features = self.transformer_encoders[i](new_features, pos=pos_emb)
                    new_features = new_features.permute(1, 0, 2).contiguous()  # (B, L, C)
                elif self.tr_mode == 'Residual':
                    tr_new_features = new_features.permute(1, 0, 2).contiguous()  # (L, B, C)
                    pos_emb = self.pos_embeddings[i].unsqueeze(1).repeat(1, tr_new_features.shape[1], 1)  # (L, B, C)
                    tr_new_features = self.transformer_encoders[i](tr_new_features, pos=pos_emb)
                    tr_new_features = tr_new_features.permute(1, 0, 2).contiguous()  # (B, L, C)
                    new_features = new_features + tr_new_features
                else:
                    raise NotImplementedError
    
    opened by WangYaqi180 3
  • The code and training configuration for Pyramid-P

    The code and training configuration for Pyramid-P

    Thank you for sharing this amazing paper. I have been working on Pointrcnn improvements recently. Could you please release the code and training configuration for Pyramid-P on KITTI and Waymo Open datasets? I have tried to patch it myself but have been unsuccessful.

    opened by xuheyang 2
  • Test models on Waymo

    Test models on Waymo

    In paper, it says "We append another frame and use a larger voxel backbone" for test. What does it mean? How can I append the frame and which backbone should I use to reproduce the results on test leaderboard?

    opened by tdzdog 2
  • Question about Density-Aware Radius Prediction

    Question about Density-Aware Radius Prediction

    Thanks for your greak work.

    I noticed that in the paper the grid feature is computed as below formulation:

    image image

    But I don't find the corresponding code of multiplication of s(i, r), can you help to explain it.

    opened by Eaphan 2
  • How to use the KITTI test_set to evaluate model?

    How to use the KITTI test_set to evaluate model?

    Thanks for your greate work! I want to use KITTI test set to evaluate the model. I modified DATA_SPLIT and INFO_PATH in the kitti_dataset.yaml, but the result shows that all the recalls are 0, and the mAP is not displayed. image image

    opened by zhangwanjingjj 1
  • The code and training configuration for Pyramid-P

    The code and training configuration for Pyramid-P

    Thank you for sharing this amazing paper. I have been working on Pointrcnn improvements recently. Could you please release the code and training configuration for Pyramid-P on KITTI and Waymo Open datasets? I have tried to patch it myself but have been unsuccessful.

    opened by xuheyang 0
  • Hello, can you give me a detection code submitted to Kitti on openpcdet? I just learned this knowledge and wrote a code myself, but the submitted result is very low. I hope I can refer to it.

    Hello, can you give me a detection code submitted to Kitti on openpcdet? I just learned this knowledge and wrote a code myself, but the submitted result is very low. I hope I can refer to it.

    Hello, can you give me a detection code submitted to Kitti on openpcdet? I just learned this knowledge and wrote a code myself, but the submitted result is very low. I hope I can refer to it.

    opened by libingDY 2
Owner
I may not respond to issues quickly. Send me an e-mail if necessary.
null
Pytorch implementation of Feature Pyramid Network (FPN) for Object Detection

fpn.pytorch Pytorch implementation of Feature Pyramid Network (FPN) for Object Detection Introduction This project inherits the property of our pytorc

Jianwei Yang 912 Dec 21, 2022
NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

null 5 Nov 3, 2022
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Mask R-CNN for Object Detection and Segmentation This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bound

Matterport, Inc 22.5k Jan 4, 2023
This repository contains the implementation of the paper: "Towards Frequency-Based Explanation for Robust CNN"

RobustFreqCNN About This repository contains the implementation of the paper "Towards Frequency-Based Explanation for Robust CNN" arxiv. It primarly d

Sarosij Bose 2 Jan 23, 2022
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

Facebook Research 25.5k Jan 7, 2023
The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

MIC-DKFZ 1.2k Jan 4, 2023
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals, CVPR2021

End-to-End Object Detection with Learnable Proposal, CVPR2021

Peize Sun 1.2k Dec 27, 2022
Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022) By Shilong Zhang*, Zhuoran Yu*, Liyang Liu*, Xinjiang Wang, Aojun Zhou,

Shilong Zhang 129 Dec 24, 2022
Implementation for the paper 'YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs'

YOLO-ReT This is the original implementation of the paper: YOLO-ReT: Towards High Accuracy Real-time Object Detection on Edge GPUs. Prakhar Ganesh, Ya

null 69 Oct 19, 2022
Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

InfoPro-Pytorch The Information Propagation algorithm for training deep networks with local supervision. (ICLR 2021) Revisiting Locally Supervised Lea

null 78 Dec 27, 2022
Yoloxkeypointsegment - An anchor-free version of YOLO, with a simpler design but better performance

Introduction 关键点版本:已完成 全景分割版本:已完成 实例分割版本:已完成 YOLOX is an anchor-free version of

null 23 Oct 20, 2022
Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

null 5 Dec 10, 2022
Yolo object detection - Yolo object detection with python

How to run download required files make build_image make download Docker versio

null 3 Jan 26, 2022
This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

haifeng xia 32 Oct 26, 2022
Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

PGNet Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022, CVPR 2022 (arXiv 2204.05041) Abstract Recent salient objec

CVTEAM 109 Dec 5, 2022
LiDAR R-CNN: An Efficient and Universal 3D Object Detector

LiDAR R-CNN: An Efficient and Universal 3D Object Detector Introduction This is the official code of LiDAR R-CNN: An Efficient and Universal 3D Object

TuSimple 295 Jan 5, 2023
[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.

MiVOS (CVPR 2021) - Mask Propagation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [arXiv] [Paper PDF] [Project Page] [Papers with Code] This repo impleme

Rex Cheng 106 Jan 3, 2023