Official source code of Fast Point Transformer, CVPR 2022

Overview

Fast Point Transformer

Project Page | Paper

This repository contains the official source code and data for our paper:

Fast Point Transformer
Chunghyun Park, Yoonwoo Jeong, Minsu Cho, and Jaesik Park
POSTECH GSAI & CSE
CVPR, 2022, New Orleans.

An Overview of the proposed pipeline

Overview

This work introduces Fast Point Transformer that consists of a new lightweight self-attention layer. Our approach encodes continuous 3D coordinates, and the voxel hashing-based architecture boosts computational efficiency. The proposed method is demonstrated with 3D semantic segmentation and 3D detection. The accuracy of our approach is competitive to the best voxel based method, and our network achieves 129 times faster inference time than the state-of-the-art, Point Transformer, with a reasonable accuracy trade-off in 3D semantic segmentation on S3DIS dataset.

Citation

If you find our code or paper useful, please consider citing our paper:

@inproceedings{park2022fast,
 title={{Fast Point Transformer}},
 author={Chunghyun Park and Yoonwoo Jeong and Minsu Cho and Jaesik Park},
 booktitle={Proceedings of the {IEEE/CVF} Conference on Computer Vision and Pattern Recognition (CVPR)},
 year={2022}
}

Experiments

1. S3DIS Area 5 test

We denote MinkowskiNet42 trained with this repository as MinkowskiNet42. We use voxel size 4cm for both MinkowskiNet42 and our Fast Point Transformer.

Model Latency (sec) mAcc (%) mIoU (%) Reference
PointTransformer 18.07 76.5 70.4 Codes from the authors
MinkowskiNet42 0.08 74.1 67.2 Checkpoint
  + rotation average 0.66 75.1 69.0 -
FastPointTransformer 0.14 76.6 69.2 Checkpoint
  + rotation average 1.13 77.6 71.0 -

2. ScanNetV2 validation

Model Voxel Size mAcc (%) mIoU (%) Reference
MinkowskiNet42 2cm - 72.2 Official GitHub
MinkowskiNet42 2cm 81.4 72.1 Checkpoint
FastPointTransformer 2cm 81.2 72.5 Checkpoint
MinkowskiNet42 5cm 76.3 67.0 Checkpoint
FastPointTransformer 5cm 78.9 70.0 Checkpoint
MinkowskiNet42 10cm 70.8 60.7 Checkpoint
FastPointTransformer 10cm 76.1 66.5 Checkpoint

Installation

This repository is developed and tested on

  • Ubuntu 18.04 and 20.04
  • Conda 4.11.0
  • CUDA 11.1
  • Python 3.8.13
  • PyTorch 1.7.1 and 1.10.0
  • MinkowskiEngine 0.5.4

Environment Setup

You can install the environment by using the provided shell script:

~$ git clone --recursive [email protected]:POSTECH-CVLab/FastPointTransformer.git
~$ cd FastPointTransformer
~/FastPointTransformer$ bash setup.sh fpt
~/FastPointTransformer$ conda activate fpt

Training & Evaluation

First of all, you need to download the datasets (ScanNetV2 and S3DIS), and preprocess them as:

(fpt) ~/FastPointTransformer$ python src/data/preprocess_scannet.py # you need to modify the data path
(fpt) ~/FastPointTransformer$ python src/data/preprocess_s3dis.py # you need to modify the data path

And then, locate the provided meta data of each dataset (src/data/meta_data) with the preprocessed dataset following the structure below:

${data_dir}
├── scannetv2
│   ├── meta_data
│   │   ├── scannetv2_train.txt
│   │   ├── scannetv2_val.txt
│   │   └── ...
│   └── scannet_processed
│       ├── train
│       │   ├── scene0000_00.ply
│       │   ├── scene0000_01.ply
│       │   └── ...
│       └── test
└── s3dis
    ├── meta_data
    │   ├── area1.txt
    │   ├── area2.txt
    │   └── ...
    └── s3dis_processed
        ├── Area_1
        │   ├── conferenceRoom_1.ply
        │   ├── conferenceRoom_2.ply
        │   └── ...
        ├── Area_2
        └── ...

After then, you can train and evalaute a model by using the provided python scripts (train.py and eval.py) with configuration files in the config directory. For example, you can train and evaluate Fast Point Transformer with voxel size 4cm on S3DIS dataset via the following commands:

(fpt) ~/FastPointTransformer$ python train.py config/s3dis/train_fpt.gin
(fpt) ~/FastPointTransformer$ python eval.py config/s3dis/eval_fpt.gin {checkpoint_file} # use -r option for rotation averaging.

Consistency Score

You need to generate predictions via the following command:

(fpt) ~/FastPointTransformer$ python -m src.cscore.prepare {checkpoint_file} -m {model_name} -v {voxel_size} # This takes hours.

Then, you can calculate the consistency score (CScore) with:

(fpt) ~/FastPointTransformer$ python -m src.cscore.calculate {prediction_dir} # This takes seconds.

3D Object Detection using VoteNet

Please refer this repository.

Acknowledgement

Our code is based on the MinkowskiEngine. We also thank Hengshuang Zhao for providing the code of Point Transformer. If you use our model, please consider citing them as well.

Comments
  • How to get the access to wandb?

    How to get the access to wandb?

    An error is reported when training the dataset, saying "Error while calling W&B API: permission denied (<Response [403]>)". I checked the documentation of wandb and it says I don't have the project permissions. I want to know how to get the permission, thanks for your help!

    opened by Tomoki-0526 5
  • questions

    questions

    Thanks for the great work! I have some questions.

    1. I wonder where the GPU memory budget cost in FastTrans. Because I only have GTX1080, 12G. Do you test the model for small-scale task like ShapeNet DataSet? And can I apply the lightweight self-attention to get a better feature embedding for ShapeNet classification in my 1080 machine? Maybe I can use a smaller FastTrans?
    2. I notice that the voxel size is used for data augmention. If i use the model for ShapeNet classification which normalized coordinates lie between -1 and 1. Can I remove the data augmention and just use original coordinates? Or how to get a feasible voxel size ?
    opened by zouwenqin 4
  • how to download the dataset?

    how to download the dataset?

    Before calling the script preprocess_s3dis.py,I need to download the data manually, right? Or automatic download in the script preprocess_s3dis.py? If I need to download it automatically,How to download the dataset?

    opened by ZoangX 3
  • Why δ(vi-vj) is O(KD)?

    Why δ(vi-vj) is O(KD)?

    Hi! When I read your paper at reducing space complexity section, it says that the space complexity of δrel(vi-vj) is O(KD),I can't understand it. I think there are K neighbors for each voxel,why not O(IKD)? I hope you can help me ,Thanks!

    opened by artofstate 3
  • import cuda_sparse_ops

    import cuda_sparse_ops

    Thanks to the author for such quality code. I have some questions for the author, in the sparse_ops.py file. import cuda_sparse_ops keeps reporting errors. How to solve it? ? ? And how to run setup.py in src.cuda_ops??????

    opened by AArutoria 2
  • [Question] hardware used and training time

    [Question] hardware used and training time

    Hi

    The paper mentions an inference time extremely reduced compared to the original PointTransformer, but I was also curious about the time it took to train the model. What GPUs did you use and how much time did it take to train compared to PointTransformer ?

    Thx a lot

    opened by QuanticDisaster 2
  • Training Log

    Training Log

    Hellow! Thank you for your awesome work! I find that it takes 20 hours on A100 to train your model. Could I have a look at your training log on S3DIS?

    opened by aoligei178 1
  • Could it utilize the inter-frame information?

    Could it utilize the inter-frame information?

    Hi,Thanks for your great work first,but I have some questions about the attention block. For example, I set batch size = 2, how can i find the query voxels around both two frames.

    opened by artofstate 1
  • How to get access to wandb?

    How to get access to wandb?

    It occurs when I start to train on S3DIS datset via the command "python train.py config/s3dis/train_fpt.gin" 78186242395619 I tried installing the module using pip but it said "No matching distribution found for cuda_sparse_ops", and I didn't find any solution on the Internet. Is it because there was something wrong with my installation?

    opened by Tomoki-0526 0
Owner
null
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 87 Jan 8, 2023
Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR 2022)

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR2022)[paper] Authors: Chenhang He, Ruihuang Li, Shuai Li, L

Billy HE 141 Dec 30, 2022
Stratified Transformer for 3D Point Cloud Segmentation (CVPR 2022)

Stratified Transformer for 3D Point Cloud Segmentation Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia

DV Lab 195 Jan 1, 2023
(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

Haoxi Ran 264 Dec 23, 2022
Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

PV-RAFT This repository contains the PyTorch implementation for paper "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clou

Yi Wei 43 Dec 5, 2022
[CVPR 2022] Official Pytorch code for OW-DETR: Open-world Detection Transformer

OW-DETR: Open-world Detection Transformer (CVPR 2022) [Paper] Akshita Gupta*, Sanath Narayan*, K J Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Sh

Akshita Gupta 127 Dec 27, 2022
Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021)

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021) An efficient PyTorch library for Point Cloud Completion.

Microsoft 119 Jan 2, 2023
Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Style Transformer for Image Inversion and Editing (CVPR2022) https://arxiv.org/abs/2203.07932 Existing GAN inversion methods fail to provide latent co

Xueqi Hu 153 Dec 2, 2022
Implementation of the "Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos" paper.

Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos Introduction Point cloud videos exhibit irregularities and lack of or

Hehe Fan 101 Dec 29, 2022
[ICCV 2021 Oral] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

This repository contains the source code for the paper SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer (ICCV 2021 Oral). The project page is here.

AllenXiang 65 Dec 26, 2022
Implementation of CVPR'2022:Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors

Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository contains

null 151 Dec 26, 2022
Implementation of CVPR'2022:Surface Reconstruction from Point Clouds by Learning Predictive Context Priors

Surface Reconstruction from Point Clouds by Learning Predictive Context Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository c

null 136 Dec 12, 2022
Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

Joshua Ji 3 Aug 20, 2022
source code the paper Fast and Robust Iterative Closet Point.

Fast-Robust-ICP This repository includes the source code the paper Fast and Robust Iterative Closet Point. Authors: Juyong Zhang, Yuxin Yao, Bailin De

yaoyuxin 320 Dec 28, 2022
The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

NTIRE 2022 - Image Inpainting Challenge Important dates 2022.02.01: Release of train data (input and output images) and validation data (only input) 2

Andrés Romero 37 Nov 27, 2022
GeoTransformer - Geometric Transformer for Fast and Robust Point Cloud Registration

Geometric Transformer for Fast and Robust Point Cloud Registration PyTorch imple

Zheng Qin 220 Jan 5, 2023
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

Yuanhao Cai 274 Jan 5, 2023
Open-source code for Generic Grouping Network (GGN, CVPR 2022)

Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity Pytorch implementation for "Open-World Instance Segmen

Meta Research 99 Dec 6, 2022
[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

MDCA Calibration 21 Dec 22, 2022