Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Hesper

Last update: Jan 5, 2023

Related tags

Deep Learning STRL

Overview

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds

This is the official code implementation for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021) paper

Checklist

Self-supervised Pre-training Framework

BYOL
SimCLR

Downstream Tasks

Shape Classification
Semantic Segmentation
Indoor Object Detection
Outdoor Object Detection

Installation

The code was tested with the following environment: Ubuntu 18.04, python 3.7, pytorch 1.7.1, torchvision 0.8.2 and CUDA 11.1.

For self-supervised pre-training, run the following command:

git clone https://github.com/yichen928/STRL.git
cd STRL
pip install -r requirements.txt

For downstream tasks, please refer to the Downstream Tasks section.

Datasets

Please download the used dataset with the following links:

ShapeNet: https://drive.google.com/uc?id=1sJd5bdCg9eOo3-FYtchUVlwDgpVdsbXB
ModelNet40: https://shapenet.cs.stanford.edu/media/modelnet40_normal_resampled.zip
ScanNet (subset): Please follow the instruction in their official website. The 25k frames subset is enough for our model.

Make sure to put the files in the following structure:

|-- ROOT
|	|-- BYOL
|		|-- data
|			|-- modelnet40_normal_resampled_cache
|			|-- shapenet57448xyzonly.npz
|			|-- scannet
|				|-- scannet_frames_25k

Pre-training

BYOL framework

Please run the following command:

python BYOL/train.py

You need to edit the config file BYOL/config/config.yaml to switch different backbone architectures (currently including BYOL-pointnet-cls, BYOL-dgcnn-cls, BYOL-dgcnn-semseg, BYOL-votenet-detection).

Pre-trained Models

You can find the checkpoints of the pre-training and downstream tasks in our Google Drive.

Linear Evaluation

For PointNet or DGCNN classification backbones, you may evaluate the learnt representation with linear SVM classifier by running the following command:

For PointNet:

python BYOL/evaluate_pointnet.py -w /path/to/your/pre-trained/checkpoints

For DGCNN:

python BYOL/evaluate_dgcnn.py -w /path/to/your/pre-trained/checkpoints

Downstream Tasks

Checkpoints Transformation

You can transform the pre-trained checkpoints to different downstream tasks by running:

For VoteNet:

python BYOL/transform_ckpt_votenet.py --input_path /path/to/your/pre-trained/checkpoints --output_path /path/to/the/transformed/checkpoints

For other backbones:

python BYOL/transform_ckpt.py --input_path /path/to/your/pre-trained/checkpoints --output_path /path/to/the/transformed/checkpoints

Fine-tuning and Evaluation for Downstream Tasks

For the fine-tuning and evaluation of downstream tasks, please refer to other corresponding repos. We sincerely thank all these authors for their nice work!

Classification: WangYueFt/dgcnn
Semantic Segmentation: AnTao97/dgcnn.pytorch
Indoor Object Detection: facebookresearch/votenet

Citation

If you found our paper or code useful for your research, please cite the following paper:

@article{huang2021spatio,
  title={Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds},
  author={Huang, Siyuan and Xie, Yichen and Zhu, Song-Chun and Zhu, Yixin},
  journal={arXiv preprint arXiv:2109.00179},
  year={2021}
}

Comments

Dataloader returned 0 length

Sorry to bother you, I've got an error when running train.py. ShapeNet is used as a pre-training dataset and ModelNet40 is an evaluation dataset. The two datasets are downloaded and I put them in a directory as described readme file. The detail of traceback information as follows.

Traceback (most recent call last): File "/home/jiangfan/workspace/codespace/STRL/BYOL/train.py", line 67, in main() File "/home/jiangfan/miniconda3/envs/pytorch2/lib/python3.8/site-packages/hydra/main.py", line 20, in decorated_main run_hydra( File "/home/jiangfan/miniconda3/envs/pytorch2/lib/python3.8/site-packages/hydra/_internal/utils.py", line 171, in run_hydra hydra.run( File "/home/jiangfan/miniconda3/envs/pytorch2/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 82, in run return run_job( File "/home/jiangfan/miniconda3/envs/pytorch2/lib/python3.8/site-packages/hydra/plugins/common/utils.py", line 109, in run_job ret.return_value = task_function(task_cfg) File "/home/jiangfan/workspace/codespace/STRL/BYOL/train.py", line 63, in main trainer.fit(model) File "/home/jiangfan/miniconda3/envs/pytorch2/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 997, in fit results = self.dp_train(model) File "/home/jiangfan/miniconda3/envs/pytorch2/lib/python3.8/site-packages/pytorch_lightning/trainer/distrib_parts.py", line 270, in dp_train result = self.run_pretrain_routine(model) File "/home/jiangfan/miniconda3/envs/pytorch2/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1185, in run_pretrain_routine self.reset_val_dataloader(ref_model) File "/home/jiangfan/miniconda3/envs/pytorch2/lib/python3.8/site-packages/pytorch_lightning/trainer/data_loading.py", line 343, in reset_val_dataloader self.num_val_batches, self.val_dataloaders = self._reset_eval_dataloader(model, 'val') File "/home/jiangfan/miniconda3/envs/pytorch2/lib/python3.8/site-packages/pytorch_lightning/trainer/data_loading.py", line 303, in _reset_eval_dataloader num_batches = len(dataloader) if _has_len(dataloader) else float('inf') File "/home/jiangfan/miniconda3/envs/pytorch2/lib/python3.8/site-packages/pytorch_lightning/trainer/data_loading.py", line 58, in _has_len raise ValueError('Dataloader returned 0 length.' ValueError: Dataloader returned 0 length. Please make sure that your Dataloader at least returns 1 batch

opened by xjiangfan 8
weight model error
this weight raise a error when load it to model.

RuntimeError: unexpected EOF, expected 9781136 more bytes. The file might be corrupted. terminate called without an active exception

https://drive.google.com/file/d/1NzFKTyw_TfZQon902IqEnldODYgWKvlG/view?usp=sharing
opened by fukexue 5
Experment result in dgcnn-cls

Thanks for your great work! I tried to run the training experiment with dgcnn in classification task .But I only got around 70% accuarcy in SVM evaluation and 89.6% in fine-tune evaluation. During training I found my val_loss is lower than 0.2 after 10 epoch , but the checkpoint you provided has 0.513 val_loss at epoch 51 . Is there something wrong with my config ? Thanks

optimizer: weight_decay: 0.01 lr: 1e-3 type: adam

network: DGCNN dataset: ShapeNet # ShapeNet, ShapeNetPart, ModelNet40, ScanNet num_points: 2048 # 2048 for ShapeNet, 4096 for ModelNet40, 4096 for ScanNet epochs: 100 batch_size: 32 acc_batches: 1 transform_mode: both

decay_rate: 0.996 mlp_hidden_size: 4096 projection_size: 256

k: 40 emb_dims: 1024 window_length: 3 dropout: 0.5 num_workers: 32

resume_ckpt:

opened by tsbiosky 3
Question about the PVRCNN model in outdoor scenes

Hi ! Thanks for your great job for model pretraining on point clouds, I have a question about the outdoor scenes model, PVRCNN. Unlike the indoor scenes model VoteNet, the PVRCNN is a more complex model with two stages. So my question is where the max pooling is applied for the PVRCNN model, 3D backbone or 2D backbone? Thanks for your reply!

opened by lichengwei-code 2
semseg

Hello,

thanks for your work. I have a question. For the indoor semantic segmentation downstream task, it seems the pretrain task is still a classification task? Am I right?

opened by zhangzihui247 1

Owner

Hesper

GitHub

code for ICCV 2021 paper 'Generalized Source-free Domain Adaptation'

G-SFDA Code (based on pytorch 1.3) for our ICCV 2021 paper 'Generalized Source-free Domain Adaptation'. [project] [paper]. Dataset preparing Download

84 Dec 26, 2022

Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators..

ARAPReg Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators.. Installation The cod

132 Nov 28, 2022

Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

Pixel Difference Convolution This repository contains the PyTorch implementation for "Pixel Difference Networks for Efficient Edge Detection" by Zhuo

236 Dec 21, 2022

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization 0. Environment Environment: python 3.6 and cuda 10

62 Dec 30, 2022

Code release for ICCV 2021 paper "Anticipative Video Transformer"

Anticipative Video Transformer Ranked first in the Action Anticipation task of the CVPR 2021 EPIC-Kitchens Challenge! (entry: AVT-FB-UT) [project page

123 Dec 13, 2022

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

235 Dec 26, 2022

Demo code for ICCV 2021 paper "Sensor-Guided Optical Flow"

Sensor-Guided Optical Flow Demo code for "Sensor-Guided Optical Flow", ICCV 2021 This code is provided to replicate results with flow hints obtained f

10 Mar 16, 2022

Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"

367 Dec 24, 2022

Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation.

Unified-EPT Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation. Installation Linux, CUDA>=10.0,

29 Aug 23, 2022

Starter code for the ICCV 2021 paper, 'Detecting Invisible People'

Detecting Invisible People [ICCV 2021 Paper] [Website] Tarasha Khurana, Achal Dave, Deva Ramanan Introduction This repository contains code for Detect

28 Sep 16, 2022

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Vision Transformer with Progressive Sampling This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

123 Jan 1, 2023

PyTorch implementation of paper: AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer, ICCV 2021.

AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer [Paper] [PyTorch Implementation] [Paddle Implementation] Overview This reposit

148 Dec 30, 2022

Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction [Paper] [PaddlePaddle Implementation] Homepage of paper: Paint Transformer: Fee

442 Dec 16, 2022

Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

PixelSynth: Generating a 3D-Consistent Experience from a Single Image (ICCV 2021) Chris Rockwell, David F. Fouhey, and Justin Johnson [Project Website

95 Nov 22, 2022

Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

The DETR approach applies the transformer encoder and decoder architecture to object detection and achieves promising performance. In this paper, we handle the critical issue, slow training convergence, and present a conditional cross-attention mechanism for fast DETR training. Our approach is motivated by that the cross-attention in DETR relies highly on the content embeddings and that the spatial embeddings make minor contributions, increasing the need for high-quality content embeddings and thus increasing the training difficulty.

281 Dec 30, 2022

Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Related tags

Overview

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds

Checklist

Self-supervised Pre-training Framework

Downstream Tasks

Installation

Datasets

Pre-training

BYOL framework

Pre-trained Models

Linear Evaluation

Downstream Tasks

Checkpoints Transformation

Fine-tuning and Evaluation for Downstream Tasks

Citation

Comments

Dataloader returned 0 length

weight model error

Experment result in dgcnn-cls

Question about the PVRCNN model in outdoor scenes

semseg

Owner

Hesper

code for ICCV 2021 paper 'Generalized Source-free Domain Adaptation'

Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators..

Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Code release for ICCV 2021 paper "Anticipative Video Transformer"

Official code release for ICCV 2021 paper SNARF: Differentiable Forward Skinning for Animating Non-rigid Neural Implicit Shapes.

Demo code for ICCV 2021 paper "Sensor-Guided Optical Flow"

Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"

Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation.

Starter code for the ICCV 2021 paper, 'Detecting Invisible People'

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

PyTorch implementation of paper: AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer, ICCV 2021.

Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

The Official Implementation of the ICCV-2021 Paper: Semantically Coherent Out-of-Distribution Detection.

Official implementation of the ICCV 2021 paper: "The Power of Points for Modeling Humans in Clothing".

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules