Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL)

Overview

Scribble-Supervised LiDAR Semantic Segmentation

Dataset and code release for the paper Scribble-Supervised LiDAR Semantic Segmentation, CVPR 2022 (ORAL).
Authors: Ozan Unal, Dengxin Dai, Luc Van Gool

Abstract: Densely annotating LiDAR point clouds remains too expensive and time-consuming to keep up with the ever growing volume of data. While current literature focuses on fully-supervised performance, developing efficient methods that take advantage of realistic weak supervision have yet to be explored. In this paper, we propose using scribbles to annotate LiDAR point clouds and release ScribbleKITTI, the first scribble-annotated dataset for LiDAR semantic segmentation. Furthermore, we present a pipeline to reduce the performance gap that arises when using such weak annotations. Our pipeline comprises of three stand-alone contributions that can be combined with any LiDAR semantic segmentation model to achieve up to 95.7% of the fully-supervised performance while using only 8% labeled points.


News

[2022-04] We release our training code with the Cylinder3D backbone.
[2022-03] Our paper is accepted to CVPR 2022 for an ORAL presentation!
[2022-03] We release ScribbleKITTI, the first scribble-annotated dataset for LiDAR semantic segmentation.


ScribbleKITTI

teaser

We annotate the train-split of SemanticKITTI based on KITTI which consists of 10 sequences, 19130 scans, 2349 million points. ScribbleKITTI contains 189 million labeled points corresponding to only 8.06% of the total point count. We choose SemanticKITTI for its current wide use and established benchmark. We retain the same 19 classes to encourage easy transitioning towards research into scribble-supervised LiDAR semantic segmentation.

Our scribble labels can be downloaded here (118.2MB).

Data organization

The data is organized in the format of SemanticKITTI. The dataset can be used with any existing dataloader by changing the label directory from labels to scribbles.

sequences/
    ├── 00/
    │   ├── scribbles/
    │   │     ├ 000000.label
    │   │     └ 000001.label
    ├── 01/
    ├── 02/
    .
    .
    └── 10/

Scribble-Supervised LiDAR Semantic Segmentation

pipeline

We develop a novel learning method for 3D semantic segmentation that directly exploits scribble annotated LiDAR data. We introduce three stand-alone contributions that can be combined with any 3D LiDAR segmentation model: a teacher-student consistency loss on unlabeled points, a self-training scheme designed for outdoor LiDAR scenes, and a novel descriptor that improves pseudo-label quality.

Specifically, we first introduce a weak form of supervision from unlabeled points via a consistency loss. Secondly, we strengthen this supervision by fixing the confident predictions of our model on the unlabeled points and employing self-training with pseudo-labels. The standard self-training strategy is however very prone to confirmation bias due to the long-tailed distribution of classes inherent in autonomous driving scenes and the large variation of point density across different ranges inherent in LiDAR data. To combat these, we develop a class-range-balanced pseudo-labeling strategy to uniformly sample target labels across all classes and ranges. Finally, to improve the quality of our pseudo-labels, we augment the input point cloud by using a novel descriptor that provides each point with the semantic prior about its local surrounding at multiple resolutions.

Putting these two contributions along with the mean teacher framework, our scribble-based pipeline achieves up to 95.7% relative performance of fully supervised training while using only 8% labeled points.

Installation

For the installation, we recommend setting up a virtual environment:

python -m venv ~/venv/scribblekitti
source ~/venv/scribblekitti/bin/activate
pip install -r requirements.txt

Futhermore install the following dependencies:

Data Preparation

Please follow the instructions from SemanticKITTI to download the dataset including the KITTI Odometry point cloud data. Download our scribble annotations and unzip in the same directory. Each sequence in the train-set (00-07, 09-10) should contain the velodyne, labels and scribbles directories.

Move the sequences folder into a new directoy called data/. Alternatively, edit the dataset: root_dir field of each config file to point to the sequences folder.

Training

The training of our method requires three steps as illustrated in the above figure: (1) training, where we utilize the PLS descriptors and the mean teacher framework to generate high quality pseudo-labels; (2) pseudo-labeling, where we fix the trained teacher models predictions in a class-range-balanced manner; (3) distillation, where we train on the generated psuedo-labels.

Step 1 can be trained as follows. The checkpoint for the trained first stage model can be downloaded here. (The resulting model will show slight improvements over the model presented in the paper with 86.38% mIoU on the fully-labeled train-set.)

python train.py --config_path config/training.yaml --dataset_config_path config/semantickitti.yaml

For Step 2, we first need to first save the intermediate results of our trained teacher model.
Warning: This step will initially create a save file training_results.h5 (27GB). This file can be deleted after generating the psuedo-labels.

python save.py --config_path config/training.yaml --dataset_config_path config/semantickitti.yaml --checkpoint_path STEP1/CKPT/PATH --save_dir SAVE/DIR

Next, we find the optimum threshold for each class-annuli pairing and generate pseudo-labels in a class-range balanced manner. The psuedo-labels will be saved in the same root directory as the scribble lables but under a new folder called crb. The generated pseudo-labels from our model can be downloaded here.

python crb.py --config_path config/crb.yaml --dataset_config_path config/semantickitti.yaml --save_dir SAVE/DIR

Step 3 can be trained as follows. The resulting model state_dict can be downloaded here (61.25% mIoU).

python train.py --config_path config/distillation.yaml --dataset_config_path config/semantickitti.yaml

Evaluation

The final model as well as the provided checkpoints for the distillation steps can be evaluated on the SemanticKITTI validation set as follows. Evaluating the model is not neccessary when doing in-house training as the evaluation takes place within the training script after every epoch. The best teacher mIoU is given by the val_best_miou metric in W&B.

python evaluate.py --config_path config/distillation.yaml --dataset_config_path config/semantickitti.yaml --ckpt_path STEP2/CKPT/PATH

Quick Access for Download Links:


Citation

If you use our dataset or our work in your research, please cite:

@InProceedings{Unal_2022_CVPR,
    author    = {Unal, Ozan and Dai, Dengxin and Van Gool, Luc},
    title     = {Scribble-Supervised LiDAR Semantic Segmentation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2022},
}

Acknowledgements

We would like to additionally thank the authors the open source codebase Cylinder3D.

Comments
  • Error while creating Pseudo-Labels

    Error while creating Pseudo-Labels

    Hi, I would like to get Scribble Kitti running on my own data. For this I have already created my own scribble labels. Because I don't have point labels, I downloaded the checkpoint of the first step and ran save.py with it. That worked smoothly. But when I run the crb with the generated h5-file I get the following error:

      Determining global threshold k^(c,r)...
      0%|                                                                                                                                                                                                | 0/19130 
      [00:00<?, ?it/s]
      Traceback (most recent call last):
      File "crb.py", line 65, in <module>
        mask = pred[bin_mask] == j
      IndexError: boolean index did not match indexed array along dimension 0; dimension is 1 but corresponding boolean dimension is 124668
    

    At first I thought that maybe it is because of my data. But the error also occurs when I use your scribbles. I have not changed anything in the code. Here is a link to the h5-file i created with your labels after downloading the step1-checkpoint. I just thought I would ask. Maybe it's something trivial that I haven't noticed.

    Thank's in advance!

    Best Regards Leon

    opened by LeonRuddat 4
  • Segmentation Performance with Partially Annotated Data

    Segmentation Performance with Partially Annotated Data

    Thank you for open-sourcing your annotated data and code!

    Regarding Table 2 in your paper, I have a question about the segmentation performance of Cylinder3D and SparseConv-UNet (Ref [18] in your reference).

    The results under the 10% frame split are 46.8% for Cylinder3D and 43.9% for SparseConv-UNet. I have recently run experiments on Cylinder3D with the same number of labeled training frames (1913 out of 19130) and got much higher results (55%+). I am using the latest version of Cylinder3D from here. I use exact configurations provided by the authors except for init_size. I replaced it with 16 (originally set as 32). I would like to know how you exactly implement this and what is the potential cause for such a huge performance difference. Thanks!

    opened by ldkong1205 4
  • dataloader issue

    dataloader issue

    Hi,

    Thanks for sharing your work! Right now i am running scribblekitti on my own data and experience the following issue:

    When i run step 1 with train.py, the progress bar shows exact twice the amount of frames than there are in the folders for training. But shouldn't it display the sum of frames inside the training and validation folder? Also it seems the model is validating on the labels inside the folders for training, not on the labels inside the validation folder.

    In dataloader/semantickitti.py the 'Baseline' class has the method 'load_file_paths', which is called during initialization of the 'Baseline' class, as well as during the initialization of the 'PLSCylindricalMT'.

    Could be the issue that the method is called twice or am i missing a point here?

    Baseline:

        self.split, self.config = split, config
        self.root_dir = self.config['root_dir']
        assert(os.path.isdir(self.root_dir))
        label_directory = 'scribbles' if 'label_directory' not in config.keys() \
                                      else config['label_directory']
        self.label_directory =  label_directory if split == 'train' else 'labels'
        self.load_file_paths(split, self.label_directory)
    

    PLSCylindricalMT:

        self.load_file_paths('train', self.label_directory)
        self.nclasses = nclasses
        self.bin_sizes = self.config['bin_size']
    
    opened by Alexmsit 2
  • Change parser args in evaluate.py according to documentation.

    Change parser args in evaluate.py according to documentation.

    Hey @ouenal , I have found a little nameswitch. Namely, the parser flag in evaluate.py did not match the documentation. Not a big deal of course, but I thought I would correct it. :) Leon

    opened by LeonRuddat 1
  • Consistency loss

    Consistency loss

    Hi, thanks for your great work! I have a question about the consistency loss between the teacher and the student on the unlabeled points. In the code you used the KL-divergence, but in the paper (formula 3) it's something different. For me formula 3 looks like a soft version of cross-entropy, but the minus sign is missing. Or should it be the KL-divergence (https://pytorch.org/docs/stable/generated/torch.nn.KLDivLoss.html) and you forgot some part of it? Or am I missing something?

    opened by MaekTec 1
  • Training speed

    Training speed

    Thanks for your amazing work, and I'm care about the time of training consuming.

    My GPU is Tesla V100 32G(single). Under your training settings, each iteration consumes around 1.5-2s in STEP 1, and the training time for each epoch is close to 15-16h, repeating 75 epochs for training, it seems time-consuming.

    Would you like to share your device setting and the details of time consuming during training(like iteration time and the whole training pipeline)?

    opened by jasonwjw 1
  • Batch size bigger than 1

    Batch size bigger than 1

    Thanks so much for your excellent work and code. I've a question about the dataloader code. Is it possible to set the batch size bigger than 1 (e.g., 4, 8 or 16). When I tried to set a bigger batch_size in training.yaml, the code went wrong with the error "RuntimeError: stack expects each tensor to be equal size, but got [124266, 3] at entry 0 and [112695, 3] at entry 1".

    # in `training.yaml`
    train_dataloader:
      batch_size: 4       # default is 1
      shuffle: True
      num_workers: 4
    

    Thanks in advance!

    opened by l1997i 1
  • Sparse annotations for other datasets

    Sparse annotations for other datasets

    Thanks for your great work! It really helps me a lot. Do you have any plans to provide sparse annotations for other datasets such as nuscenes or semanticposs?

    opened by songw-zju 1
Owner
null
Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral) This is the official implementat

Yifan Zhang 259 Dec 25, 2022
[CVPR 2022] Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

Using Unreliable Pseudo Labels Official PyTorch implementation of Semi-Supervised Semantic Segmentation Using Unreliable Pseudo Labels, CVPR 2022. Ple

Haochen Wang 268 Dec 24, 2022
Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

Computer Vision Insitute, SZU 113 Dec 27, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

Hust Visual Learning Team 203 Dec 31, 2022
HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022

HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022 [Project page | Video] Getting sta

null 51 Nov 29, 2022
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 87 Jan 8, 2023
Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving

SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving Abstract In this paper, we introduce SalsaNext f

null 308 Jan 4, 2023
An extremely simple, intuitive, hardware-friendly, and well-performing network structure for LiDAR semantic segmentation on 2D range image. IROS21

FIDNet_SemanticKITTI Motivation Implementing complicated network modules with only one or two points improvement on hardware is tedious. So here we pr

YimingZhao 54 Dec 12, 2022
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

Jia Research Lab 137 Dec 14, 2022
Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation Input Image Initial CAM Successive Maps with adversar

Jungbeom Lee 110 Dec 7, 2022
Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation This repository contains the official implementation of our paper: Self-su

Visual Inference Lab @TU Darmstadt 132 Dec 21, 2022
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

DV Lab 137 Dec 14, 2022
[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

TorchSemiSeg [CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision by Xiaokang Chen1, Yuhui Yuan2, Gang Zeng1, Jingdong Wang

Chen XiaoKang 387 Jan 8, 2023
Code for the paper One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation, CVPR 2021.

One Thing One Click One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation (CVPR2021) Code for the paper One Thi

null 44 Dec 12, 2022
[CVPR'22] Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast

wseg Overview The Pytorch implementation of Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast. [arXiv] Though image-level weakly

Ye Du 96 Dec 30, 2022
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

null 32 Sep 21, 2022
[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

Rethinking Minimal Sufficient Representation in Contrastive Learning PyTorch implementation of Rethinking Minimal Sufficient Representation in Contras

null 36 Nov 23, 2022
(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry Official implementation of the paper Multi-View Depth Est

Bae, Gwangbin 138 Dec 28, 2022