Official PyTorch implementation of paper: Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation (ICCV 2021 Oral Presentation)

SangHun

Last update: Dec 27, 2022

Related tags

Deep Learning Standardized-max-logits

Overview

SML (ICCV 2021, Oral) : Official Pytorch Implementation

This repository provides the official PyTorch implementation of the following paper:

Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation
Sanghun Jung* (KAIST AI), Jungsoo Lee* (KAIST AI), Daehoon Gwak (KAIST AI)
Sungha Choi (LG AI Research), and Jaegul Choo (KAIST AI) (*: equal contribution)
ICCV 2021 (Oral)

Paper: arxiv

Youtube Video (English): Youtube

Abstract: Identifying unexpected objects on roads in semantic segmentation (e.g., identifying dogs on roads) is crucial in safety-critical applications. Existing approaches use images of unexpected objects from external datasets or require additional training (e.g., retraining segmentation networks or training an extra network), which necessitate a non-trivial amount of labor intensity or lengthy inference time. One possible alternative is to use prediction scores of a pre-trained network such as the max logits (i.e., maximum values among classes before the final softmax layer) for detecting such objects. However, the distribution of max logits of each predicted class is significantly different from each other, which degrades the performance of identifying unexpected objects in urban-scene segmentation. To address this issue, we propose a simple yet effective approach that standardizes the max logits in order to align the different distributions and reflect the relative meanings of max logits within each predicted class. Moreover, we consider the local regions from two different perspectives based on the intuition that neighboring pixels share similar semantic information. In contrast to previous approaches, our method does not utilize any external datasets or require additional training, which makes our method widely applicable to existing pre-trained segmentation models. Such a straightforward approach achieves a new state-of-the-art performance on the publicly available Fishyscapes Lost & Found leaderboard with a large margin.

Code Contributors

Sanghun Jung [Website] [LinkedIn] [Google Scholar] (KAIST AI)
Jungsoo Lee [Website] [LinkedIn] [Google Scholar] (KAIST AI)

Concept Video

Click the figure to watch the youtube video of our paper!

Pytorch Implementation

Installation

Clone this repository.

git clone https://github.com/shjung13/Standardized-max-logits.git
cd Standardized-max-logits
pip install -r requirements.txt

Cityscapes data directory

cityscapes
 └ leftImg8bit_trainvaltest
   └ leftImg8bit
     └ train
     └ val
     └ test
 └ gtFine_trainvaltest
   └ gtFine
     └ train
     └ val
     └ test

OoD data directory

Fishyscapes (OoD Dataset)
 └ leftImg8bit_trainvaltest
   └ leftImg8bit
     └ val
 └ gtFine_trainvaltest
   └ gtFine
     └ val

How to Run

Train the segmentation model

CUDA_VISIBLE_DEVICES=0,1 ./scripts/train_r101_os8.sh

Obtain statistics from training samples

CUDA_VISIBLE_DEVICES=0 ./scripts/calc_stat_r101_os8.sh

Evaluate on Out-of-Distribution dataset

Download the pretrained model here and after creating "<Directory Home>/pretrained", place it under the folder.

CUDA_VISIBLE_DEVICES=0 python eval.py --ood_dataset_path <path_to_OoD_dataset>

Quantitative / Qualitative Evaluation

Fishyscapes Learboard

Identified OoD pixels (colored white)

Fishyscapes Leaderboard

Our result is also available at fishyscapes.com.

Citation

@InProceedings{Jung_2021_ICCV,
    author    = {Jung, Sanghun and Lee, Jungsoo and Gwak, Daehoon and Choi, Sungha and Choo, Jaegul},
    title     = {Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {15425-15434}
}

Acknowledgments

We deeply appreciate Hermann Blum and FishyScapes team for their sincere help in providing the baseline performances and helping our team to update our model on the FishyScapes Leaderboard. Our pytorch implementation is heavily derived from NVIDIA segmentation and RobustNet. Thanks to the NVIDIA implementations.

Comments

Can not reproduce results on the Fishscapes static dataset

Hi, thanks for your contribution! I am currently having trouble on reproducing the reported results on the Fishscapes static dataset.

I use the offered pre-trained model "r101_os8_base_cty.pth" and can get the exactaly same results on the Fishscapes lost & found as reported in the paper and roughly same results on the Road Anomaly dataet (difference < 1%). However, when I simply change the "ood_dataset_path" to Fishscapes static dataset. I get the following statistics, which are different from the reported values in the paper. Could you please kindly explain this?

AUROC score: 0.9547932440646849 AUPRC score: 0.51792152018834 FPR@TPR95: 0.2243785053302718

opened by gaozhitong 3
Fishyscapes OoD Dataset download

Hello,

Nice work first. I am trying to run your evaluation code, while I have to download the Fishyscapes OoD Dataset and put them in your specified data directory. I tried to install bdlb package and download the dataset through bdlb.load(benchmark="fishyscapes"), while the resulted data format/directory is different from yours. Could you provide your script of data preprocessing or point me out how you did it. Thanks a lot!

BR, Yifan

opened by zhuyifan1993 3

About pretrained resnet101

if pretrained:
    # model.load_state_dict(model_zoo.load_url(model_urls['resnet101']))
    print("########### pretrained ##############")
    # model.load_state_dict(torch.load('./pretrained/resnet101-imagenet.pth', map_location="cpu"))
    mynn.forgiving_state_restore(model, torch.load('./pretrained/resnet101-imagenet.pth', map_location="cpu"))

In line 339 of Resnet.py, './pretrained/resnet101-imagenet.pth' is not given, how I get it?

opened by gangweiX 2

Visualize the final prediction

Dear Sanghun Jung,

Thank you so much for your amazing works, I was wondering if you could provide the source which helps visualize the b) Unexpected detected and c) final prediction as you stated in the paper? Thank you and looking forward to hearing from you!

opened by Sundragon1993 1
code to produce Figure 5 in the paper

Hi

Thanks for the great work. I was looking at figure 5 - unexpected object detected with TPR 95, and I was just wondering could you provide the code to generate such figure. The figure looks amazing and I would like to use such a figure in my paper. Thanks so much for the help.

opened by tianyu0207 1
What's the difference between stats/cityscapes_mean.npy and stats/cityscapes_mean_reported.npy?

Hi, as the title described, I notice that the code contains two mean/var statistics files in the stats directory. I am wondering what is the difference.

opened by gaozhitong 0

Official PyTorch implementation of paper: Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation (ICCV 2021 Oral Presentation)

Related tags

Overview

SML (ICCV 2021, Oral) : Official Pytorch Implementation

Code Contributors

Concept Video

Pytorch Implementation

Installation

Cityscapes data directory

OoD data directory

How to Run

Train the segmentation model

Obtain statistics from training samples

Evaluate on Out-of-Distribution dataset

Quantitative / Qualitative Evaluation

Fishyscapes Learboard

Identified OoD pixels (colored white)

Fishyscapes Leaderboard

Citation

Acknowledgments

Comments

Can not reproduce results on the Fishscapes static dataset

Fishyscapes OoD Dataset download

About pretrained resnet101

Visualize the final prediction

code to produce Figure 5 in the paper

What's the difference between stats/cityscapes_mean.npy and stats/cityscapes_mean_reported.npy?

Owner

SangHun

Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)

[arXiv'22] Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation

[arXiv'22] Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation

This repo uses a combination of logits and feature distillation method to teach the PSPNet model of ResNet18 backbone with the PSPNet model of ResNet50 backbone. All the models are trained and tested on the PASCAL-VOC2012 dataset.

A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

A Pytorch implementation of CVPR 2021 paper "RSG: A Simple but Effective Module for Learning Imbalanced Datasets"

PyTorch implementation of Memory-based semantic segmentation for off-road unstructured natural environments.

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

[CVPR'21] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

Official Code for ICML 2021 paper "Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline"

This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

[SIGIR22] Official PyTorch implementation for "CORE: Simple and Effective Session-based Recommendation within Consistent Representation Space".

Pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model'

A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

An efficient 3D semantic segmentation framework for Urban-scale point clouds like SensatUrban, Campus3D, etc.

Official implementation of the ICCV 2021 paper "Joint Inductive and Transductive Learning for Video Object Segmentation"

Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)