Official PyTorch implementation of RobustNet (CVPR 2021 Oral)

Overview

RobustNet (CVPR 2021 Oral): Official Project Webpage

Codes and pretrained models will be released soon.

This repository provides the official PyTorch implementation of the following paper:

RobustNet: Improving Domain Generalization in Urban-Scene Segmentationvia Instance Selective Whitening
Sungha Choi* (LG AI Research), Sanghun Jung* (KAIST AI), Huiwon Yun (Sogang Univ.)
Joanne T. Kim (Korea Univ.), Seungryong Kim (Korea Univ.), Jaegul Choo (KAIST AI) (*: equal contribution)
CVPR 2021, Accepted as Oral Presentation

Paper : arxiv

Abstract: Enhancing the generalization performance of deep neural networks in the real world (i.e., unseen domains) is crucial for safety-critical applications such as autonomous driving. To address this issue, this paper proposes a novel instance selective whitening loss to improve the robustness of the segmentation networks for unseen domains. Our approach disentangles the domain-specific style and domain-invariant content encoded in higher-order statistics (i.e., feature covariance) of the feature representations and selectively removes only the style information causing domain shift. As shown in the below figure, our method provides reasonable predictions for (a) low-illuminated, (b) rainy, and (c) unexpected new scene images. These types of images are not included in the training dataset that the baseline shows a significant performance drop, contrary to ours. Being simple but effective, our approach improves the robustness of various backbone networks without additional computational cost. We conduct extensive experiments in urban-scene segmentation and show the superiority of our approach over existing work.

Code Contributors

Sungha Choi (LG AI Research), Sanghun Jung (KAIST AI)

Pytorch Implementation

Installation

Comming soon.

Acknowledgments

Our pytorch implementation is heavily derived from NVIDIA segmentation and HANet. Thanks to the NVIDIA implementations.

Comments
  • Question about the results and a bug

    Question about the results and a bug

    Hi, Thanks for your great work!

    I have run the code for MobileNet, however, I get largely lower results than your reported results.

    I got the following results:

    MobileNet-Base, mIoU = 20.12%

    MobileNet-IBN, mIoU = 27.41%

    MobileNet-ISW, mIoU = 23.24%

    All results are tested on the last model and on the CityScapes.

    I also found an error when loading the dataset,

    https://github.com/shachoi/RobustNet/blob/55e385a6600482793cf64544217678196dbcd3b8/datasets/gtav.py#L438

    This line will causes ``TypeError: 'bool' object is not subscriptable'' for Pthon 3.

    And I change:

    mask_copy[(mask == np.array(k))[:,:,0] & (mask == np.array(k))[:,:,1] & (mask == np.array(k))[:,:,2]] = v

    to

    sel_mask = np.logical_and(mask[:, :, 0] == np.array(k)[0], mask[:, :, 1] == np.array(k)[1], mask[:, :, 2] == np.array(k)[2])
    mask_copy[sel_mask] = v
    

    for all the dataset loaders.

    I am not sure if this causes the low results. Could you help me to address this problem? Thanks!

    opened by Making24 14
  • about the directory structures

    about the directory structures

    Thank you for your outstanding work. You only give the file structure of Cityscapes, but you don't provide the file structure of gtav, Synthia, BDD, etc. can you provide them?

    opened by SvenSu 9
  • Low results for RobustNet with ResNet-101 backbone

    Low results for RobustNet with ResNet-101 backbone

    Hi, I tried the experiment with ResNet-101 for ISW by training on the GTAV (I changed the script based on the city_isw_r101.sh). However, I got results that are similar to ResNet-50. For example, the mIoU in Cityscape is 36%.

    Do you have any results on ResNet-101 for GTAV? And how to train ISW with ResNet-101 for GTAV?

    opened by Making24 7
  • Running RobustNet with ResNet-101 backbone and DeepLab v2 decoder

    Running RobustNet with ResNet-101 backbone and DeepLab v2 decoder

    Hi,

    Thanks for such a nice work! I was wondering if it is possible to reproduce somehow the results of Fig. 9, where you compare with DA methods. Did you change the backbone to ResNet-101 + DeepLab v2 to compare with those UDA methods (Appendix A.1)? Or did you downscale their backbone to Resnet-50 and changed their decoder to DeepLab v3+?

    If you changed your backbone to ResNet-101 + DeepLab v2, is it possible to run an experiment with this setting? What steps should I take?

    opened by fabriziojpiva 6
  • Questions about the paper

    Questions about the paper

    Since I confirmed that the author of the paper was Korean, I wrote the question in Korean. If I have to write a question in English, if you give me an answer, I will write it.

    안녕하세요. 'RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening' 양질의 논문을 작성해주셔서, 정말 흥미롭게 잘 읽었습니다. 논문 내용과 관련해서 2가지 질문 사항이 있는데, 답변 해주시면 감사하겠습니다.

    1. photometric transformation(i.e. color jittering)에 따른 covariance matrix의 elements 확인

    원-이미지가 있고 photometric transformation을 통한 변환-이미지 2개가 있을 때, 각각의 covariance matrix of feature map을 계산하고 2개의 covariance matrix의 값을 직접 확인해보고 싶습니다. 논문에서 언급된 Sensitive covariances는 Illumination이 크게 달라지고, Insensitive covariances는 photometric transformation에 영향을 받지 않고 scene structure에 민감한 부분을 직접 확인해보고 싶은데, 혹시 이에 대한 구현이 있을까요?

    1. feature-covariance matrix 차이에 대한 질문

    개인적인 생각으로는 일반적인 segmentation model with ResNet backbone은 photometric transformation에 따른 feature-covariance matrix의 elements 차이가 나타나지 않고, model의 layer에 Instance-Normalization 혹은 Loss_DWT와 같은 whitening이 적용된 모델만 photometric transformation에 따른 feature-covariance matrix의 elements가 차이를 나타낼 것으로 보이는데, 혹시 해당 내용이 맞을까요? 아니면, 일반적인 모델도 논문에서 언급된 covariance의 차이를 보일까요?

    감사합니다 ! I love LG A.I Research.

    opened by DeepHM 3
  • Inference on BDD and Mapillary

    Inference on BDD and Mapillary

    Thanks for your great work. I have tested your trained Res50 model successfully on cityscapes. But I noticed that there are no test loaders for other datasets. So I tried to use the val loader of Mapillary to do inference, but the result especially the compose images seem not compatible with validation mean_iou(0.5+). So could you update eval.py or tell me how to eval on other datasets, thanks a lot.

    opened by JingjunYi 3
  • Uniform dataset

    Uniform dataset

    Hi,

    I came across this Uniform dataset in your implementation. For example the class GTAVUniform in datasets/gtav.py. This uniform dataset appears to be the default dataset for your baselines, but I haven't found any info about it in the Readme or the paper. Could you help me understand what is it for?

    Thanks!

    opened by azshue 3
  • Questions about inference on Mapillary.

    Questions about inference on Mapillary.

    Hi,

    I have a question about inference on Mapillary dataset. I found that image resolution in Mapillary is not fixed. So, how did you test model on Mapillary dataset? Resizing all images to the same resolution or using slide window? This makes me confused.

    opened by zzzqzhou 2
  • Question about train split in GTAV dataset.

    Question about train split in GTAV dataset.

    Dear author, In your paper, you mentioned that there are 12403 images for a train set in GTAV. However, in your split file 'split_data/gtav_split_train.txt', there are only 12388 image names. Is this a mistake?

    opened by zzzqzhou 2
  • Do you have the code for a single CPU?

    Do you have the code for a single CPU?

    Thank you so much for your great work! I am very interested in your work and am trying your code. However, I only have one gpu. After modifying the code and training the model, it is found that the baseline performance has dropped significantly. Do you have the code for a single CPU? I would be very grateful if you could provide it to me.

    opened by werweassd 2
  • Can not obtain the reported baseline performance

    Can not obtain the reported baseline performance

    Thank you very much for this great work! I am trying to train the baseline model based on your code. However, I can not obtain the reported baseline performance.

    I use Cityscapes train set to train the model and test the model on BDD-100K. But the baseline performance is only 42.7, which is 2.3 mIoU lower than your baseline (44.96).

    opened by fanq15 2
  • hello

    hello

    Hey! I saw your email and was able to work on it for today, so from what I saw in the docs it seems you have to be in an additional method when you create your camera, I'm not exactly sure what camera you are using or how it should interact with the rest of the code so for now I replaced the issue file "OpenCVPipeline.java" with this file from the OpenCV documentation. I would consult those further to find out the exact issue you're having. I hope this can help!

    opened by thisishusseinali 0
  • opencv

    opencv

    Hello, I face an Incompatibility issue between open cv and PyTorch can you mention which PyTorch version did you use ? and what command do you use to install opencv ?

    opened by Yussef93 0
Owner
Sungha Choi
Sungha Choi
Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

Juhong Min 70 Nov 22, 2022
Pytorch implementation for "Adversarial Robustness under Long-Tailed Distribution" (CVPR 2021 Oral)

Adversarial Long-Tail This repository contains the PyTorch implementation of the paper: Adversarial Robustness under Long-Tailed Distribution, CVPR 20

Tong WU 88 Oct 27, 2022
(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

Haoxi Ran 257 Nov 26, 2022
Official MegEngine implementation of CREStereo(CVPR 2022 Oral).

[CVPR 2022] Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation This repository contains MegEngine implementation of ou

MEGVII Research 293 Nov 21, 2022
Official pytorch implementation of "Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization" ACMMM 2021 (Oral)

Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization This is an official implementation of "Feature Stylization and Domain-

null 22 Sep 22, 2022
Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Bidirectional Projection Network for Cross Dimension Scene Understanding CVPR 2021 (Oral) [ Project Webpage ] [ arXiv ] [ Video ] Existing segmentatio

Hu Wenbo 133 Nov 14, 2022
Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

Large-Scale Long-Tailed Recognition in an Open World [Project] [Paper] [Blog] Overview Open Long-Tailed Recognition (OLTR) is the author's re-implemen

Zhongqi Miao 759 Nov 26, 2022
Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)

End-to-End Optimization of Scene Layout Code release for: End-to-End Optimization of Scene Layout CVPR 2020 (Oral) Project site, Bibtex For help conta

Andrew Luo 39 Sep 21, 2022
Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)

Official PyTorch Implementation for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'2021, Oral Presentation) HOTR: End-to-

Kakao Brain 114 Nov 28, 2022
Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

nvdiffrec Joint optimization of topology, materials and lighting from multi-view image observations as described in the paper Extracting Triangular 3D

NVIDIA Research Projects 1.4k Nov 29, 2022
Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

DeepPanoContext (DPC) [Project Page (with interactive results)][Paper] DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context G

Cheng Zhang 66 Nov 16, 2022
Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video Project Page | Paper NeuralRecon: Real-Time Coherent 3D Reconstruction from Mon

ZJU3DV 1.4k Nov 28, 2022
Dynamic Slimmable Network (CVPR 2021, Oral)

Dynamic Slimmable Network (DS-Net) This repository contains PyTorch code of our paper: Dynamic Slimmable Network (CVPR 2021 Oral). Architecture of DS-

Changlin Li 195 Nov 13, 2022
[CVPR 2021 Oral] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis [arxiv|pdf|v

Yinan He 78 Oct 24, 2022
[CVPR 2021 Oral] Variational Relational Point Completion Network

VRCNet: Variational Relational Point Completion Network This repository contains the PyTorch implementation of the paper: Variational Relational Point

PL 120 Nov 20, 2022
Code for "Single-view robot pose and joint angle estimation via render & compare", CVPR 2021 (Oral).

Single-view robot pose and joint angle estimation via render & compare Yann Labbé, Justin Carpentier, Mathieu Aubry, Josef Sivic CVPR: Conference on C

Yann Labbé 51 Oct 14, 2022
Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)

Quasi-Dense Tracking This is the offical implementation of paper Quasi-Dense Similarity Learning for Multiple Object Tracking. We present a trailer th

ETH VIS Research Group 321 Nov 17, 2022
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

Facebook Research 178 Nov 23, 2022
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

TAP: Text-Aware Pre-training TAP: Text-Aware Pre-training for Text-VQA and Text-Caption by Zhengyuan Yang, Yijuan Lu, Jianfeng Wang, Xi Yin, Dinei Flo

Microsoft 61 Nov 14, 2022