PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Hanchao Leng

Last update: Dec 29, 2022

Related tags

Deep Learning ShapeConv

Overview

Shape-aware Convolutional Layer (ShapeConv)

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Introduction

We design a Shape-aware Convolutional(ShapeConv) layer to explicitly model the shape information for enhancing the RGB-D semantic segmentation accuracy. Specifically, we decompose the depth feature into a shape-component and a value component, after which two learnable weights are introduced to handle the shape and value with differentiation. Extensive experiments on three challenging indoor RGB-D semantic segmentation benchmarks, i.e., NYU-Dv2(-13,-40), SUN RGB-D, and SID, demonstrate the effectiveness of our ShapeConv when employing it over five popular architectures.

Usage

Installation

Requirements

Linux
Python 3.6+
PyTorch 1.7.0 or higher
CUDA 10.0 or higher

We have tested the following versions of OS and softwares:

OS: Ubuntu 16.04.6 LTS
CUDA: 10.0
PyTorch 1.7.0
Python 3.6.9

Install dependencies.

pip install -r requirements.txt

Dataset

Download the offical dataset and convert to a format appropriate for this project. See here.

Or download the converted dataset:

Evaluation

Model

Download trained model and put it in folder ./model_zoo. See all trained models here.
Config

Edit config file in ./config. The config files in ./config correspond to the model files in ./models.
1. Set inference.gpu_id = CUDA_VISIBLE_DEVICES. CUDA_VISIBLE_DEVICES is used to specify which GPUs should be visible to a CUDA application, e.g., inference.gpu_id = "0,1,2,3".
2. Set dataset_root = path_to_dataset. path_to_dataset represents the path of dataset. e.g.,dataset_root = "/home/shape_conv/nyu_v2".
Run
1. Ditributed evaluation, please run:
```
./tools/dist_test.sh config_path checkpoint_path gpu_num
```
- config_path is path of config file;
- checkpoint_pathis path of model file;
- gpu_num is the number of GPUs used, note that gpu_num <= len(inference.gpu_id).
E.g., evaluate shape-conv model on NYU-V2(40 categories), please run:
```
./tools/dist_test.sh configs/nyu/nyu40_deeplabv3plus_resnext101_shape.py model_zoo/nyu40_deeplabv3plus_resnext101_shape.pth 4
```
1. Non-distributed evaluation
```
python tools/test.py config_path checkpoint_path
```

Train

Config

Edit config file in ./config.
1. Set inference.gpu_id = CUDA_VISIBLE_DEVICES.
  
  E.g.,inference.gpu_id = "0,1,2,3".
2. Set dataset_root = path_to_dataset.
  
  E.g.,dataset_root = "/home/shape_conv/nyu_v2".

Run

Ditributed training

./tools/dist_train.sh config_path gpu_num

E.g., train shape-conv model on NYU-V2(40 categories) with 4 GPUs, please run:

./tools/dist_train.sh configs/nyu/nyu40_deeplabv3plus_resnext101_shape.py 4

Non-distributed training

python tools/train.py config_path

Result

For more result, please see model zoo.

NYU-V2(40 categories)

Architecture	Backbone	MS & Flip	Shape Conv	mIOU
DeepLabv3plus	ResNeXt-101	False	False	48.9%
DeepLabv3plus	ResNeXt-101	False	True	50.2%
DeepLabv3plus	ResNeXt-101	True	False	50.3%
DeepLabv3plus	ResNeXt-101	True	True	51.3%

SUN-RGBD

Architecture	Backbone	MS & Flip	Shape Conv	mIOU
DeepLabv3plus	ResNet-101	False	False	46.9%
DeepLabv3plus	ResNet-101	False	True	47.6%
DeepLabv3plus	ResNet-101	True	False	47.6%
DeepLabv3plus	ResNet-101	True	True	48.6%

SID(Stanford Indoor Dataset)

Architecture	Backbone	MS & Flip	Shape Conv	mIOU
DeepLabv3plus	ResNet-101	False	False	54.55%
DeepLabv3plus	ResNet-101	False	True	60.6%

Acknowledgments

This repo was developed based on vedaseg.

You might also like...

This repo is a PyTorch implementation for Paper "Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds"

Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds This repository is a PyTorch implementation for paper: Uns

42 Dec 9, 2022

Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation Created by Zeyu HU Introduction This work is based on our paper VMNet: Voxel-Mes

82 Dec 27, 2022

Official Implementation of HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation by Lukas Hoyer, Dengxin Dai, and Luc Van Gool [Arxiv] [Paper] Overview Unsup

149 Dec 28, 2022

The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

A Deep Feature Aggregation Network for Accurate Indoor Camera Localization This is the PyTorch implementation of our paper "A Deep Feature Aggregation

9 Dec 9, 2022

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

107 Dec 20, 2022

Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022)

Pop-Out Motion Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022) Jihyun Lee*, Minhyuk Sung*, Hyunjin Kim, Tae-Ky

88 Nov 22, 2022

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

137 Dec 14, 2022

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

137 Dec 14, 2022

Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving

SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving Abstract In this paper, we introduce SalsaNext f

308 Jan 4, 2023

Comments

Scales for Test

Hello,

Thanks for the great work !

I am confused about the scales in the configs. Why you use multi-scales for testing instead of scale = 1?

For example the code on the configs/nyu/nyu40 tta=dict( scales=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75],

Can you provide some explications? Thanks

opened by Zongwei97 2
Thank you very much for your work. I am very interested in your work. I hope you can help explain the meaning of this sentence in the code. The code is in the script 'test_runner. py ' in line 56, as follows: pred_ rgb[label == 255] = np.array((0, 0, 0))

Thank you very much for your work. I am very interested in your work. I hope you can help explain the meaning of this sentence in the code. The code is in the script 'test_runner. py ' in line 56, as follows:

pred_ rgb[label == 255] = np.array((0, 0, 0))

opened by 1725917163 2
Suggest to loosen the dependency on albumentations

Hi, your project ShapeConv(commit id: 25bee65af4952c10ed4e24f6556765654e56575f) requires "albumentations==0.4.1" in its dependency. After analyzing the source code, we found that the following versions of albumentations can also be suitable, i.e., albumentations 0.4.0, since all functions that you directly (6 APIs: albumentations.augmentations.functional.scale, albumentations.core.transforms_interface.to_tuple, albumentations.augmentations.functional.pad_with_params, albumentations.core.transforms_interface.DualTransform.init, albumentations.core.composition.Compose.init, albumentations.augmentations.transforms.PadIfNeeded.init) or indirectly (propagate to 14 albumentations's internal APIs and 2 outsider APIs) used from the package have not been changed in these versions, thus not affecting your usage.

Therefore, we believe that it is quite safe to loose your dependency on albumentations from "albumentations==0.4.1" to "albumentations>=0.4.0,<=0.4.1". This will improve the applicability of ShapeConv and reduce the possibility of any further dependency conflict with other projects.

May I pull a request to further loosen the dependency on albumentations?

By the way, could you please tell us whether such an automatic tool for dependency analysis may be potentially helpful for maintaining dependencies easier during your development?

opened by Agnes-U 0

Owner

Hanchao Leng

GitHub

Tensorflow 2.x implementation of Panoramic BlitzNet for object detection and semantic segmentation on indoor panoramic images.

Deep neural network for object detection and semantic segmentation on indoor panoramic images. The implementation is based on the papers:

9 Nov 24, 2022

Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

SASSnet Code for paper: Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images(MICCAI 2020) Our code is origin from UA-MT You can fin

125 Jan 3, 2023

DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

DSAC* for Visual Camera Re-Localization (RGB or RGB-D) Introduction Installation Data Structure Supported Datasets 7Scenes 12Scenes Cambridge Landmark

143 Dec 22, 2022

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py

677 Dec 25, 2022

PyTorch implementation of "Representing Shape Collections with Alignment-Aware Linear Models" paper.

deep-linear-shapes PyTorch implementation of "Representing Shape Collections with Alignment-Aware Linear Models" paper. If you find this code useful i

27 Sep 24, 2022

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

32 Sep 21, 2022

Multi-layer convolutional LSTM with Pytorch

Convolution_LSTM_pytorch Thanks for your attention. I haven't got time to maintain this repo for a long time. I recommend this repo which provides an

734 Jan 3, 2023

Multi-layer convolutional LSTM with Pytorch

Convolution_LSTM_pytorch Thanks for your attention. I haven't got time to maintain this repo for a long time. I recommend this repo which provides an

733 Dec 30, 2022

PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation

StructDepth PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimat

112 Nov 28, 2022

Edge-aware Guidance Fusion Network for RGB-Thermal Scene Parsing

EGFNet Edge-aware Guidance Fusion Network for RGB-Thermal Scene Parsing Dataset and Results Test maps: 百度网盘提取码：zust Citation @ARTICLE{ author={Zhou,

10 Dec 8, 2022

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Related tags

Overview

Shape-aware Convolutional Layer (ShapeConv)

Introduction

Usage

Installation

Dataset

Evaluation

Train

Result

NYU-V2(40 categories)

SUN-RGBD

SID(Stanford Indoor Dataset)

Acknowledgments

You might also like...

This repo is a PyTorch implementation for Paper "Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds"

Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

Official Implementation of HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

The implementation of the paper "A Deep Feature Aggregation Network for Accurate Indoor Camera Localization".

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving

Comments

Scales for Test

Thank you very much for your work. I am very interested in your work. I hope you can help explain the meaning of this sentence in the code. The code is in the script 'test_runner. py ' in line 56, as follows: pred_ rgb[label == 255] = np.array((0, 0, 0))

Suggest to loosen the dependency on albumentations

Owner

Hanchao Leng

Tensorflow 2.x implementation of Panoramic BlitzNet for object detection and semantic segmentation on indoor panoramic images.

Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PyTorch implementation of "Representing Shape Collections with Alignment-Aware Linear Models" paper.

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Multi-layer convolutional LSTM with Pytorch

Multi-layer convolutional LSTM with Pytorch

PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation

Edge-aware Guidance Fusion Network for RGB-Thermal Scene Parsing