PSANet: Point-wise Spatial Attention Network for Scene Parsing, ECCV2018.

Hengshuang Zhao

Last update: Oct 30, 2022

Related tags

Deep Learning PSANet

Overview

PSANet: Point-wise Spatial Attention Network for Scene Parsing (in construction)

by Hengshuang Zhao*, Yi Zhang*, Shu Liu, Jianping Shi, Chen Change Loy, Dahua Lin, Jiaya Jia, details are in project page.

Introduction

This repository is build for PSANet, which contains source code for PSA module and related evaluation code. For installation, please merge the related layers and follow the description in PSPNet repository (test with CUDA 7.0/7.5 + cuDNN v4).

PyTorch Version

Highly optimized PyTorch codebases available for semantic segmentation in repo: semseg, including full training and testing codes for PSPNet and PSANet.

Usage

Clone the repository recursively:

git clone --recursive https://github.com/hszhao/PSANet.git

Merge the caffe layers into PSPNet repository:

Point-wise spatial attention: pointwise_spatial_attention_layer.hpp/cpp/cu and caffe.proto.

Build Caffe and matcaffe:

cd $PSANET_ROOT/PSPNet
cp Makefile.config.example Makefile.config
vim Makefile.config
make -j8 && make matcaffe
cd ..

Evaluation:
- Evaluation code is in folder 'evaluation'.
- Download trained models and put them in related dataset folder under 'evaluation/model', refer 'README.md'.
- Modify the related paths in 'eval_all.m':
  
  Mainly variables 'data_root' and 'eval_list', and your image list for evaluation should be similarity to that in folder 'evaluation/samplelist' if you use this evaluation code structure.
```
cd evaluation
vim eval_all.m
```
- Run the evaluation scripts:
```
./run.sh
```

Results:

Predictions will show in folder 'evaluation/mc_result' and the expected scores are listed as below:

(mIoU/pAcc. stands for mean IoU and pixel accuracy, 'ss' and 'ms' denote single scale and multiple scale testing.)

ADE20K:

network	training data	testing data	mIoU/pAcc.(ss)	mIoU/pAcc.(ms)	md5sum
PSANet50	train	val	41.92/80.17	42.97/80.92	a8e884
PSANet101	train	val	42.75/80.71	43.77/81.51	ab5e56

VOC2012:

network	training data	testing data	mIoU/pAcc.(ss)	mIoU/pAcc.(ms)	md5sum
PSANet50	train_aug	val	77.24/94.88	78.14/95.12	d5fc37
PSANet101	train_aug	val	78.51/95.18	79.77/95.43	5d8c0f
PSANet101	COCO + train_aug + val	test	-/-	85.7/-	3c6a69

Cityscapes:

network	training data	testing data	mIoU/pAcc.(ss)	mIoU/pAcc.(ms)	md5sum
PSANet50	fine_train	fine_val	76.65/95.99	77.79/96.24	25c06a
PSANet101	fine_train	fine_val	77.94/96.10	79.05/96.30	3ac1bf
PSANet101	fine_train	fine_test	-/-	78.6/-	3ac1bf
PSANet101	fine_train + fine_val	fine_test	-/-	80.1/-	1dfc91

Demo video:
- Video processed by PSANet (with PSPNet) on BDD dataset for drivable area segmentation: Video.

Citation

If PSANet is useful for your research, please consider citing:

@inproceedings{zhao2018psanet,
  title={{PSANet}: Point-wise Spatial Attention Network for Scene Parsing},
  author={Zhao, Hengshuang and Zhang, Yi and Liu, Shu and Shi, Jianping and Loy, Chen Change and Lin, Dahua and Jia, Jiaya},
  booktitle={ECCV},
  year={2018}
}

Questions

Please contact '[email protected]' or '[email protected]'

Comments

Confusion about the Attention Map Generation of distribute branch

Hi,@hszhao

It seems that it is same between the collect branch and distributes branch(show on Fig. 3). Could you show the equation of the distribute branch(like the equation(eq.9) of the collect branch show on the paper)?

opened by KamInNg 7
How to deal with images with different input size?

It seems like collection and distribution operations require the training and testing images have the same input feature size. So How to deal with images with different input size?

opened by manmanCover 6
when will u public pytorch version?

Hello , can you public your pytorch version please? I keep trying to do this with caffe,but it always meets bugs ,as a newcomer to deep learning,it's really difficult to me,please public your pytorch version,thank you very much.

opened by EchoAmor 2
how to merge layers into PSPNet?

Hi ! I have merged layers you said into PSPNet ,but when I make it appears following errors,can you tell me how to fix this??

I also want to know whether I can train this model because you didn't mention it in this repository. Thanks very much!

Here are some errors:

src/caffe/layers/pointwise_spatial_attention_layer.cpp: In member function ‘virtual void caffe::PointwiseSpatialAttentionLayer<Dtype>::LayerSetUp(const std::vector<caffe::Blob<Dtype>*>&, const std::vector<caffe::Blob<Dtype>*>&)’: src/caffe/layers/pointwise_spatial_attention_layer.cpp:12:3: error: ‘PointwiseSpatialAttentionParameter’ was not declared in this scope src/caffe/layers/pointwise_spatial_attention_layer.cpp:71:3: error: ‘PointwiseSpatialAttentionParameter_PSAType_COLLECT’ was not declared in this scope src/caffe/layers/pointwise_spatial_attention_layer.cpp:71:3: error: ‘PointwiseSpatialAttentionParameter_PSAType_DISTRIBUTE’ was not declared in this scope src/caffe/layers/pointwise_spatial_attention_layer.cpp: In function ‘void caffe::PSAForward_buffer_mask_collect_cpu(int, int, int, int, int, int, int, const Dtype*, Dtype*)’: src/caffe/layers/pointwise_spatial_attention_layer.cpp:116:51: error: there are no arguments to ‘max’ that depend on a template parameter, so a declaration of ‘max’ must be available [-fpermissive] src/caffe/layers/pointwise_spatial_attention_layer.cu(201): error: class "caffe::LayerParameter" has no member "pointwise_spatial_attention_param" detected during instantiation of "void caffe::PointwiseSpatialAttentionLayer<Dtype>::Backward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=double]" (220): here 8 errors detected in the compilation of "/tmp/tmpxft_00004960_00000000-19_pointwise_spatial_attention_layer.compute_52.cpp1.ii". make: *** [.build_release/cuda/src/caffe/layers/pointwise_spatial_attention_layer.o] Error 1

opened by EchoAmor 1
How to visualize the mask as shown in subsection 4.5?

Hi, I am confused about that how to visualize the mask predicted by PSANet described in the subsection 4.5. The predicted attention map has a spatial size of (H,W, H*W), how to get final mask which is shown in Fig.6? Best regards.

opened by wangq95 1
evaluate error

When I evaluate psanet50_voc2012_465.prototxt net use your pretained psanet50_voc2012_d5fc37.caffemodel, there is some errors. F0920 09:00:23.963999 5519 net.cpp:829] Cannot copy param 0 weights from layer 'PSA_COLLECT_fc2'; shape mismatch. Source param shape is 13689 512 1 1 (7008768); target param shape is 3481 512 1 1 (1782272). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer But the ‘PSA_COLLECT_fc2’ layer's output channel is 3481in your psanet50_voc2012_465.prototxt. layer { name: "PSA_COLLECT_fc2" type: "Convolution" bottom: "PSA_COLLECT_fc1" top: "PSA_COLLECT_fc2" param { lr_mult: 10 decay_mult: 1 } convolution_param { num_output: 3481 # 59*59 kernel_size: 1 stride: 1 weight_filler { type: "msra" } bias_term: false } }

opened by 994374821 1
HoleConvolution

I have some confusion about your 'HoleConvolution' layer when I try to evaluate your pretrained model. I did not find the definition of 'HoleConvolution' which appears in 'prototxt' file. Where the 'HoleConvolution' layer define? Thanks.

opened by 994374821 1
about the gpu memory consumed

Hello. Thanks for the code. In the paper, during inference, there will be two tensor with size b * (2*h*w) * h * w in the psa module, willl it consumes too much memory ?

opened by chenchr 1

Owner

Hengshuang Zhao

GitHub https://hszhao.github.io/projects/psanet

Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

PV-RAFT This repository contains the PyTorch implementation for paper "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clou

43 Dec 5, 2022

Edge-aware Guidance Fusion Network for RGB-Thermal Scene Parsing

EGFNet Edge-aware Guidance Fusion Network for RGB-Thermal Scene Parsing Dataset and Results Test maps: 百度网盘提取码：zust Citation @ARTICLE{ author={Zhou,

10 Dec 8, 2022

Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

GSAN Introduction Code for paper GSAN: Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving, wh

6 Oct 27, 2022

PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features

PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features Overview This repository is the Pytorch implementation of PRIN/SPRIN: On Extracting P

17 Mar 2, 2022

Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Fast MST Algorithm Implementation of fast algorithms for (Maximum Spanning Tree) MST parsing that includes fast ArcMax+Reweighting+Tarjan algorithm fo

11 Oct 14, 2022

Unofficial implementation of Point-Unet: A Context-Aware Point-Based Neural Network for Volumetric Segmentation

Point-Unet This is an unofficial implementation of the MICCAI 2021 paper Point-Unet: A Context-Aware Point-Based Neural Network for Volumetric Segment

9 Dec 7, 2022

This is an official implementation of "Polarized Self-Attention: Towards High-quality Pixel-wise Regression"

Polarized Self-Attention: Towards High-quality Pixel-wise Regression This is an official implementation of: Huajun Liu, Fuqiang Liu, Xinyi Fan and Don

212 Jan 8, 2023

Official code for "Stereo Waterdrop Removal with Row-wise Dilated Attention (IROS2021)"

Stereo-Waterdrop-Removal-with-Row-wise-Dilated-Attention This repository includes official codes for "Stereo Waterdrop Removal with Row-wise Dilated A

29 Oct 1, 2022

[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

PointDSC repository PyTorch implementation of PointDSC for CVPR'2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency",

153 Dec 14, 2022

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

35 Nov 20, 2022

PSANet: Point-wise Spatial Attention Network for Scene Parsing, ECCV2018.

Related tags

Overview

PSANet: Point-wise Spatial Attention Network for Scene Parsing (in construction)

Introduction

PyTorch Version

Usage

Citation

Questions

Comments

Confusion about the Attention Map Generation of distribute branch

How to deal with images with different input size?

when will u public pytorch version?

how to merge layers into PSPNet?

How to visualize the mask as shown in subsection 4.5?

evaluate error

HoleConvolution

about the gpu memory consumed

Owner

Hengshuang Zhao

Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

Edge-aware Guidance Fusion Network for RGB-Thermal Scene Parsing

Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features

Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Unofficial implementation of Point-Unet: A Context-Aware Point-Based Neural Network for Volumetric Segmentation

This is an official implementation of "Polarized Self-Attention: Towards High-quality Pixel-wise Regression"

Official code for "Stereo Waterdrop Removal with Row-wise Dilated Attention (IROS2021)"

[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Development kit for MIT Scene Parsing Benchmark

The open source code of SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation.

Twins: Revisiting the Design of Spatial Attention in Vision Transformers

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Neural Scene Graphs for Dynamic Scene (CVPR 2021)

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''