PSANet: Point-wise Spatial Attention Network for Scene Parsing, ECCV2018.

Related tags

Deep Learning PSANet
Overview

PSANet: Point-wise Spatial Attention Network for Scene Parsing (in construction)

by Hengshuang Zhao*, Yi Zhang*, Shu Liu, Jianping Shi, Chen Change Loy, Dahua Lin, Jiaya Jia, details are in project page.

Introduction

This repository is build for PSANet, which contains source code for PSA module and related evaluation code. For installation, please merge the related layers and follow the description in PSPNet repository (test with CUDA 7.0/7.5 + cuDNN v4).

PyTorch Version

Highly optimized PyTorch codebases available for semantic segmentation in repo: semseg, including full training and testing codes for PSPNet and PSANet.

Usage

  1. Clone the repository recursively:

    git clone --recursive https://github.com/hszhao/PSANet.git
  2. Merge the caffe layers into PSPNet repository:

    Point-wise spatial attention: pointwise_spatial_attention_layer.hpp/cpp/cu and caffe.proto.

  3. Build Caffe and matcaffe:

    cd $PSANET_ROOT/PSPNet
    cp Makefile.config.example Makefile.config
    vim Makefile.config
    make -j8 && make matcaffe
    cd ..
  4. Evaluation:

    • Evaluation code is in folder 'evaluation'.

    • Download trained models and put them in related dataset folder under 'evaluation/model', refer 'README.md'.

    • Modify the related paths in 'eval_all.m':

      Mainly variables 'data_root' and 'eval_list', and your image list for evaluation should be similarity to that in folder 'evaluation/samplelist' if you use this evaluation code structure.

    cd evaluation
    vim eval_all.m
    • Run the evaluation scripts:
    ./run.sh
    
  5. Results:

    Predictions will show in folder 'evaluation/mc_result' and the expected scores are listed as below:

    (mIoU/pAcc. stands for mean IoU and pixel accuracy, 'ss' and 'ms' denote single scale and multiple scale testing.)

    ADE20K:

    network training data testing data mIoU/pAcc.(ss) mIoU/pAcc.(ms) md5sum
    PSANet50 train val 41.92/80.17 42.97/80.92 a8e884
    PSANet101 train val 42.75/80.71 43.77/81.51 ab5e56

    VOC2012:

    network training data testing data mIoU/pAcc.(ss) mIoU/pAcc.(ms) md5sum
    PSANet50 train_aug val 77.24/94.88 78.14/95.12 d5fc37
    PSANet101 train_aug val 78.51/95.18 79.77/95.43 5d8c0f
    PSANet101 COCO + train_aug + val test -/- 85.7/- 3c6a69

    Cityscapes:

    network training data testing data mIoU/pAcc.(ss) mIoU/pAcc.(ms) md5sum
    PSANet50 fine_train fine_val 76.65/95.99 77.79/96.24 25c06a
    PSANet101 fine_train fine_val 77.94/96.10 79.05/96.30 3ac1bf
    PSANet101 fine_train fine_test -/- 78.6/- 3ac1bf
    PSANet101 fine_train + fine_val fine_test -/- 80.1/- 1dfc91
  6. Demo video:

    • Video processed by PSANet (with PSPNet) on BDD dataset for drivable area segmentation: Video.

Citation

If PSANet is useful for your research, please consider citing:

@inproceedings{zhao2018psanet,
  title={{PSANet}: Point-wise Spatial Attention Network for Scene Parsing},
  author={Zhao, Hengshuang and Zhang, Yi and Liu, Shu and Shi, Jianping and Loy, Chen Change and Lin, Dahua and Jia, Jiaya},
  booktitle={ECCV},
  year={2018}
}

Questions

Please contact '[email protected]' or '[email protected]'

Comments
  • Confusion about the Attention Map Generation of distribute branch

    Confusion about the Attention Map Generation of distribute branch

    Hi,@hszhao

    It seems that it is same between the collect branch and distributes branch(show on Fig. 3). Could you show the equation of the distribute branch(like the equation(eq.9) of the collect branch show on the paper)?

    opened by KamInNg 7
  • How to deal with images with different input size?

    How to deal with images with different input size?

    It seems like collection and distribution operations require the training and testing images have the same input feature size. So How to deal with images with different input size?

    opened by manmanCover 6
  • when will u public pytorch version?

    when will u public pytorch version?

    Hello , can you public your pytorch version please? I keep trying to do this with caffe,but it always meets bugs ,as a newcomer to deep learning,it's really difficult to me,please public your pytorch version,thank you very much.

    opened by EchoAmor 2
  • how to merge layers into PSPNet?

    how to merge layers into PSPNet?

    Hi ! I have merged layers you said into PSPNet ,but when I make it appears following errors,can you tell me how to fix this??

    I also want to know whether I can train this model because you didn't mention it in this repository. Thanks very much!

    Here are some errors:

    src/caffe/layers/pointwise_spatial_attention_layer.cpp: In member function ‘virtual void caffe::PointwiseSpatialAttentionLayer<Dtype>::LayerSetUp(const std::vector<caffe::Blob<Dtype>*>&, const std::vector<caffe::Blob<Dtype>*>&)’: src/caffe/layers/pointwise_spatial_attention_layer.cpp:12:3: error: ‘PointwiseSpatialAttentionParameter’ was not declared in this scope src/caffe/layers/pointwise_spatial_attention_layer.cpp:71:3: error: ‘PointwiseSpatialAttentionParameter_PSAType_COLLECT’ was not declared in this scope src/caffe/layers/pointwise_spatial_attention_layer.cpp:71:3: error: ‘PointwiseSpatialAttentionParameter_PSAType_DISTRIBUTE’ was not declared in this scope src/caffe/layers/pointwise_spatial_attention_layer.cpp: In function ‘void caffe::PSAForward_buffer_mask_collect_cpu(int, int, int, int, int, int, int, const Dtype*, Dtype*)’: src/caffe/layers/pointwise_spatial_attention_layer.cpp:116:51: error: there are no arguments to ‘max’ that depend on a template parameter, so a declaration of ‘max’ must be available [-fpermissive] src/caffe/layers/pointwise_spatial_attention_layer.cu(201): error: class "caffe::LayerParameter" has no member "pointwise_spatial_attention_param" detected during instantiation of "void caffe::PointwiseSpatialAttentionLayer<Dtype>::Backward_gpu(const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &, const std::vector<__nv_bool, std::allocator<__nv_bool>> &, const std::vector<caffe::Blob<Dtype> *, std::allocator<caffe::Blob<Dtype> *>> &) [with Dtype=double]" (220): here 8 errors detected in the compilation of "/tmp/tmpxft_00004960_00000000-19_pointwise_spatial_attention_layer.compute_52.cpp1.ii". make: *** [.build_release/cuda/src/caffe/layers/pointwise_spatial_attention_layer.o] Error 1

    opened by EchoAmor 1
  • How to visualize the mask as shown in subsection 4.5?

    How to visualize the mask as shown in subsection 4.5?

    Hi, I am confused about that how to visualize the mask predicted by PSANet described in the subsection 4.5. The predicted attention map has a spatial size of (H,W, H*W), how to get final mask which is shown in Fig.6? Best regards.

    opened by wangq95 1
  • evaluate error

    evaluate error

    When I evaluate psanet50_voc2012_465.prototxt net use your pretained psanet50_voc2012_d5fc37.caffemodel, there is some errors. F0920 09:00:23.963999 5519 net.cpp:829] Cannot copy param 0 weights from layer 'PSA_COLLECT_fc2'; shape mismatch. Source param shape is 13689 512 1 1 (7008768); target param shape is 3481 512 1 1 (1782272). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer But the ‘PSA_COLLECT_fc2’ layer's output channel is 3481in your psanet50_voc2012_465.prototxt. layer { name: "PSA_COLLECT_fc2" type: "Convolution" bottom: "PSA_COLLECT_fc1" top: "PSA_COLLECT_fc2" param { lr_mult: 10 decay_mult: 1 } convolution_param { num_output: 3481 # 59*59 kernel_size: 1 stride: 1 weight_filler { type: "msra" } bias_term: false } }

    opened by 994374821 1
  • HoleConvolution

    HoleConvolution

    I have some confusion about your 'HoleConvolution' layer when I try to evaluate your pretrained model. I did not find the definition of 'HoleConvolution' which appears in 'prototxt' file. Where the 'HoleConvolution' layer define? Thanks.

    opened by 994374821 1
  • about the gpu memory consumed

    about the gpu memory consumed

    Hello. Thanks for the code. In the paper, during inference, there will be two tensor with size b * (2*h*w) * h * w in the psa module, willl it consumes too much memory ?

    opened by chenchr 1
Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

PV-RAFT This repository contains the PyTorch implementation for paper "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clou

Yi Wei 43 Dec 5, 2022
Edge-aware Guidance Fusion Network for RGB-Thermal Scene Parsing

EGFNet Edge-aware Guidance Fusion Network for RGB-Thermal Scene Parsing Dataset and Results Test maps: 百度网盘 提取码:zust Citation @ARTICLE{ author={Zhou,

ShaohuaDong 10 Dec 8, 2022
Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

GSAN Introduction Code for paper GSAN: Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving, wh

YE Luyao 6 Oct 27, 2022
PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features

PRIN/SPRIN: On Extracting Point-wise Rotation Invariant Features Overview This repository is the Pytorch implementation of PRIN/SPRIN: On Extracting P

Yang You 17 Mar 2, 2022
Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Fast MST Algorithm Implementation of fast algorithms for (Maximum Spanning Tree) MST parsing that includes fast ArcMax+Reweighting+Tarjan algorithm fo

Miloš Stanojević 11 Oct 14, 2022
Unofficial implementation of Point-Unet: A Context-Aware Point-Based Neural Network for Volumetric Segmentation

Point-Unet This is an unofficial implementation of the MICCAI 2021 paper Point-Unet: A Context-Aware Point-Based Neural Network for Volumetric Segment

Namt0d 9 Dec 7, 2022
This is an official implementation of "Polarized Self-Attention: Towards High-quality Pixel-wise Regression"

Polarized Self-Attention: Towards High-quality Pixel-wise Regression This is an official implementation of: Huajun Liu, Fuqiang Liu, Xinyi Fan and Don

DeLightCMU 212 Jan 8, 2023
Official code for "Stereo Waterdrop Removal with Row-wise Dilated Attention (IROS2021)"

Stereo-Waterdrop-Removal-with-Row-wise-Dilated-Attention This repository includes official codes for "Stereo Waterdrop Removal with Row-wise Dilated A

null 29 Oct 1, 2022
[PyTorch] Official implementation of CVPR2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency". https://arxiv.org/abs/2103.05465

PointDSC repository PyTorch implementation of PointDSC for CVPR'2021 paper "PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency",

null 153 Dec 14, 2022
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da

MIT CSAIL Computer Vision 4.5k Jan 8, 2023
A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video

null 45 Nov 29, 2022
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da

MIT CSAIL Computer Vision 4.5k Jan 8, 2023
Development kit for MIT Scene Parsing Benchmark

Development Kit for MIT Scene Parsing Benchmark [NEW!] Our PyTorch implementation is released in the following repository: https://github.com/hangzhao

MIT CSAIL Computer Vision 424 Dec 1, 2022
The open source code of SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation.

SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation(ICPR 2020) Overview This code is for the paper: Spatial Attention U-Net for Retinal V

Changlu Guo 151 Dec 28, 2022
Twins: Revisiting the Design of Spatial Attention in Vision Transformers

Twins: Revisiting the Design of Spatial Attention in Vision Transformers Very recently, a variety of vision transformer architectures for dense predic

null 482 Dec 18, 2022
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

Facebook Research 182 Dec 30, 2022
[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion Code for Multi-Temporal Scene Classification and Scene Ch

Lixiang Ru 33 Dec 12, 2022
Neural Scene Graphs for Dynamic Scene (CVPR 2021)

Implementation of Neural Scene Graphs, that optimizes multiple radiance fields to represent different objects and a static scene background. Learned representations can be rendered with novel object compositions and views.

null 151 Dec 26, 2022
A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

Keren Ye 35 Nov 20, 2022