Pyramid Scene Parsing Network, CVPR2017.

Related tags

Deep Learning PSPNet
Overview

Pyramid Scene Parsing Network

by Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia, details are in project page.

Introduction

This repository is for 'Pyramid Scene Parsing Network', which ranked 1st place in ImageNet Scene Parsing Challenge 2016. The code is modified from Caffe version of DeepLab v2 and yjxiong for evaluation. We merge the batch normalization layer named 'bn_layer' in the former one into the later one while keep the original 'batch_norm_layer' in the later one unchanged for compatibility. The difference is that 'bn_layer' contains four parameters as 'slope,bias,mean,variance' while 'batch_norm_layer' contains two parameters as 'mean,variance'. Several evaluation code is borrowed from MIT Scene Parsing.

PyTorch Version

Highly optimized PyTorch codebases available for semantic segmentation in repo: semseg, including full training and testing codes for PSPNet and PSANet.

Installation

For installation, please follow the instructions of Caffe and DeepLab v2. To enable cuDNN for GPU acceleration, cuDNN v4 is needed. If you meet error related with 'matio', please download and install matio as required in 'DeepLab v2'.

The code has been tested successfully on Ubuntu 14.04 and 12.04 with CUDA 7.0.

Usage

  1. Clone the repository:

    git clone https://github.com/hszhao/PSPNet.git
  2. Build Caffe and matcaffe:

    cd $PSPNET_ROOT
    cp Makefile.config.example Makefile.config
    vim Makefile.config
    make -j8 && make matcaffe
  3. Evaluation:

    • Evaluation code is in folder 'evaluation'.
    • Download trained models and put them in folder 'evaluation/model':
    • Modify the related paths in 'eval_all.m':
      • Mainly variables 'data_root' and 'eval_list', and your image list for evaluation should be similarity to that in folder 'evaluation/samplelist' if you use this evaluation code structure.
      • Matlab 'parfor' evaluation is used and the default GPUs are with ID [0:3]. Modify variable 'gpu_id_array' if needed. We assume that number of images can be divided by number of GPUs; if not, you can just pad your image list or switch to single GPU evaluation by set 'gpu_id_array' be length of one, and change 'parfor' to 'for' loop.
    cd evaluation
    vim eval_all.m
    • Run the evaluation scripts:
    ./run.sh
    
  4. Results:

    Prediction results will show in folder 'evaluation/mc_result' and the expected scores are:

    (single scale testing denotes as 'ss' and multiple scale testing denotes as 'ms')

    • PSPNet50 on ADE20K valset (mIoU/pAcc): 41.68/80.04 (ss) and 42.78/80.76 (ms)
    • PSPNet101 on VOC2012 testset (mIoU): 85.41 (ms)
    • PSPNet101 on cityscapes valset (mIoU/pAcc): 79.70/96.38 (ss) and 80.91/96.59 (ms)
  5. Demo video:

    Video processed by PSPNet101 on cityscapes dataset:

    Merge with colormap on side: Video1

    Alpha blending with value as 0.5: Video2

Citation

If PSPNet is useful for your research, please consider citing:

@inproceedings{zhao2017pspnet,
  title={Pyramid Scene Parsing Network},
  author={Zhao, Hengshuang and Shi, Jianping and Qi, Xiaojuan and Wang, Xiaogang and Jia, Jiaya},
  booktitle={CVPR},
  year={2017}
}

Questions

Please contact '[email protected]'

Comments
  • matio.h could not found

    matio.h could not found

    Hi,

    I came across a bug "matio.h could not found" while trying to make -j8. I also check the src/caffe/util/ and there is also no matio.h both in your version of caffe and BVLC version vaffe.

    The following is Error Message:

    CXX src/caffe/util/matio_io.cpp src/caffe/util/matio_io.cpp:10:19: 致命错误: matio.h:没有那个文件或目录 编译中断。 Makefile:575: recipe for target '.build_release/src/caffe/util/matio_io.o' failed make: *** [.build_release/src/caffe/util/matio_io.o] Error 1

    Best Wishes~

    opened by zhengdixin 13
  • Can this version be compiled by CUDA 8.0?

    Can this version be compiled by CUDA 8.0?

    Hi, I found that there are some errors when compiling with CUDA 8.0.

    1 error detected in the compilation of "/tmp/tmpxft_00000a2e_00000000-5_domain_transform_forward_only_layer.cpp4.ii".
    make: *** [.build_release/cuda/src/caffe/layers/domain_transform_forward_only_layer.o] Error 1
    make: *** Waiting for unfinished jobs....
    
    

    This problem can be also seen when compiling deeplab with CUDA 8.0.

    Do you have any solution for this problem? Thank you.

    opened by bearpaw 12
  • Define class (ColorMap) for my Data

    Define class (ColorMap) for my Data

    Dear all.

    I want to training PSPNet with my data, so where and how can i define class (label) (Ex: Define color RGB(123,123,123) is for clother, RGB( 200,123,150) is for Desk.....).

    Thanks so much.

    opened by ThienAnh 7
  • MATLAB crashes using cityscapes trained model

    MATLAB crashes using cityscapes trained model

    I have been trying to test pspnet101_VOC2012.caffemodel, pspnet101_cityscapes.caffemodel, and pspnet50_ADE20K.caffemodel with my set of input images of size 1280*600, Each time I run eval_all.m, matlab crashes with the error as show below screenshot from 2017-05-27 10-45-47 screenshot from 2017-05-27 10-46-04

    I don't understand what the actual problem is, I am running it on a GPU with configuration screenshot from 2017-05-27 10-54-29

    have made my makefile.config with openmpi.

    On trying to run it with pspnet101_VOC2012.caffemodel with a sample of images I am able to run it without crashing, but the result saved in /home/PSPNet/evaluation/mc_result/VOC2012/test/pspnet101_473/color shows images with all pixel values 11.

    Please let me know, What is wrong, and guide me to successfully test PSPNet with my data set.

    opened by uu-ujwalkumar06 6
  • Evaluation with VOC2012

    Evaluation with VOC2012

    Hi As per step 3. evaluation,

    1. I downloaded VOC2012 dataset from http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar.
    2. updated the path in eval_all.m. (inside /evaluation/samplelist)
    3. I also used the existing VOC2012_test.txt
    4. While running the ./run.sh, I got the below error.

    Starting parallel pool (parpool) using the 'local' profile ... connected to 4 workers. Error using importdata (line 137) Unable to open file.

    Error in eval_sub (line 3) list = importdata(fullfile(data_root,eval_list));

    Error in eval_all (line 71) parfor i = 1:gpu_num %change 'parfor' to 'for' if singe GPU testing is used


    One thing I observed is whatever filenames mentioned in VOC2012_test.txt are missing inside path VOC2012/JPEGImages.

    Ex: The first 5 lines of VOC2012_test.txt file are /JPEGImages/2008_000006.jpg /JPEGImages/2008_000011.jpg /JPEGImages/2008_000012.jpg /JPEGImages/2008_000018.jpg /JPEGImages/2008_000024.jpg

    But the available files in VOC2012/JPEGImages are like /JPEGImages/2008_000002.jpg /JPEGImages/2008_000003.jpg /JPEGImages/2008_000007.jpg /JPEGImages/2008_000008.jpg /JPEGImages/2008_000009.jpg

    Why the filenames like 6, 11, 12, 18, 24, ... are missing inside VOC2012/JPEGImages is unclear. Do I need to download VOC2012 from some other place?, please let me know.

    FYI, I tried downloading ADE20K, but the folder structures are totally different & don't know how to modify ADE20K_val.txt (big manual work) & gave up.

    opened by gopi77 5
  • How to make it compatible with CUDNN v5

    How to make it compatible with CUDNN v5

    When I try to find why I cannot compile PSPNet with CUDNN v5 successfully, I find that the author is changing the README to say it is compatible with CUDNN v4... So, can anybody give me a hint how to modify the code to make it compatible with CUDNN v5?

    opened by zhixy 4
  • runtest failed

    runtest failed

    ...
    ./include/caffe/test/test_gradient_check_util.hpp:175: Failure
    The difference between computed_gradient and estimated_gradient is 0.033336639404296875, which exceeds threshold_ * scale, where
    computed_gradient evaluates to 1.9999799728393555,
    estimated_gradient evaluates to 1.9666433334350586, and
    threshold_ * scale evaluates to 0.00019999798678327352.
    debug: (top_id, top_data_id, blob_id, feat_id)=0,119,3,119; feat = 0.96758121252059937; objective+ = 2.1408424377441406; objective- = 2.1015095710754395
    ...
    ...
    [  FAILED  ] BatchNormLayerTest/2.TestGradient, where TypeParam = caffe::GPUDevice<float> (3116 ms)
    [ RUN      ] BatchNormLayerTest/2.TestForward
    src/caffe/test/test_batch_norm_layer.cpp:75: Failure
    The difference between 1 and var is 111861.75, which exceeds kErrorBound, where
    1 evaluates to 1,
    var evaluates to 111862.75, and
    kErrorBound evaluates to 0.0010000000474974513.
    src/caffe/test/test_batch_norm_layer.cpp:75: Failure
    The difference between 1 and var is 118616.2578125, which exceeds kErrorBound, where
    1 evaluates to 1,
    var evaluates to 118617.2578125, and
    kErrorBound evaluates to 0.0010000000474974513.
    [  FAILED  ] BatchNormLayerTest/2.TestForward, where TypeParam = caffe::GPUDevice<float> (2 ms)
    [ RUN      ] BatchNormLayerTest/2.TestForwardInplace
    src/caffe/test/test_batch_norm_layer.cpp:119: Failure
    The difference between 1 and var is 114000.234375, which exceeds kErrorBound, where
    1 evaluates to 1,
    var evaluates to 114001.234375, and
    kErrorBound evaluates to 0.0010000000474974513.
    src/caffe/test/test_batch_norm_layer.cpp:119: Failure
    The difference between 1 and var is 91060.625, which exceeds kErrorBound, where
    1 evaluates to 1,
    var evaluates to 91061.625, and
    kErrorBound evaluates to 0.0010000000474974513.
    [  FAILED  ] BatchNormLayerTest/2.TestForwardInplace, where TypeParam = caffe::GPUDevice<float> (1 ms)
    [----------] 3 tests from BatchNormLayerTest/2 (3119 ms total)
    
    [----------] 8 tests from SliceLayerTest/3, where TypeParam = caffe::GPUDevice<double>
    [ RUN      ] SliceLayerTest/3.TestSliceAcrossChannels
    [       OK ] SliceLayerTest/3.TestSliceAcrossChannels (2 ms)
    ...
    ...
    [       OK ] ContrastiveLossLayerTest/0.TestGradientLegacy (127 ms)
    [----------] 4 tests from ContrastiveLossLayerTest/0 (266 ms total)
    
    [----------] 1 test from LayerFactoryTest/0, where TypeParam = caffe::CPUDevice<float>
    [ RUN      ] LayerFactoryTest/0.TestCreateLayer
    *** Aborted at 1489289221 (unix time) try "date -d @1489289221" if you are using GNU date ***
    PC: @     0x7fd3894b5962 cfree
    *** SIGSEGV (@0x1f9) received by PID 11176 (TID 0x7fd393215740) from PID 505; stack trace: ***
        @     0x7fd38980c390 (unknown)
        @     0x7fd3894b5962 cfree
        @     0x7fd38a1302e1 deallocate()
        @     0x7fd38a18c0e0 caffe::DenseCRFLayer<>::DeAllocateAllData()
        @     0x7fd38a190e66 caffe::DenseCRFLayer<>::~DenseCRFLayer()
        @     0x7fd38a1914a9 caffe::DenseCRFLayer<>::~DenseCRFLayer()
        @           0x71fa18 caffe::LayerFactoryTest_TestCreateLayer_Test<>::TestBody()
        @           0x8b5e43 testing::internal::HandleExceptionsInMethodIfSupported<>()
        @           0x8af45a testing::Test::Run()
        @           0x8af5a8 testing::TestInfo::Run()
        @           0x8af685 testing::TestCase::Run()
        @           0x8b095f testing::internal::UnitTestImpl::RunAllTests()
        @           0x8b0c83 testing::UnitTest::Run()
        @           0x46645d main
        @     0x7fd389452830 __libc_start_main
        @           0x46d7e9 _start
        @                0x0 (unknown)
    Makefile:526: recipe for target 'runtest' failed
    make: *** [runtest] Segmentation fault
    

    Please help! CUDA8,cuDNN5,ubuntu16,GTX950m

    opened by lawpdas 3
  • error: function

    error: function "atomicAdd(double *, double)" has already been defined

    hi @hszhao, after i do cp Makefile.config.example Makefile.config, and then make -j9, i met this problem: ./include/caffe/common.cuh(9): error: function "atomicAdd(double *, double)" has already been defined. how to fix it? thanks.

    opened by zimenglan-sysu-512 3
  • Caffe make error

    Caffe make error

    Getting the following error when compiling Caffe with make:

    ./include/caffe/common.cuh(9): error: function "atomicAdd(double *, double)" has already been defined
    
    1 error detected in the compilation of "/tmp/tmpxft_00003eec_00000000-5_domain_transform_layer.cpp4.ii".
    

    Any solutions?

    opened by BAILOOL 2
  • libmatio.so.2: cannot open shared object file

    libmatio.so.2: cannot open shared object file

    I downloaded and install matio by following commands:

    ./configure
    make
    sudo make install
    

    I also changed the MATLAB_DIR := /usr/local/MATLAB/R2014b in Makefile.config file. Then I compile the PSPnet using

    make -j8 && make matcaffe

    I got the success message

    MEX matlab/+caffe/private/caffe_.cpp
    Building with 'g++'.
    Warning: You are using gcc version '5.4.1'. The version of gcc is not supported. The version currently supported with MEX is '4.7.x'. For a list of currently supported compilers see: http://www.mathworks.com/support/compilers/current_release.
    MEX completed successfully.
    

    However, when I run the eval_all.m file, I got the error

    Invalid MEX-file '/home/john/PSPNet/matlab/+caffe/private/caffe_.mexa64': libmatio.so.2: cannot open shared object file: No such file or directory
    Error in caffe.reset_all (line 5)
    caffe_('reset');
    Error in eval_sub (line 20)
    caffe.reset_all();
    Error in eval_all (line 72)  eval_sub(data_name,data_root,eval_list,model_weights,model_deploy,fea_cha,base_size,crop_size,data_class,data_colormap, ..
    ```
    
    How can I fix it? Thank you
    
    opened by mjohn123 2
  • [FIXED] Why are scale_factors used to scale pixel values?

    [FIXED] Why are scale_factors used to scale pixel values?

    I read the function TransformImgAndSeg, which has the following codes.

    // perform scaling
    if (scale_factors_.size() > 0) {
      int scale_ind = Rand(scale_factors_.size());
      Dtype scale   = scale_factors_[scale_ind];
      
      if (scale != 1) {
        img_height *= scale;
        img_width  *= scale;
        ...
    

    scale is a random scale_factor defined in transform_param, used to scale the size of the image. This makes perfect sense.

    However, I noticed that the scale is also used to scale the pixel values in the same function.

    if (has_mean_file) {
      int mean_index = (c * img_height + h_off + h) * img_width + w_off + w;
      transformed_data[top_index] =
        (pixel - mean[mean_index]) * scale;
    } else {
        if (has_mean_values) {
          transformed_data[top_index] =
            (pixel - mean_values_[c]) * scale;
        } else {
          transformed_data[top_index] = pixel * scale;
        }
    }
    

    I think we should scale the pixel values using the scale in transform_param instead of this random scale_factor, right?

    opened by jianchao-li 1
  • Reason to have a fixed inference size (473x473)

    Reason to have a fixed inference size (473x473)

    Hi, Thank you for sharing the code and trained models! I have a question specific to the demo in your PyTorch implementation. As I understand, you are using a base_size = 512 to which any incoming image is resized regardless of the input dimensions while maintaining the original aspect ratio. Then the inference is run in a grid fashion on crops of 473x473.

    My question is to why have a fixed crop_size or a base_size? There is no fully connected layer in the architecture so why we have this fixed size? Is this an arbitrary choice for numbers or there is a solid reason for it?

    Thank you for your time! Best, Touqeer

    opened by TouqeerAhmad 1
  • Questions about PSPNet.

    Questions about PSPNet.

    Thank you for uploading your code. It is very helpful to understand PSPNet. I have two questions about your paper.

    1. You wrote

    we use a pretrained ResNet model with the dilated network strategy to extract the feature map. The final feature map size is 1/8 of the input image.

    in the paper. But I think the feature map size is 1/16 when you use ResNet50. Do you use only first 3 blocks of ResNet50?

    1. You wrote

    Then we directly upsample the low-dimension feature maps to get the same size feature as the original feature map via bilinear interpolation. Finally, different levels of features are concatenated as the final pyramid pooling global feature.

    in Section 3.2 in the paper. I understand we have to concatenate resized different levels of features and feature map extracted by ResNet 50. But after that, the image size is 1/8 of the input image. How did you resize them to the same image size as input image?

    無題

    opened by kazucmpt 6
  • math_functions.cu:375 [Check failed: status== CURAND_STATUS_SUCCESS (201 VS. 0) CURAND_STATUS_LAUNCH_FAILURE]

    math_functions.cu:375 [Check failed: status== CURAND_STATUS_SUCCESS (201 VS. 0) CURAND_STATUS_LAUNCH_FAILURE]

    math_functions.cu:375 [Check failed: status== CURAND_STATUS_SUCCESS (201 VS. 0) CURAND_STATUS_LAUNCH_FAILURE] so I can not train the model. Can you help me? Looking forward to your reply!

    opened by Alice-kenan 0
  • Problem with evaluation

    Problem with evaluation

    Hi,

    Thanks for your work. is it possible to evaluate your result with cityscapes evaluation tool? They have some requirements on the value for every pixel.

    Thanks!

    opened by YuShen1116 0
Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Fast MST Algorithm Implementation of fast algorithms for (Maximum Spanning Tree) MST parsing that includes fast ArcMax+Reweighting+Tarjan algorithm fo

Miloš Stanojević 11 Oct 14, 2022
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da

MIT CSAIL Computer Vision 4.5k Jan 8, 2023
A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video

null 45 Nov 29, 2022
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da

MIT CSAIL Computer Vision 4.5k Jan 8, 2023
Development kit for MIT Scene Parsing Benchmark

Development Kit for MIT Scene Parsing Benchmark [NEW!] Our PyTorch implementation is released in the following repository: https://github.com/hangzhao

MIT CSAIL Computer Vision 424 Dec 1, 2022
Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts The rapid progress in 3D scene understanding has come with growing dem

Facebook Research 182 Dec 30, 2022
[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion Code for Multi-Temporal Scene Classification and Scene Ch

Lixiang Ru 33 Dec 12, 2022
Neural Scene Graphs for Dynamic Scene (CVPR 2021)

Implementation of Neural Scene Graphs, that optimizes multiple radiance fields to represent different objects and a static scene background. Learned representations can be rendered with novel object compositions and views.

null 151 Dec 26, 2022
A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

Keren Ye 35 Nov 20, 2022
Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

DeepPanoContext (DPC) [Project Page (with interactive results)][Paper] DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context G

Cheng Zhang 66 Nov 16, 2022
Automatic number plate recognition using tech: Yolo, OCR, Scene text detection, scene text recognation, flask, torch

Automatic Number Plate Recognition Automatic Number Plate Recognition (ANPR) is the process of reading the characters on the plate with various optica

Meftun AKARSU 52 Dec 22, 2022
Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors

Make-A-Scene - PyTorch Pytorch implementation (inofficial) of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors (https://arxiv.org/

Casual GAN Papers 259 Dec 28, 2022
(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

RDPNet IEEE TIP 2021: Regularized Densely-connected Pyramid Network for Salient Instance Segmentation PyTorch training and testing code are available.

Yu-Huan Wu 41 Oct 21, 2022
EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration

EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration Ruikang Xu, Zeyu Xiao, Jie Huang, Yueyi Zhang, Zhiwei Xiong. EDPN: Enhanced Deep Pyra

null 69 Dec 15, 2022
EPSANet:An Efficient Pyramid Split Attention Block on Convolutional Neural Network

EPSANet:An Efficient Pyramid Split Attention Block on Convolutional Neural Network This repo contains the official Pytorch implementaion code and conf

Hu Zhang 175 Jan 7, 2023
Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019) Introduction Official implementation of Adaptive Pyramid Context Network

null 21 Nov 9, 2022
[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

FaPN: Feature-aligned Pyramid Network for Dense Image Prediction [arXiv] [Project Page] @inproceedings{ huang2021fapn, title={{FaPN}: Feature-alig

Shihua Huang 23 Jul 22, 2022
a reimplementation of Optical Flow Estimation using a Spatial Pyramid Network in PyTorch

pytorch-spynet This is a personal reimplementation of SPyNet [1] using PyTorch. Should you be making use of this work, please cite the paper according

Simon Niklaus 269 Jan 2, 2023
The code repository for "RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection" (ACM MM'21)

RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection (ACM MM'21) By Zhuofan Zong, Qianggang Cao, Biao Leng Introduction F

TempleX 9 Jul 30, 2022