implementation for paper "ShelfNet for fast semantic segmentation"

Overview

ShelfNet-lightweight for paper (ShelfNet for fast semantic segmentation)

  • This repo contains implementation of ShelfNet-lightweight models for real-time models on Cityscapes.
  • For real-time tasks, we achieved 74.8% mIoU on Ctiyscapes dataset, with a speed of 59.2 FPS (61.7 FPS for BiSeNet at 74.7% on a GTX 1080Ti GPU).
  • For non real-time tasks, we achieved 79.0% mIoU on Cityscapes test set with ResNet34 backbone, suparssing other models (PSPNet and BiSeNet) with largers backbones with ResNet50 or Resnet 101 backbone.
  • For Non light-weight ShelfNet implementation, refer to another ShelfNet repo.
  • This branch is the result on Cityscapes experiment, for results on PASCAL, see branch pascal

This repo is based on two implementations Implementation 1 and Implementation 2. This implementation takes about 24h's training on 2 GTX 1080Ti GPU.

Results

Imagess
Cityscapes results

Link to results on Cityscapes test set

ShelfNet18-lw real-time: https://www.cityscapes-dataset.com/anonymous-results/?id=b2cc8f49fc3267c73e6bb686425016cb152c8bc34fc09ac207c81749f329dc8d
ShelfNet34-lw non real-time: https://www.cityscapes-dataset.com/anonymous-results/?id=c0a7c8a4b64a880a715632c6a28b116d239096b63b5d14f5042c8b3280a7169d

Data Preparation

Download fine labelled dataset from Cityscapes server, and decompress into ./data folder.
You might need to modify data path here and here

$ mkdir -p data
$ mv /path/to/leftImg8bit_trainvaltest.zip data
$ mv /path/to/gtFine_trainvaltest.zip data
$ cd data
$ unzip leftImg8bit_trainvaltest.zip
$ unzip gtFine_trainvaltest.zip

Two models and the pretrained weights

We provide two models, ShelfNet18 with 64 base channels for real-time semantic segmentation, and ShelfNet34 with 128 base channels for non-real-time semantic segmentation.
Pretrained weights for ShelfNet18 and ShelfNet34.

Requirements

PyTorch 1.1
python3
scikit-image
tqdm

How to run

Find the folder (cd ShelfNet18_realtime or cd ShelfNet34_non_realtime)

training

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 train.py

evaluate on validation set (Create a folder called res, this folder is automatically created if you train the model. Put checkpoint in resfolder, and make sure the checkpoint name and dataset path match evaluate.py. Change checkpoint name to model_final.pthby default)

python evaluate.py

Running speed

test running speed of ShelfNet18-lw

python test_speed.py

You can modify the shape of input images to test running speed, by modifying here
You can test running speed of different models by modifying here
The running speed is an average of 100 single forward passes, therefore it's possible the speed varies. The code returns the mean running time by default.

Comments
  • results on cityscapes

    results on cityscapes

    Hi, I tried to run the test code on cityscapes data set, and I changed the crop_size in option.py to 768, but I found the mean IoU was about 60%. Do I need to change the base_size? Or do you know how can I achieve the results reported in your paper? I used pretrained resnet50 provided by you. Thanks a lot!

    opened by noahzn 7
  • Compile Errors

    Compile Errors

    RuntimeError: Error building extension 'enclib_gpu': [1/4] :/usr/local/cuda-9.0/bin/nvcc -DTORCH_EXTENSION_NAME=enclib_gpu -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/TH -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/THC -I:/usr/local/cuda-9.0/include -I/home/ayx/anaconda3/envs/pytorch/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 --compiler-options '-fPIC' -std=c++11 -c /home/ayx/ShelfNet-citys/encoding/lib/gpu/encoding_kernel.cu -o encoding_kernel.cuda.o FAILED: encoding_kernel.cuda.o :/usr/local/cuda-9.0/bin/nvcc -DTORCH_EXTENSION_NAME=enclib_gpu -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/TH -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/THC -I:/usr/local/cuda-9.0/include -I/home/ayx/anaconda3/envs/pytorch/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 --compiler-options '-fPIC' -std=c++11 -c /home/ayx/ShelfNet-citys/encoding/lib/gpu/encoding_kernel.cu -o encoding_kernel.cuda.o /bin/sh: 1: :/usr/local/cuda-9.0/bin/nvcc: not found [2/4] :/usr/local/cuda-9.0/bin/nvcc -DTORCH_EXTENSION_NAME=enclib_gpu -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/TH -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/THC -I:/usr/local/cuda-9.0/include -I/home/ayx/anaconda3/envs/pytorch/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 --compiler-options '-fPIC' -std=c++11 -c /home/ayx/ShelfNet-citys/encoding/lib/gpu/syncbn_kernel.cu -o syncbn_kernel.cuda.o FAILED: syncbn_kernel.cuda.o :/usr/local/cuda-9.0/bin/nvcc -DTORCH_EXTENSION_NAME=enclib_gpu -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/TH -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/THC -I:/usr/local/cuda-9.0/include -I/home/ayx/anaconda3/envs/pytorch/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 --compiler-options '-fPIC' -std=c++11 -c /home/ayx/ShelfNet-citys/encoding/lib/gpu/syncbn_kernel.cu -o syncbn_kernel.cuda.o /bin/sh: 1: :/usr/local/cuda-9.0/bin/nvcc: not found [3/4] :/usr/local/cuda-9.0/bin/nvcc -DTORCH_EXTENSION_NAME=enclib_gpu -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/TH -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/THC -I:/usr/local/cuda-9.0/include -I/home/ayx/anaconda3/envs/pytorch/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 --compiler-options '-fPIC' -std=c++11 -c /home/ayx/ShelfNet-citys/encoding/lib/gpu/roi_align_kernel.cu -o roi_align_kernel.cuda.o FAILED: roi_align_kernel.cuda.o :/usr/local/cuda-9.0/bin/nvcc -DTORCH_EXTENSION_NAME=enclib_gpu -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/TH -I/home/ayx/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/include/THC -I:/usr/local/cuda-9.0/include -I/home/ayx/anaconda3/envs/pytorch/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 --compiler-options '-fPIC' -std=c++11 -c /home/ayx/ShelfNet-citys/encoding/lib/gpu/roi_align_kernel.cu -o roi_align_kernel.cuda.o /bin/sh: 1: :/usr/local/cuda-9.0/bin/nvcc: not found ninja: build stopped: subcommand failed.

    opened by aiyanxiao 5
  • Bad results on custom training data

    Bad results on custom training data

    I trained Shelfnet realtime model on custom dataset,with 4 classes- road,curb,sidewalk,driveable-fallback,with 4.5k images for training and 80k iterations for training.For the other classes I kept the id as 255.I used the following command for training CUDA_VISIBLE_DEVICES=0 python3 -m torch.distributed.launch --nproc_per_node=2 train.py I got the following result

    frame9875_leftImg8bit lkdfkhdf I am wondering why the result is so bad.The val miou is nan Any suggestion is greatly appreciated. Thank you

    opened by poornimajd 4
  • 关于论文中对Cityscapes的测试结果

    关于论文中对Cityscapes的测试结果

    作者你好,我看了你的代码和论文,目前shelfnet50 single scale和multi scale在Cityscapes的val集上得到了与你论文里面类似的结果,但是我发现这里有一个问题,你所对比的那些方法,他们在Cityscapes数据集上都是以2048x1024去统计性能的,而我观察你代码发现,你设置Cityscapes数据集的base_size为1024,在做测试的时候你以base_size乘以scale作为最长边去对图像进行缩放,在scale小于2的时候,输入到网络中的图像是永远小于2048x1024的,也就意味着跟其他用2048x1024去统计分割性能的方法没有可比性了,不知道你这边是出于什么考量? 不好意思打扰了。

    opened by luuuyi 3
  • shelfnet share weights?

    shelfnet share weights?

    I am pretty appreciate your work. I try to re-implement your work, but when I refer to your pytorch code, I don't find any shared-weights operation but it does mentioned in your paper.
    I notice that the shared-weights block should appear in every stage in shelfnet's decoder side and the in/out channels of conv2d are not the same, so how can I share the weights of residual block?

    opened by freedomsb 3
  • Tips for better training

    Tips for better training

    Dear @juntang-zhuang,

    First of all, thank you for this repo. I am trying to use it to train shelfnet on the Mapillary Vistas Dataset (here you can find my fork). I have succeeded training she Real-Time version of Shelfnet, however the results are pretty bad even after 270000 epochs. The reached mIOU is 34,06%, however in this paper they say that they were able to achieve 49.2% on your model. I have already tried to contact the authors, but I got no response.
    Therefore I wanted to ask you if you could give me some tips to improve the training to achieve better results.

    Thank you in advance.

    Best, Micaela

    opened by mive93 2
  • How to load pretrained weights for training

    How to load pretrained weights for training

    Hi great work @juntang-zhuang . I have trained the model on a custom dataset(say dataset1),and I also have another dataset(say dataset2),which is similar to dataset1,and hence I want to assign the custom dataset trained model weights as the initial weights for training the model on the dataset2.In the code I did not find where exactly we need to load the pretrained weights for training. Any suggestion is appreciated Thank you

    opened by poornimajd 2
  • comparison to GridNet

    comparison to GridNet

    opened by stubborn-dwarf 2
  • RuntimeError: storage has wrong size: expected -4885659930368473377 got 589824

    RuntimeError: storage has wrong size: expected -4885659930368473377 got 589824

    Hi, Zhuang. I meet a problem in /ShelfNet18_realtime/train.py

     if it % 1000 == 0:
        ## dump the final model
        save_pth = osp.join(respth, 'shelfnet_model_it_%d.pth'%it)
        # net.cpu()
        # state = net.module.state_dict() if hasattr(net, 'module') else net.state_dict()
        # if dist.get_rank() == 0: torch.save(state, save_pth)
        torch.save(net.module.state_dict(),save_pth)
    
        if it % 1000 == 0 and it > 0:
            evaluate(checkpoint=save_pth)
    

    when this code block runs in the following line evaluate(checkpoint=save_pth), it appears the problem in /ShelfNet18_realtime/evaluate.py

    def evaluate(respth='./res', dspth='/home/cjj/datasets/CityScapes/Fine', checkpoint=None):
    ...
    if checkpoint is None:
            save_pth = osp.join(respth, 'model_final.pth')
        else:
            save_pth = checkpoint
        net.load_state_dict(torch.load(save_pth))
        net.cuda()
        net.eval()
    ...
    

    and the line net.load_state_dict(torch.load(save_pth)) arises a error like that : RuntimeError: storage has wrong size: expected -4885659930368473377 got 589824

    Have you meet this problem like that? Thx

    opened by J-JunChen 1
  • TypeError: __init__() got an unexpected keyword argument 'find_unused_parameters'

    TypeError: __init__() got an unexpected keyword argument 'find_unused_parameters'

    Thanks for your work! I have a problem while running the code ShelfNetlw. The environment configuration I use is as follows: torch 1.0.0 python3.6 and when give the command CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 train.py

    The error is:TypeError: init() got an unexpected keyword argument 'find_unused_parameters' so, I tried to delete line76 of train.py (find_unused_parameters=True) But I encountered this situation as shown: 1

    There is something that confuses me, the DistributedDataParallel in pytorch1.0.0 does not have this parameter “find_unused_parameters”,but the lack of find_unused_parameters can create trouble.

    Can someone tell me how to solve it? Best not to change the version of pytorch.

    opened by mzmzdcr 1
  • Some question about pascal voc test set

    Some question about pascal voc test set

    Hi I find your code is useful but there is question confuse me, when you test your code on pascal voc val set, your use _val_sync_transform function to crop and pad the image, but if we want to test the network performance on pascal voc test set, we should transform the image size to the original image size. can you share what should i do, thanks for your code.

    opened by pcl111 1
  • Get the confidence of the segmented class during test tme

    Get the confidence of the segmented class during test tme

    Hello, I am trying to figure out if there is a way to get the shelfnet network also give the confidence of the segmented class during test time.Basically give how accurate the network thinks the segmentation output is. Any suggestion is greatly appreciated. Thank you

    opened by poornimajd 4
  • ImportError: No module named 'inplace_abn'

    ImportError: No module named 'inplace_abn'

    Thank you for the great work! I met a problem with No module named 'inplace_abn' when I run python3 evaluate.py. After installing inplace_abn from official repo, it works still not.

    Do you have any advices? THX and have a nice day!

    opened by shanjiuvspikaqiu 5
  • Functionality difference between pytorch batchnorm and synchronised batchnorm

    Functionality difference between pytorch batchnorm and synchronised batchnorm

    Hi, Thanks a lot for sharing the code. I wanted to export an ONNX model. Hence, I replaced all the synchronized batchnorm with pytorch's batch-norm. However, I observed huge drop in accuracy(~20%). When I dig deeper, I realized that inside the batch-norm kernel, you are taking the absolute value of the weights and adding eps to it. This is functionally different from pytorch's batch-norm.

    What is the reason behind this slightly different implementation of batch-norm? Does it help in training or something else?

    opened by debapriyamaji 3
Owner
Juntang Zhuang
Juntang Zhuang
Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

Who Left the Dogs Out? Evaluation and demo code for our ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization

Benjamin Biggs 29 Dec 28, 2022
The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

Leo Xiao 3.9k Jan 5, 2023
Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Regularized Greedy Forest Regularized Greedy Forest (RGF) is a tree ensemble machine learning method described in this paper. RGF can deliver better r

RGF-team 364 Dec 28, 2022
Official implementation of AAAI-21 paper "Label Confusion Learning to Enhance Text Classification Models"

Description: This is the official implementation of our AAAI-21 accepted paper Label Confusion Learning to Enhance Text Classification Models. The str

null 101 Nov 25, 2022
Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

null 49 Nov 23, 2022
A PyTorch re-implementation of the paper 'Exploring Simple Siamese Representation Learning'. Reproduced the 67.8% Top1 Acc on ImageNet.

Exploring simple siamese representation learning This is a PyTorch re-implementation of the SimSiam paper on ImageNet dataset. The results match that

Taojiannan Yang 72 Nov 9, 2022
Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

Non-AR Spatial-Temporal Transformer Introduction Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series For

Chen Kai 66 Nov 28, 2022
This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

null 212 Dec 25, 2022
Official implementation of the ICLR 2021 paper

You Only Need Adversarial Supervision for Semantic Image Synthesis Official PyTorch implementation of the ICLR 2021 paper "You Only Need Adversarial S

Bosch Research 272 Dec 28, 2022
Implementation of Nyström Self-attention, from the paper Nyströmformer

Nyström Attention Implementation of Nyström Self-attention, from the paper Nyströmformer. Yannic Kilcher video Install $ pip install nystrom-attention

Phil Wang 95 Jan 2, 2023
Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

SETR - Pytorch Since the original paper (Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.) has no official

zhaohu xing 112 Dec 16, 2022
Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

CIPS -- Official Pytorch Implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis Requirements pip install -r requi

Multimodal Lab @ Samsung AI Center Moscow 201 Dec 21, 2022
Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

HiSD: Image-to-image Translation via Hierarchical Style Disentanglement Official pytorch implementation of paper "Image-to-image Translation

null 364 Dec 14, 2022
PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 20

Zhengqi Li 585 Jan 4, 2023
Implementation of Barlow Twins paper

barlowtwins PyTorch Implementation of Barlow Twins paper: Barlow Twins: Self-Supervised Learning via Redundancy Reduction This is currently a work in

IgorSusmelj 86 Dec 20, 2022
Official pytorch implementation of paper "Inception Convolution with Efficient Dilation Search" (CVPR 2021 Oral).

IC-Conv This repository is an official implementation of the paper Inception Convolution with Efficient Dilation Search. Getting Started Download Imag

Jie Liu 111 Dec 31, 2022
Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection This project provides an implementation for "LLA: Loss-aware Label Assignment for Dens

null 35 Dec 6, 2022
Functional TensorFlow Implementation of Singular Value Decomposition for paper Fast Graph Learning

tf-fsvd TensorFlow Implementation of Functional Singular Value Decomposition for paper Fast Graph Learning with Unique Optimal Solutions Cite If you f

Sami Abu-El-Haija 14 Nov 25, 2021
This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

Haotong Qin 59 Dec 17, 2022