Pytorch code for semantic segmentation using ERFNet

Overview

ERFNet (PyTorch version)

This code is a toolbox that uses PyTorch for training and evaluating the ERFNet architecture for semantic segmentation.

For the Original Torch version please go HERE

NOTE: This PyTorch version has a slightly better result than the ones in the Torch version (used in the paper): 72.1 IoU in Val set and 69.8 IoU in test set.

Example segmentation

Publications

If you use this software in your research, please cite our publications:

"Efficient ConvNet for Real-time Semantic Segmentation", E. Romera, J. M. Alvarez, L. M. Bergasa and R. Arroyo, IEEE Intelligent Vehicles Symposium (IV), pp. 1789-1794, Redondo Beach (California, USA), June 2017. [Best Student Paper Award], [pdf]

"ERFNet: Efficient Residual Factorized ConvNet for Real-time Semantic Segmentation", E. Romera, J. M. Alvarez, L. M. Bergasa and R. Arroyo, Transactions on Intelligent Transportation Systems (T-ITS), December 2017. [pdf]

Packages

For instructions please refer to the README on each folder:

  • train contains tools for training the network for semantic segmentation.
  • eval contains tools for evaluating/visualizing the network's output.
  • imagenet Contains script and model for pretraining ERFNet's encoder in Imagenet.
  • trained_models Contains the trained models used in the papers. NOTE: the pytorch version is slightly different from the torch models.

Requirements:

  • The Cityscapes dataset: Download the "leftImg8bit" for the RGB images and the "gtFine" for the labels. Please note that for training you should use the "_labelTrainIds" and not the "_labelIds", you can download the cityscapes scripts and use the conversor to generate trainIds from labelIds
  • Python 3.6: If you don't have Python3.6 in your system, I recommend installing it with Anaconda
  • PyTorch: Make sure to install the Pytorch version for Python 3.6 with CUDA support (code only tested for CUDA 8.0).
  • Additional Python packages: numpy, matplotlib, Pillow, torchvision and visdom (optional for --visualize flag)

In Anaconda you can install with:

conda install numpy matplotlib torchvision Pillow
conda install -c conda-forge visdom

If you use Pip (make sure to have it configured for Python3.6) you can install with:

pip install numpy matplotlib torchvision Pillow visdom

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, which allows for personal and research use only. For a commercial license please contact the authors. You can view a license summary here: http://creativecommons.org/licenses/by-nc/4.0/

Comments
  • CARLA Simulator - Semantic Segmentation

    CARLA Simulator - Semantic Segmentation

    Hello,

    Firstly, thanks for this amazing work.

    Secondly, I want to use the network to train it on my own dataset from (CARLA Simulator). Are there any tips on how to adapt your implementation to my own dataset (with only 12 classes of semantics) ?

    opened by mhusseinsh 33
  • How to properly resume decoder training?

    How to properly resume decoder training?

    Hi,

    I am trying to retrain the model on my own on the cityscapes dataset, but only using 2 classes. The encoder training works fine, but I have problems with decoder training which I think are caused by not attaching the trained encoder to the model properly.

    As far as i understand, there are two possibilities for decoder training.

    1. Use the encoder pretrained on imagenet

    For this I used the following commands from the documentation

    python main_binary.py --savedir erfnet_training1 --datadir /home/datasets/cityscapes/ --num-epochs 150 --batch-size 6 --decoder --pretrainedEncoder "../trained_models/erfnet_encoder_pretrained.pth.tar" and python main_binary.py --savedir erfnet_training1 --datadir /home/datasets/cityscapes/ --num-epochs 150 --batch-size 6 --decoder --pretrainedEncoder "../trained_models/erfnet_encoder_pretrained.pth.tar" --resume

    I have not trained the model for all epochs, but the intermediate result looks fine (best Val-IoU after 85 epochs: 0.9495)

    2. Use an encoder trained on cityscapes

    The encoder training worked fine and resulted in a Val-IoU of 0.9471. However, I couldn't find any documentation on how to attach the pretrained encoder for decoder training. From the code, I got that --pretrainedEncoder flag is only for the imagenet encoder, and that I should use --state, so I used

    python main_binary.py --savedir erfnet_training2 --datadir /home/datasets/cityscapes/ --num-epochs 150 --batch-size 6 --decoder --state "../save/erfnet_training2/model_best_enc.pth.tar" and since the code says to only use --state for initializing: python main_binary.py --savedir erfnet_training2 --datadir /home/datasets/cityscapes/ --num-epochs 150 --batch-size 6 --decoder --resume

    However, this didn't work out and finished with a best Val-IoU of only 0.9441, so the whole network performs worse than just the encoder alone. For comparison, I also tested training the decoder without initializing an encoder, which resulted in Val-IoU= 0.9461.

    So what is the correct way of training the decoder after finished encoder training? Especially the arguments in combination with the --resume flag, since I cannot train the model in one go due to hardware availability.

    Thank You for your answer.

    opened by heumchri 27
  • data with cityscape doesnot work

    data with cityscape doesnot work

    hi, thanks for the code first, but it seems code doesnot work with cityscape dataset currently. I'm using python3.6 with pytorch version: 0.2.0_4, it crashed after several training steps with:

    THCudaCheck FAIL device-side assert triggered, update_grad)input_fn() RuntimeError: cuda runtime error(59): device-side assert triggered at /opt/conda/conda-bld/pytorch/work/torch/lib/THCUNN/generic/Threshold.cu:66

    opened by huanhuanxuxu 11
  • There is not model_best.pth after training finish

    There is not model_best.pth after training finish

    Hi, Eromera. There is --epochs-save to save model every X epochs, but when I set it to none zero, It can't save the model as I expected. And another issue is that when the training is finished, some time it will save the model_best.pth and someting just get the model_encoder_best.pth. Thanks!

    opened by HyuanTan 9
  • transfer learning with erfnet

    transfer learning with erfnet

    hi all,

    I would like to do a transfer learning project by using erfnet, but i have some questions about the training process with erfnet.

    I have collected my own data(training : val : test = 7k : 1.5k : 1.5k images), the dataset has 15 classes, if I would like to train the model without pre-trained ImageNet weights, when do I decide to terminate the encoder training process?

    Thank you very much :)

    opened by ytzhao 8
  •  get stuck  running erfnet model on Jetson TX2

    get stuck running erfnet model on Jetson TX2

    Hi, I want to run the erfnet model on Jeson tx2.

    • I installed pytorch without problems and cuda 9.0 with cudnn 7.0 ex) i checked by 'import torch' , i checked by 'nvcc --version'

    but When I try to run erfnet code, I got stuck

    "RuntimeError : cuda runtime error(7) : too many resources requested for launch at /home/nvidia/pytorch/aten/src/THCUNN/im2col.h"

    please help me!.

    opened by EthanCalvin 8
  • Not able to reproduce validation set accuracy

    Not able to reproduce validation set accuracy

    I am currently trying out the ERFNet on the cityscapes dataset. For that I use my own training script but the exact same model implementation as yours.

    The mIoU results that I achieve are at best around 62% mIoU on the Cityscapes validation set for training from scratch. Now I am wondering if I am missing something during training since your validation set results are around 69% mIoU for training from scratch(right?).

    What I do is:

    • Training Scale: 1024x512 (for testing: bilinear upsampling to 2048x1024)
    • Augmentation: Random translation x/y +-2px; rand. horizontal flipping; input normalization to [-1,1]
    • Class balancing with the weights from your script (the class train ids are the same as from the official cityscapes-scripts right? Or did you use a different train id distribution?)
    • Learning rate schedule with same lambda function as in your script
    • Start learning rate: 5e-4, weight decay: 1e-4
    • Batch size: 5 (you used 6 right? But I can't imagine that makes the huge difference)
    • Trained for 150 epochs then I picked the best working epoch (epoch 127 in my case) -> 62.3 % mIoU on val. set (did you search for the best epoch or achieved the results with simply the last epoch?)

    So do you maybe know if I miss something that could lead to my poor performance? Any help would be appreciated!

    opened by mbcel 7
  • Implementation on PyTorch for Windows 10

    Implementation on PyTorch for Windows 10

    Hi Eromera, first thanks for your inspiring work! I wanted to recreate your project with a conda package of PyTorch for Windows 10 x64, Anaconda3 (Python 3.6), Cuda 8.0 and pytorch-0.3.0. When I tried to start the training I receive an IndexError because a list index is out of range. The same error also appears, when I try to evaluate the trained model on the validation set.

    train error_li

    evaluate error_li

    Do you think this kind of errors are quick to fix or are they appearing because I try to deploy it on Windows?

    All the best, Max

    opened by MaximilianBoemer 6
  • Error when I train my own dataset

    Error when I train my own dataset

    Hi, Thanks for you share. I train the model on CityScape Dateset and get the results which the paper show. I want to train the model on my own dataset, but met some issue. When I train on 2 classes(including background), and change NUM_CLASSES=20(same with original code), the train process work find but the predict result look strange: arriveroom_000002_000030_leftimg8bit

    when I change NUM_CLASSES=20 and def __init__(self, nClasses, ignoreIndex=0) in iouEval.py(because my background is 0), in the encode val stage:

    ----- VALIDATING - EPOCH 1 ----- Traceback (most recent call last): File "main.py", line 545, in <module> main(parser.parse_args()) File "main.py", line 499, in main model = train(args, model, True) #Train encoder File "main.py", line 334, in train iouEvalVal.addBatch(outputs.max(1)[1].unsqueeze(1).data, targets.data) File "/media/holly/Code/Segmentation/ERFNet/erfnet_pytorch/train/iouEval.py", line 41, in addBatch x_onehot = x_onehot[:, :self.ignoreIndex] ValueError: result of slicing is an empty tensor

    when I change NUM_CLASSES=2 and def __init__(self, nClasses, ignoreIndex=19), in the decode strage:

    ========== DECODER TRAINING =========== /DataSet/DSHolly/DataAll/SegmentationLikeCityScapes_room/leftImg8bit/train /DataSet/DSHolly/DataAll/SegmentationLikeCityScapes_room/leftImg8bit/val <class 'criterion.CrossEntropyLoss2d'> ----- TRAINING - EPOCH 1 ----- LEARNING RATE: 0.0005 THCudaCheck FAIL file=/pytorch/torch/lib/THCUNN/generic/Threshold.cu line=66 error=59 : device-side assert triggered THCudaCheck FAIL file=/pytorch/torch/lib/THCUNN/generic/Threshold.cu line=66 error=59 : device-side assert triggered Traceback (most recent call last): File "main.py", line 541, in <module> main(parser.parse_args()) File "main.py", line 514, in main model = train(args, model, False) #Train decoder File "main.py", line 260, in train loss.backward() File "/media/holly/Code/.pyenv/versions/Python3.6.3ERFNet/lib/python3.6/site-packages/torch/autograd/variable.py", line 156, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables) File "/media/holly/Code/.pyenv/versions/Python3.6.3ERFNet/lib/python3.6/site-packages/torch/autograd/__init__.py", line 98, in backward variables, grad_variables, retain_graph) File "/media/holly/Code/.pyenv/versions/Python3.6.3ERFNet/lib/python3.6/site-packages/torch/autograd/function.py", line 91, in apply return self._forward_cls.backward(self, *args) File "/media/holly/Code/.pyenv/versions/Python3.6.3ERFNet/lib/python3.6/site-packages/torch/nn/_functions/thnn/auto.py", line 187, in backward return (backward_cls.apply(input, grad_output, ctx.additional_args, ctx._backend, ctx.buffers, *tensor_params) + File "/media/holly/Code/.pyenv/versions/Python3.6.3ERFNet/lib/python3.6/site-packages/torch/nn/_functions/thnn/auto.py", line 219, in backward_cls_forward update_grad_input_fn(ctx._backend.library_state, input, grad_output, grad_input, *gi_args) RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THCUNN/generic/Threshold.cu:66

    Are there some tips if I want to train model on my own data set? Thanks!!

    opened by HyuanTan 6
  • DownsamplerBlock torch cat inconsistent tensor size

    DownsamplerBlock torch cat inconsistent tensor size

    hi Eromera, I'm using my dataset to train the model, but it seems something wrong with the torch.cat operation in DownsamplerBlock:

    screen shot 2017-12-14 at 8 51 21 am

    it seems that results from conv and pool doesnot match for odd image size... Also, due to the repeated usage of this "DownsamplerBlock", all "cat" operation input must have the even size. Also, I'm curious about this "cat" operation, does it contribute to the final performance? I mean why just use the conv or pool operation directly?

    opened by huanhuanxuxu 4
  • How to compute class weight

    How to compute class weight

    In (https://github.com/Eromera/erfnet_pytorch/blob/master/train/main.py#L92-L131), you give different weights for every class when calculating loss. Can you introduce the method you compute this weights. I want to use that in other datasets. Thanks.

    opened by ywlng 3
  • Is

    Is "erfnet_pretrained.pth" used ImageNet pretrained encoder?

    Hi, @Eromera thank u for your nice work! btw, in the trained_models directory, is erfnet_pretrained.pth used ImageNet pretrained encoder?
    and is it for segmentation to cityscapes dataset?

    How can I get ERFNet pretrained on ImageNet?

    opened by daeunni 0
  • i can't run this code correctly due to y_onehot.scatter_(1, y, 1).float() in eval_IOU.py

    i can't run this code correctly due to y_onehot.scatter_(1, y, 1).float() in eval_IOU.py

    hello, i'll try to run this code in eval_IOU.py, but there are some issue as below: y_onehot_size: torch.Size([1, 23, 512, 1024]) max(y): tensor(26) Traceback (most recent call last): File "eval_iou.py", line 153, in main(parser.parse_args()) File "eval_iou.py", line 94, in main iouEvalVal.addBatch(outputs.max(1)[1].unsqueeze(1).data, labels) File "/workspace/data/ERFNet2/erfnet_pytorch-master/eval/iouEval.py", line 58, in addBatch y_onehot.scatter_(1, y, 1).float() RuntimeError: index 26 is out of bounds for dimension 1 with size 23 root@7b72ac96a07e:/workspace/data/ERFNet2/erfnet_pytorch-master/eval#

    tha'ts why?

    opened by hisrg 1
  • Encoder Pretrained Weights.

    Encoder Pretrained Weights.

    Hello guys.

    I'm using the ERFNet Encoder for a project of mine and I had some issues because of the additional conv layer (output_conv). After getting different feature maps every time I ran the network, I realized that this layer didn't have any pretrained weights and I was getting different results because of random initialization. So now I have two questions:

    1. Do you have trained weights for this layer? If that`s the case, could you share them with us? Of course I can use the previous layer as the encoder's output, but the big number of filters is really an issue in my project, so having 20 filters would be much better.

    2. Just out of curiosity, how was this additional layer trained? Did you use a scaled version of the original annotated image as the ground truth?

    Thank you very much in advance.

    opened by Luizerko 0
  • Inference comparison with the ENet

    Inference comparison with the ENet

    In ERFNet paper inference speed of the model (ERFNet) with ENet has been presented, the values mentioned in your paper has been taken from the original ENet paper.

    Has anyone faced the same issue, in my case ENet inference speed is much is slower than the ERFNet i.e for 360x640 image ENet is about 80fps, whereas in case of ERFNet it is about 100fps for same resolution image on RTX2060 GPU.

    Has anyone replicated the result of ENet in terms of fps?

    opened by Abdul-Nasir11 0
  • File

    File "/home/dev__7/Desktop/jobIVs/snetlab-jobIV/erfnet_pytorch/train/main.py", line 403, in main assert os.path.exists(modelfile), "Error: model definition not found" AssertionError: Error: model definition not found

    File "/home/dev__7/Desktop/jobIVs/snetlab-jobIV/erfnet_pytorch/train/main.py", line 403, in main assert os.path.exists(modelfile), "Error: model definition not found" AssertionError: Error: model definition not found

    opened by gaurang-dev7 0
Owner
Edu
Edu
Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

ADE20k Semantic segmentation with MAE Getting started Install the mmsegmentation

null 97 Dec 17, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP Abstract: We introduce a method that allows to automatically se

Daniil Pakhomov 134 Dec 19, 2022
Train neural network for semantic segmentation (deep lab V3) with pytorch in less then 50 lines of code

Train neural network for semantic segmentation (deep lab V3) with pytorch in 50 lines of code Train net semantic segmentation net using Trans10K datas

null 17 Dec 19, 2022
PyTorch for Semantic Segmentation

PyTorch for Semantic Segmentation This repository contains some models for semantic segmentation and the pipeline of training and testing models, impl

Zijun Deng 1.7k Jan 6, 2023
A semantic segmentation toolbox based on PyTorch

Introduction vedaseg is an open source semantic segmentation toolbox based on PyTorch. Features Modular Design We decompose the semantic segmentation

null 407 Dec 15, 2022
HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

: We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. Furthermore, to allow maximal adaptivity, the weights at each decoder block vary spatially. For this purpose, we design a new type of hypernetwork, composed of a nested U-Net for drawing higher level context features

Yuval Nirkin 182 Dec 14, 2022
PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab 42 Nov 9, 2022
Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation

Segmenter: Transformer for Semantic Segmentation Segmenter: Transformer for Semantic Segmentation by Robin Strudel*, Ricardo Garcia*, Ivan Laptev and

null 594 Jan 6, 2023
PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Shape-aware Convolutional Layer (ShapeConv) PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentatio

Hanchao Leng 82 Dec 29, 2022
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da

MIT CSAIL Computer Vision 4.5k Jan 8, 2023
Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

null 87 Oct 19, 2022
An official implementation of "Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation" (ICCV 2021) in PyTorch.

Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation This is an official implementation of the paper "Exploiting a Joint

CV Lab @ Yonsei University 35 Oct 26, 2022
Semantic Segmentation with Pytorch-Lightning

This is a simple demo for performing semantic segmentation on the Kitti dataset using Pytorch-Lightning and optimizing the neural network by monitoring and comparing runs with Weights & Biases.

Boris Dayma 58 Nov 18, 2022
The PyTorch implementation of DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision The PyTorch implementation of DiscoBox: Weakly Supe

Shiyi Lan 1 Oct 23, 2021
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da

MIT CSAIL Computer Vision 4.5k Jan 8, 2023
PyTorch implementation of Memory-based semantic segmentation for off-road unstructured natural environments.

MemSeg: Memory-based semantic segmentation for off-road unstructured natural environments Introduction This repository is a PyTorch implementation of

null 11 Nov 28, 2022