TensorFlow implementation of ENet, trained on the Cityscapes dataset.

Overview

segmentation

TensorFlow implementation of ENet (https://arxiv.org/pdf/1606.02147.pdf) based on the official Torch implementation (https://github.com/e-lab/ENet-training) and the Keras implementation by PavlosMelissinos (https://github.com/PavlosMelissinos/enet-keras), trained on the Cityscapes dataset (https://www.cityscapes-dataset.com/).

  • Youtube video of results (https://youtu.be/HbPhvct5kvs):

  • demo video with results

  • The results in the video can obviously be improved, but because of limited computing resources (personally funded Azure VM) I did not perform any further hyperparameter tuning.


You might get the error "No gradient defined for operation 'MaxPoolWithArgmax_1' (op type: MaxPoolWithArgmax)". To fix this, I had to add the following code to the file /usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_grad.py:

@ops.RegisterGradient("MaxPoolWithArgmax")  
def _MaxPoolGradWithArgmax(op, grad, unused_argmax_grad):  
  return gen_nn_ops._max_pool_grad_with_argmax(op.inputs[0], grad, op.outputs[1], op.get_attr("ksize"), op.get_attr("strides"), padding=op.get_attr("padding"))  

Documentation:

preprocess_data.py:

  • ASSUMES: that all Cityscapes training (validation) image directories have been placed in data_dir/cityscapes/leftImg8bit/train (data_dir/cityscapes/leftImg8bit/val) and that all corresponding ground truth directories have been placed in data_dir/cityscapes/gtFine/train (data_dir/cityscapes/gtFine/val).
  • DOES: script for performing all necessary preprocessing of images and labels.

model.py:

  • ASSUMES: that preprocess_data.py has already been run.
  • DOES: contains the ENet_model class.

utilities.py:

  • ASSUMES: -
  • DOES: contains a number of functions used in different parts of the project.

train.py:

  • ASSUMES: that preprocess_data.py has already been run.
  • DOES: script for training the model.

run_on_sequence.py:

  • ASSUMES: that preprocess_data.py has already been run.
  • DOES: runs a model checkpoint (set in line 56) on all frames in a Cityscapes demo sequence directory (set in line 30) and creates a video of the result.

Training details:

  • In the paper the authors suggest that you first pretrain the encoder to categorize downsampled regions of the input images, I did however train the entire network from scratch.

  • Batch size: 4.

  • For all other hyperparameters I used the same values as in the paper.

  • Training loss:

  • training loss

  • Validation loss:

  • validation loss

  • The results in the video above was obtained with the model at epoch 23, for which a checkpoint is included in segmentation/training_logs/best_model in the repo.


Training on Microsoft Azure:

To train the model, I used an NC6 virtual machine on Microsoft Azure. Below I have listed what I needed to do in order to get started, and some things I found useful. For reference, my username was 'fregu856':

#!/bin/bash

# DEFAULT VALUES
GPUIDS="0"
NAME="fregu856_GPU"


NV_GPU="$GPUIDS" nvidia-docker run -it --rm \
        -p 5584:5584 \
        --name "$NAME""$GPUIDS" \
        -v /home/fregu856:/root/ \
        tensorflow/tensorflow:latest-gpu bash
  • /root/ will now be mapped to /home/fregu856 (i.e., $ cd -- takes you to the regular home folder).

  • To start the image:

    • $ sudo sh start_docker_image.sh
  • To commit changes to the image:

    • Open a new terminal window.
    • $ sudo docker commit fregu856_GPU0 tensorflow/tensorflow:latest-gpu
  • To stop the image when it’s running:

    • $ sudo docker stop fregu856_GPU0
  • To exit the image without killing running code:

    • Ctrl-P + Q
  • To get back into a running image:

    • $ sudo docker attach fregu856_GPU0
  • To open more than one terminal window at the same time:

    • $ sudo docker exec -it fregu856_GPU0 bash
  • To install the needed software inside the docker image:

    • $ apt-get update
    • $ apt-get install nano
    • $ apt-get install sudo
    • $ apt-get install wget
    • $ sudo apt-get install libopencv-dev python-opencv
    • Commit changes to the image (otherwise, the installed packages will be removed at exit!)
Comments
  • Cityscapes Test Result

    Cityscapes Test Result

    Does your Cityscapes test dataset have labels?I have downloaded a label without a test data set from the official website.So I can't get the test mean IOU

    opened by InstantWindy 1
  • the test result is awful

    the test result is awful

    I use the Python3 and train the model but the test result is awful I can't see the segmentation at all however the validation images work how to solve the problem? also, the saver shouldn't be restore the ckpt file in the version2 there will be four files~ I solve the problem thanks for your amazing work~

    opened by Fengmoon93 1
  • Dimention error

    Dimention error

    Hello,

    When i try to compile the code (Python3.6.4 & TensorFlow 1.8.0) during training i get the following error.


    C:/Users/s01syed/PycharmProjects/ENet/train.py [4, 256, 512, 16] [4, 128, 256, 64] [4, 128, 256, 64] [4, 128, 256, 64] [4, 128, 256, 64] [4, 128, 256, 64] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 64, 128, 128] [4, 128, 256, 64] [4, 128, 256, 64] [4, 128, 256, 64] [4, 256, 512, 16] [4, 256, 512, 16] [4, 512, 1024, 20] Traceback (most recent call last): File "C:\Users\s01syed\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1567, in _create_c_op c_op = c_api.TF_FinishOperation(op_desc) tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 20 and 3 for 'mul_254' (op: 'Mul') with input shapes: [4,512,1024,20], [3].

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "C:/Users/s01syed/PycharmProjects/ENet/train.py", line 27, in batch_size=batch_size) File "C:\Users\s01syed\PycharmProjects\ENet\model.py", line 42, in init self.add_loss_op() File "C:\Users\s01syed\PycharmProjects\ENet\model.py", line 237, in add_loss_op weights = self.onehot_labels_ph*self.class_weights File "C:\Users\s01syed\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py", line 979, in binary_op_wrapper return func(x, y, name=name) File "C:\Users\s01syed\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1211, in _mul_dispatch return gen_math_ops.mul(x, y, name=name) File "C:\Users\s01syed\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 5066, in mul "Mul", x=x, y=y, name=name) File "C:\Users\s01syed\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "C:\Users\s01syed\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3392, in create_op op_def=op_def) File "C:\Users\s01syed\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1734, in init control_input_ops) File "C:\Users\s01syed\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1570, in _create_c_op raise ValueError(str(e)) ValueError: Dimensions must be equal, but are 20 and 3 for 'mul_254' (op: 'Mul') with input shapes: [4,512,1024,20], [3].

    Process finished with exit code 1


    I saw a similar error for Keras, which was caused due to the version upgrade https://github.com/datalogue/keras-attention/issues/1. I really want to implement this code but im stuck, could you please have a look.

    Thanks :)

    UPDATE: Solved the Error, it was due to passing a wrong input to class_weights

    opened by Saif-03 0
  • Modernize Python 2 code to get ready for Python 3

    Modernize Python 2 code to get ready for Python 3

    Make the minimal, safe changes required to convert the repo's code to be syntax compatible with both Python 2 and Python 3. There might be other work required to complete the port to Python 3 but this is a minimal, safe first step.

    opened by cclauss 0
  • Validation Loss

    Validation Loss

    Hi @fregu856 Inspite of training the model for 200 epochs the training loss is fluctuating between 4.9-5, and validation loss is fluctuating between 7.5-8.2 . The model is not working on random images, it just works on cityscapes dataset. How to reduce the loss, how to calculate the accuracy and mIoU for the training and validation set and also how to make the model more generic?

    Thanking you in anticipation

    opened by Ysmita 0
  • how to load pretrained weights for training instead training from scratch

    how to load pretrained weights for training instead training from scratch

    I have trained for 100 epochs ,still i need to continue training for minimizing the error but not interested to train from scratch...I have weights of 98th epoch of my training. How can I load the weights and continue the training from 98th epoch

    opened by DikshitDHegde 0
  • Frozen version of the net

    Frozen version of the net

    Hi, Thanks for your great implementation. I would be interested to get a "frozen" version of the net.

    Could you add to the repository the frozen version of the graph, please ? You can take a look here : https://github.com/tensorflow/tensorflow/blob/r1.8/tensorflow/python/tools/freeze_graph.py or https://github.com/tensorflow/models/tree/master/research/slim#freezing-the-exported-graph

    Thank you in advance

    opened by LucasMahieu 0
  • Prediction problem

    Prediction problem

    I'm getting prediction for some classes like pattern, it is not filling the color completely for some classes in its region. You can find attached predicted image, could any one please help me out... 1

    opened by Girisha10 2
Owner
Fredrik Gustafsson
PhD student whose research focuses on probabilistic deep learning for automotive computer vision applications.
Fredrik Gustafsson
Realtime segmentation with ENet, the fast and accurate segmentation net.

Enet This is a realtime segmentation net with almost 22 fps on GTX1080 ti, and the model size is very small with only 28M. This repo contains the infe

JinTian 14 Aug 30, 2022
Annotate datasets with a semi-trained or fully trained YOLOv5 model

YOLOv5 Auto Annotator Annotate datasets with a semi-trained or fully trained YOLOv5 model Prerequisites Ubuntu >=20.04 Python >=3.7 System dependencie

Akash James 3 May 14, 2022
A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution.

Awesome Pretrained StyleGAN2 A collection of pre-trained StyleGAN2 models trained on different datasets at different resolution. Note the readme is a

Justin 1.1k Dec 24, 2022
PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset

Reference-Based-Sketch-Image-Colorization-ImageNet This is a PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization usin

Yuzhi ZHAO 11 Jul 28, 2022
Implementation of CVAE. Trained CVAE on faces from UTKFace Dataset to produce synthetic faces with a given degree of happiness/smileyness.

Conditional Smiles! (SmileCVAE) About Implementation of AE, VAE and CVAE. Trained CVAE on faces from UTKFace Dataset. Using an encoding of the Smile-s

Raúl Ortega 3 Jan 9, 2022
PyTorch implementation of a Real-ESRGAN model trained on custom dataset

Real-ESRGAN PyTorch implementation of a Real-ESRGAN model trained on custom dataset. This model shows better results on faces compared to the original

Sber AI 160 Jan 4, 2023
Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"

Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition" Pre-trained Deep Convo

Ankush Malaker 5 Nov 11, 2022
Repository to run object detection on a model trained on an autonomous driving dataset.

Autonomous Driving Object Detection on the Raspberry Pi 4 Description of Repository This repository contains code and instructions to configure the ne

Ethan 51 Nov 17, 2022
Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

ERICA Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive L

THUNLP 75 Nov 2, 2022
Semantic Segmentation of images using PixelLib with help of Pascalvoc dataset trained with Deeplabv3+ framework.

CARscan- Approach 1 - Segmentation of images by detecting contours. It failed because in images with elements along with cars were also getting detect

Padmanabha Banerjee 5 Jul 29, 2021
LIAO Shuiying 6 Dec 1, 2022
LSTM model trained on a small dataset of 3000 names written in PyTorch

LSTM model trained on a small dataset of 3000 names. Model generates names from model by selecting one out of top 3 letters suggested by model at a time until an EOS (End Of Sentence) character is not encountered.

Sahil Lamba 1 Dec 20, 2021
High level network definitions with pre-trained weights in TensorFlow

TensorNets High level network definitions with pre-trained weights in TensorFlow (tested with 2.1.0 >= TF >= 1.4.0). Guiding principles Applicability.

Taehoon Lee 1k Dec 13, 2022
KoRean based ELECTRA pre-trained models (KR-ELECTRA) for Tensorflow and PyTorch

KoRean based ELECTRA (KR-ELECTRA) This is a release of a Korean-specific ELECTRA model with comparable or better performances developed by the Computa

null 12 Jun 3, 2022
Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Portrait Photo Retouching with PPR10K Paper | Supplementary Material PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask an

null 184 Dec 11, 2022
This is the dataset and code release of the OpenRooms Dataset.

This is the dataset and code release of the OpenRooms Dataset.

Visual Intelligence Lab of UCSD 95 Jan 8, 2023
A large dataset of 100k Google Satellite and matching Map images, resembling pix2pix's Google Maps dataset.

Larger Google Sat2Map dataset This dataset extends the aerial ⟷ Maps dataset used in pix2pix (Isola et al., CVPR17). The provide script download_sat2m

null 34 Dec 28, 2022