TensorFlow-based implementation of "ICNet for Real-Time Semantic Segmentation on High-Resolution Images".

Overview

ICNet_tensorflow

HitCount

This repo provides a TensorFlow-based implementation of paper "ICNet for Real-Time Semantic Segmentation on High-Resolution Images," by Hengshuang Zhao, and et. al. (ECCV'18).

The model generates segmentation mask for every pixel in the image. It's based on the ResNet50 with totally three branches as auxiliary paths, see architecture below for illustration.

We provide both training and inference code in this repo. The pre-trained models we provided are converted from caffe weights in Official Implementation.

News (2018.10.22 updated):

Now you can try ICNet on your own image online using ModelDepot live demo!

Table Of Contents

Environment Setup

pip install tensorflow-gpu opencv-python jupyter matplotlib tqdm

Download Weights

We provide pre-trained weights for cityscapes and ADE20k dataset. You can download the weights easily use following command,

python script/download_weights.py --dataset cityscapes (or ade20k)

Download Dataset (Optional)

If you want to evaluate the provided weights or keep fine-tuning on cityscapes and ade20k dataset, you need to download them using different methods.

ADE20k dataset

Simply run following command:

bash script/download_ADE20k.sh

Cityscapes dataset

You need to download Cityscape dataset from Official website first (you'll need to request access which may take couple of days).

Then convert downloaded dataset ground truth to training format by following instructions to install cityscapesScripts then running these commands:

export CITYSCAPES_DATASET=<cityscapes dataset path>
csCreateTrainIdLabelImgs

Get started!

This repo provide three phases with full documented, which means you can try train/evaluate/inference on your own.

Inference on your own image

demo.ipynb show the easiest example to run semantic segmnetation on your own image.

In the end of demo.ipynb, you can test the speed of ICNet.

Here are some results run on Titan Xp with high resolution images (1024x2048):
~0.037(s) per images, which means we can get ~27 fps (nearly same as described in paper).

Evaluate on cityscapes/ade20k dataset

To get the results, you need to follow the steps metioned above to download dataset first.
Then you need to change the data_dir path in config.py.

CITYSCAPES_DATA_DIR = '/data/cityscapes_dataset/cityscape/'
ADE20K_DATA_DIR = './data/ADEChallengeData2016/'

Cityscapes

Perform in single-scaled model on the cityscapes validation dataset. (We have sucessfully re-produced the performance same to caffe framework).

Model Accuracy Model Accuracy
train_30k   67.26%/67.7% train_30k_bn 67.31%/67.7%
trainval_90k 80.90% trainval_90k_bn 0.8081%

Run following command to get evaluation results,

python evaluate.py --dataset=cityscapes --filter-scale=1 --model=trainval

List of Args:

--model=train       - To select train_30k model
--model=trainval    - To select trainval_90k model
--model=train_bn    - To select train_30k_bn model
--model=trainval_bn - To select trainval_90k_bn model

ADE20k

Reach 32.25%mIoU on ADE20k validation set.

python evaluate.py --dataset=ade20k --filter-scale=2 --model=others

Note: to use model provided by us, set filter-scale to 2.

Training on your own dataset

This implementation is different from the details descibed in ICNet paper, since I did not re-produce model compression part. Instead, we train on the half kernels directly.

In orignal paper, the authod trained the model in full kernels and then performed model-pruning techique to kill half kernels. Here we use --filter-scale to denote whether pruning or not.

For example, --filter-scale=1 <-> [h, w, 32] and --filter-scale=2 <-> [h, w, 64].

Step by Step

1. Change the configurations in utils/config.py.

cityscapes_param = {'name': 'cityscapes',
                    'num_classes': 19,
                    'ignore_label': 255,
                    'eval_size': [1025, 2049],
                    'eval_steps': 500,
                    'eval_list': CITYSCAPES_eval_list,
                    'train_list': CITYSCAPES_train_list,
                    'data_dir': CITYSCAPES_DATA_DIR}

2. Set Hyperparameters in train.py,

class TrainConfig(Config):
    def __init__(self, dataset, is_training,  filter_scale=1, random_scale=None, random_mirror=None):
        Config.__init__(self, dataset, is_training, filter_scale, random_scale, random_mirror)

    # Set pre-trained weights here (You can download weight using `python script/download_weights.py`) 
    # Note that you need to use "bnnomerge" version.
    model_weight = './model/cityscapes/icnet_cityscapes_train_30k_bnnomerge.npy'
    
    # Set hyperparameters here, you can get much more setting in Config Class, see 'utils/config.py' for details.
    LAMBDA1 = 0.16
    LAMBDA2 = 0.4
    LAMBDA3 = 1.0
    BATCH_SIZE = 4
    LEARNING_RATE = 5e-4

3. Run following command and decide whether to update mean/var or train beta/gamma variable.

python train.py --update-mean-var --train-beta-gamma \
      --random-scale --random-mirror --dataset cityscapes --filter-scale 2

Note: Be careful to use --update-mean-var! Use this flag means you will update the moving mean and moving variance in batch normalization layer. This need large batch size, otherwise it will lead bad results.

Result (inference with my own data)

Citation

@article{zhao2017icnet,
  author = {Hengshuang Zhao and
            Xiaojuan Qi and
            Xiaoyong Shen and
            Jianping Shi and
            Jiaya Jia},
  title = {ICNet for Real-Time Semantic Segmentation on High-Resolution Images},
  journal={arXiv preprint arXiv:1704.08545},
  year = {2017}
}

@inproceedings{zhou2017scene,
    title={Scene Parsing through ADE20K Dataset},
    author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
    year={2017}
}

@article{zhou2016semantic,
  title={Semantic understanding of scenes through the ade20k dataset},
  author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
  journal={arXiv preprint arXiv:1608.05442},
  year={2016}
}

If you find this implementation or the pre-trained models helpful, please consider to cite:

@misc{Yang2018,
  author = {Hsuan-Kung, Yang},
  title = {ICNet-tensorflow},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/hellochick/ICNet-tensorflow}}
}
Comments
  • train the model

    train the model

    Hello,

    I am interested in your work and would like to try using your code to train my dataset. Would you provide the training code and give me some suggestions about this?

    Thank you so much.

    opened by HoracceFeng 10
  • Error arises from the size of own training data

    Error arises from the size of own training data

    Hello, I tried to use own dataset to train the model, but it has some problems. The following are the steps I've done:

    note: The dataset has 32 classes, and the image size is 360*480.

    1. changed the "INPUT_SIZE" to 360*480 in train.py
    2. changed the "NUM_CLASSES" to 32 in train.py
    3. To use your pretrained model, I modified the name of "conv6_cls", "sub4_out" and "sub24_out" in class ICNet_BN, since they all related to num_classes. Then I set the "ignore_missing" to True in line 186 in train.py.
    4. run the command "python train.py --update-mean-var --train-beta-gamma", but it show error "ValueError: Dimension 1 in both shapes must be equal, but are 44 and 45 From merging shape 0 with other shapes. for 'sub12_sum' (op: 'AddN') with input shapes: [32,44,60,128], [32,45,60,128]. "
    5. change the "INPUT_SIZE" back to 480*480 (though it does not make sense)
    6. again run the command "python train.py --update-mean-var --train-beta-gamma" and it successfully start to train.
    7. try to use the checkpoint just trained to do inference, so I run the command "python inference.py --img-path=./input/test.png --model=others", but it has error (as I expected).

    It seems that your model can train different size of image, and you also have updated the code enabling to inference in different size. I am still figuring out the code. I try to clearly explain what I thought. Any comments will be much appreciated. Really thanks your help in advance.

    opened by YiSyuanChen 7
  • picture path in ade20k dataset seems not right

    picture path in ade20k dataset seems not right

    when I put this command python evaluate.py --dataset=cityscapes --filter-scale=2 --model=others ValueError: Failed to find file: /home/media/data/ade20k/images/validation/ADE_val_00000001.jpg but in fact the path is (in folder "validation", there are several folders to arrange pictures in alphabetical order )

    [media@localhost validation]$ find . -name 'ADE_val_00000001.jpg' ./a/abbey/ADE_val_00000001.jpg

    I download it on http://groups.csail.mit.edu/vision/datasets/ADE20K/ ,the full dataset(ADE20K_2016_07_26.zip)

    After I copy all pictures in one folder , it shows

    ValueError: Failed to find file: /home/media/data/ade20k/annotations/validation/ADE_val_00000001.png

    but this png file doesn't exist

    Is this a bug or I did something wrong?

    opened by BeyondHeaven 6
  • class number issues

    class number issues

    in train.py, for CITYSCAPES_DATASET, the NUM_CLASSES is set 19 (in line 29). from line 57 we can see parser.add_argument("--num-classes", type=int, default=NUM_CLASSES, help="Number of classes to predict (including background).") So it seems that the --num-classes should "including background".

    However, from tools.py we can see that the label_colours (in line 6) includes 19 classes, without background class. it means if we including background class, the NUM_CLASSES should be 20.

    So, any explanations about that? or I miss something?

    opened by ifangcheng 6
  • Different eval metrics for model with/without bn

    Different eval metrics for model with/without bn

    Hi,

    When I evaluate on cityscapes using train_30k and train_30k_bnnomerge models, I am getting different mIOU of 65.6% and 59.3% respectively. As per my understanding, they should ideally give the same results. Am I missing something?

    Thanks

    help wanted 
    opened by alasin 6
  • Inference Time is too high

    Inference Time is too high

    Hi, I have directly taken your code and just modified the path to the datasets, and then ran evaluate.py to measure the inference time. I am getting an inference time of 4sec(as opposed to ~0.04sec) on Tesla X GPU. Could you please point out on what could be the reason on why this could be happening.

    Thanks, Sudhir

    opened by skrya 6
  • What's your output node of your model file ?help.

    What's your output node of your model file ?help.

    Hi, friends, I am a postgraduate at school. I am trying to transplant this model file to my android examples, I added the following code in your inference.py to save your ckpt model file saver=tf.train.Saver() saver.save(sess,'checkpoint/ICNet.ckpt') Then I adopt the 'CkptToPb.py' written by myself to transform the ckpt model to pb file. But I need know your output node for your model, and I can not find something obviously about your output node, I need your help .Can you update your 'model.ckpt' to your git ? I really want to check it with mine to find what is wrong. I am so appreciated your warm heart, thank you very much. Thanks again sincerely.

    opened by wangyarui 6
  • Own dataset training produces low loss, but unsatisfying results

    Own dataset training produces low loss, but unsatisfying results

    Hi.

    I got training working and used icnet_cityscapes_trainval_90k.npy with --filter-scale=1 The loss reduces quite quickly to 0.08 in ~2000 epochs which I presumed was quite good. But when doing evaluate or inference the output is garbage as seen on the example.

    b113-994_clipped b113-994_clipped

    I had to change nr of classes, perhaps that could be the error or I have labaled them incorrectly?

    Currently my image is shape (480, 870, 3) and gt shape is (480, 870) and gt values are in [2,5,8,10, etc] belonging to corresponding cityscapes classes.

    My own one theory would be that the pretrained model is slowly transitionig to my classes and that is why the output is mixed, but that would not explain the constant low loss.

    Do you have some other ideas perhaps?

    Regards, Tamme

    opened by Tamme 5
  • JPG image size error

    JPG image size error

    I try to inference my own .jpg images with different sizes (e.g. 312492 264385 1435*832), i got invalidargument error like this: (inputs to operation sub12_sum of type AddN must have the same size and shape. input 0: [1,128,256,128]!=input 1: [1,49,33,128])

    it seems that the code did have preprocess to resize the input image size to 1024*2048,so i got confused.

    opened by ifangcheng 5
  • About RGB-BGR conversion in preprocess() in inference and evaluate

    About RGB-BGR conversion in preprocess() in inference and evaluate

    Hi there, I have a question concerning the preprocessing part in evaluate.py and inference.py: The channels get swapped from RGB to BGR, but I cannot find such a swapping in train.py. Why is it applied and did I miss it in train.py?

    opened by Sparkofska 4
  • Increase in accuracy after fixing #41 ?

    Increase in accuracy after fixing #41 ?

    Hi @hellochick. After fixing the bug referenced in #41 , did you get a bump in accuracy too? The Trainval_bn model gives me around 80% class IoU. However I am not sure if there is some mistake.

    opened by rydeldcosta 4
  • 上一个项目

    上一个项目

    您好,您上一个项目的File "C:\ProgramFilesMyself\Anaconda3\envs\garbage2\lib\site-packages\tensorflow\python\client\session.py", line 1095, in _run 'Cannot interpret feed_dict key as Tensor: ' + e.args[0]) TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("input_1:0", shape=(?, 456, 456, 3), dtype=float32) is not an element of this graph. 解决了吗,我看到您在垃圾分类的项目下面也遇到了这个问题,过来请教一下您

    opened by ziduGithub 0
  • Assign requires shapes of both tensors to match. lhs shape= [13] rhs shape= [150]

    Assign requires shapes of both tensors to match. lhs shape= [13] rhs shape= [150]

    I have change restore_var = tf.global_variables() to restore_var = [v for v in tf.global_variables() if 'conv6_cls' not in v.name] Update INPUT_SIZE to '512, 512' Update NUM_CLASSES to 13 filter_scale=2. model_type = 'others' when run python train.py --dataset others ,I get erros: raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [13] rhs shape= [150] [[node save/Assign_382 (defined at /home/sherry/cuimiao/Fabric_defect_detection/ICNet-tensorflow/network.py:79) ]]

    Errors may have originated from an input operation. Input Source operations connected to node save/Assign_382: sub24_out/biases (defined at /home/sherry/cuimiao/Fabric_defect_detection/ICNet-tensorflow/network.py:145)

    Original stack trace for 'save/Assign_382': File "train.py", line 167, in main() File "train.py", line 146, in main train_net.restore(cfg.model_weight, restore_var) File "/home/sherry/cuimiao/Fabric_defect_detection/ICNet-tensorflow/network.py", line 79, in restore loader = tf.train.Saver(var_list=tf.global_variables()) File "/home/sherry/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 825, in init self.build() File "/home/sherry/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 837, in build self._build(self._filename, build_save=True, build_restore=True) File "/home/sherry/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 875, in _build build_restore=build_restore) File "/home/sherry/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 508, in _build_internal restore_sequentially, reshape) File "/home/sherry/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 350, in _AddRestoreOps assign_ops.append(saveable.restore(saveable_ten

    opened by shakey-cuimiao 0
  • bad results of voc2012

    bad results of voc2012

    As you said, after opening the mean-var parameter, you need a big batch, otherwise the result is very bad. This is how I am now. My results are very bad. What size batch do you use when training? I ’m training voc. Does the data set have any effect? @hellochick Hope your reply, thanks

    opened by AishuaiYao 0
  • Training over-fitting after every epochs

    Training over-fitting after every epochs

    I use train.py to train on my relabel ade20k dataset (150 to 5 classes) but couldn't get good result. mIoU is always at maximum 57%.

    My training parameters:

    • Batch Size: 32

    After plotting out the train loss and val loss graph, I found that the loss pattern is suspicious, it is shown below. Model is overfitting regularly after every epoch and I assume that's the reason why my model can't hit a higher mIoU. I have no idea why. loss

    opened by songshan0321 0
Owner
HsuanKung Yang
HsuanKung Yang
TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

null 2.6k Jan 4, 2023
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Peter Lin 6.5k Jan 4, 2023
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting (RVM) English | 中文 Official repository for the paper Robust High-Resolution Video Matting with Temporal Guidance. RVM is specific

flow-dev 2 Aug 21, 2022
An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

SERank An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow

Zhihu 44 Oct 20, 2022
Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

Fastformer-Keras Unofficial Tensorflow-Keras implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Tensorflo

Yam Peleg 10 Jan 30, 2022
The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

Action Transformer A Self-Attention Model for Short-Time Human Action Recognition This repository contains the official TensorFlow implementation of t

PIC4SeRCentre 20 Jan 3, 2023
TensorFlow-based neural network library

Sonnet Documentation | Examples Sonnet is a library built on top of TensorFlow 2 designed to provide simple, composable abstractions for machine learn

DeepMind 9.5k Jan 7, 2023
NeuPy is a Tensorflow based python library for prototyping and building neural networks

NeuPy v0.8.2 NeuPy is a python library for prototyping and building neural networks. NeuPy uses Tensorflow as a computational backend for deep learnin

Yurii Shevchuk 729 Jan 3, 2023
Model-based reinforcement learning in TensorFlow

Bellman Website | Twitter | Documentation (latest) What does Bellman do? Bellman is a package for model-based reinforcement learning (MBRL) in Python,

null 46 Nov 9, 2022
Deep Learning Package based on TensorFlow

White-Box-Layer is a Python module for deep learning built on top of TensorFlow and is distributed under the MIT license. The project was started in M

YeongHyeon Park 7 Dec 27, 2021
KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

null 80 Dec 27, 2022
Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

Realtime Face Anti-Spoofing Detection ?? Realtime Face Anti Spoofing Detection with Face Detector to detect real and fake faces Please star this repo

Prem Kumar 86 Aug 3, 2022
Curvlearn, a Tensorflow based non-Euclidean deep learning framework.

English | 简体中文 Why Non-Euclidean Geometry Considering these simple graph structures shown below. Nodes with same color has 2-hop distance whereas 1-ho

Alibaba 123 Dec 12, 2022
Simple embedding based text classifier inspired by fastText, implemented in tensorflow

FastText in Tensorflow This project is based on the ideas in Facebook's FastText but implemented in Tensorflow. However, it is not an exact replica of

Alan Patterson 306 Dec 2, 2022
A Tensorflow based library for Time Series Modelling with Gaussian Processes

Markovflow Documentation | Tutorials | API reference | Slack What does Markovflow do? Markovflow is a Python library for time-series analysis via prob

Secondmind Labs 24 Dec 12, 2022
Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

null 730 Jan 9, 2023
Cascaded Pyramid Network (CPN) based on Keras (Tensorflow backend)

ML2 Takehome Project Reimplementing the paper: Cascaded Pyramid Network for Multi-Person Pose Estimation Dataset The model uses the COCO dataset which

Vo Van Tu 1 Nov 22, 2021
A lightweight face-recognition toolbox and pipeline based on tensorflow-lite

FaceIDLight ?? Description A lightweight face-recognition toolbox and pipeline based on tensorflow-lite with MTCNN-Face-Detection and ArcFace-Face-Rec

Martin Knoche 16 Dec 7, 2022
KoRean based ELECTRA pre-trained models (KR-ELECTRA) for Tensorflow and PyTorch

KoRean based ELECTRA (KR-ELECTRA) This is a release of a Korean-specific ELECTRA model with comparable or better performances developed by the Computa

null 12 Jun 3, 2022