TensorFlow-based implementation of "ICNet for Real-Time Semantic Segmentation on High-Resolution Images".

HsuanKung Yang

Last update: Nov 27, 2022

Related tags

Deep Learning real-time tensorflow semantic-segmentation cityscapes ade20k icnet

Overview

ICNet_tensorflow

This repo provides a TensorFlow-based implementation of paper "ICNet for Real-Time Semantic Segmentation on High-Resolution Images," by Hengshuang Zhao, and et. al. (ECCV'18).

The model generates segmentation mask for every pixel in the image. It's based on the ResNet50 with totally three branches as auxiliary paths, see architecture below for illustration.

We provide both training and inference code in this repo. The pre-trained models we provided are converted from caffe weights in Official Implementation.

News (2018.10.22 updated):

Now you can try ICNet on your own image online using ModelDepot live demo!

Environment Setup
Download Weights
Download Dataset
- ade20k
- cityscapes
Get Started!

Environment Setup

pip install tensorflow-gpu opencv-python jupyter matplotlib tqdm

Download Weights

We provide pre-trained weights for cityscapes and ADE20k dataset. You can download the weights easily use following command,

python script/download_weights.py --dataset cityscapes (or ade20k)

Download Dataset (Optional)

If you want to evaluate the provided weights or keep fine-tuning on cityscapes and ade20k dataset, you need to download them using different methods.

ADE20k dataset

Simply run following command:

bash script/download_ADE20k.sh

Cityscapes dataset

You need to download Cityscape dataset from Official website first (you'll need to request access which may take couple of days).

Then convert downloaded dataset ground truth to training format by following instructions to install cityscapesScripts then running these commands:

export CITYSCAPES_DATASET=<cityscapes dataset path>
csCreateTrainIdLabelImgs

Get started!

This repo provide three phases with full documented, which means you can try train/evaluate/inference on your own.

Inference on your own image

demo.ipynb show the easiest example to run semantic segmnetation on your own image.

In the end of demo.ipynb, you can test the speed of ICNet.

Here are some results run on Titan Xp with high resolution images (1024x2048):
~0.037(s) per images, which means we can get ~27 fps (nearly same as described in paper).

Evaluate on cityscapes/ade20k dataset

To get the results, you need to follow the steps metioned above to download dataset first.
Then you need to change the data_dir path in config.py.

CITYSCAPES_DATA_DIR = '/data/cityscapes_dataset/cityscape/'
ADE20K_DATA_DIR = './data/ADEChallengeData2016/'

Cityscapes

Perform in single-scaled model on the cityscapes validation dataset. (We have sucessfully re-produced the performance same to caffe framework).

Model	Accuracy	Model	Accuracy
train_30k	67.26%/67.7%	train_30k_bn	67.31%/67.7%
trainval_90k	80.90%	trainval_90k_bn	0.8081%

Run following command to get evaluation results,

python evaluate.py --dataset=cityscapes --filter-scale=1 --model=trainval

List of Args:

--model=train       - To select train_30k model
--model=trainval    - To select trainval_90k model
--model=train_bn    - To select train_30k_bn model
--model=trainval_bn - To select trainval_90k_bn model

ADE20k

Reach 32.25%mIoU on ADE20k validation set.

python evaluate.py --dataset=ade20k --filter-scale=2 --model=others

Note: to use model provided by us, set filter-scale to 2.

Training on your own dataset

This implementation is different from the details descibed in ICNet paper, since I did not re-produce model compression part. Instead, we train on the half kernels directly.

In orignal paper, the authod trained the model in full kernels and then performed model-pruning techique to kill half kernels. Here we use --filter-scale to denote whether pruning or not.

For example, --filter-scale=1 <-> [h, w, 32] and --filter-scale=2 <-> [h, w, 64].

Step by Step

1. Change the configurations in utils/config.py.

cityscapes_param = {'name': 'cityscapes',
                    'num_classes': 19,
                    'ignore_label': 255,
                    'eval_size': [1025, 2049],
                    'eval_steps': 500,
                    'eval_list': CITYSCAPES_eval_list,
                    'train_list': CITYSCAPES_train_list,
                    'data_dir': CITYSCAPES_DATA_DIR}

2. Set Hyperparameters in train.py,

class TrainConfig(Config):
    def __init__(self, dataset, is_training,  filter_scale=1, random_scale=None, random_mirror=None):
        Config.__init__(self, dataset, is_training, filter_scale, random_scale, random_mirror)

    # Set pre-trained weights here (You can download weight using `python script/download_weights.py`) 
    # Note that you need to use "bnnomerge" version.
    model_weight = './model/cityscapes/icnet_cityscapes_train_30k_bnnomerge.npy'
    
    # Set hyperparameters here, you can get much more setting in Config Class, see 'utils/config.py' for details.
    LAMBDA1 = 0.16
    LAMBDA2 = 0.4
    LAMBDA3 = 1.0
    BATCH_SIZE = 4
    LEARNING_RATE = 5e-4

3. Run following command and decide whether to update mean/var or train beta/gamma variable.

python train.py --update-mean-var --train-beta-gamma \
      --random-scale --random-mirror --dataset cityscapes --filter-scale 2

Note: Be careful to use --update-mean-var! Use this flag means you will update the moving mean and moving variance in batch normalization layer. This need large batch size, otherwise it will lead bad results.

Result (inference with my own data)

Citation

@article{zhao2017icnet,
  author = {Hengshuang Zhao and
            Xiaojuan Qi and
            Xiaoyong Shen and
            Jianping Shi and
            Jiaya Jia},
  title = {ICNet for Real-Time Semantic Segmentation on High-Resolution Images},
  journal={arXiv preprint arXiv:1704.08545},
  year = {2017}
}

@inproceedings{zhou2017scene,
    title={Scene Parsing through ADE20K Dataset},
    author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
    year={2017}
}

@article{zhou2016semantic,
  title={Semantic understanding of scenes through the ade20k dataset},
  author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
  journal={arXiv preprint arXiv:1608.05442},
  year={2016}
}

If you find this implementation or the pre-trained models helpful, please consider to cite:

@misc{Yang2018,
  author = {Hsuan-Kung, Yang},
  title = {ICNet-tensorflow},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/hellochick/ICNet-tensorflow}}
}

Comments

train the model

Hello,

I am interested in your work and would like to try using your code to train my dataset. Would you provide the training code and give me some suggestions about this?

Thank you so much.

opened by HoracceFeng 10
Error arises from the size of own training data
Hello, I tried to use own dataset to train the model, but it has some problems. The following are the steps I've done:

note: The dataset has 32 classes, and the image size is 360*480.

changed the "INPUT_SIZE" to 360*480 in train.py

changed the "NUM_CLASSES" to 32 in train.py

To use your pretrained model, I modified the name of "conv6_cls", "sub4_out" and "sub24_out" in class ICNet_BN, since they all related to num_classes. Then I set the "ignore_missing" to True in line 186 in train.py.

run the command "python train.py --update-mean-var --train-beta-gamma", but it show error "ValueError: Dimension 1 in both shapes must be equal, but are 44 and 45 From merging shape 0 with other shapes. for 'sub12_sum' (op: 'AddN') with input shapes: [32,44,60,128], [32,45,60,128]. "

change the "INPUT_SIZE" back to 480*480 (though it does not make sense)

again run the command "python train.py --update-mean-var --train-beta-gamma" and it successfully start to train.

try to use the checkpoint just trained to do inference, so I run the command "python inference.py --img-path=./input/test.png --model=others", but it has error (as I expected).

It seems that your model can train different size of image, and you also have updated the code enabling to inference in different size. I am still figuring out the code. I try to clearly explain what I thought. Any comments will be much appreciated. Really thanks your help in advance.
opened by YiSyuanChen 7
picture path in ade20k dataset seems not right

when I put this command python evaluate.py --dataset=cityscapes --filter-scale=2 --model=others ValueError: Failed to find file: /home/media/data/ade20k/images/validation/ADE_val_00000001.jpg but in fact the path is (in folder "validation", there are several folders to arrange pictures in alphabetical order )

[media@localhost validation]$ find . -name 'ADE_val_00000001.jpg' ./a/abbey/ADE_val_00000001.jpg

I download it on http://groups.csail.mit.edu/vision/datasets/ADE20K/ ,the full dataset(ADE20K_2016_07_26.zip)

After I copy all pictures in one folder , it shows

ValueError: Failed to find file: /home/media/data/ade20k/annotations/validation/ADE_val_00000001.png

but this png file doesn't exist

Is this a bug or I did something wrong?

opened by BeyondHeaven 6
class number issues

in train.py, for CITYSCAPES_DATASET, the NUM_CLASSES is set 19 (in line 29). from line 57 we can see parser.add_argument("--num-classes", type=int, default=NUM_CLASSES, help="Number of classes to predict (including background).") So it seems that the --num-classes should "including background".

However, from tools.py we can see that the label_colours (in line 6) includes 19 classes, without background class. it means if we including background class, the NUM_CLASSES should be 20.

So, any explanations about that? or I miss something?

opened by ifangcheng 6
Different eval metrics for model with/without bn

Hi,

When I evaluate on cityscapes using train_30k and train_30k_bnnomerge models, I am getting different mIOU of 65.6% and 59.3% respectively. As per my understanding, they should ideally give the same results. Am I missing something?

Thanks
help wanted

opened by alasin 6
Inference Time is too high

Hi, I have directly taken your code and just modified the path to the datasets, and then ran evaluate.py to measure the inference time. I am getting an inference time of 4sec(as opposed to ~0.04sec) on Tesla X GPU. Could you please point out on what could be the reason on why this could be happening.

Thanks, Sudhir

opened by skrya 6
What's your output node of your model file ?help.

Hi, friends, I am a postgraduate at school. I am trying to transplant this model file to my android examples, I added the following code in your inference.py to save your ckpt model file saver=tf.train.Saver() saver.save(sess,'checkpoint/ICNet.ckpt') Then I adopt the 'CkptToPb.py' written by myself to transform the ckpt model to pb file. But I need know your output node for your model, and I can not find something obviously about your output node, I need your help .Can you update your 'model.ckpt' to your git ? I really want to check it with mine to find what is wrong. I am so appreciated your warm heart, thank you very much. Thanks again sincerely.

opened by wangyarui 6
Own dataset training produces low loss, but unsatisfying results

Hi.

I got training working and used icnet_cityscapes_trainval_90k.npy with --filter-scale=1 The loss reduces quite quickly to 0.08 in ~2000 epochs which I presumed was quite good. But when doing evaluate or inference the output is garbage as seen on the example.

I had to change nr of classes, perhaps that could be the error or I have labaled them incorrectly?

Currently my image is shape (480, 870, 3) and gt shape is (480, 870) and gt values are in [2,5,8,10, etc] belonging to corresponding cityscapes classes.

My own one theory would be that the pretrained model is slowly transitionig to my classes and that is why the output is mixed, but that would not explain the constant low loss.

Do you have some other ideas perhaps?

Regards, Tamme

opened by Tamme 5
JPG image size error

I try to inference my own .jpg images with different sizes (e.g. 312492 264385 1435*832), i got invalidargument error like this: (inputs to operation sub12_sum of type AddN must have the same size and shape. input 0: [1，128，256，128]!=input 1: [1，49，33，128])

it seems that the code did have preprocess to resize the input image size to 1024*2048，so i got confused.

opened by ifangcheng 5
About RGB-BGR conversion in preprocess() in inference and evaluate

Hi there, I have a question concerning the preprocessing part in evaluate.py and inference.py: The channels get swapped from RGB to BGR, but I cannot find such a swapping in train.py. Why is it applied and did I miss it in train.py?

opened by Sparkofska 4
Increase in accuracy after fixing #41 ?

Hi @hellochick. After fixing the bug referenced in #41 , did you get a bump in accuracy too? The Trainval_bn model gives me around 80% class IoU. However I am not sure if there is some mistake.

opened by rydeldcosta 4
上一个项目

您好，您上一个项目的File "C:\ProgramFilesMyself\Anaconda3\envs\garbage2\lib\site-packages\tensorflow\python\client\session.py", line 1095, in _run 'Cannot interpret feed_dict key as Tensor: ' + e.args[0]) TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("input_1:0", shape=(?, 456, 456, 3), dtype=float32) is not an element of this graph. 解决了吗，我看到您在垃圾分类的项目下面也遇到了这个问题，过来请教一下您

opened by ziduGithub 0
Assign requires shapes of both tensors to match. lhs shape= [13] rhs shape= [150]

I have change restore_var = tf.global_variables() to restore_var = [v for v in tf.global_variables() if 'conv6_cls' not in v.name] Update INPUT_SIZE to '512, 512' Update NUM_CLASSES to 13 filter_scale=2. model_type = 'others' when run python train.py --dataset others ,I get erros: raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [13] rhs shape= [150] [[node save/Assign_382 (defined at /home/sherry/cuimiao/Fabric_defect_detection/ICNet-tensorflow/network.py:79) ]]

Errors may have originated from an input operation. Input Source operations connected to node save/Assign_382: sub24_out/biases (defined at /home/sherry/cuimiao/Fabric_defect_detection/ICNet-tensorflow/network.py:145)

Original stack trace for 'save/Assign_382': File "train.py", line 167, in main() File "train.py", line 146, in main train_net.restore(cfg.model_weight, restore_var) File "/home/sherry/cuimiao/Fabric_defect_detection/ICNet-tensorflow/network.py", line 79, in restore loader = tf.train.Saver(var_list=tf.global_variables()) File "/home/sherry/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 825, in init self.build() File "/home/sherry/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 837, in build self._build(self._filename, build_save=True, build_restore=True) File "/home/sherry/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 875, in _build build_restore=build_restore) File "/home/sherry/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 508, in _build_internal restore_sequentially, reshape) File "/home/sherry/.local/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 350, in _AddRestoreOps assign_ops.append(saveable.restore(saveable_ten

opened by shakey-cuimiao 0
bad results of voc2012

As you said, after opening the mean-var parameter, you need a big batch, otherwise the result is very bad. This is how I am now. My results are very bad. What size batch do you use when training? I ’m training voc. Does the data set have any effect? @hellochick Hope your reply, thanks

opened by AishuaiYao 0
Training over-fitting after every epochs
I use train.py to train on my relabel ade20k dataset (150 to 5 classes) but couldn't get good result. mIoU is always at maximum 57%.

My training parameters:

Batch Size: 32

After plotting out the train loss and val loss graph, I found that the loss pattern is suspicious, it is shown below. Model is overfitting regularly after every epoch and I assume that's the reason why my model can't hit a higher mIoU. I have no idea why.
opened by songshan0321 0

Owner

HsuanKung Yang

GitHub

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

2.6k Jan 4, 2023

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

6.5k Jan 4, 2023

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting (RVM) English | 中文 Official repository for the paper Robust High-Resolution Video Matting with Temporal Guidance. RVM is specific

2 Aug 21, 2022

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

SERank An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow

44 Oct 20, 2022

Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

Fastformer-Keras Unofficial Tensorflow-Keras implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Tensorflo

10 Jan 30, 2022

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

Action Transformer A Self-Attention Model for Short-Time Human Action Recognition This repository contains the official TensorFlow implementation of t

20 Jan 3, 2023

TensorFlow-based neural network library

Sonnet Documentation | Examples Sonnet is a library built on top of TensorFlow 2 designed to provide simple, composable abstractions for machine learn

9.5k Jan 7, 2023

NeuPy is a Tensorflow based python library for prototyping and building neural networks

NeuPy v0.8.2 NeuPy is a python library for prototyping and building neural networks. NeuPy uses Tensorflow as a computational backend for deep learnin

729 Jan 3, 2023

Model-based reinforcement learning in TensorFlow

Bellman Website | Twitter | Documentation (latest) What does Bellman do? Bellman is a package for model-based reinforcement learning (MBRL) in Python,

46 Nov 9, 2022

Deep Learning Package based on TensorFlow

White-Box-Layer is a Python module for deep learning built on top of TensorFlow and is distributed under the MIT license. The project was started in M

7 Dec 27, 2021

KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

80 Dec 27, 2022

Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

Realtime Face Anti-Spoofing Detection ?? Realtime Face Anti Spoofing Detection with Face Detector to detect real and fake faces Please star this repo

86 Aug 3, 2022

Curvlearn, a Tensorflow based non-Euclidean deep learning framework.

English | 简体中文 Why Non-Euclidean Geometry Considering these simple graph structures shown below. Nodes with same color has 2-hop distance whereas 1-ho

123 Dec 12, 2022

Simple embedding based text classifier inspired by fastText, implemented in tensorflow

FastText in Tensorflow This project is based on the ideas in Facebook's FastText but implemented in Tensorflow. However, it is not an exact replica of

306 Dec 2, 2022

A Tensorflow based library for Time Series Modelling with Gaussian Processes

Markovflow Documentation | Tutorials | API reference | Slack What does Markovflow do? Markovflow is a Python library for time-series analysis via prob

24 Dec 12, 2022

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

730 Jan 9, 2023

TensorFlow-based implementation of "ICNet for Real-Time Semantic Segmentation on High-Resolution Images".

Related tags

Overview

ICNet_tensorflow

News (2018.10.22 updated):

Table Of Contents

Environment Setup

Download Weights

Download Dataset (Optional)

ADE20k dataset

Cityscapes dataset

Get started!

Inference on your own image

Evaluate on cityscapes/ade20k dataset

Cityscapes

ADE20k

Training on your own dataset

Step by Step

Result (inference with my own data)

Citation

Comments

Owner

HsuanKung Yang

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

TensorFlow-based neural network library

NeuPy is a Tensorflow based python library for prototyping and building neural networks

Model-based reinforcement learning in TensorFlow

Deep Learning Package based on TensorFlow

KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

Curvlearn, a Tensorflow based non-Euclidean deep learning framework.

Simple embedding based text classifier inspired by fastText, implemented in tensorflow

A Tensorflow based library for Time Series Modelling with Gaussian Processes

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

Cascaded Pyramid Network (CPN) based on Keras (Tensorflow backend)

A lightweight face-recognition toolbox and pipeline based on tensorflow-lite

KoRean based ELECTRA pre-trained models (KR-ELECTRA) for Tensorflow and PyTorch