LinkNet - This repository contains our Torch7 implementation of the network developed by us at e-Lab.

e-Lab

Last update: Nov 11, 2022

Related tags

Deep Learning LinkNet

Overview

LinkNet

This repository contains our Torch7 implementation of the network developed by us at e-Lab. You can go to our blogpost or read the article LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation for further details.

Dependencies:

Torch7 : you can follow our installation step specified here
VideoDecoder : video decoder for torch that utilizes avcodec library.
Profiler : use it to calculate # of paramaters, operations and forward pass time of any network trained using torch.

Currently the network can be trained on two datasets:

Datasets	Input Resolution	# of classes
CamVid (cv)	768x576	11
Cityscapes (cs)	1024x512	19

To download both datasets, follow the link provided above. Both the datasets are first of all resized by the training script and if you want then you can cache this resized data using --cachepath option. In case of CamVid dataset, the available video data is first split into train/validate/test set. This is done using prepCamVid.lua file. dataDistributionCV.txt contains the detail about splitting of CamVid dataset. These things are automatically run before training of the network.

LinkNet performance on both of the above dataset:

Datasets	Best IoU	Best iIoU
Cityscapes	76.44	60.78
CamVid	69.10	55.83

Pretrained models and confusion matrices for both datasets can be found in the latest release.

Files/folders and their usage:

run.lua : main file
opts.lua : contains all the input options used by the tranining script
data : data loaders for loading datasets
[models] : all the model architectures are defined here
train.lua : loading of models and error calculation
test.lua : calculate testing error and save confusion matrices

There are three model files present in models folder:

model.lua : our LinkNet architecture
model-res-dec.lua : LinkNet with residual connection in each of the decoder blocks. This slightly improves the result but we had to use bilinear interpolation in residual connection because of which we were not able to run our trained model on TX1.
nobypass.lua : this architecture does not use any link between encoder and decoder. You can use this model to verify if connecting encoder and decoder modules actually improve performance.

A sample command to train network is given below:

th main.lua --datapath /Datasets/Cityscapes/ --cachepath /dataCache/cityscapes/ --dataset cs --model models/model.lua --save /Models/cityscapes/ --saveTrainConf --saveAll --plot

License

This software is released under a creative commons license which allows for personal and research use only. For a commercial license please contact the authors. You can view a license summary here: http://creativecommons.org/licenses/by-nc/4.0/

Comments

memory consuming

The model read all the dataset into the momory, this method is too memory consuming. Maybe it is better to read the dataset list and iterate the list when training .

opened by mingminzhen 7
Training on camvid dataset

Hi. I can't reproduce your result on camvid dataset. What is the learning rate and number of training epoch you used in your training, is your published result on validate or test set?.

opened by vietdoan 4
Torch: not enough memory (17GB)

Hi, all

When I run : th main.lua --datapath /data2/cityscapes_dataset/leftImg8bit/all_train_images/ --cachepath /data2/cityscapes_dataset/leftImg8bit/dataCache/ --dataset cs --model models/model.lua --save save_models/cityscapes/ --saveTrainConf --saveAll --plot

I got "Torch: not enough memory: you tried to allocate 17GB" error (details)

It's strange because the paper mentioned it is trained using Titan X which has 12GB memory. Why the network consumes 17GB in running?

Any suggestion to fix this issue?

Thanks!

opened by amiltonwong 3
Fine Tuning

Hi,

is there any possibility to fine-tune this model on a custom datase with different number of classes? The pre-trained weights must be exist also, as I know.

opened by MyVanitar 3
Model input/output details?
Hi,

I'm having a hell of a time trying to understand what the model is expecting in terms of input and output. I'm trying to use this model in an iOS project, so I need to convert the model to Apple's CoreML format.

Image input questions:

For image pixel values: 0-255, 0-1, -1-1?

RGB or BGR?

Color bias?

Prediction output:

Looks like the shape is # of classes, width, height?

Predictions are positive floats from 0-100?

So far I'm having the best luck with these specifications:

import torch from torch2coreml import convert from torch.utils.serialization import load_lua model = load_lua("model-cs-IoU-cpu.net") input_shape = (3, 512, 1024) coreml_model = convert( model, [input_shape], input_names=['inputImage'], output_names=['outputImage'], image_input_names=['inputImage'], preprocessing_args={ 'image_scale': 2/255.0 } ) coreml_model.save("/home/sean/Downloads/Final/model-cs-IoU.mlmodel")
opened by seantempesta 2
About IoU

Hi, @codeAC29
I cannot obtain the high IoU in my training. I looked into your code and found that, the IoU is computed via averageValid. But this is actually computing the mean of class accuracy. The IoU should be the value of averageUnionValid. Do you notice the difference and obtain 76% IoU by averageUnionValid ?

Sorry for the trouble. For convenience, I refer the definition of averageValid and averageUnionValid here.

opened by qqning 2
Error while running linknet main file

Hii, I am getting this error while running main.py RuntimeError: Expected object of type torch.cuda.LongTensor but found type torch.cuda.FloatTensor for argument 2 'target'. Please help me out. Also when i try to run the trained models i am running into error. I am using pytorch to run .net files. I am not able to load them as it is showing error: name cs is not defined. It is a model. Why does it have a variable named cs(here cs represents cityscapes) in it?

opened by Tharun98 0
Model fails for input size other than multiples of 32(for depth of 4)

Hi, If we give the input image size other than 32 multiples there is a size mismatch error when adding the output from encoder3 and decoder4. For example input image size is 1000x2000 output of encoder3 is 63x125 and decoder4 output size is 64x126. We need adjust parameters for spatialfullconvolution layer only if input image size is multiple of 2^(n+1) where n is encoder depth. For other image sizes adjust parameter depends on the image size. In this example network works if adjust parameter is zero in decoders 3 and 4. Please clarify if this network works only for 2^(n+1) sizes. Thanks.

opened by Tharun98 1
How about the image resolution?

Hi, I am reproducing the LinkNet. I have a doubt about the input image resolution and the output image resolution when you compute the FLOPS. I find my FLOPS and running speed are different your results reported on your paper.

opened by ycszen 5
linknet architecture

iam trying to build linknet in caffe. Could you please help me in below qns: 1)Found that there are 5 downsampling and 6 updsampling by 2. if we have different no of up sampling and down sampling(6,5) how can we get the same output shape as input. Referred:https://arxiv.org/pdf/1707.03718.pdf 2)how many iterations you ran to get the proper results. 3)To match the encoder and decoder output shape i used crop layer before Eltwise instead of adding extra row or column. Will it make any difference?

opened by vishnureghu007 7
Error while training

I got the camVid dataset as specified in the in the read me file and installed video-decoder

Ientered the following command to start training: th main.lua --datapath ./data/CamVid/ --cachepath ./dataCache/CamV/ --dataset cv --model ./models/model.lua --save ./Models/CamV/ --saveTrainConf --saveAll --plot

And I got the following error,

Preparing CamVid dataset for data loader Filenames and their role found in: ./misc/dataDistributionCV.txt

Getting input images and labels for: 01TP_extract.avi /home/jayp/torch/install/bin/luajit: /home/jayp/torch/install/share/lua/5.1/trepl/init.lua:389: /home/jayp/torch/install/share/lua/5.1/trepl/init.lua:389: error loading module 'libvideo_decoder' from file '/home/jayp/torch/install/lib/lua/5.1/libvideo_decoder.so': /home/jayp/torch/install/lib/lua/5.1/libvideo_decoder.so: undefined symbol: avcodec_get_frame_defaults stack traceback: [C]: in function 'error' /home/jayp/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require' main.lua:34: in main chunk [C]: in function 'dofile' ...jayp/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk

I would really appreciate if anyone would help me with this.

Thank You!

opened by jay98 4

Releases(v1.0)

v1.0(Aug 7, 2017)

Contains trained models for cityscapes and camvid dataset.
Source code(tar.gz)
Source code(zip)
v1_0.zip(327.85 MB)

Owner

e-Lab

GitHub

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

SERank An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow

44 Oct 20, 2022

Train neural network for semantic segmentation (deep lab V3) with pytorch in less then 50 lines of code

Train neural network for semantic segmentation (deep lab V3) with pytorch in 50 lines of code Train net semantic segmentation net using Trans10K datas

17 Dec 19, 2022

This repository contains the code for our fast polygonal building extraction from overhead images pipeline.

Polygonal Building Segmentation by Frame Field Learning We add a frame field output to an image segmentation neural network to improve segmentation qu

186 Jan 4, 2023

This repository contains the source code of our work on designing efficient CNNs for computer vision

Efficient networks for Computer Vision This repo contains source code of our work on designing efficient networks for different computer vision tasks:

386 Nov 26, 2022

This repository contains the entire code for our work "Two-Timescale End-to-End Learning for Channel Acquisition and Hybrid Precoding"

Two-Timescale-DNN Two-Timescale End-to-End Learning for Channel Acquisition and Hybrid Precoding This repository contains the entire code for our work

3 Mar 7, 2022

Convolutional neural network web app trained to track our infant’s sleep schedule using our Google Nest camera.

Machine Learning Sleep Schedule Tracker What is it? Convolutional neural network web app trained to track our infant’s sleep schedule using our Google

7 Jul 15, 2022

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

Adam-NSCL This is a PyTorch implementation of Adam-NSCL algorithm for continual learning from our CVPR2021 (oral) paper: Title: Training Networks in N

34 Dec 21, 2022

All-in-one Docker container that allows a user to explore Nautobot in a lab environment.

Nautobot Lab This container is not for production use! Nautobot Lab is an all-in-one Docker container that allows a user to quickly get an instance of

29 Sep 16, 2022

NHS AI Lab Skunkworks project: Long Stayer Risk Stratification

NHS AI Lab Skunkworks project: Long Stayer Risk Stratification A pilot project for the NHS AI Lab Skunkworks team, Long Stayer Risk Stratification use

21 Nov 14, 2022

Code for the AI lab course 2021/2022 of the University of Verona

AI-Lab Code for the AI lab course 2021/2022 of the University of Verona Set-Up the environment for the curse Download Anaconda for your System. Instal

5 Oct 19, 2022

piSTAR Lab is a modular platform built to make AI experimentation accessible and fun. (pistar.ai)

piSTAR Lab WARNING: This is an early release. Overview piSTAR Lab is a modular deep reinforcement learning platform built to make AI experimentation a

0 Aug 1, 2022

Manipulation OpenAI Gym environments to simulate robots at the STARS lab

Manipulator Learning This repository contains a set of manipulation environments that are compatible with OpenAI Gym and simulated in pybullet. In par

5 Dec 8, 2022

All the code and files related to the MI-Lab of UE19CS305 course in sem 5

Machine-Intelligence-Lab-CS305 The compilation of all the code an drelated files from MI-Lab UE19CS305 (of batch 2019-2023) offered by PES University

3 Nov 10, 2022

Experiments for Operating Systems Lab (ETCS-352)

Operating Systems Lab (ETCS-352) Experiments for Operating Systems Lab (ETCS-352) performed by me in 2021 at uni. All codes are written by me except t

0 Sep 6, 2022

SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab

CORNELLSASLAB SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab Instructions: This python code can be used to convert SAS out

2 Jan 26, 2022

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

3 Nov 30, 2021

Repository for publicly available deep learning models developed in Rosetta community

trRosetta2 This package contains deep learning models and related scripts used by Baker group in CASP14. Installation Linux/Mac clone the package git

81 Dec 29, 2022

Repository for the electrical and ICT benchmark model developed in the ERIGrid 2.0 project.

Benchmark Model Electrical and ICT System This repository contains the documentation, code, and models for the electrical and ICT benchmark model deve

1 Nov 29, 2021

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Skeleton Aware Multi-modal Sign Language Recognition By Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li and Yun Fu. Smile Lab @ Northeastern

128 Dec 8, 2022

LinkNet - This repository contains our Torch7 implementation of the network developed by us at e-Lab.

Related tags

Overview

LinkNet

Dependencies:

Files/folders and their usage:

License

Comments

Releases(v1.0)

v1.0(Aug 7, 2017)

Owner

e-Lab

An efficient and effective learning to rank algorithm by mining information across ranking candidates. This repository contains the tensorflow implementation of SERank model. The code is developed based on TF-Ranking.

Train neural network for semantic segmentation (deep lab V3) with pytorch in less then 50 lines of code

This repository contains the code for our fast polygonal building extraction from overhead images pipeline.

This repository contains the source code of our work on designing efficient CNNs for computer vision

This repository contains the entire code for our work "Two-Timescale End-to-End Learning for Channel Acquisition and Hybrid Precoding"

Convolutional neural network web app trained to track our infant’s sleep schedule using our Google Nest camera.

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

All-in-one Docker container that allows a user to explore Nautobot in a lab environment.

NHS AI Lab Skunkworks project: Long Stayer Risk Stratification

Code for the AI lab course 2021/2022 of the University of Verona

piSTAR Lab is a modular platform built to make AI experimentation accessible and fun. (pistar.ai)

Manipulation OpenAI Gym environments to simulate robots at the STARS lab

All the code and files related to the MI-Lab of UE19CS305 course in sem 5

Experiments for Operating Systems Lab (ETCS-352)

SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

Repository for publicly available deep learning models developed in Rosetta community

Repository for the electrical and ICT benchmark model developed in the ERIGrid 2.0 project.

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.