Fully convolutional networks for semantic segmentation

Overview

FCN-semantic-segmentation

Simple end-to-end semantic segmentation using fully convolutional networks [1]. Takes a pretrained 34-layer ResNet [2], removes the fully connected layers, and adds transposed convolution layers with skip connections from lower layers. Initialises upsampling convolutions with bilinear interpolation filters and zeros the final (classification) layer.

Uses an independent cross-entropy loss per class. Trained with SGD with momentum, plus weight decay only on convolutional weights. Calculates and plots class-wise and mean intersection-over-union. Checkpoints the network every epoch.

Note: This code does not achieve great results (achieves ~40 IoU fairly quickly, but converges there). Contributions to fix this are welcome! The goal of this repo is to provide strong, simple and efficient baselines for semantic segmentation using the FCN method, so this shouldn't be restricted to using ResNet 34 etc.

Requirements

Instructions

  1. Install all of the required software. To feasibly run the training, CUDA is needed. The crop size and batch size can be tailored to your GPU memory (the default crop and batch sizes use ~10GB of GPU RAM).
  2. Register on the Cityscapes website to access the dataset.
  3. Download and extract the training/validation RGB data (leftImg8bit_trainvaltest) and ground truth data (gtFine_trainvaltest).
  4. Run python main.py <options>.

First a Dataset object is set up, returning the RGB inputs, one-hot targets (for independent classification) and label targets. During training, the images are randomly cropped and horizontally flipped. Testing calculates IoU scores and produces a subset of coloured predictions that match the coloured ground truth.

References

[1] Fully convolutional networks for semantic segmentation
[2] Deep Residual Learning for Image Recognition

Comments
  • RuntimeError: CUDA out of memory

    RuntimeError: CUDA out of memory

    Hi Kaixhin. The following error occurred when I ran the code: RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 4.00 GiB total capacity; 1.40 GiB already allocated; 0 bytes free; 16.43 MiB cached)

    My computer's graphics card has only 4GB of memory, and I have changed the size of batch to1. I would really appreciate it if you could give me some advice.

    opened by jouming 4
  • Wrong action activation and functional loss

    Wrong action activation and functional loss

    Hey, thanks for the starter code. I found out the reason why: "This code does not achieve great results (achieves ~40 IoU fairly quickly, but converges there)"

    Following the repo in https://github.com/fyu/drn

    In the main.py line 47 I would change the loss to self.crit = nn.NLLLoss2d(ignore_index=255).cuda() and line 70, I would change to output = F.log_softmax(self.net(input)).
    Correspondingly the line 67 should change to for i, (input, _, target,) in enumerate(train_loader): that is using the third item of the tuple which is not the one-hot-encoded one.

    At least by changing the code, the network continues improving the IOU further (and the change of loss function and output activation make more sense to me).

    Hope I made it clear. Cheers

    opened by stevenwudi 4
  • More details

    More details

    Hi Kaixhin,

    I just want to try your demo. But I am new to FCN model. Can you give more details on how to process data, and how to run the demo successfully.

    I met the same problem on pytorch that the train accuracy is low when train object detection. And I solved the problem with a few tricks. I am very interest in your code. And will be happy figure out your problem.

    opened by marvis 3
  • Reduced Classes

    Reduced Classes

    I have tried to reduce Labels 20 classes to 3 classes, we can get results predected as class 0. Do you have any ideas??

    Labels: -1 license plate, 0 unlabeled, 1 ego vehicle, 2 rectification border, 3 out of roi, 4 static, 5 dynamic, 6 ground, 7 road, 8 sidewalk, 9 parking, 10 rail track, 11 building, 12 wall, 13 fence, 14 guard rail, 15 bridge, 16 tunnel, 17 pole, 18 polegroup, 19 traffic light, 20 traffic sign, 21 vegetation, 22 terrain, 23 sky, 24 person, 25 rider, 26 car, 27 truck, 28 bus, 29 caravan, 30 trailer, 31 train, 32 motorcycle, 33 bicycle num_classes = 20 full_to_train = {-1: 19, 0: 19, 1: 19, 2: 19, 3: 19, 4: 19, 5: 19, 6: 19, 7: 0, 8: 1, 9: 19, 10: 19, 11: 2, 12: 3, 13: 4, 14: 19, 15: 19, 16: 19, 17: 5, 18: 19, 19: 6, 20: 7, 21: 8, 22: 9, 23: 10, 24: 11, 25: 12, 26: 13, 27: 14, 28: 15, 29: 19, 30: 19, 31: 16, 32: 17, 33: 18} train_to_full = {0: 7, 1: 8, 2: 11, 3: 12, 4: 13, 5: 17, 6: 19, 7: 20, 8: 21, 9: 22, 10: 23, 11: 24, 12: 25, 13: 26, 14: 27, 15: 28, 16: 31, 17: 32, 18: 33, 19: 0} full_to_colour = {0: (0, 0, 0), 7: (128, 64, 128), 8: (244, 35, 232), 11: (70, 70, 70), 12: (102, 102, 156), 13: (190, 153, 153), 17: (153, 153, 153), 19: (250, 170, 30), 20: (220, 220, 0), 21: (107, 142, 35), 22: (152, 251, 152), 23: (70, 130, 180), 24: (220, 20, 60), 25: (255, 0, 0), 26: (0, 0, 142), 27: (0, 0, 70), 28: (0, 60,100), 31: (0, 80, 100), 32: (0, 0, 230), 33: (119, 11, 32)}

    question 
    opened by debuops 2
  • Implementation and Test

    Implementation and Test

    Thank you for providing the code but it seems not giving good results with cityscapes and also with my own data set. The mean IOU I am getting is around 0,2% can you explain this?

    opened by mellou01 1
  • billinear interpolation filters

    billinear interpolation filters

    Hi, @Kaixhin ,

    Have you appended the billinear interpolation filters in your fcn implementation? I cannot found this layer in your code. Is that the reason for lower performance ?

    opened by amiltonwong 1
  • AttributeError: 'NoneType' object has no attribute 'weight'

    AttributeError: 'NoneType' object has no attribute 'weight'

    After following all the instructions I am getting this error and it seems to be a problem in the model.py file init.constant(self.conv10.weight, 0) # Zero init AttributeError: 'NoneType' object has no attribute 'weight'

    opened by mellou01 0
  • ResNet50

    ResNet50

    Hi@Kaixhin, I am trying to improve the resnet34 to ResNet50, but I met the reshape problems File "/data2/FCN-semantic-segmentation_new/model.py", line 92, in forward x = self.relu(self.bn8(self.conv8(x + x2))) RuntimeError: The size of tensor a (256) must match the size of tensor b (64) at non-single

    Cloud you please help me?

    opened by buaaswf 2
Owner
Kai Arulkumaran
Researcher, programmer, DJ, transhumanist.
Kai Arulkumaran
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

null 32 Sep 21, 2022
Another pytorch implementation of FCN (Fully Convolutional Networks)

FCN-pytorch-easiest Trying to be the easiest FCN pytorch implementation and just in a get and use fashion Here I use a handbag semantic segmentation f

Y. Dong 158 Dec 21, 2022
PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

pytorch-fcn PyTorch implementation of Fully Convolutional Networks. Requirements pytorch >= 0.2.0 torchvision >= 0.1.8 fcn >= 6.1.5 Pillow scipy tqdm

Kentaro Wada 1.6k Jan 7, 2023
Another pytorch implementation of FCN (Fully Convolutional Networks)

FCN-pytorch-easiest Trying to be the easiest FCN pytorch implementation and just in a get and use fashion Here I use a handbag semantic segmentation f

Y. Dong 158 Dec 21, 2022
BMW TechOffice MUNICH 148 Dec 21, 2022
PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Shape-aware Convolutional Layer (ShapeConv) PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentatio

Hanchao Leng 82 Dec 29, 2022
End-to-End Object Detection with Fully Convolutional Network

This project provides an implementation for "End-to-End Object Detection with Fully Convolutional Network" on PyTorch.

null 472 Dec 22, 2022
The official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." *.

F-Clip — Fully Convolutional Line Parsing This repository contains the official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang

Xili Dai 115 Dec 28, 2022
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

null 111 Dec 27, 2022
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

null 39 Aug 2, 2021
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network This repository is the official implementation of Speech Separati

Kai Li (李凯) 116 Nov 9, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

U-Net Implementation By Christopher Ley This is my interpretation and implementation of the famous paper "U-Net: Convolutional Networks for Biomedical

Christopher Ley 1 Jan 6, 2022
Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP Abstract: We introduce a method that allows to automatically se

Daniil Pakhomov 134 Dec 19, 2022
TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

yifan liu 147 Dec 3, 2022
Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

ADE20k Semantic segmentation with MAE Getting started Install the mmsegmentation

null 97 Dec 17, 2022
CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes Implementation of CoSMA: Convolutional Semi-Regular Mesh Autoencoder arXiv p

Fraunhofer SCAI 10 Oct 11, 2022