Fully convolutional networks for semantic segmentation

Kai Arulkumaran

Last update: Dec 25, 2022

Related tags

Overview

FCN-semantic-segmentation

Simple end-to-end semantic segmentation using fully convolutional networks [1]. Takes a pretrained 34-layer ResNet [2], removes the fully connected layers, and adds transposed convolution layers with skip connections from lower layers. Initialises upsampling convolutions with bilinear interpolation filters and zeros the final (classification) layer.

Uses an independent cross-entropy loss per class. Trained with SGD with momentum, plus weight decay only on convolutional weights. Calculates and plots class-wise and mean intersection-over-union. Checkpoints the network every epoch.

Note: This code does not achieve great results (achieves ~40 IoU fairly quickly, but converges there). Contributions to fix this are welcome! The goal of this repo is to provide strong, simple and efficient baselines for semantic segmentation using the FCN method, so this shouldn't be restricted to using ResNet 34 etc.

Requirements

Instructions

Install all of the required software. To feasibly run the training, CUDA is needed. The crop size and batch size can be tailored to your GPU memory (the default crop and batch sizes use ~10GB of GPU RAM).
Register on the Cityscapes website to access the dataset.
Download and extract the training/validation RGB data (leftImg8bit_trainvaltest) and ground truth data (gtFine_trainvaltest).
Run python main.py <options>.

First a Dataset object is set up, returning the RGB inputs, one-hot targets (for independent classification) and label targets. During training, the images are randomly cropped and horizontally flipped. Testing calculates IoU scores and produces a subset of coloured predictions that match the coloured ground truth.

References

[1] Fully convolutional networks for semantic segmentation
[2] Deep Residual Learning for Image Recognition

Comments

RuntimeError: CUDA out of memory

Hi Kaixhin. The following error occurred when I ran the code: RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 4.00 GiB total capacity; 1.40 GiB already allocated; 0 bytes free; 16.43 MiB cached)

My computer's graphics card has only 4GB of memory, and I have changed the size of batch to1. I would really appreciate it if you could give me some advice.

opened by jouming 4
Wrong action activation and functional loss

Hey, thanks for the starter code. I found out the reason why: "This code does not achieve great results (achieves ~40 IoU fairly quickly, but converges there)"

Following the repo in https://github.com/fyu/drn

In the main.py line 47 I would change the loss to self.crit = nn.NLLLoss2d(ignore_index=255).cuda() and line 70, I would change to output = F.log_softmax(self.net(input)).
Correspondingly the line 67 should change to for i, (input, _, target,) in enumerate(train_loader): that is using the third item of the tuple which is not the one-hot-encoded one.

At least by changing the code, the network continues improving the IOU further (and the change of loss function and output activation make more sense to me).

Hope I made it clear. Cheers

opened by stevenwudi 4
More details

Hi Kaixhin,

I just want to try your demo. But I am new to FCN model. Can you give more details on how to process data, and how to run the demo successfully.

I met the same problem on pytorch that the train accuracy is low when train object detection. And I solved the problem with a few tricks. I am very interest in your code. And will be happy figure out your problem.

opened by marvis 3
Reduced Classes

I have tried to reduce Labels 20 classes to 3 classes, we can get results predected as class 0. Do you have any ideas??

Labels: -1 license plate, 0 unlabeled, 1 ego vehicle, 2 rectification border, 3 out of roi, 4 static, 5 dynamic, 6 ground, 7 road, 8 sidewalk, 9 parking, 10 rail track, 11 building, 12 wall, 13 fence, 14 guard rail, 15 bridge, 16 tunnel, 17 pole, 18 polegroup, 19 traffic light, 20 traffic sign, 21 vegetation, 22 terrain, 23 sky, 24 person, 25 rider, 26 car, 27 truck, 28 bus, 29 caravan, 30 trailer, 31 train, 32 motorcycle, 33 bicycle num_classes = 20 full_to_train = {-1: 19, 0: 19, 1: 19, 2: 19, 3: 19, 4: 19, 5: 19, 6: 19, 7: 0, 8: 1, 9: 19, 10: 19, 11: 2, 12: 3, 13: 4, 14: 19, 15: 19, 16: 19, 17: 5, 18: 19, 19: 6, 20: 7, 21: 8, 22: 9, 23: 10, 24: 11, 25: 12, 26: 13, 27: 14, 28: 15, 29: 19, 30: 19, 31: 16, 32: 17, 33: 18} train_to_full = {0: 7, 1: 8, 2: 11, 3: 12, 4: 13, 5: 17, 6: 19, 7: 20, 8: 21, 9: 22, 10: 23, 11: 24, 12: 25, 13: 26, 14: 27, 15: 28, 16: 31, 17: 32, 18: 33, 19: 0} full_to_colour = {0: (0, 0, 0), 7: (128, 64, 128), 8: (244, 35, 232), 11: (70, 70, 70), 12: (102, 102, 156), 13: (190, 153, 153), 17: (153, 153, 153), 19: (250, 170, 30), 20: (220, 220, 0), 21: (107, 142, 35), 22: (152, 251, 152), 23: (70, 130, 180), 24: (220, 20, 60), 25: (255, 0, 0), 26: (0, 0, 142), 27: (0, 0, 70), 28: (0, 60,100), 31: (0, 80, 100), 32: (0, 0, 230), 33: (119, 11, 32)}

question

opened by debuops 2
Implementation and Test

Thank you for providing the code but it seems not giving good results with cityscapes and also with my own data set. The mean IOU I am getting is around 0,2% can you explain this?

opened by mellou01 1
billinear interpolation filters

Hi, @Kaixhin ,

Have you appended the billinear interpolation filters in your fcn implementation? I cannot found this layer in your code. Is that the reason for lower performance ?

opened by amiltonwong 1
AttributeError: 'NoneType' object has no attribute 'weight'

After following all the instructions I am getting this error and it seems to be a problem in the model.py file init.constant(self.conv10.weight, 0) # Zero init AttributeError: 'NoneType' object has no attribute 'weight'

opened by mellou01 0
ResNet50

Hi@Kaixhin, I am trying to improve the resnet34 to ResNet50, but I met the reshape problems File "/data2/FCN-semantic-segmentation_new/model.py", line 92, in forward x = self.relu(self.bn8(self.conv8(x + x2))) RuntimeError: The size of tensor a (256) must match the size of tensor b (64) at non-single

Cloud you please help me?

opened by buaaswf 2

Fully convolutional networks for semantic segmentation

Related tags

Overview

FCN-semantic-segmentation

Requirements

Instructions

References

Comments

RuntimeError: CUDA out of memory

Wrong action activation and functional loss

Following the repo in https://github.com/fyu/drn

More details

Reduced Classes

Implementation and Test

billinear interpolation filters

AttributeError: 'NoneType' object has no attribute 'weight'

ResNet50

Owner

Kai Arulkumaran

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Another pytorch implementation of FCN (Fully Convolutional Networks)

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

Another pytorch implementation of FCN (Fully Convolutional Networks)

This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

End-to-End Object Detection with Fully Convolutional Network

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

Fully convolutional networks for semantic segmentation

Related tags

Overview

FCN-semantic-segmentation

Requirements

Instructions

References

Comments

Following the repo in https://github.com/fyu/drn

Owner

Kai Arulkumaran

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Another pytorch implementation of FCN (Fully Convolutional Networks)

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

Another pytorch implementation of FCN (Fully Convolutional Networks)

This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

End-to-End Object Detection with Fully Convolutional Network

The official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." *.

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .