Fully Convolutional DenseNets for semantic segmentation.

Last update: Nov 26, 2022

Related tags

Deep Learning FC-DenseNet

Overview

Introduction

This repo contains the code to train and evaluate FC-DenseNets as described in The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation. We investigate the use of Densely Connected Convolutional Networks for semantic segmentation, and report state of the art results on datasets such as CamVid.

Installation

You need to install :

Theano. Preferably the last version
Lasagne
The dataset loader (Not yet available)
(Recommended) The new Theano GPU backend. Compilation will be much faster.

Data

The data loader is now available here : https://github.com/fvisin/dataset_loaders Thanks a lot to Francesco Visin, please cite if you use his data loader. Some adaptations may be do on the actual code, I hope to find some time to modify it !

The data-loader we used for the experiments will be released later. If you do want to train models now, you need to create a function load_data which returns 3 iterators (for training, validation and test). When applying next(), the iterator returns two values X, Y where X is the batch of input images (shape= (batch_size, 3, n_rows, n_cols), dtype=float32) and Y the batch of target segmentation maps (shape=(batch_size, n_rows, n_cols), dtype=int32) where each pixel in Y is an int indicating the class of the pixel.

The iterator must also have the following methods (so they are not python iterators) : get_n_classes (returns the number of classes), get_n_samples (returns the number of examples in the set), get_n_batches (returns the number of batches necessary to see the entire set) and get_void_labels (returns a list containing the classes associated to void). It might be easier to change directly the files train.py and test.py.

Run experiments

The architecture of the model is defined in FC-DenseNet.py. To train a model, you need to prepare a configuration file (folder config) where all the parameters needed for creating and training your model are precised. DenseNets contain lot of connections making graph optimization difficult for Theano. We strongly recommend to use the flags described further.

To train the FC-DenseNet103 model, use the command : THEANO_FLAGS='device=cuda,optimizer=fast_compile,optimizer_including=fusion' python train.py -c config/FC-DenseNet103.py -e experiment_name. All the logs of the experiments are stored in the folder experiment_name.

On a Titan X 12GB, for the model FC-DenseNet103 (see folder config), compilation takes around 400 sec and 1 epoch 120 sec for training and 40 sec for validation.

Use a pretrained model

We publish the weights of our model FC-DenseNet103. Metrics claimed in the paper (jaccard and accuracy) can be verified running THEANO_FLAGS='device=cuda,optimizer=fast_compile,optimizer_including=fusion' python test.py

About the "m" number in the paper

There is a small error with the "m" number in the Table 2 of the paper (that you may understand when running the code!). All values from the bottleneck to the last block (880, 1072, 800 and 368) should be incremented by 16 (896, 1088, 816 and 384).

Here how we compute this value representing the number of feature maps concatenated into the "stack" :

First convolution : m=48
In the downsampling part + bottleneck, m[B] = m[B-1] + n_layers[B] * growth_rate [linear growth]. First block : m = 48 + 4x16 = 112. Second block m = 112 + 5x16 = 192. Until the bottleneck : m = 656 + 15x16 = 896.
In the upsampling part, m[B] is the sum of 3 terms : the m value corresponding to same resolution in the downsampling part (skip connection), the number of feature maps from the upsampled block (n_layers[B-1] * growth_rate) and the number of feature maps in the new block (n_layers[B] * growth_rate). First upsampling, m = 656 + 15x16 + 12x16 = 1088. Second upsampling, m = 464 + 12x16 + 10x16 = 816. Third upsampling, m = 304 + 10x16 + 7x16 = 576, Fourth upsampling, m = 192 + 7x16 + 5x16 = 384 and fifth upsampling, m = 112 + 5x16 + 4x16 = 256

Comments

Pass individual images from a city scape

Trying to on my own images by passing them in in a loop. I did notice that if you try print the return value of the theano.function the system aborts. It looks like the return value is not correct when calling only for predictions.

    net = cf.net
    weight_path = 'weights/FC-DenseNet103_weights.npz'
    net.restore(weight_path)
    # Compile test functions
    prediction = get_output(net.output_layer, deterministic=True, batch_norm_use_averages=False)
    g = theano.function([net.input_var], prediction)   
    while(True):
       ret_val, img = cam.read()
       h = np.shape(img)[1]/4
       w = np.shape(img)[0]/4  
       img = cv2.resize(img, (h, w))               
       X = np.transpose(img, (2, 0, 1))                       
       X = X[np.newaxis,:] #(1, 3, 180, 320))        
       g_X = np.argmax(g(X), axis = 1)
       print(g_X)   #  **************   prints all zeros....[0,0,0....0,0,0]        
       image = np.reshape(g_X, (h, w))
       cv2.imshow('image',img)              
       cv2.imshow('proc',image)  # ******** displays all white
       cv2.waitKey(1)

opened by rnunziata 4

Running out of memory in fine tuning

Hi, first and foremost thank you very much for sharing your code! It is very insightful and I learned a lot from it. Unfortunately when I am trying to reproduce the results of your paper I run into out of memory issues while fine tuning on the whole images (FC-DenseNet103, batch size 3, input dimension (3, 3, 360, 480)). My theano.config.floatX is float32 (you did not mention using float16 and you code also suggests float32). I am using CuDNN 5105 along with cuda 8.0. My GPU is a Pascal Titan X (12GB VRAM). CnMem is disabled. With these settings, I cannot even train the network on a batch size of 2. Did you use a GPU with more VRAM or used a specific theano configuration? Tank you very much! Cheers, Fabian

opened by FabianIsensee 4
Comparing training time as sanity check

I implemented this model in Keras/Tensorflow and am finding that training is slower than I expected. (It's about 8 times slower per epoch than a simplified version of U-Net). This might be due to a greater number of sequential operations in the FC-DenseNet which can't be parallelized on the GPU.

As a sanity check, I wanted to compare the time per epoch that you reported with mine. In the README, it says that it took 120 secs per epoch. How many samples are there per epoch? I know there are 3 samples per minibatch, so how many minibatches are there per epoch? This number seems to come from iter.get_n_batches() (https://github.com/SimJeg/FC-DenseNet/blob/master/train.py#L24) which isn't in the repo. I'm training on a Tesla K80 using 256x256 images with 4096 samples per epoch with a batch size of 2 (ie. 2048 minibatches), and it's taking 45 mins per epoch.

Thanks!

opened by lewfish 3

more confusion about data loaders

trying to run test.py get the following error: I thought data loader would download the data for me...since there is no reference to access the test data on this site.

installed
git clone --recursive https://github.com/fvisin/dataset_loaders.git

cat /usr/local/lib/python2.7/dist-packages/dataset_loaders-1.0.0-py2.7.egg/dataset_loaders/config.ini
[general]
datasets_local_path = /home/rjn/data

[camvid]
shared_path = /home/rjn/data/camvid/segnet/

[cityscapes]
shared_path = /home/rjn/data/cityscapes/

[kitti]
shared_path = /home/rjn/data/kitti/

[mscoco]
shared_path = /home/rjn/data/COCO/

[pascal_voc]
shared_path = /home/rjn/data/PASCAL-VOC/VOCdevkit/

[scene_parsing_MIT]
shared_path = /home/rjn/data/SceneParsingMIT/

error:

Unknown arguments: ['get_01c', 'seq_per_video', 'horizontal_flip', 'get_one_hot', 'crop_size']
The local path /home/rjn/data/camvid exist, but is outdated. I will replace the old files with the new ones...
Traceback (most recent call last):
  File "test.py", line 89, in <module>
    test(config_path, weight_path)
  File "test.py", line 30, in test
    _, _, iterator = load_data(cf.dataset, batch_size=batch_size)
  File "/home/rjn/opencv3-p3-code/classification_and_boxing/new_and_abandond/FC-DenseNet/data_loader.py", line 22, in load_data
    rng=rng)
  File "/usr/local/lib/python2.7/dist-packages/dataset_loaders-1.0.0-py2.7.egg/dataset_loaders/images/camvid.py", line 110, in __init__
    super(CamvidDataset, self).__init__(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/dataset_loaders-1.0.0-py2.7.egg/dataset_loaders/parallel_loader.py", line 304, in __init__
    shutil.copytree(self.shared_path, self.path)
  File "/usr/lib/python2.7/shutil.py", line 171, in copytree
    names = os.listdir(src)
OSError: [Errno 2] No such file or directory: '/home/rjn/data/camvid/segnet/'

opened by rnunziata 2

local variable 'not_void' referenced before assignment

I prepared the Camvid Dataloader and started the training run, but got the following error:

Number of Convolutional layers : 103 Number of parameters : 9425163 Compilation starts at 2017-03-13 11:42:25 Traceback (most recent call last): File "train.py", line 255, in initiate_training(cf) File "train.py", line 223, in initiate_training train(cf) File "train.py", line 108, in train I, U, acc = theano_metrics(pred, net.target_var, n_classes, void_labels) File "/home/ubuntu/FC-DenseNet/metrics.py", line 33, in theano_metrics U = T.set_subtensor(U[i], T.sum(T.or_(y_true_i, y_pred_i) * not_void)) UnboundLocalError: local variable 'not_void' referenced before assignment

Anyhelp is appreciated. Thanks :)

opened by ps48 2
How long does it take to compile FC-Densenet103?

I'm running this on an i7 and it has been running for 48 hours, still not finished. Has anyone successfully compiled it? I'm using latest Theano with libgpuarray backend.

opened by jrao1 2
config/FC-DenseNet103.py settings
I use images of 320x320 px with 2 classes (background and foreground) with no cropping. What should I set in the following variables:

train_crop_size if train_crop_size = None is OK or should I use train_crop_size = (320, 320) ?

input_shape (there are 2 places where it is set) Is this OK: input_shape=(None, 3, 320, 320) Or should I use input_shape=(None, 3, None, None)

n_classes Since I have 2 classes, but your code assumes that the result of get_void_labels() is non empty, then I return [255] from get_void_labels() (value of 255 is not used on my mask files) and thus increase n_classes by 1. So I set n_classes to 3 is that correct?
opened by SergeMv 1
Error: AttributeError: Bad input argument to theano function with name "train.py:114" at index 1 (0-based).

Hi @SimJeg , I'm getting this error on training "'int' object has no attribute 'dtype'" What can be a reason?

Traceback (most recent call last): File "train.py", line 259, in initiate_training(cf) File "train.py", line 223, in initiate_training train(cf) File "train.py", line 144, in train history = batch_loop(train_iter, train_fn, epoch, 'train', history) File "train.py", line 32, in batch_loop loss, I, U, acc = f(X, Y[:, None, :, :]) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 813, in call allow_downcast=s.allow_downcast) File "/usr/local/lib/python2.7/dist-packages/theano/tensor/type.py", line 124, in filter up_dtype = scal.upcast(self.dtype, data.dtype) File "/usr/local/lib/python2.7/dist-packages/theano/scalar/basic.py", line 79, in upcast rval = str(z.dtype) AttributeError: Bad input argument to theano function with name "train.py:114" at index 1 (0-based).
Backtrace when that variable is created:

File "train.py", line 254, in cf = imp.load_source('config', arguments.config_path) File "config/FC-DenseNet103.py", line 32, in dropout_p=0.2) File "/root/fcdn/FC-DenseNet.py", line 45, in init self.target_var = T.tensor4('target_var', dtype='int32') # target

'int' object has no attribute 'dtype'

opened by SergeMv 1
Question on concatenate ?

Hi , Thanks for your code and paper. I got a question: As your paper said, your first dense block is: DB (4 layers) + TD, m = 112 your second dense block is: DB (5 layers) + TD, m = 192 Why m=192? You may explain that because 112+5*16=192. However, in my opinion, the output of the first dense block(or the input of the second dense block) has different feature size(row and column) as the second dense block, because there is 2*2 pooling layer in the dense block, right? So, it's hardly to understand how you can concatenate 112 feature maps and 5*16 feature maps along the channel axis since they have different rows and columns Thank you in advance！

opened by huangh12 1
Versions of Theano, Lasagne and gpuarray

@SimJeg Can you please let us know the versions of Theano, Lasagne and gpuarray used for training in the latest versions there is some mismatch. Thanks :)

opened by ps48 1
Problem of reproducing the results
Hi,

I was impressed by your excellent work and tried implement a PyTorch version of it. Unfortunately, I have trouble when reproducing your results.

I followed the setting as follows:

Using FC-DenseNet-103

Using RMSprop with initial learning rate: 1e-3, decay rate: 0.995, weight decay: 1e-4,

Pretrain the model with randomly cropping to 224x224 and fine-tune it with the original size

Using patience = 100 for pretraining and 50 for fine-tuning. The maximum number of epoch is set to 750 as in this code.

Dropout rate = 0.2

Since there is no preprocessing code in this repository, I tried either normalize the images or not.

I found that without dropout the model learns much better than the one with the one with dropout. However, none of these settings can get the same accuracy as what reported in the paper. The validation accuracy is 0.9372 and the mIoU is 0.7025; however, the test accuracy is only 0.8932 and the test mIoU is 0.5790.

I am wondering what is the data preprocessing method you used. Is there anything wrong with my experiment procedure?

Also, I tried to run your code with my implementation of dataloader (following your explanation of data format). However I got RuntimeError: error getting worksize: CUDNN_STATUS_BAD_PARAM It suggested me use 'optimizer=None', but it took me more than 3 hours to compile the model before I killed the job. FYI, I used the latest theano and lasagne with CUDA 8.0 and CUDNN 5.1. Do I have to use a different version?

Thanks in advance.
opened by felixgwu 1
The value of FLOPs？

Hi, do you still remember the FLOPs of the model when you runed fc-densenet, the reviewer of my paper asked me to write the parameters and FLOPs values, but I failed to run because the dataset could not be loaded, I look forward to your reply, thank you very much @SimJeg

opened by xiaomixiaomi123zm 0
What about frontend model for feature extraction?

@SimJeg Thank you very much for your repository. I have a little confuse about the frontend model. Which frontend model such as Resnet, inception, have you used? Would you tell about something please?

opened by AI-ML-Enthusiast 0
More than 3 bands/channels per image

I would like to use more than 3 channels/bands per image (not only RGB). How can I do that? What needs to be modified in order for the network to work?

opened by ntelo007 0
Accuracy and IoU reported in paper

Hi SimJeg

Is the Accuracy and IoU reported in your paper taken on the Validation Set or Test Set? I'm trying to reproduce your result, and I got 0.74 IoU on val set at 0.05 train loss, while the IoU on test set IoU is only 0.6, which is a significant drop. Can you confirm for me about this?

Many thanks

opened by khiemkhanh98 0
how could I set the right path to run the code

i set [general] datasets_local_path = /home/GithubRepository/FC-DenseNet/datasets/ [camvid] shared_path = /home/GithubRepository/Tensorflow-SegNet/CamVid/ in CamVid folder CamVid: --test --testannot --train --trainannot --val --valannot --test.txt --train.txt --val.txt when i run the code, the data in folder CamVid is copied to datasets.but Error No such file or directory: '/home/GithubRepository/FC-DenseNet/datasets/camvid/train/CamVid'

opened by 15757170756 0

Fully Convolutional DenseNets for semantic segmentation.

Related tags

Overview

Introduction

Installation

Data

Run experiments

Use a pretrained model

About the "m" number in the paper

Comments

Owner

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

End-to-End Object Detection with Fully Convolutional Network

The official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." *.

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.

Another pytorch implementation of FCN (Fully Convolutional Networks)

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Another pytorch implementation of FCN (Fully Convolutional Networks)

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation".

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .