Fully Convolutional DenseNets for semantic segmentation.

Overview

Introduction

This repo contains the code to train and evaluate FC-DenseNets as described in The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation. We investigate the use of Densely Connected Convolutional Networks for semantic segmentation, and report state of the art results on datasets such as CamVid.

Installation

You need to install :

Data

The data loader is now available here : https://github.com/fvisin/dataset_loaders Thanks a lot to Francesco Visin, please cite if you use his data loader. Some adaptations may be do on the actual code, I hope to find some time to modify it !


The data-loader we used for the experiments will be released later. If you do want to train models now, you need to create a function load_data which returns 3 iterators (for training, validation and test). When applying next(), the iterator returns two values X, Y where X is the batch of input images (shape= (batch_size, 3, n_rows, n_cols), dtype=float32) and Y the batch of target segmentation maps (shape=(batch_size, n_rows, n_cols), dtype=int32) where each pixel in Y is an int indicating the class of the pixel.

The iterator must also have the following methods (so they are not python iterators) : get_n_classes (returns the number of classes), get_n_samples (returns the number of examples in the set), get_n_batches (returns the number of batches necessary to see the entire set) and get_void_labels (returns a list containing the classes associated to void). It might be easier to change directly the files train.py and test.py.

Run experiments

The architecture of the model is defined in FC-DenseNet.py. To train a model, you need to prepare a configuration file (folder config) where all the parameters needed for creating and training your model are precised. DenseNets contain lot of connections making graph optimization difficult for Theano. We strongly recommend to use the flags described further.

To train the FC-DenseNet103 model, use the command : THEANO_FLAGS='device=cuda,optimizer=fast_compile,optimizer_including=fusion' python train.py -c config/FC-DenseNet103.py -e experiment_name. All the logs of the experiments are stored in the folder experiment_name.

On a Titan X 12GB, for the model FC-DenseNet103 (see folder config), compilation takes around 400 sec and 1 epoch 120 sec for training and 40 sec for validation.

Use a pretrained model

We publish the weights of our model FC-DenseNet103. Metrics claimed in the paper (jaccard and accuracy) can be verified running THEANO_FLAGS='device=cuda,optimizer=fast_compile,optimizer_including=fusion' python test.py

About the "m" number in the paper

There is a small error with the "m" number in the Table 2 of the paper (that you may understand when running the code!). All values from the bottleneck to the last block (880, 1072, 800 and 368) should be incremented by 16 (896, 1088, 816 and 384).

Here how we compute this value representing the number of feature maps concatenated into the "stack" :

  • First convolution : m=48
  • In the downsampling part + bottleneck, m[B] = m[B-1] + n_layers[B] * growth_rate [linear growth]. First block : m = 48 + 4x16 = 112. Second block m = 112 + 5x16 = 192. Until the bottleneck : m = 656 + 15x16 = 896.
  • In the upsampling part, m[B] is the sum of 3 terms : the m value corresponding to same resolution in the downsampling part (skip connection), the number of feature maps from the upsampled block (n_layers[B-1] * growth_rate) and the number of feature maps in the new block (n_layers[B] * growth_rate). First upsampling, m = 656 + 15x16 + 12x16 = 1088. Second upsampling, m = 464 + 12x16 + 10x16 = 816. Third upsampling, m = 304 + 10x16 + 7x16 = 576, Fourth upsampling, m = 192 + 7x16 + 5x16 = 384 and fifth upsampling, m = 112 + 5x16 + 4x16 = 256
Comments
  • Pass individual images from a city scape

    Pass individual images from a city scape

    Trying to on my own images by passing them in in a loop. I did notice that if you try print the return value of the theano.function the system aborts. It looks like the return value is not correct when calling only for predictions.

        net = cf.net
        weight_path = 'weights/FC-DenseNet103_weights.npz'
        net.restore(weight_path)
        # Compile test functions
        prediction = get_output(net.output_layer, deterministic=True, batch_norm_use_averages=False)
        g = theano.function([net.input_var], prediction)   
        while(True):
           ret_val, img = cam.read()
           h = np.shape(img)[1]/4
           w = np.shape(img)[0]/4  
           img = cv2.resize(img, (h, w))               
           X = np.transpose(img, (2, 0, 1))                       
           X = X[np.newaxis,:] #(1, 3, 180, 320))        
           g_X = np.argmax(g(X), axis = 1)
           print(g_X)   #  **************   prints all zeros....[0,0,0....0,0,0]        
           image = np.reshape(g_X, (h, w))
           cv2.imshow('image',img)              
           cv2.imshow('proc',image)  # ******** displays all white
           cv2.waitKey(1)
    
    
    opened by rnunziata 4
  • Running out of memory in fine tuning

    Running out of memory in fine tuning

    Hi, first and foremost thank you very much for sharing your code! It is very insightful and I learned a lot from it. Unfortunately when I am trying to reproduce the results of your paper I run into out of memory issues while fine tuning on the whole images (FC-DenseNet103, batch size 3, input dimension (3, 3, 360, 480)). My theano.config.floatX is float32 (you did not mention using float16 and you code also suggests float32). I am using CuDNN 5105 along with cuda 8.0. My GPU is a Pascal Titan X (12GB VRAM). CnMem is disabled. With these settings, I cannot even train the network on a batch size of 2. Did you use a GPU with more VRAM or used a specific theano configuration? Tank you very much! Cheers, Fabian

    opened by FabianIsensee 4
  • Comparing training time as sanity check

    Comparing training time as sanity check

    I implemented this model in Keras/Tensorflow and am finding that training is slower than I expected. (It's about 8 times slower per epoch than a simplified version of U-Net). This might be due to a greater number of sequential operations in the FC-DenseNet which can't be parallelized on the GPU.

    As a sanity check, I wanted to compare the time per epoch that you reported with mine. In the README, it says that it took 120 secs per epoch. How many samples are there per epoch? I know there are 3 samples per minibatch, so how many minibatches are there per epoch? This number seems to come from iter.get_n_batches() (https://github.com/SimJeg/FC-DenseNet/blob/master/train.py#L24) which isn't in the repo. I'm training on a Tesla K80 using 256x256 images with 4096 samples per epoch with a batch size of 2 (ie. 2048 minibatches), and it's taking 45 mins per epoch.

    Thanks!

    opened by lewfish 3
  • more confusion about data loaders

    more confusion about data loaders

    trying to run test.py get the following error: I thought data loader would download the data for me...since there is no reference to access the test data on this site.

    installed
    git clone --recursive https://github.com/fvisin/dataset_loaders.git

    cat /usr/local/lib/python2.7/dist-packages/dataset_loaders-1.0.0-py2.7.egg/dataset_loaders/config.ini
    [general]
    datasets_local_path = /home/rjn/data
    
    [camvid]
    shared_path = /home/rjn/data/camvid/segnet/
    
    [cityscapes]
    shared_path = /home/rjn/data/cityscapes/
    
    [kitti]
    shared_path = /home/rjn/data/kitti/
    
    [mscoco]
    shared_path = /home/rjn/data/COCO/
    
    [pascal_voc]
    shared_path = /home/rjn/data/PASCAL-VOC/VOCdevkit/
    
    [scene_parsing_MIT]
    shared_path = /home/rjn/data/SceneParsingMIT/
    
    

    error:

    Unknown arguments: ['get_01c', 'seq_per_video', 'horizontal_flip', 'get_one_hot', 'crop_size']
    The local path /home/rjn/data/camvid exist, but is outdated. I will replace the old files with the new ones...
    Traceback (most recent call last):
      File "test.py", line 89, in <module>
        test(config_path, weight_path)
      File "test.py", line 30, in test
        _, _, iterator = load_data(cf.dataset, batch_size=batch_size)
      File "/home/rjn/opencv3-p3-code/classification_and_boxing/new_and_abandond/FC-DenseNet/data_loader.py", line 22, in load_data
        rng=rng)
      File "/usr/local/lib/python2.7/dist-packages/dataset_loaders-1.0.0-py2.7.egg/dataset_loaders/images/camvid.py", line 110, in __init__
        super(CamvidDataset, self).__init__(*args, **kwargs)
      File "/usr/local/lib/python2.7/dist-packages/dataset_loaders-1.0.0-py2.7.egg/dataset_loaders/parallel_loader.py", line 304, in __init__
        shutil.copytree(self.shared_path, self.path)
      File "/usr/lib/python2.7/shutil.py", line 171, in copytree
        names = os.listdir(src)
    OSError: [Errno 2] No such file or directory: '/home/rjn/data/camvid/segnet/'
    
    
    opened by rnunziata 2
  • local variable 'not_void' referenced before assignment

    local variable 'not_void' referenced before assignment

    I prepared the Camvid Dataloader and started the training run, but got the following error:

    Number of Convolutional layers : 103 Number of parameters : 9425163 Compilation starts at 2017-03-13 11:42:25 Traceback (most recent call last): File "train.py", line 255, in initiate_training(cf) File "train.py", line 223, in initiate_training train(cf) File "train.py", line 108, in train I, U, acc = theano_metrics(pred, net.target_var, n_classes, void_labels) File "/home/ubuntu/FC-DenseNet/metrics.py", line 33, in theano_metrics U = T.set_subtensor(U[i], T.sum(T.or_(y_true_i, y_pred_i) * not_void)) UnboundLocalError: local variable 'not_void' referenced before assignment

    Anyhelp is appreciated. Thanks :)

    opened by ps48 2
  • How long does it take to compile FC-Densenet103?

    How long does it take to compile FC-Densenet103?

    I'm running this on an i7 and it has been running for 48 hours, still not finished. Has anyone successfully compiled it? I'm using latest Theano with libgpuarray backend.

    opened by jrao1 2
  • config/FC-DenseNet103.py settings

    config/FC-DenseNet103.py settings

    I use images of 320x320 px with 2 classes (background and foreground) with no cropping. What should I set in the following variables:

    • train_crop_size if train_crop_size = None is OK or should I use train_crop_size = (320, 320) ?

    • input_shape (there are 2 places where it is set) Is this OK: input_shape=(None, 3, 320, 320) Or should I use input_shape=(None, 3, None, None)

    • n_classes Since I have 2 classes, but your code assumes that the result of get_void_labels() is non empty, then I return [255] from get_void_labels() (value of 255 is not used on my mask files) and thus increase n_classes by 1. So I set n_classes to 3 is that correct?

    opened by SergeMv 1
  • Error: AttributeError: Bad input argument to theano function with name

    Error: AttributeError: Bad input argument to theano function with name "train.py:114" at index 1 (0-based).

    Hi @SimJeg , I'm getting this error on training "'int' object has no attribute 'dtype'" What can be a reason?

    Traceback (most recent call last): File "train.py", line 259, in initiate_training(cf) File "train.py", line 223, in initiate_training train(cf) File "train.py", line 144, in train history = batch_loop(train_iter, train_fn, epoch, 'train', history) File "train.py", line 32, in batch_loop loss, I, U, acc = f(X, Y[:, None, :, :]) File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 813, in call allow_downcast=s.allow_downcast) File "/usr/local/lib/python2.7/dist-packages/theano/tensor/type.py", line 124, in filter up_dtype = scal.upcast(self.dtype, data.dtype) File "/usr/local/lib/python2.7/dist-packages/theano/scalar/basic.py", line 79, in upcast rval = str(z.dtype) AttributeError: Bad input argument to theano function with name "train.py:114" at index 1 (0-based).
    Backtrace when that variable is created:

    File "train.py", line 254, in cf = imp.load_source('config', arguments.config_path) File "config/FC-DenseNet103.py", line 32, in dropout_p=0.2) File "/root/fcdn/FC-DenseNet.py", line 45, in init self.target_var = T.tensor4('target_var', dtype='int32') # target

    'int' object has no attribute 'dtype'

    opened by SergeMv 1
  • Question on concatenate ?

    Question on concatenate ?

    Hi , Thanks for your code and paper. I got a question: As your paper said, your first dense block is: DB (4 layers) + TD, m = 112 your second dense block is: DB (5 layers) + TD, m = 192 Why m=192? You may explain that because 112+5*16=192. However, in my opinion, the output of the first dense block(or the input of the second dense block) has different feature size(row and column) as the second dense block, because there is 2*2 pooling layer in the dense block, right? So, it's hardly to understand how you can concatenate 112 feature maps and 5*16 feature maps along the channel axis since they have different rows and columns Thank you in advance!

    opened by huangh12 1
  • Versions of Theano, Lasagne and gpuarray

    Versions of Theano, Lasagne and gpuarray

    @SimJeg Can you please let us know the versions of Theano, Lasagne and gpuarray used for training in the latest versions there is some mismatch. Thanks :)

    opened by ps48 1
  • Problem of reproducing the results

    Problem of reproducing the results

    Hi,

    I was impressed by your excellent work and tried implement a PyTorch version of it. Unfortunately, I have trouble when reproducing your results.

    I followed the setting as follows:

    • Using FC-DenseNet-103
    • Using RMSprop with initial learning rate: 1e-3, decay rate: 0.995, weight decay: 1e-4,
    • Pretrain the model with randomly cropping to 224x224 and fine-tune it with the original size
    • Using patience = 100 for pretraining and 50 for fine-tuning. The maximum number of epoch is set to 750 as in this code.
    • Dropout rate = 0.2
    • Since there is no preprocessing code in this repository, I tried either normalize the images or not.

    I found that without dropout the model learns much better than the one with the one with dropout. However, none of these settings can get the same accuracy as what reported in the paper. The validation accuracy is 0.9372 and the mIoU is 0.7025; however, the test accuracy is only 0.8932 and the test mIoU is 0.5790.

    I am wondering what is the data preprocessing method you used. Is there anything wrong with my experiment procedure?

    Also, I tried to run your code with my implementation of dataloader (following your explanation of data format). However I got RuntimeError: error getting worksize: CUDNN_STATUS_BAD_PARAM It suggested me use 'optimizer=None', but it took me more than 3 hours to compile the model before I killed the job. FYI, I used the latest theano and lasagne with CUDA 8.0 and CUDNN 5.1. Do I have to use a different version?

    Thanks in advance.

    opened by felixgwu 1
  • The value of FLOPs?

    The value of FLOPs?

    Hi, do you still remember the FLOPs of the model when you runed fc-densenet, the reviewer of my paper asked me to write the parameters and FLOPs values, but I failed to run because the dataset could not be loaded, I look forward to your reply, thank you very much @SimJeg

    opened by xiaomixiaomi123zm 0
  • What about frontend model for feature extraction?

    What about frontend model for feature extraction?

    @SimJeg Thank you very much for your repository. I have a little confuse about the frontend model. Which frontend model such as Resnet, inception, have you used? Would you tell about something please?

    opened by AI-ML-Enthusiast 0
  • More than 3 bands/channels per image

    More than 3 bands/channels per image

    I would like to use more than 3 channels/bands per image (not only RGB). How can I do that? What needs to be modified in order for the network to work?

    opened by ntelo007 0
  • Accuracy and IoU reported in paper

    Accuracy and IoU reported in paper

    Hi SimJeg

    Is the Accuracy and IoU reported in your paper taken on the Validation Set or Test Set? I'm trying to reproduce your result, and I got 0.74 IoU on val set at 0.05 train loss, while the IoU on test set IoU is only 0.6, which is a significant drop. Can you confirm for me about this?

    Many thanks

    opened by khiemkhanh98 0
  • how could I set the right path to run the code

    how could I set the right path to run the code

    i set [general] datasets_local_path = /home/GithubRepository/FC-DenseNet/datasets/ [camvid] shared_path = /home/GithubRepository/Tensorflow-SegNet/CamVid/ in CamVid folder CamVid: --test --testannot --train --trainannot --val --valannot --test.txt --train.txt --val.txt when i run the code, the data in folder CamVid is copied to datasets.but Error No such file or directory: '/home/GithubRepository/FC-DenseNet/datasets/camvid/train/CamVid'

    opened by 15757170756 0
Owner
null
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

null 32 Sep 21, 2022
BMW TechOffice MUNICH 148 Dec 21, 2022
PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Shape-aware Convolutional Layer (ShapeConv) PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentatio

Hanchao Leng 82 Dec 29, 2022
End-to-End Object Detection with Fully Convolutional Network

This project provides an implementation for "End-to-End Object Detection with Fully Convolutional Network" on PyTorch.

null 472 Dec 22, 2022
The official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." *.

F-Clip — Fully Convolutional Line Parsing This repository contains the official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang

Xili Dai 115 Dec 28, 2022
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

null 111 Dec 27, 2022
Another pytorch implementation of FCN (Fully Convolutional Networks)

FCN-pytorch-easiest Trying to be the easiest FCN pytorch implementation and just in a get and use fashion Here I use a handbag semantic segmentation f

Y. Dong 158 Dec 21, 2022
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

null 39 Aug 2, 2021
PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

pytorch-fcn PyTorch implementation of Fully Convolutional Networks. Requirements pytorch >= 0.2.0 torchvision >= 0.1.8 fcn >= 6.1.5 Pillow scipy tqdm

Kentaro Wada 1.6k Jan 7, 2023
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network This repository is the official implementation of Speech Separati

Kai Li (李凯) 116 Nov 9, 2022
Another pytorch implementation of FCN (Fully Convolutional Networks)

FCN-pytorch-easiest Trying to be the easiest FCN pytorch implementation and just in a get and use fashion Here I use a handbag semantic segmentation f

Y. Dong 158 Dec 21, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP Abstract: We introduce a method that allows to automatically se

Daniil Pakhomov 134 Dec 19, 2022
TorchDistiller - a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

This project is a collection of the open source pytorch code for knowledge distillation, especially for the perception tasks, including semantic segmentation, depth estimation, object detection and instance segmentation.

yifan liu 147 Dec 3, 2022
Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

ADE20k Semantic segmentation with MAE Getting started Install the mmsegmentation

null 97 Dec 17, 2022
This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

Orientation independent Möbius CNNs This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of

Maurice Weiler 59 Dec 9, 2022
CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes Implementation of CoSMA: Convolutional Semi-Regular Mesh Autoencoder arXiv p

Fraunhofer SCAI 10 Oct 11, 2022
An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

Retina Blood Vessels Segmentation This is an implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional

Srijarko Roy 23 Aug 20, 2022
Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation".

FPS-Net Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation", accepted by ISPRS journal of Photogrammetry

null 15 Nov 30, 2022