Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation

Overview

FCN.tensorflow

Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation (FCNs).

The implementation is largely based on the reference code provided by the authors of the paper link. The model was applied on the Scene Parsing Challenge dataset provided by MIT http://sceneparsing.csail.mit.edu/.

  1. Prerequisites
  2. Results
  3. Observations
  4. Useful links

Prerequisites

  • The results were obtained after training for ~6-7 hrs on a 12GB TitanX.
  • The code was originally written and tested with tensorflow0.11 and python2.7. The tf.summary calls have been updated to work with tensorflow version 0.12. To work with older versions of tensorflow use branch tf.0.11_compatible.
  • Some of the problems while working with tensorflow1.0 and in windows have been discussed in Issue #9.
  • To train model simply execute python FCN.py
  • To visualize results for a random batch of images use flag --mode=visualize
  • debug flag can be set during training to add information regarding activations, gradients, variables etc.
  • The IPython notebook in logs folder can be used to view results in color as below.

Results

Results were obtained by training the model in batches of 2 with resized image of 256x256. Note that although the training is done at this image size - Nothing prevents the model from working on arbitrary sized images. No post processing was done on the predicted images. Training was done for 9 epochs - The shorter training time explains why certain concepts seem semantically understood by the model while others were not. Results below are from randomly chosen images from validation dataset.

Pretty much used the same network design as in the reference model implementation of the paper in caffe. The weights for the new layers added were initialized with small values, and the learning was done using Adam Optimizer (Learning rate = 1e-4).

Observations

  • The small batch size was necessary to fit the training model in memory but explains the slow learning
  • Concepts that had many examples seem to be correctly identified and segmented - in the example above you can see that cars, persons were identified better. I believe this can be solved by training for longer epochs.
  • Also the resizing of images cause loss of information - you can notice this in the fact smaller objects are segmented with less accuracy.

Now for the gradients,

  • If you closely watch the gradients you will notice the inital training is almost entirely on the new layers added - it is only after these layers are reasonably trained do we see the VGG layers get some gradient flow. This is understandable as changes the new layers affect the loss objective much more in the beginning.
  • The earlier layers of the netowrk are initialized with VGG weights and so conceptually would require less tuning unless the train data is extremely varied - which in this case is not.
  • The first layer of convolutional model captures low level information and since this entrirely dataset dependent you notice the gradients adjusting the first layer weights to accustom the model to the dataset.
  • The other conv layers from VGG have very small gradients flowing as the concepts captured here are good enough for our end objective - Segmentation.
  • This is the core reason Transfer Learning works so well. Just thought of pointing this out while here.

Useful Links

  • Video of the presentaion given by the authors on the paper - link
Comments
  • [Solved] Problems with TensorFlow 1.0 and Windows

    [Solved] Problems with TensorFlow 1.0 and Windows

    Hi there,

    First, I wanted to say thanks for sharing! I'm working through the code to help with my own segmentation project and having something to work from is a big help.

    Second, I came across a few issues (minor really) that I've figured out and wanted to share:

    • TensorFlow 1.0 replaced tf.pack() with tf.stack().
    • In TensorFlow 1.0, variables should be initialised using tf.global_variables_initializer()
    • In Windows, the os.path.splittext() should use "\ \", rather than '/'. Otherwise, the program can't find any files to pickle (and the MITSceneParsing.pickle file is empty), which in turn means 0 records are found and the feed dict instruction doesn't work.

    Like I said, pretty minor stuff, but I wanted to post in case anyone else had any issues.

    Best regards,

    Frazer

    P.S. If you get an out of memory error, it's likely because you're trying to work with 20,000 images, which might be a bit too much. I deleted some of the training images and it worked.

    opened by drfknoble 38
  • How can i solve these problems?

    How can i solve these problems?

    runfile('C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master/FCN.py', wdir='C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master') setting up vgg initialized conv layers ... Setting up summary op... Setting up image reader... Found pickle file! 0 0 Setting up dataset reader Initializing Batch Dataset Reader... {'resize': True, 'resize_size': 224} (0,) (0,) Initializing Batch Dataset Reader... {'resize': True, 'resize_size': 224} (0,) (0,) Setting up Saver... ****************** Epochs completed: 1******************

    Traceback (most recent call last):
    
      File "<ipython-input-1-6062f5716837>", line 1, in <module>
        runfile('C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master/FCN.py', wdir='C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master')
    
      File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
        execfile(filename, namespace)
    
      File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
        exec(compile(f.read(), filename, 'exec'), namespace)
    
      File "C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master/FCN.py", line 223, in <module>
        tf.app.run()
    
      File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 43, in run
        sys.exit(main(sys.argv[:1] + flags_passthrough))
    
      File "C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master/FCN.py", line 194, in main
        sess.run(train_op, feed_dict=feed_dict)
    
      File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 766, in run
        run_metadata_ptr)
    
      File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 943, in _run
        % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
    
    ValueError: Cannot feed value of shape (0,) for Tensor 'input_image:0', which has shape '(?, 224, 224, 3)'
    
    opened by eserercanakin 25
  • Getting blank predicted images

    Getting blank predicted images

    @shekkizh I took exact code from the git repo and trained without any changes on Titan x for 100000 iterations with default batch size. But on visualizing after completing the training, all i could see is full black images in prediction.

    What could possibly go wrong?

    opened by NitishMutha 13
  • How to train on Pascal dataset?

    How to train on Pascal dataset?

    Hello,

    I was wondering what all is need to be done to get this working with the Pascal dataset. I see that the output placeholder channel size is hardcoded to 1 and the pascal annotations are rgb images so that would need to be changed to 3, along with the number of classes and whatnot. I've tried changing those but its giving me an error in the loss function line and I'm having a hard time understanding how logits which is of shape (?, ?, ?, num_classes) can be compared with y_output which is of shape (?, width, height, channel).

    Also a separate question, but do you know how to compute intersection over union for the output?

    Thanks

    edit: Spent a little bit of time looking around and it looks like I need to figure out how to map the color mapped segmentation labels they give us in the dataset with a 0-20 integer indexed version

    opened by Exuro889 10
  • [SOLVED] Loss won't decrease, predictions are all the same

    [SOLVED] Loss won't decrease, predictions are all the same

    Hi! I would like to reproduce your results. Just running the code like python FCN.py doesn't seem to do the job for me. The default parameters are:

    • IMAGE_SIZE = 224 ~~(changing it to 256 does not affect the results)~~
    • learning_rate = 1e-4
    • batch_size = 2

    What I get is that training and validation loss start at about 400, and very quickly (200 iterations) decrease until they settle to about 3.

    Step: 0, Train_loss:415.754
    2016-10-13 12:19:13.407670 ---> Validation_loss: 395.876
    Step: 10, Train_loss:28.7208
    Step: 20, Train_loss:10.2944
    Step: 30, Train_loss:5.06159
    Step: 40, Train_loss:4.51668
    Step: 50, Train_loss:4.17936
    Step: 60, Train_loss:4.55051
    Step: 70, Train_loss:4.98752
    Step: 80, Train_loss:3.63942
    Step: 90, Train_loss:3.56676
    Step: 100, Train_loss:3.96641
    Step: 110, Train_loss:3.72767
    Step: 120, Train_loss:3.26587
    Step: 130, Train_loss:3.89015
    Step: 140, Train_loss:5.48371
    Step: 150, Train_loss:4.27173
    Step: 160, Train_loss:3.81378
    Step: 170, Train_loss:3.58391
    Step: 180, Train_loss:2.79207
    Step: 190, Train_loss:4.10269
    Step: 200, Train_loss:4.57686
    Step: 210, Train_loss:4.00551
    Step: 220, Train_loss:3.1667
    Step: 230, Train_loss:3.7841
    Step: 240, Train_loss:3.74983
    Step: 250, Train_loss:3.03212
    Step: 260, Train_loss:2.85248
    Step: 270, Train_loss:3.64257
    Step: 280, Train_loss:3.765
    Step: 290, Train_loss:4.16679
    Step: 300, Train_loss:4.0291
    Step: 310, Train_loss:3.95092
    Step: 320, Train_loss:3.38709
    Step: 330, Train_loss:2.48646
    Step: 340, Train_loss:2.98015
    Step: 350, Train_loss:3.59501
    Step: 360, Train_loss:3.80755
    Step: 370, Train_loss:3.73314
    Step: 380, Train_loss:3.40185
    Step: 390, Train_loss:3.89394
    Step: 400, Train_loss:3.80676
    Step: 410, Train_loss:2.78324
    Step: 420, Train_loss:3.14695
    Step: 430, Train_loss:3.29019
    Step: 440, Train_loss:3.16163
    Step: 450, Train_loss:3.64598
    Step: 460, Train_loss:2.74009
    Step: 470, Train_loss:3.93917
    Step: 480, Train_loss:3.815
    Step: 490, Train_loss:3.83076
    Step: 500, Train_loss:4.45192
    2016-10-13 12:24:10.606606 ---> Validation_loss: 3.02666
    

    I kept it running up to 35000 iterations, which should be about 3.5 epochs, but the loss won't decrease any further. If I then validate the model at 35000 iterations (Train_loss:2.73392, Validation_loss: 3.51286) with python FCN.py --mode visualize I get always the same prediction, whichever the input image is:

    image

    This is also, by the way, the same prediction I get with an earlier model (200 iterations).

    Is there something I'm getting wrong? Thank you

    opened by MarcoBauzz 9
  •  [[Node: entropy/entropy = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT32, _device=

    [[Node: entropy/entropy = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](entropy/Reshape, entropy/Reshape_1)]]

    ----- 7 198 197 196 196 197 196 196 201 197 197 198 190 166 133 130 131 131 132 132 132 132 132 133 133 133 134 129 114 104 96 69 17 10 10 28 124 133 132 132 132 132 132 131 131 131 131 131 130 131 133 132 132 132 133 133 132 132 132 132 132 132 132 132 132 132 132 133 134 133 133 133 133 133 133 133 133 133 133 127 [[Node: entropy/entropy = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](entropy/Reshape, entropy/Reshape_1)]]

    This error stoped the traing. Is this related to softmax ? loss = tf.reduce_mean((tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=tf.squeeze(annotation, squeeze_dims=[3]), name="entropy")))

    opened by Yaredoh 8
  • Training loss remains nan

    Training loss remains nan

    Hi,

    I want to use the code to train my own image data. And the data image is 512*512 gray image. But when I train with them, the loss remains nan:

    Setting up Saver... Step: 0, Train_loss:nan 2017-03-17 13:55:04.919050 ---> Validation_loss: nan Step: 10, Train_loss:nan Step: 20, Train_loss:nan Step: 30, Train_loss:nan Step: 40, Train_loss:nan

    In the py file BatchDatsetReader I changed the way to read images like:

    def _read_images(self):
        self.__channels = False
        self.images = np.array(
            [np.expand_dims(self._transform(filename['image']), axis=3) for filename in self.files])
    

    But it did not work

    How can I solve this problem to train my gray image ? And what is the mean of NUM_OF_CLASSESS?

    opened by jiao0805 7
  • The loss is Nan

    The loss is Nan

    I train my dataset which has only two class,so I set the NUM_CLASSES as 2,and the loss turned out to be Nan.I change the NUM _CLASSED to 3 or 151 without changing my dataset,and it can work. I'm very confused with this,please help me.

    I have tried to decrease the lr to le-7 and le-8,but it didn't work.

    opened by abc8350712 6
  • Minor change requests and a Big question.

    Minor change requests and a Big question.

    Thanks for sharing your implementation with the public.

    I have couple of minor requests to update your code and one real question.

    1. how about using 'tqdm' when loading images and annotations or Chainer dataset.
    2. the iteration is restarted from 0 when resuming the training. How about using 'blobal-step' variable for fixing this.
    3. also again, you 'd better use "os.pathsep" instead of '/' for file separators.

    And my main question is.

    I trained with the MIT dataset, but the training is not the same as you reported. Train loss and Validation loss starts around 3.5 and does not decrease and just fluctuates. In your report, it seems the loss should goes below 1.0 for convergence, am I right? Step: 860, Train_loss:3.89231 Step: 870, Train_loss:3.64891 Step: 880, Train_loss:2.40985 Step: 890, Train_loss:3.11681 Step: 900, Train_loss:3.55415 Step: 910, Train_loss:3.38955 Step: 920, Train_loss:2.9828 Step: 930, Train_loss:2.39928 Step: 940, Train_loss:3.25557 Step: 950, Train_loss:3.52195 Step: 960, Train_loss:3.15188 Step: 970, Train_loss:5.03387 Step: 980, Train_loss:2.47206 Step: 990, Train_loss:3.91655 Step: 1000, Train_loss:2.78551 2018-03-24 15:37:13.298805 ---> Validation_loss: 3.48481

    Looking forwards,

    opened by ahnHeejune 5
  • Question regarding using custom dataset having 2 classes

    Question regarding using custom dataset having 2 classes

    @shekkizh Sorry to bother you but could you please tell me if my number of classes is 2, what all changes need to be done? While training I set NUM_OF_CLASSES as 3. Which one is the annotation file? What changes have to be made in it? Sorry if the question is trivial but I am new to tensorflow. Thanks in advance.

    opened by matvaibhav 5
  • How to visualize the prediction?

    How to visualize the prediction?

    I have run the command "python FCN.py --mode=visualize", but I got result far different from the author's result. Here is my result:

    image

    Does anyone know what should I do, and how to apply the whole model in my own dataset and test data set?

    Thanks.

    opened by Dean-TianZhang 5
  • Logging and Visualizing Training Metrics on Tensorboard

    Logging and Visualizing Training Metrics on Tensorboard

    I ran the training for a few epoch and was only getting an 'etropy' curve on tensorboard. How can I log and visalise metrics like - accuracy, mean accuracy. mean iou, f.w iou, etc. ?

    158780436-ffec4881-2f55-4d52-acff-ef92d3c4e8a1

    @shekkizh Kindly help please.

    opened by varungupta31 0
  • ValueError: Cannot feed value of shape (2, 227, 227, 1, 4) for Tensor 'annotation:0', which has shape '(?, 227, 227, 1)'

    ValueError: Cannot feed value of shape (2, 227, 227, 1, 4) for Tensor 'annotation:0', which has shape '(?, 227, 227, 1)'

    it seems that my own labeled images are not suitable for the feed value. The batch size is 2, 227*227 pixels, single channel labeled pngs, but what is the "4" represent? Hope someone could give me a hint or help me! Will be really appriciate!

    BTW, I used Labelme to annotate my own images, and used the Label.png, which are singel channel images directly, is that correct?

    opened by Rundong4026 2
  • Is anyone could share the trained ckpt???thanks!!!!

    Is anyone could share the trained ckpt???thanks!!!!

    Is anyone could share the trained ckpt???thanks!!!! Is anyone could share the trained ckpt???thanks!!!! Is anyone could share the trained ckpt???thanks!!!!

    opened by huizhilei 0
  • how to train my own dataset?

    how to train my own dataset?

    i am new in this field, now i have my dataset and divide is into training and validation. so how i use it to train by this model? where should i correct?

    opened by kunyu17 0
Owner
Sarath Shekkizhar
PhD Student at University of Southern California; Interests: Graphs, Machine Learning
Sarath Shekkizhar
Another pytorch implementation of FCN (Fully Convolutional Networks)

FCN-pytorch-easiest Trying to be the easiest FCN pytorch implementation and just in a get and use fashion Here I use a handbag semantic segmentation f

Y. Dong 158 Dec 21, 2022
PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

pytorch-fcn PyTorch implementation of Fully Convolutional Networks. Requirements pytorch >= 0.2.0 torchvision >= 0.1.8 fcn >= 6.1.5 Pillow scipy tqdm

Kentaro Wada 1.6k Jan 7, 2023
Another pytorch implementation of FCN (Fully Convolutional Networks)

FCN-pytorch-easiest Trying to be the easiest FCN pytorch implementation and just in a get and use fashion Here I use a handbag semantic segmentation f

Y. Dong 158 Dec 21, 2022
Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation) Download Synthia dataset The model uses

null 32 Sep 21, 2022
BMW TechOffice MUNICH 148 Dec 21, 2022
PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Shape-aware Convolutional Layer (ShapeConv) PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentatio

Hanchao Leng 82 Dec 29, 2022
Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

[TensorFlow] Protein Interface Prediction using Graph Convolutional Networks Unofficial TensorFlow implementation of Protein Interface Prediction usin

YeongHyeon Park 9 Oct 25, 2022
The official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." *.

F-Clip — Fully Convolutional Line Parsing This repository contains the official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang

Xili Dai 115 Dec 28, 2022
U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

U-Net Implementation By Christopher Ley This is my interpretation and implementation of the famous paper "U-Net: Convolutional Networks for Biomedical

Christopher Ley 1 Jan 6, 2022
Tensorflow 2.x implementation of Panoramic BlitzNet for object detection and semantic segmentation on indoor panoramic images.

Deep neural network for object detection and semantic segmentation on indoor panoramic images. The implementation is based on the papers:

Alejandro de Nova Guerrero 9 Nov 24, 2022
End-to-End Object Detection with Fully Convolutional Network

This project provides an implementation for "End-to-End Object Detection with Fully Convolutional Network" on PyTorch.

null 472 Dec 22, 2022
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

null 111 Dec 27, 2022
Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

null 39 Aug 2, 2021
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network This repository is the official implementation of Speech Separati

Kai Li (李凯) 116 Nov 9, 2022
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

Jiwoon Ahn 337 Dec 15, 2022
An example of semantic segmentation using tensorflow in eager execution.

Semantic segmentation using Tensorflow eager execution Requirement Python 2.7+ Tensorflow-gpu OpenCv H5py Scikit-learn Numpy Imgaug Train with eager e

Iñigo Alonso Ruiz 25 Sep 29, 2022
Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP Abstract: We introduce a method that allows to automatically se

Daniil Pakhomov 134 Dec 19, 2022