Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation

Sarath Shekkizhar

Last update: Dec 25, 2022

Related tags

Overview

FCN.tensorflow

Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation (FCNs).

The implementation is largely based on the reference code provided by the authors of the paper link. The model was applied on the Scene Parsing Challenge dataset provided by MIT http://sceneparsing.csail.mit.edu/.

Prerequisites
Results
Observations
Useful links

Prerequisites

The results were obtained after training for ~6-7 hrs on a 12GB TitanX.
The code was originally written and tested with tensorflow0.11 and python2.7. The tf.summary calls have been updated to work with tensorflow version 0.12. To work with older versions of tensorflow use branch tf.0.11_compatible.
Some of the problems while working with tensorflow1.0 and in windows have been discussed in Issue #9.
To train model simply execute python FCN.py
To visualize results for a random batch of images use flag --mode=visualize
debug flag can be set during training to add information regarding activations, gradients, variables etc.
The IPython notebook in logs folder can be used to view results in color as below.

Results

Results were obtained by training the model in batches of 2 with resized image of 256x256. Note that although the training is done at this image size - Nothing prevents the model from working on arbitrary sized images. No post processing was done on the predicted images. Training was done for 9 epochs - The shorter training time explains why certain concepts seem semantically understood by the model while others were not. Results below are from randomly chosen images from validation dataset.

Pretty much used the same network design as in the reference model implementation of the paper in caffe. The weights for the new layers added were initialized with small values, and the learning was done using Adam Optimizer (Learning rate = 1e-4).

Observations

The small batch size was necessary to fit the training model in memory but explains the slow learning
Concepts that had many examples seem to be correctly identified and segmented - in the example above you can see that cars, persons were identified better. I believe this can be solved by training for longer epochs.
Also the resizing of images cause loss of information - you can notice this in the fact smaller objects are segmented with less accuracy.

Now for the gradients,

If you closely watch the gradients you will notice the inital training is almost entirely on the new layers added - it is only after these layers are reasonably trained do we see the VGG layers get some gradient flow. This is understandable as changes the new layers affect the loss objective much more in the beginning.
The earlier layers of the netowrk are initialized with VGG weights and so conceptually would require less tuning unless the train data is extremely varied - which in this case is not.
The first layer of convolutional model captures low level information and since this entrirely dataset dependent you notice the gradients adjusting the first layer weights to accustom the model to the dataset.
The other conv layers from VGG have very small gradients flowing as the concepts captured here are good enough for our end objective - Segmentation.
This is the core reason Transfer Learning works so well. Just thought of pointing this out while here.

Useful Links

Video of the presentaion given by the authors on the paper - link

Comments

[Solved] Problems with TensorFlow 1.0 and Windows
Hi there,

First, I wanted to say thanks for sharing! I'm working through the code to help with my own segmentation project and having something to work from is a big help.

Second, I came across a few issues (minor really) that I've figured out and wanted to share:

TensorFlow 1.0 replaced tf.pack() with tf.stack().

In TensorFlow 1.0, variables should be initialised using tf.global_variables_initializer()

In Windows, the os.path.splittext() should use "\ \", rather than '/'. Otherwise, the program can't find any files to pickle (and the MITSceneParsing.pickle file is empty), which in turn means 0 records are found and the feed dict instruction doesn't work.

Like I said, pretty minor stuff, but I wanted to post in case anyone else had any issues.

Best regards,

Frazer

P.S. If you get an out of memory error, it's likely because you're trying to work with 20,000 images, which might be a bit too much. I deleted some of the training images and it worked.
opened by drfknoble 38

How can i solve these problems?

runfile('C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master/FCN.py', wdir='C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master') setting up vgg initialized conv layers ... Setting up summary op... Setting up image reader... Found pickle file! 0 0 Setting up dataset reader Initializing Batch Dataset Reader... {'resize': True, 'resize_size': 224} (0,) (0,) Initializing Batch Dataset Reader... {'resize': True, 'resize_size': 224} (0,) (0,) Setting up Saver... ****************** Epochs completed: 1******************

Traceback (most recent call last):

  File "<ipython-input-1-6062f5716837>", line 1, in <module>
    runfile('C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master/FCN.py', wdir='C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master')

  File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
    execfile(filename, namespace)

  File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master/FCN.py", line 223, in <module>
    tf.app.run()

  File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 43, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))

  File "C:/Users/PROCOMP-9/Desktop/FCN.tensorflow-master/FCN.tensorflow-master/FCN.py", line 194, in main
    sess.run(train_op, feed_dict=feed_dict)

  File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 766, in run
    run_metadata_ptr)

  File "C:\Users\PROCOMP-9\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 943, in _run
    % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))

ValueError: Cannot feed value of shape (0,) for Tensor 'input_image:0', which has shape '(?, 224, 224, 3)'

opened by eserercanakin 25

Getting blank predicted images

@shekkizh I took exact code from the git repo and trained without any changes on Titan x for 100000 iterations with default batch size. But on visualizing after completing the training, all i could see is full black images in prediction.

What could possibly go wrong?

opened by NitishMutha 13
How to train on Pascal dataset?

Hello,

I was wondering what all is need to be done to get this working with the Pascal dataset. I see that the output placeholder channel size is hardcoded to 1 and the pascal annotations are rgb images so that would need to be changed to 3, along with the number of classes and whatnot. I've tried changing those but its giving me an error in the loss function line and I'm having a hard time understanding how logits which is of shape (?, ?, ?, num_classes) can be compared with y_output which is of shape (?, width, height, channel).

Also a separate question, but do you know how to compute intersection over union for the output?

Thanks

edit: Spent a little bit of time looking around and it looks like I need to figure out how to map the color mapped segmentation labels they give us in the dataset with a 0-20 integer indexed version

opened by Exuro889 10

[SOLVED] Loss won't decrease, predictions are all the same

Hi! I would like to reproduce your results. Just running the code like python FCN.py doesn't seem to do the job for me. The default parameters are:

IMAGE_SIZE = 224 ~~(changing it to 256 does not affect the results)~~
learning_rate = 1e-4
batch_size = 2

What I get is that training and validation loss start at about 400, and very quickly (200 iterations) decrease until they settle to about 3.

Step: 0, Train_loss:415.754
2016-10-13 12:19:13.407670 ---> Validation_loss: 395.876
Step: 10, Train_loss:28.7208
Step: 20, Train_loss:10.2944
Step: 30, Train_loss:5.06159
Step: 40, Train_loss:4.51668
Step: 50, Train_loss:4.17936
Step: 60, Train_loss:4.55051
Step: 70, Train_loss:4.98752
Step: 80, Train_loss:3.63942
Step: 90, Train_loss:3.56676
Step: 100, Train_loss:3.96641
Step: 110, Train_loss:3.72767
Step: 120, Train_loss:3.26587
Step: 130, Train_loss:3.89015
Step: 140, Train_loss:5.48371
Step: 150, Train_loss:4.27173
Step: 160, Train_loss:3.81378
Step: 170, Train_loss:3.58391
Step: 180, Train_loss:2.79207
Step: 190, Train_loss:4.10269
Step: 200, Train_loss:4.57686
Step: 210, Train_loss:4.00551
Step: 220, Train_loss:3.1667
Step: 230, Train_loss:3.7841
Step: 240, Train_loss:3.74983
Step: 250, Train_loss:3.03212
Step: 260, Train_loss:2.85248
Step: 270, Train_loss:3.64257
Step: 280, Train_loss:3.765
Step: 290, Train_loss:4.16679
Step: 300, Train_loss:4.0291
Step: 310, Train_loss:3.95092
Step: 320, Train_loss:3.38709
Step: 330, Train_loss:2.48646
Step: 340, Train_loss:2.98015
Step: 350, Train_loss:3.59501
Step: 360, Train_loss:3.80755
Step: 370, Train_loss:3.73314
Step: 380, Train_loss:3.40185
Step: 390, Train_loss:3.89394
Step: 400, Train_loss:3.80676
Step: 410, Train_loss:2.78324
Step: 420, Train_loss:3.14695
Step: 430, Train_loss:3.29019
Step: 440, Train_loss:3.16163
Step: 450, Train_loss:3.64598
Step: 460, Train_loss:2.74009
Step: 470, Train_loss:3.93917
Step: 480, Train_loss:3.815
Step: 490, Train_loss:3.83076
Step: 500, Train_loss:4.45192
2016-10-13 12:24:10.606606 ---> Validation_loss: 3.02666

I kept it running up to 35000 iterations, which should be about 3.5 epochs, but the loss won't decrease any further. If I then validate the model at 35000 iterations (Train_loss:2.73392, Validation_loss: 3.51286) with python FCN.py --mode visualize I get always the same prediction, whichever the input image is:

This is also, by the way, the same prediction I get with an earlier model (200 iterations).

Is there something I'm getting wrong? Thank you

opened by MarcoBauzz 9

[[Node: entropy/entropy = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](entropy/Reshape, entropy/Reshape_1)]]

----- 7 198 197 196 196 197 196 196 201 197 197 198 190 166 133 130 131 131 132 132 132 132 132 133 133 133 134 129 114 104 96 69 17 10 10 28 124 133 132 132 132 132 132 131 131 131 131 131 130 131 133 132 132 132 133 133 132 132 132 132 132 132 132 132 132 132 132 133 134 133 133 133 133 133 133 133 133 133 133 127 [[Node: entropy/entropy = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](entropy/Reshape, entropy/Reshape_1)]]

This error stoped the traing. Is this related to softmax ? loss = tf.reduce_mean((tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=tf.squeeze(annotation, squeeze_dims=[3]), name="entropy")))

opened by Yaredoh 8
Training loss remains nan
Hi,

I want to use the code to train my own image data. And the data image is 512*512 gray image. But when I train with them, the loss remains nan:

Setting up Saver... Step: 0, Train_loss:nan 2017-03-17 13:55:04.919050 ---> Validation_loss: nan Step: 10, Train_loss:nan Step: 20, Train_loss:nan Step: 30, Train_loss:nan Step: 40, Train_loss:nan

In the py file BatchDatsetReader I changed the way to read images like:

def _read_images(self): self.__channels = False self.images = np.array( [np.expand_dims(self._transform(filename['image']), axis=3) for filename in self.files])

But it did not work

How can I solve this problem to train my gray image ? And what is the mean of NUM_OF_CLASSESS?
opened by jiao0805 7
The loss is Nan

I train my dataset which has only two class,so I set the NUM_CLASSES as 2,and the loss turned out to be Nan.I change the NUM _CLASSED to 3 or 151 without changing my dataset,and it can work. I'm very confused with this,please help me.

I have tried to decrease the lr to le-7 and le-8,but it didn't work.

opened by abc8350712 6
Minor change requests and a Big question.
Thanks for sharing your implementation with the public.

I have couple of minor requests to update your code and one real question.

how about using 'tqdm' when loading images and annotations or Chainer dataset.

the iteration is restarted from 0 when resuming the training. How about using 'blobal-step' variable for fixing this.

also again, you 'd better use "os.pathsep" instead of '/' for file separators.

And my main question is.

I trained with the MIT dataset, but the training is not the same as you reported. Train loss and Validation loss starts around 3.5 and does not decrease and just fluctuates. In your report, it seems the loss should goes below 1.0 for convergence, am I right? Step: 860, Train_loss:3.89231 Step: 870, Train_loss:3.64891 Step: 880, Train_loss:2.40985 Step: 890, Train_loss:3.11681 Step: 900, Train_loss:3.55415 Step: 910, Train_loss:3.38955 Step: 920, Train_loss:2.9828 Step: 930, Train_loss:2.39928 Step: 940, Train_loss:3.25557 Step: 950, Train_loss:3.52195 Step: 960, Train_loss:3.15188 Step: 970, Train_loss:5.03387 Step: 980, Train_loss:2.47206 Step: 990, Train_loss:3.91655 Step: 1000, Train_loss:2.78551 2018-03-24 15:37:13.298805 ---> Validation_loss: 3.48481

Looking forwards,
opened by ahnHeejune 5
Question regarding using custom dataset having 2 classes

@shekkizh Sorry to bother you but could you please tell me if my number of classes is 2, what all changes need to be done? While training I set NUM_OF_CLASSES as 3. Which one is the annotation file? What changes have to be made in it? Sorry if the question is trivial but I am new to tensorflow. Thanks in advance.

opened by matvaibhav 5
How to visualize the prediction?

I have run the command "python FCN.py --mode=visualize", but I got result far different from the author's result. Here is my result:

Does anyone know what should I do, and how to apply the whole model in my own dataset and test data set?

Thanks.

opened by Dean-TianZhang 5
Logging and Visualizing Training Metrics on Tensorboard

I ran the training for a few epoch and was only getting an 'etropy' curve on tensorboard. How can I log and visalise metrics like - accuracy, mean accuracy. mean iou, f.w iou, etc. ?

@shekkizh Kindly help please.

opened by varungupta31 0
ValueError: Cannot feed value of shape (2, 227, 227, 1, 4) for Tensor 'annotation:0', which has shape '(?, 227, 227, 1)'

it seems that my own labeled images are not suitable for the feed value. The batch size is 2, 227*227 pixels, single channel labeled pngs, but what is the "4" represent? Hope someone could give me a hint or help me! Will be really appriciate!

BTW, I used Labelme to annotate my own images, and used the Label.png, which are singel channel images directly, is that correct?

opened by Rundong4026 2
Is anyone could share the trained ckpt???thanks!!!!

Is anyone could share the trained ckpt???thanks!!!! Is anyone could share the trained ckpt???thanks!!!! Is anyone could share the trained ckpt???thanks!!!!

opened by huizhilei 0
how to train my own dataset?

i am new in this field, now i have my dataset and divide is into training and validation. so how i use it to train by this model? where should i correct?

opened by kunyu17 0

Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation

Related tags

Overview

FCN.tensorflow

Prerequisites

Results

Observations

Useful Links

Comments

Owner

Sarath Shekkizhar

Another pytorch implementation of FCN (Fully Convolutional Networks)

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

Another pytorch implementation of FCN (Fully Convolutional Networks)

Recall Loss for Semantic Segmentation (This repo implements the paper: Recall Loss for Semantic Segmentation)

This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

The official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." *.

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

Tensorflow 2.x implementation of Panoramic BlitzNet for object detection and semantic segmentation on indoor panoramic images.

End-to-End Object Detection with Fully Convolutional Network

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network.

Dewarping Document Image By Displacement Flow Estimation with Fully Convolutional Network

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task

An example of semantic segmentation using tensorflow in eager execution.

Segmentation in Style: Unsupervised Semantic Image Segmentation with Stylegan and CLIP

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .