Implementation of SegNet: A Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-Wise Labelling

Alex Kendall

Last update: Jan 2, 2023

Related tags

Deep Learning caffe-segnet

Overview

Caffe SegNet

This is a modified version of Caffe which supports the SegNet architecture

As described in SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation Vijay Badrinarayanan, Alex Kendall and Roberto Cipolla, PAMI 2017 [http://arxiv.org/abs/1511.00561]

Updated Version:

This version supports cudnn v2 acceleration. @TimoSaemann has a branch supporting a more recent version of Caffe (Dec 2016) with cudnn v5.1: https://github.com/TimoSaemann/caffe-segnet-cudnn5

Getting Started with Example Model and Webcam Demo

If you would just like to try out a pretrained example model, then you can find the model used in the SegNet webdemo and a script to run a live webcam demo here: https://github.com/alexgkendall/SegNet-Tutorial

For a more detailed introduction to this software please see the tutorial here: http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html

Dataset

Prepare a text file of space-separated paths to images (jpegs or pngs) and corresponding label images alternatively e.g. /path/to/im1.png /another/path/to/lab1.png /path/to/im2.png /path/lab2.png ...

Label images must be single channel, with each value from 0 being a separate class. The example net uses an image size of 360 by 480.

Net specification

Example net specification and solver prototext files are given in examples/segnet. To train a model, alter the data path in the data layers in net.prototxt to be your dataset.txt file (as described above).

In the last convolution layer, change num_output to be the number of classes in your dataset.

Training

In solver.prototxt set a path for snapshot_prefix. Then in a terminal run ./build/tools/caffe train -solver ./examples/segnet/solver.prototxt

Publications

If you use this software in your research, please cite our publications:

http://arxiv.org/abs/1511.02680 Alex Kendall, Vijay Badrinarayanan and Roberto Cipolla "Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding." arXiv preprint arXiv:1511.02680, 2015.

http://arxiv.org/abs/1511.00561 Vijay Badrinarayanan, Alex Kendall and Roberto Cipolla "SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation." PAMI, 2017.

License

This extension to the Caffe library is released under a creative commons license which allows for personal and research use only. For a commercial license please contact the authors. You can view a license summary here: http://creativecommons.org/licenses/by-nc/4.0/

Comments

ask HowTo do the training and label prediction for CamVid dataset

Hi, @alexgkendall ,

I had installed caffe-segnet successfully. Now I want to practice some examples such as training CamVid dataset which was discussed in paper (SegNet). Would you list the steps and corresponding command for achieving the training and label prediction for CamVid dataset?

Thx~ Milton

opened by amiltonwong 18
k-channel image dataset

My dataset has 2 classes; with 1000 training images of (5,256,256) also corresponding ground truth data (1,256,256) which is a binary image either 0 or 1 to represent the 2 classes. Can seg-net be used on k-channel dataset or is limited to RGB ?

opened by mtrth 15
Check failed: status == CUBLAS STATUS_SUCCESS (11 vs. 0) CUBLAS STATUS MAPPING_ERROR

Hello! An example described here is excellent and easily trained http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html For my task requires segmentation of two classes: the background and subject. Man and background. But I have a problem with the preparation of annotated images (labels), in its own database. As I understand the network sees only color in the gray scale. I create labels using Adobe photoshop so: Uploading original JPEG image Superimposed over the original, without a background image in PNG format. Thus creating selection mask. The sacrificial layer with the image without background PNG image. Encodes the image to indexed color (3 colors: void 0 0 0, person 192 128 128, transparency alpha channel). This is necessary in order to while painting flowers no new colors or blur. I produce fill the selected area with mask the above colors: 192 128 228 subject, background 0 0 0) Then convert the image to RGB PNG 8 bit / chanel mode. If you submit the following annotated (labels) to the input image network, it displays the following: F0606 19:20:16.349822 2152 math_functions.cu:123] Check failed: status == CUBLAS_STATUS_SUCCESS (11 vs. 0) CUBLAS_STATUS_MAPPING_ERROR. Then I proceed as follows: Skipping my images via this script, that should solve my problems:

!/usr/bin/env python

import os import numpy as np from itertools import izip from argparse import ArgumentParser from collections import OrderedDict from skimage.io import ImageCollection, imsave from skimage.transform import resize

camvid_colors = OrderedDict([ ("Animal", np.array([64, 128, 64], dtype=np.uint8)), ("Archway", np.array([192, 0, 128], dtype=np.uint8)), ("Bicyclist", np.array([0, 128, 192], dtype=np.uint8)), ("Bridge", np.array([0, 128, 64], dtype=np.uint8)), ("Building", np.array([128, 0, 0], dtype=np.uint8)), ("Car", np.array([64, 0, 128], dtype=np.uint8)), ("CartLuggagePram", np.array([64, 0, 192], dtype=np.uint8)), ("Child", np.array([192, 128, 64], dtype=np.uint8)), ("Column_Pole", np.array([192, 192, 128], dtype=np.uint8)), ("Fence", np.array([64, 64, 128], dtype=np.uint8)), ("LaneMkgsDriv", np.array([128, 0, 192], dtype=np.uint8)), ("LaneMkgsNonDriv", np.array([192, 0, 64], dtype=np.uint8)), ("Misc_Text", np.array([128, 128, 64], dtype=np.uint8)), ("MotorcycleScooter", np.array([192, 0, 192], dtype=np.uint8)), ("OtherMoving", np.array([128, 64, 64], dtype=np.uint8)), ("ParkingBlock", np.array([64, 192, 128], dtype=np.uint8)), ("Pedestrian", np.array([64, 64, 0], dtype=np.uint8)), ("Road", np.array([128, 64, 128], dtype=np.uint8)), ("RoadShoulder", np.array([128, 128, 192], dtype=np.uint8)), ("Sidewalk", np.array([0, 0, 192], dtype=np.uint8)), ("SignSymbol", np.array([192, 128, 128], dtype=np.uint8)), ("Sky", np.array([128, 128, 128], dtype=np.uint8)), ("SUVPickupTruck", np.array([64, 128, 192], dtype=np.uint8)), ("TrafficCone", np.array([0, 0, 64], dtype=np.uint8)), ("TrafficLight", np.array([0, 64, 64], dtype=np.uint8)), ("Train", np.array([192, 64, 128], dtype=np.uint8)), ("Tree", np.array([128, 128, 0], dtype=np.uint8)), ("Truck_Bus", np.array([192, 128, 192], dtype=np.uint8)), ("Tunnel", np.array([64, 0, 64], dtype=np.uint8)), ("VegetationMisc", np.array([192, 192, 0], dtype=np.uint8)), ("Wall", np.array([64, 192, 0], dtype=np.uint8)), ("Void", np.array([0, 0, 0], dtype=np.uint8)) ])

def convert_label_to_grayscale(im): out = (np.ones(im.shape[:2]) * 255).astype(np.uint8) for gray_val, (label, rgb) in enumerate(camvid_colors.items()): match_pxls = np.where((im == np.asarray(rgb)).sum(-1) == 3) out[match_pxls] = gray_val assert (out != 255).all(), "rounding errors or missing classes in camvid_colors" return out.astype(np.uint8)

def make_parser(): parser = ArgumentParser() parser.add_argument( 'label_dir', help="Directory containing all RGB camvid label images as PNGs" ) parser.add_argument( 'out_dir', help="""Directory to save grayscale label images. Output images have same basename as inputs so be careful not to overwrite original RGB labels""") return parser

if name == 'main': parser = make_parser() args = parser.parse_args() labs = ImageCollection(os.path.join(args.label_dir, "*")) os.makedirs(args.out_dir) for i, (inpath, im) in enumerate(izip(labs.files, labs)): print i + 1, "of", len(labs) # resize to caffe-segnet input size and preserve label values resized_im = (resize(im, (360, 480), order=0) * 255).astype(np.uint8) out = convert_label_to_grayscale(resized_im) outpath = os.path.join(args.out_dir, os.path.basename(inpath)) imsave(outpath, out) Then again I try to start training with the converted images. Again a message: Check failed: status == CUBLAS_STATUS_SUCCESS (11 vs. 0). Remarkable is that the marked image after the tool takes the network and trained them: https://github.com/kyamagu/js-segment-annotator But that's not what I need. I do not want to mark up the image manually, I already have cut without PNG image background. Please help me to understand how to convert colors to segnet took them? Maybe hsv? Here's a link to my data: https://github.com/Maxfashko/CamVid?files=1

opened by Maxfashko 11

On custom data, loss and weights diverge to nan -- faster with lower LR

Thanks very much for sharing your work on SegNet!

I'm interested in using the segnet model with some custom data (aerial imagery; 1000 images and 3 label classes for now), and I'm seeing something strange in training:

Training with base_lr: 0.001, I was seeing training loss = nan. Surprisingly, when I reduced the LR, the loss exploded to nan even faster. Couple of examples:

With base_lr = 1e-4

#Iters Seconds TrainingLoss LearningRate
0    3.022133     1.25783   0.0001
20   58.911155    1.3897    0.0001
40   114.768299   1.18601   0.0001
60   170.620245   1.02438   0.0001
80   226.502911   0.992765  0.0001
100  282.332283   1.05062   0.0001
120  338.179408   1.44935   0.0001
140  394.035362   4.73935   0.0001
160  449.878457   0.997907  0.0001
180  505.701507   0.98053   0.0001
200  561.558644   1.14603   0.0001
220  617.398724   0.936918  0.0001
240  672.909704   nan       0.0001
260  728.026581   nan       0.0001

With base_lr = 1e-6

#Iters Seconds TrainingLoss LearningRate
0   3.033944    1.4484   1e-06
20  58.966689   1.12416  1e-06
40  114.393818  nan      1e-06
60  169.507581  nan      1e-06
80  224.611446  nan      1e-06

I also set debug_info: true, and discovered that the actual weights seem to be diverging, also:

I0408 13:13:13.229012 23306 net.cpp:594]     [Forward] Layer data, top blob data data: 62.8498
I0408 13:13:13.229233 23306 net.cpp:594]     [Forward] Layer data, top blob label data: 2.95128
I0408 13:13:13.229307 23306 net.cpp:594]     [Forward] Layer label_data_1_split, top blob label_data_1_split_0 data: 2.95128
I0408 13:13:13.229379 23306 net.cpp:594]     [Forward] Layer label_data_1_split, top blob label_data_1_split_1 data: 2.95128
I0408 13:13:13.236066 23306 net.cpp:594]     [Forward] Layer conv1_1, top blob conv1_1 data: nan
I0408 13:13:13.236177 23306 net.cpp:604]     [Forward] Layer conv1_1, param blob 0 data: nan
I0408 13:13:13.236227 23306 net.cpp:604]     [Forward] Layer conv1_1, param blob 1 data: nan
I0408 13:13:13.258630 23306 net.cpp:594]     [Forward] Layer conv1_1_bn, top blob conv1_1 data: nan
I0408 13:13:13.258780 23306 net.cpp:604]     [Forward] Layer conv1_1_bn, param blob 0 data: nan
I0408 13:13:13.258839 23306 net.cpp:604]     [Forward] Layer conv1_1_bn, param blob 1 data: nan
I0408 13:13:13.261641 23306 net.cpp:594]     [Forward] Layer relu1_1, top blob conv1_1 data: 0
I0408 13:13:13.304590 23306 net.cpp:594]     [Forward] Layer conv1_2, top blob conv1_2 data: nan
...

Finally, if I set base_lr to 0, the loss diverges, but the weights do not.

I'm going to keep investigating, and will report back here if I resolve it, but I'm a bit stumped. @alexgkendall Do any insights come to mind?

opened by anandthakker 10

making class weight for custom DB

I am establishing the process for making class weight. First, I had checked default DB(CamVid) and default weights in segnet_train.prototxt As I counted law data pixels of default DB, below data table was shown.

class, pixels, files 0 10682767 366 1 14750079 365 2 623349 366 3 20076880 367 4 2845085 349 5 6166762 319 6 743859 351 7 714595 173 8 3719877 360 9 405385 317 10 184967 201 11 2503995 367

and following your paper, calculated weight. left is default weight, right is my result. f(class) = frequency(class) / (image_count(class) * 480*360) weight(class) = median of f(class)) / f(class)

0 0.2595 0.256527308 1 0.1826 0.185282667 2 4.564 4.396287575 3 0.1417 0.136869322 4 0.9051 0.918473131 5 0.3826 0.387319864 6 9.6446 3.533074291 7 1.8418 1.812685267 8 0.6823 0.724619798 9 6.2478 5.855012981 10 7.3614 8.136508447 11 1.097409921

almost weight values seems similar between them but only class #6 has big different. Could you confirm my raw data?

Also, I have another question. If I want to ignore class#5, how can I do it? your reference ignores #11. so it seems that you described only 0-10 on prototxt file. but if middle position class should be ignored how can we describe weights? and multi classes can be ignored? If it is possible, can you advice the way? BR

opened by suhyung 9
How to create annotated image for Segnet?

I checked Camvid image dataset. I am trying to create annotated image just like Camvid annotated image but failed. Please suggest any tool or algorithm for the same. Please suggest can I use binary image as ground truth?

opened by monjoybme 8
Check failed: error == cudaSuccess (29 vs. 0) driver shutting down
Hi Alex,

syncedmem.cpp:19] Check failed: error == cudaSuccess (29 vs. 0) driver shutting down *** Check failure stack trace: *** Aborted (core dumped)

I get the above error. Any idea what's wrong? I've checked caffe discussions and they all say that it is due to memory issue but GPU memory usage is very low when I run segnet. Strange thing is that I get the segnet output but when the python process is about to exit at the end I get this error. Possibly some clean up code error. I have no clue how to solve it. Any pointers?
opened by codecolony 8
Segmentation fault at Softmax

Hi Alex,

Hope someone can help me with this issue. I'm just reusing the network architecture with only 4 output classes. Image dimensions remain same. Made the necessary changes in the prototext files. I get the following error when I start the training.

I0531 09:45:15.546435 2031686400 net.cpp:248] Memory required for data: 1068595224 I0531 09:45:15.546778 2031686400 solver.cpp:42] Solver scaffolding done. I0531 09:45:15.546954 2031686400 solver.cpp:250] Solving VGG_ILSVRC_16_layer I0531 09:45:15.546959 2031686400 solver.cpp:251] Learning Rate Policy: step *** Aborted at 1464668121 (unix time) try "date -d @1464668121" if you are using GNU date *** PC: @ 0x10b25b442 caffe::SoftmaxWithLossLayer<>::Forward_cpu() *** SIGSEGV (@0x216caa000) received by PID 55425 (TID 0x7fff79191300) stack trace: *** @ 0x7fff918dff1a _sigtramp @ 0x100000000000000 (unknown) @ 0x10b2209d9 caffe::Layer<>::Forward() @ 0x10b2760d9 caffe::Net<>::ForwardFromTo() @ 0x10b276718 caffe::Net<>::Forward() @ 0x10b285dcf caffe::Solver<>::Step() @ 0x10b28577c caffe::Solver<>::Solve() @ 0x10b1b1594 train() @ 0x10b1b392f main @ 0x7fff8c7e25c9 start @ 0x4 (unknown) Segmentation fault: 11

Any help in this regard. Appreciate your time being spent on this.

opened by codecolony 8
upsample layer problem

Hi @alexgkendall ,

My data have 8 classes, when I change the number of class in the model file, I had this problem. Can you please have a look at my segnet_basic_train.prototxt file (http://pastebin.com/8SFwqtij) and show me what's wrong?

Thanks!

F1125 15:28:05.204437 5212 upsample_layer.cpp:63] Check failed: bottom[0]->height() == bottom[1]->height() (59 vs. 60) *** Check failure stack trace: *** @ 0x7f7bf11c3daa (unknown) @ 0x7f7bf11c3ce4 (unknown) @ 0x7f7bf11c36e6 (unknown) @ 0x7f7bf11c6687 (unknown) @ 0x7f7bf1557af9 caffe::UpsampleLayer<>::Reshape() @ 0x7f7bf15f3f07 caffe::Net<>::Init() @ 0x7f7bf15f5cb2 caffe::Net<>::Net() @ 0x7f7bf14ecb00 caffe::Solver<>::InitTrainNet() @ 0x7f7bf14edbe3 caffe::Solver<>::Init() @ 0x7f7bf14eddb6 caffe::Solver<>::Solver() @ 0x40edf0 caffe::GetSolver<>() @ 0x408683 train() @ 0x406c81 main @ 0x7f7bf06d5ec5 (unknown) @ 0x40722d (unknown) @ (nil) (unknown) Aborted (core dumped)

opened by trminh89 8
upsample top index xxx out of range

Hi,@kezbreen I try your code, but got the following error. I followed the steps in README.md,step by step,here is the list:

download camvid dataset(701 images) and transform the label by using the script from issue #3 resized the original pics into [360 480]. generate data.txt change the path in net.prototxt(first 2 layer) change the 'dense_softmax_inner_prod' layer in "num_output: 32" (camvid 32 kinds of label) change the solver.prototxt "solver_mode" to cpu(i used the Geforce 970,it showed error==cudaSuccess(2,0), i thought it was not enough memory ? So I change to cpu) then I start training...

... I1002 10:58:40.084039 3306 solver.cpp:42] Solver scaffolding done. I1002 10:58:40.084084 3306 solver.cpp:250] Solving segnet I1002 10:58:40.084090 3306 solver.cpp:251] Learning Rate Policy: fixed I1002 10:58:40.085350 3306 solver.cpp:294] Iteration 0, Testing net (#0) I1002 10:58:46.428558 3306 solver.cpp:343] Test net output #0: accuracy = 0.0286227 I1002 10:58:46.428601 3306 solver.cpp:343] Test net output #1: loss = 3.49524 (* 1 = 3.49524 loss) F1002 10:58:58.492833 3306 upsample_layer.cpp:127] upsample top index 481 out of range - check scale settings match input pooling layer's downsample setup *** Check failure stack trace: *** @ 0x7fc354130ea4 (unknown) @ 0x7fc354130deb (unknown) @ 0x7fc3541307bf (unknown) @ 0x7fc354133a35 (unknown) @ 0x7fc359fa814d caffe::UpsampleLayer<>::Backward_cpu() @ 0x7fc359f551e9 caffe::Net<>::BackwardFromTo() @ 0x7fc359f55551 caffe::Net<>::Backward() @ 0x7fc359f61135 caffe::Solver<>::Step() @ 0x7fc359f61da5 caffe::Solver<>::Solve() @ 0x407c7b train() @ 0x404ed8 main @ 0x7fc3509a9a40 (unknown) @ 0x404a79 _start @ (nil) (unknown) Aborted (core dumped) ...

I change the batch size to 5 in net.prototxt, the result is F1002 10:58:58.492833 3306 upsample_layer.cpp:127] upsample top index 0 out of range - check scale settings match input pooling layer's downsample setup

Please help, i'm new in caffe .Thanks

opened by SpecialM2580 8
Build issue

I'm building caffe-segnet on ubuntu 17.10 in modalities CPU-only.

I have this error on terminal. Anyone can help me? I'm a beginner

CXX src/caffe/layer_factory.cpp CXX src/caffe/common.cpp CXX src/caffe/solver.cpp CXX src/caffe/util/upgrade_proto.cpp CXX src/caffe/util/io.cpp CXX src/caffe/util/im2col.cpp CXX src/caffe/util/cudnn.cpp CXX src/caffe/util/insert_splits.cpp CXX src/caffe/util/benchmark.cpp CXX src/caffe/util/db.cpp CXX src/caffe/util/math_functions.cpp CXX src/caffe/blob.cpp CXX src/caffe/syncedmem.cpp CXX src/caffe/data_transformer.cpp CXX src/caffe/internal_thread.cpp CXX src/caffe/layers/deconv_layer.cpp CXX src/caffe/layers/filter_layer.cpp CXX src/caffe/layers/inner_product_layer.cpp CXX src/caffe/layers/power_layer.cpp CXX src/caffe/layers/loss_layer.cpp CXX src/caffe/layers/silence_layer.cpp CXX src/caffe/layers/infogain_loss_layer.cpp CXX src/caffe/layers/relu_layer.cpp CXX src/caffe/layers/cudnn_relu_layer.cpp CXX src/caffe/layers/prelu_layer.cpp CXX src/caffe/layers/concat_layer.cpp CXX src/caffe/layers/cudnn_sigmoid_layer.cpp CXX src/caffe/layers/hdf5_data_layer.cpp CXX src/caffe/layers/hinge_loss_layer.cpp CXX src/caffe/layers/memory_data_layer.cpp CXX src/caffe/layers/conv_layer.cpp CXX src/caffe/layers/sigmoid_layer.cpp CXX src/caffe/layers/base_data_layer.cpp CXX src/caffe/layers/dense_image_data_layer.cpp CXX src/caffe/layers/data_layer.cpp CXX src/caffe/layers/split_layer.cpp CXX src/caffe/layers/exp_layer.cpp CXX src/caffe/layers/cudnn_softmax_layer.cpp CXX src/caffe/layers/sigmoid_cross_entropy_loss_layer.cpp CXX src/caffe/layers/hdf5_output_layer.cpp CXX src/caffe/layers/reshape_layer.cpp CXX src/caffe/layers/accuracy_layer.cpp CXX src/caffe/layers/cudnn_conv_layer.cpp CXX src/caffe/layers/contrastive_loss_layer.cpp src/caffe/layers/contrastive_loss_layer.cpp: In instantiation of ‘void caffe::ContrastiveLossLayer<Dtype>::Forward_cpu(const std::vector<caffe::Blob<Dtype>*>&, const std::vector<caffe::Blob<Dtype>*>&) [with Dtype = float]’: src/caffe/layers/contrastive_loss_layer.cpp:118:1: required from here src/caffe/layers/contrastive_loss_layer.cpp:56:30: error: no matching function for call to ‘max(float, double)’ Dtype dist = std::max(margin - sqrt(dist_sq_.cpu_data()[i]), 0.0); ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /usr/include/c++/7/algorithm:61:0, from src/caffe/layers/contrastive_loss_layer.cpp:1: /usr/include/c++/7/bits/stl_algobase.h:219:5: note: candidate: template<class _Tp> constexpr const _Tp& std::max(const _Tp&, const _Tp&) max(const _Tp& __a, const _Tp& __b) ^~~ /usr/include/c++/7/bits/stl_algobase.h:219:5: note: template argument deduction/substitution failed: src/caffe/layers/contrastive_loss_layer.cpp:56:30: note: deduced conflicting types for parameter ‘const _Tp’ (‘float’ and ‘double’) Dtype dist = std::max(margin - sqrt(dist_sq_.cpu_data()[i]), 0.0); ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /usr/include/c++/7/algorithm:61:0, from src/caffe/layers/contrastive_loss_layer.cpp:1: /usr/include/c++/7/bits/stl_algobase.h:265:5: note: candidate: template<class _Tp, class _Compare> constexpr const _Tp& std::max(const _Tp&, const _Tp&, _Compare) max(const _Tp& __a, const _Tp& __b, _Compare __comp) ^~~ /usr/include/c++/7/bits/stl_algobase.h:265:5: note: template argument deduction/substitution failed: src/caffe/layers/contrastive_loss_layer.cpp:56:30: note: deduced conflicting types for parameter ‘const _Tp’ (‘float’ and ‘double’) Dtype dist = std::max(margin - sqrt(dist_sq_.cpu_data()[i]), 0.0); ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /usr/include/c++/7/algorithm:62:0, from src/caffe/layers/contrastive_loss_layer.cpp:1: /usr/include/c++/7/bits/stl_algo.h:3462:5: note: candidate: template<class _Tp> constexpr _Tp std::max(std::initializer_list<_Tp>) max(initializer_list<_Tp> __l) ^~~ /usr/include/c++/7/bits/stl_algo.h:3462:5: note: template argument deduction/substitution failed: src/caffe/layers/contrastive_loss_layer.cpp:56:30: note: mismatched types ‘std::initializer_list<_Tp>’ and ‘float’ Dtype dist = std::max(margin - sqrt(dist_sq_.cpu_data()[i]), 0.0); ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /usr/include/c++/7/algorithm:62:0, from src/caffe/layers/contrastive_loss_layer.cpp:1: /usr/include/c++/7/bits/stl_algo.h:3468:5: note: candidate: template<class _Tp, class _Compare> constexpr _Tp std::max(std::initializer_list<_Tp>, _Compare) max(initializer_list<_Tp> __l, _Compare __comp) ^~~ /usr/include/c++/7/bits/stl_algo.h:3468:5: note: template argument deduction/substitution failed: src/caffe/layers/contrastive_loss_layer.cpp:56:30: note: mismatched types ‘std::initializer_list<_Tp>’ and ‘float’ Dtype dist = std::max(margin - sqrt(dist_sq_.cpu_data()[i]), 0.0); ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Makefile:526: recipe for target '.build_release/src/caffe/layers/contrastive_loss_layer.o' failed make: *** [.build_release/src/caffe/layers/contrastive_loss_layer.o] Error 1

opened by matbs24 7
why last prediction layer is 3x3 convolution??

Hello, I wonder why you used 3x3 convolution when you predicted the last output class. SegNet itself is to speed up, isn't it much faster to use 1x1 convolution? I want to know why it's composed of 3x3 convolution.

opened by choco9966 0
Try-demo issues about website http://mi.eng.cam.ac.uk/projects/segnet/demo.php

Thank you for doing such an excellent job and making a website for the reen hand in your busy schedule. However, when I tried to test with my own pictures, I found that I could not get the results. Could you please fix this problem?The website address as follows: http://mi.eng.cam.ac.uk/projects/segnet/demo.php

opened by PhilOkami 0
what is the softmax uncertainty in Bayesian Segnet paper?

Hi Alex, thanks a lot for your work. I read your Bayesian Segnet paper recently and I am wondering what is the softmax uncertainty. Is it the output of the final softmax layer?

opened by Mofafa 0

nvcc fatal: redefinition of argument 'compiler-bindir'

It's been days trying to compile caffe-segnet but with no luck, I m using cuda 9.0, cudnn v2 and opencv 3.3.0 Here is the output error when using make all:

...
CXX src/caffe/common.cpp
/usr/bin/g++-4.8 src/caffe/common.cpp -pthread -fPIC -DNDEBUG -O2 -I/usr/include/python2.7 -I/usr/lib/python2.7/dist-packages/numpy/core/include -I/usr/local/include -I/usr/local/hdf5 -I/usr/local/hdf5/include -I.build_release/src -I./src -I./include -I/usr/local/cuda/include -Wall -Wno-sign-compare -std=c++11 -MMD -MP -pthread -fPIC -DNDEBUG -O2 -I/usr/include/python2.7 -I/usr/lib/python2.7/dist-packages/numpy/core/include -I/usr/local/include -I/usr/local/hdf5 -I/usr/local/hdf5/include -I.build_release/src -I./src -I./include -I/usr/local/cuda/include -Wall -Wno-sign-compare -c -o .build_release/src/caffe/common.o 2> .build_release/src/caffe/common.o.warnings.txt \
	|| (cat .build_release/src/caffe/common.o.warnings.txt; exit 1)
CXX src/caffe/internal_thread.cpp
/usr/bin/g++-4.8 src/caffe/internal_thread.cpp -pthread -fPIC -DNDEBUG -O2 -I/usr/include/python2.7 -I/usr/lib/python2.7/dist-packages/numpy/core/include -I/usr/local/include -I/usr/local/hdf5 -I/usr/local/hdf5/include -I.build_release/src -I./src -I./include -I/usr/local/cuda/include -Wall -Wno-sign-compare -std=c++11 -MMD -MP -pthread -fPIC -DNDEBUG -O2 -I/usr/include/python2.7 -I/usr/lib/python2.7/dist-packages/numpy/core/include -I/usr/local/include -I/usr/local/hdf5 -I/usr/local/hdf5/include -I.build_release/src -I./src -I./include -I/usr/local/cuda/include -Wall -Wno-sign-compare -c -o .build_release/src/caffe/internal_thread.o 2> .build_release/src/caffe/internal_thread.o.warnings.txt \
	|| (cat .build_release/src/caffe/internal_thread.o.warnings.txt; exit 1)
NVCC src/caffe/layers/upsample_layer.cu
/usr/local/cuda/bin/nvcc -D_FORCE_INLINES -ccbin=/usr/bin/g++-4.8 -Xcompiler -fPIC -DNDEBUG -O2 -I/usr/include/python2.7 -I/usr/lib/python2.7/dist-packages/numpy/core/include -I/usr/local/include -I/usr/local/hdf5 -I/usr/local/hdf5/include -I.build_release/src -I./src -I./include -I/usr/local/cuda/include -std=c++11 -ccbin=/usr/bin/g++-4.8 -Xcompiler -fPIC -DNDEBUG -O2 -I/usr/include/python2.7 -I/usr/lib/python2.7/dist-packages/numpy/core/include -I/usr/local/include -I/usr/local/hdf5 -I/usr/local/hdf5/include -I.build_release/src -I./src -I./include -I/usr/local/cuda/include  -M src/caffe/layers/upsample_layer.cu -o .build_release/cuda/src/caffe/layers/upsample_layer.d \
	-odir .build_release/cuda/src/caffe/layers
nvcc fatal   : redefinition of argument 'compiler-bindir'
Makefile:544: recipe for target '.build_release/cuda/src/caffe/layers/upsample_layer.o' failed
make: *** [.build_release/cuda/src/caffe/layers/upsample_layer.o] Error 1

and the line 544 in Makefile which the error come from:

...
	@ cat $@.$(WARNS_EXT)

$(BUILD_DIR)/cuda/%.o: %.cu | $(ALL_BUILD_DIRS)
544>	@ echo NVCC $<
	$(Q)$(CUDA_DIR)/bin/nvcc $(NVCCFLAGS) $(CUDA_ARCH) -M $< -o ${@:.o=.d} \
		-odir $(@D)
...

I also added this CUSTOM_CXX := /usr/bin/g++-4.8 to Makefile.config Anyone encountered such a problem ?

opened by mjrlgue 0

make error

An error occurred when i tried executing the command "make all", and it eventually turned out to be the compilation of src/caffe/layers/sigmoid_cross_entropy_loss_layer.cpp, where in line 5: #include "caffe/layer.hpp", but there was no such file.

opened by ghost 0
make problem

when i make ,it occurs problems,like that /usr/lib/x86_64-linux-gnu/libgdcmMSFF.so.2.8: undefined reference to uuid_generate@UUID_1.0' //usr/lib/x86_64-linux-gnu/libgdcmMSFF.so.2.8: undefined reference touuid_parse@UUID_1.0' .build_release/lib/libcaffe.so: undefined reference to leveldb::Status::ToString[abi:cxx11]() const' .build_release/lib/libcaffe.so: undefined reference toleveldb::DB::Open(leveldb::Options const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, leveldb::DB**)' //usr/lib/x86_64-linux-gnu/libgdcmMSFF.so.2.8: undefined reference to `uuid_unparse@UUID_1.0' collect2: error: ld returned 1 exit status Makefile:570: recipe for target '.build_release/tools/caffe.bin' failed make: *** [.build_release/tools/caffe.bin] Error 1 please give me some advices,thanks!

opened by llbboo 0

Owner

Alex Kendall

GitHub http://mi.eng.cam.ac.uk/projects/segnet/

An implementation of a sequence to sequence neural network using an encoder-decoder

Keras implementation of a sequence to sequence model for time series prediction using an encoder-decoder architecture. I created this post to share a

195 Dec 17, 2022

This is an official implementation of "Polarized Self-Attention: Towards High-quality Pixel-wise Regression"

Polarized Self-Attention: Towards High-quality Pixel-wise Regression This is an official implementation of: Huajun Liu, Fuqiang Liu, Xinyi Fan and Don

212 Jan 8, 2023

[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation Getting Started Our codes are implemented and tested with pyth

176 Dec 15, 2022

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现目录性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Download

31 Nov 25, 2022

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa

18 Dec 29, 2022

Pixel-wise segmentation on VOC2012 dataset using pytorch.

PiWiSe Pixel-wise segmentation on the VOC2012 dataset using pytorch. FCN SegNet PSPNet UNet RefineNet For a more complete implementation of segmentati

378 Dec 30, 2022

Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral

Good news! We release a clean version of PVNet: clean-pvnet, including how to train the PVNet on the custom dataset. Use PVNet with a detector. The tr

722 Dec 27, 2022

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and testing data for various deep learning projects such as 6D object pose estimation projects singleshotpose, as well as object detection and instance segmentation projects.

305 Dec 16, 2022

Retinal Vessel Segmentation with Pixel-wise Adaptive Filters (ISBI 2022)

Retinal Vessel Segmentation with Pixel-wise Adaptive Filters (ISBI 2022) Introdu

14 Oct 27, 2022

Official code of Retinal Vessel Segmentation with Pixel-wise Adaptive Filters and Consistency Training

Official code of Retinal Vessel Segmentation with Pixel-wise Adaptive Filters and Consistency Training (ISBI 2022)

7 Feb 10, 2022

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Learning Pixel-level Semantic Affinity with Image-level Supervision This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead. Int

337 Dec 15, 2022

Implementation of U-Net and SegNet for building segmentation

Specialized project Created by Katrine Nguyen and Martin Wangen-Eriksen as a part of our specialized project at Norwegian University of Science and Te

3 Dec 7, 2022

Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter Code and checkpoints for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling

274 Dec 6, 2022

A light-weight image labelling tool for Python designed for creating segmentation data sets.

An image labelling tool for creating segmentation data sets, for Django and Flask.

117 Nov 21, 2022

HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

: We present a novel, real-time, semantic segmentation network in which the encoder both encodes and generates the parameters (weights) of the decoder. Furthermore, to allow maximal adaptivity, the weights at each decoder block vary spatially. For this purpose, we design a new type of hypernetwork, composed of a nested U-Net for drawing higher level context features

182 Dec 14, 2022

Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

87 Oct 19, 2022

TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.

TalkNet 2 [WIP] TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Predictio

69 Dec 17, 2022

Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

5 Nov 21, 2022

Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Exploring Cross-Image Pixel Contrast for Semantic Segmentation Exploring Cross-Image Pixel Contrast for Semantic Segmentation, Wenguan Wang, Tianfei Z

510 Jan 2, 2023

Implementation of SegNet: A Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-Wise Labelling

Related tags

Overview

Caffe SegNet

Updated Version:

Getting Started with Example Model and Webcam Demo

Dataset

Net specification

Training

Publications

License

Comments

!/usr/bin/env python

Owner

Alex Kendall

An implementation of a sequence to sequence neural network using an encoder-decoder

This is an official implementation of "Polarized Self-Attention: Towards High-quality Pixel-wise Regression"

[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

Pixel-wise segmentation on VOC2012 dataset using pytorch.

Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Retinal Vessel Segmentation with Pixel-wise Adaptive Filters (ISBI 2022)

Official code of Retinal Vessel Segmentation with Pixel-wise Adaptive Filters and Consistency Training

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Implementation of U-Net and SegNet for building segmentation

Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

A light-weight image labelling tool for Python designed for creating segmentation data sets.

HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.

Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

Exploring Cross-Image Pixel Contrast for Semantic Segmentation