Segmentation-Aware Convolutional Networks Using Local Attention Masks

Last update: Jun 29, 2022

Related tags

Deep Learning segaware

Overview

Segmentation-Aware Convolutional Networks Using Local Attention Masks

Segmentation-aware convolution filters are invariant to backgrounds. We achieve this in three steps: (i) compute segmentation cues for each pixel (i.e., “embeddings”), (ii) create a foreground mask for each patch, and (iii) combine the masks with convolution, so that the filters only process the local foreground in each image patch.

Installation

For prerequisites, refer to DeepLabV2. Our setup follows theirs almost exactly.

Once you have the prequisites, simply run make all -j4 from within caffe/ to compile the code with 4 cores.

Learning embeddings with dedicated loss

Use Convolution layers to create dense embeddings.
Use Im2dist to compute dense distance comparisons in an embedding map.
Use Im2parity to compute dense label comparisons in a label map.
Use DistLoss (with parameters alpha and beta) to set up a contrastive side loss on the distances.

See scripts/segaware/config/embs for a full example.

Setting up a segmentation-aware convolution layer

Use Im2col on the input, to arrange pixel/feature patches into columns.
Use Im2dist on the embeddings, to get their distances into columns.
Use Exp on the distances, with scale: -1, to get them into [0,1].
Tile the exponentiated distances, with a factor equal to the depth (i.e., channels) of the original convolution features.
Use Eltwise to multiply the Tile result with the Im2col result.
Use Convolution with bottom_is_im2col: true to matrix-multiply the convolution weights with the Eltwise output.

See scripts/segaware/config/vgg for an example in which every convolution layer in the VGG16 architecture is made segmentation-aware.

Using a segmentation-aware CRF

Use the NormConvMeanfield layer. As input, give it two copies of the unary potentials (produced by a Split layer), some embeddings, and a meshgrid-like input (produced by a DummyData layer with data_filler { type: "xy" }).

See scripts/segaware/config/res for an example in which a segmentation-aware CRF is added to a resnet architecture.

Replicating the segmentation results presented in our paper

Download pretrained model weights here, and put that file into scripts/segaware/model/res/.
From scripts, run ./test_res.sh. This will produce .mat files in scripts/segaware/features/res/voc_test/mycrf/.
From scripts, run ./gen_preds.sh. This will produce colorized .png results in scripts/segaware/results/res/voc_test/mycrf/none/results/VOC2012/Segmentation/comp6_test_cls. An example input-ouput pair is shown below:

- If you zip these results, and submit them to the official PASCAL VOC test server, you will get 79.83900% IOU.

If you run this set of steps for the validation set, you can run ./eval.sh to evaluate your results on the PASCAL VOC validation set. If you change the model, you may want to run ./edit_env.sh to update the evaluation instructions.

Citation

@inproceedings{harley_segaware,
  title = {Segmentation-Aware Convolutional Networks Using Local Attention Masks},
  author = {Adam W Harley, Konstantinos G. Derpanis, Iasonas Kokkinos},
  booktitle = {IEEE International Conference on Computer Vision (ICCV)},
  year = {2017},
}

Help

Feel free to open issues on here! Also, I'm pretty good with email: [email protected]

Comments

Check failed: error == cudaSuccess (2 vs. 0) out of memory

I am currently running ./test_res.sh on a 11GB of GPU memory,gtx1080ti. But when I run the script, it immediately throws out an error: I0911 12:46:39.811416 18216 caffe.cpp:252] Running for 1449 iterations. F0911 12:46:40.284883 18216 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory and since you are already at batch size = 1 ,I don't know how much real memory , or What should I do?

opened by zhwis 4

Compatibility issue for cudnn

When build caffe, this error occurs

CXX .build_release/src/caffe/proto/caffe.pb.cc
CXX src/caffe/syncedmem.cpp
In file included from ./include/caffe/util/device_alternate.hpp:40:0,
                 from ./include/caffe/common.hpp:19,
                 from src/caffe/syncedmem.cpp:1:
./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::createPoolingDesc(cudnnPoolingStruct**, caffe::PoolingParameter_PoolMethod, cudnnPoolingMode_t*, int, int, int, int, int, int)’:
./include/caffe/util/cudnn.hpp:127:41: error: too few arguments to function ‘cudnnStatus_t cudnnSetPooling2dDescriptor(cudnnPoolingDescriptor_t, cudnnPoolingMode_t, cudnnNanPropagation_t, int, int, int, int, int, int)’
         pad_h, pad_w, stride_h, stride_w));
                                         ^
./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’
     cudnnStatus_t status = condition; \
                            ^
In file included from ./include/caffe/util/cudnn.hpp:5:0,
                 from ./include/caffe/util/device_alternate.hpp:40,
                 from ./include/caffe/common.hpp:19,
                 from src/caffe/syncedmem.cpp:1:
/usr/local/cuda/include/cudnn.h:799:27: note: declared here
 cudnnStatus_t CUDNNWINAPI cudnnSetPooling2dDescriptor(

It seems to relate with the compatibility of cudnn. Official caffe could be built on my computer, would you please update and solve this problem?

opened by meijieru 3

Caffe error

Hi, I built the caffe from this repo and i am getting an error that Error parsing text-format caffe.NetParameter: 62:15: Message type "caffe.LayerParameter" has no field named "deletebottom".

What might be the issue? Regards, Vijay

opened by vijayg78 3
Using Im2col and bottom_is_im2col needs more memory

I am trying to train VGG16 with my own data. I have cropped the images to 224x224. When I train with VGG16 as provided by Caffe model zoo (https://gist.github.com/ksimonyan/211839e770f7b538e2d8) I can train with batch size 32. After replacing all convolution layers with im2col followed by a convolution layer with bottom_is_im2col I can train with maximum batch size 8 without "Out of memory" error. First of all, I wonder if this is normal behavior given that typical convolution layers use im2col internally. Secondly, is there a way to reduce memory needs? Thanks in advance,

opened by lazatsoc 0
syncedmem.hpp:31] Check failed: error == cudaSuccess (29 vs. 0) driver shutting down

*** Check failure stack trace: *** @ 0x7fa08bf23daa (unknown)
@ 0x7fa08bf23ce4 (unknown) @ 0x7fa08bf236e6 (unknown) @ 0x7fa08bf26687 (unknown) @ 0x7fa08c5791e1 caffe::SyncedMemory::~SyncedMemory() @ 0x7fa08c5c6fb2 boost::detail::sp_counted_impl_p<>::dispose() @ 0x40a52e boost::detail::sp_counted_base::release() @ 0x7fa08c5df1e5 caffe::Blob<>::~Blob() @ 0x7fa08aa8753a (unknown) @ 0x7fa08c50ac43 (unknown) Aborted (core dumped)

I met this issue each time at the end of training or testing. Any idea?

opened by Eleanor-H 0
How to train embedding network

I have found that the code only contains the part for test and no code for training. Is the embedding network trained separately from deeplab network?

opened by jinde-liu 1

Owner

GitHub

Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

LESA Introduction This repository contains the official implementation of Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Cont

20 Dec 31, 2021

Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"

Streaming speaker diarization Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation by Juan Manuel Coria, Hervé

187 Jan 6, 2023

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Shape-aware Convolutional Layer (ShapeConv) PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentatio

82 Dec 29, 2022

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and testing data for various deep learning projects such as 6D object pose estimation projects singleshotpose, as well as object detection and instance segmentation projects.

305 Dec 16, 2022

ObjectDrawer-ToolBox: a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system

ObjectDrawer-ToolBox is a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system, Object Drawer.

77 Jan 5, 2023

Official PyTorch implementation of UACANet: Uncertainty Aware Context Attention for Polyp Segmentation

UACANet: Uncertainty Aware Context Attention for Polyp Segmentation Official pytorch implementation of UACANet: Uncertainty Aware Context Attention fo

85 Dec 14, 2022

Code for paper "ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation"

ASAP-Net This project implements ASAP-Net of paper ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation (BMVC2020). Overview We i

26 Aug 25, 2022

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

U-Net Implementation By Christopher Ley This is my interpretation and implementation of the famous paper "U-Net: Convolutional Networks for Biomedical

1 Jan 6, 2022

Code for our ICASSP 2021 paper: SA-Net: Shuffle Attention for Deep Convolutional Neural Networks

SA-Net: Shuffle Attention for Deep Convolutional Neural Networks (paper) By Qing-Long Zhang and Yu-Bin Yang [State Key Laboratory for Novel Software T

199 Jan 8, 2023

Improving Convolutional Networks via Attention Transfer (ICLR 2017)

Attention Transfer PyTorch code for "Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Tran

1.4k Dec 23, 2022

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

This is a fork of Fairseq(-py) with implementations of the following models: Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Se

490 Dec 15, 2022

Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Intelligent Robotics and Machine Vision Lab

4 Jul 19, 2022

A PyTorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation

A PyTorch implementation of V-Net Vnet is a PyTorch implementation of the paper V-Net: Fully Convolutional Neural Networks for Volumetric Medical Imag

606 Dec 21, 2022

CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes Implementation of CoSMA: Convolutional Semi-Regular Mesh Autoencoder arXiv p

10 Oct 11, 2022

Graph neural network message passing reframed as a Transformer with local attention

Adjacent Attention Network An implementation of a simple transformer that is equivalent to graph neural network where the message passing is done with

49 Dec 28, 2022

Local Attention - Flax module for Jax

Local Attention - Flax Autoregressive Local Attention - Flax module for Jax Install $ pip install local-attention-flax Usage from jax import random fr

16 Jun 16, 2022

codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

DLCF-DCA codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification. submitted t

15 Aug 30, 2022

PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Image Super-Resolution with Non-Local Sparse Attention This repository is for NLSN introduced in the following paper "Image Super-Resolution with Non-

143 Dec 28, 2022

Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

Focal Transformer This is the official implementation of our Focal Transformer -- "Focal Self-attention for Local-Global Interactions in Vision Transf

486 Dec 20, 2022

Segmentation-Aware Convolutional Networks Using Local Attention Masks

Related tags

Overview

Segmentation-Aware Convolutional Networks Using Local Attention Masks

Installation

Learning embeddings with dedicated loss

Setting up a segmentation-aware convolution layer

Using a segmentation-aware CRF

Replicating the segmentation results presented in our paper

Citation

Help

Comments

Check failed: error == cudaSuccess (2 vs. 0) out of memory

Compatibility issue for cudnn

Caffe error

Using Im2col and bottom_is_im2col needs more memory

syncedmem.hpp:31] Check failed: error == cudaSuccess (29 vs. 0) driver shutting down

How to train embedding network

Owner

Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"

PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

ObjectDrawer-ToolBox: a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system

Official PyTorch implementation of UACANet: Uncertainty Aware Context Attention for Polyp Segmentation

Code for paper "ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation"

U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

Code for our ICASSP 2021 paper: SA-Net: Shuffle Attention for Deep Convolutional Neural Networks

Improving Convolutional Networks via Attention Transfer (ICLR 2017)

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

A PyTorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation

CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Graph neural network message passing reframed as a Transformer with local attention

Local Attention - Flax module for Jax

codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"