Segmentation-Aware Convolutional Networks Using Local Attention Masks

Overview

Segmentation-Aware Convolutional Networks Using Local Attention Masks

[Project Page] [Paper]

Segmentation-aware convolution filters are invariant to backgrounds. We achieve this in three steps: (i) compute segmentation cues for each pixel (i.e., “embeddings”), (ii) create a foreground mask for each patch, and (iii) combine the masks with convolution, so that the filters only process the local foreground in each image patch.

Installation

For prerequisites, refer to DeepLabV2. Our setup follows theirs almost exactly.

Once you have the prequisites, simply run make all -j4 from within caffe/ to compile the code with 4 cores.

Learning embeddings with dedicated loss

  • Use Convolution layers to create dense embeddings.
  • Use Im2dist to compute dense distance comparisons in an embedding map.
  • Use Im2parity to compute dense label comparisons in a label map.
  • Use DistLoss (with parameters alpha and beta) to set up a contrastive side loss on the distances.

See scripts/segaware/config/embs for a full example.

Setting up a segmentation-aware convolution layer

  • Use Im2col on the input, to arrange pixel/feature patches into columns.
  • Use Im2dist on the embeddings, to get their distances into columns.
  • Use Exp on the distances, with scale: -1, to get them into [0,1].
  • Tile the exponentiated distances, with a factor equal to the depth (i.e., channels) of the original convolution features.
  • Use Eltwise to multiply the Tile result with the Im2col result.
  • Use Convolution with bottom_is_im2col: true to matrix-multiply the convolution weights with the Eltwise output.

See scripts/segaware/config/vgg for an example in which every convolution layer in the VGG16 architecture is made segmentation-aware.

Using a segmentation-aware CRF

  • Use the NormConvMeanfield layer. As input, give it two copies of the unary potentials (produced by a Split layer), some embeddings, and a meshgrid-like input (produced by a DummyData layer with data_filler { type: "xy" }).

See scripts/segaware/config/res for an example in which a segmentation-aware CRF is added to a resnet architecture.

Replicating the segmentation results presented in our paper

  • Download pretrained model weights here, and put that file into scripts/segaware/model/res/.
  • From scripts, run ./test_res.sh. This will produce .mat files in scripts/segaware/features/res/voc_test/mycrf/.
  • From scripts, run ./gen_preds.sh. This will produce colorized .png results in scripts/segaware/results/res/voc_test/mycrf/none/results/VOC2012/Segmentation/comp6_test_cls. An example input-ouput pair is shown below:

- If you zip these results, and submit them to the official PASCAL VOC test server, you will get 79.83900% IOU.

If you run this set of steps for the validation set, you can run ./eval.sh to evaluate your results on the PASCAL VOC validation set. If you change the model, you may want to run ./edit_env.sh to update the evaluation instructions.

Citation

@inproceedings{harley_segaware,
  title = {Segmentation-Aware Convolutional Networks Using Local Attention Masks},
  author = {Adam W Harley, Konstantinos G. Derpanis, Iasonas Kokkinos},
  booktitle = {IEEE International Conference on Computer Vision (ICCV)},
  year = {2017},
}

Help

Feel free to open issues on here! Also, I'm pretty good with email: [email protected]

Comments
  • Check failed: error == cudaSuccess (2 vs. 0)  out of memory

    Check failed: error == cudaSuccess (2 vs. 0) out of memory

    I am currently running ./test_res.sh on a 11GB of GPU memory,gtx1080ti. But when I run the script, it immediately throws out an error: I0911 12:46:39.811416 18216 caffe.cpp:252] Running for 1449 iterations. F0911 12:46:40.284883 18216 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory and since you are already at batch size = 1 ,I don't know how much real memory , or What should I do?

    opened by zhwis 4
  • Compatibility issue for cudnn

    Compatibility issue for cudnn

    When build caffe, this error occurs

    CXX .build_release/src/caffe/proto/caffe.pb.cc
    CXX src/caffe/syncedmem.cpp
    In file included from ./include/caffe/util/device_alternate.hpp:40:0,
                     from ./include/caffe/common.hpp:19,
                     from src/caffe/syncedmem.cpp:1:
    ./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::createPoolingDesc(cudnnPoolingStruct**, caffe::PoolingParameter_PoolMethod, cudnnPoolingMode_t*, int, int, int, int, int, int)’:
    ./include/caffe/util/cudnn.hpp:127:41: error: too few arguments to function ‘cudnnStatus_t cudnnSetPooling2dDescriptor(cudnnPoolingDescriptor_t, cudnnPoolingMode_t, cudnnNanPropagation_t, int, int, int, int, int, int)’
             pad_h, pad_w, stride_h, stride_w));
                                             ^
    ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’
         cudnnStatus_t status = condition; \
                                ^
    In file included from ./include/caffe/util/cudnn.hpp:5:0,
                     from ./include/caffe/util/device_alternate.hpp:40,
                     from ./include/caffe/common.hpp:19,
                     from src/caffe/syncedmem.cpp:1:
    /usr/local/cuda/include/cudnn.h:799:27: note: declared here
     cudnnStatus_t CUDNNWINAPI cudnnSetPooling2dDescriptor(
    

    It seems to relate with the compatibility of cudnn. Official caffe could be built on my computer, would you please update and solve this problem?

    opened by meijieru 3
  • Caffe error

    Caffe error

    Hi, I built the caffe from this repo and i am getting an error that Error parsing text-format caffe.NetParameter: 62:15: Message type "caffe.LayerParameter" has no field named "deletebottom".

    What might be the issue? Regards, Vijay

    opened by vijayg78 3
  • Using Im2col and bottom_is_im2col needs more memory

    Using Im2col and bottom_is_im2col needs more memory

    I am trying to train VGG16 with my own data. I have cropped the images to 224x224. When I train with VGG16 as provided by Caffe model zoo (https://gist.github.com/ksimonyan/211839e770f7b538e2d8) I can train with batch size 32. After replacing all convolution layers with im2col followed by a convolution layer with bottom_is_im2col I can train with maximum batch size 8 without "Out of memory" error. First of all, I wonder if this is normal behavior given that typical convolution layers use im2col internally. Secondly, is there a way to reduce memory needs? Thanks in advance,

    opened by lazatsoc 0
  •  syncedmem.hpp:31] Check failed: error == cudaSuccess (29 vs. 0)  driver shutting down

    syncedmem.hpp:31] Check failed: error == cudaSuccess (29 vs. 0) driver shutting down

    *** Check failure stack trace: *** @ 0x7fa08bf23daa (unknown)
    @ 0x7fa08bf23ce4 (unknown) @ 0x7fa08bf236e6 (unknown) @ 0x7fa08bf26687 (unknown) @ 0x7fa08c5791e1 caffe::SyncedMemory::~SyncedMemory() @ 0x7fa08c5c6fb2 boost::detail::sp_counted_impl_p<>::dispose() @ 0x40a52e boost::detail::sp_counted_base::release() @ 0x7fa08c5df1e5 caffe::Blob<>::~Blob() @ 0x7fa08aa8753a (unknown) @ 0x7fa08c50ac43 (unknown) Aborted (core dumped)

    I met this issue each time at the end of training or testing. Any idea?

    opened by Eleanor-H 0
  • How to train embedding network

    How to train embedding network

    I have found that the code only contains the part for test and no code for training. Is the embedding network trained separately from deeplab network?

    opened by jinde-liu 1
Owner
null
Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

LESA Introduction This repository contains the official implementation of Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Cont

Chenglin Yang 20 Dec 31, 2021
Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"

Streaming speaker diarization Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation by Juan Manuel Coria, Hervé

Juanma Coria 187 Jan 6, 2023
PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentation.

Shape-aware Convolutional Layer (ShapeConv) PyTorch implementation of ShapeConv: Shape-aware Convolutional Layer for RGB-D Indoor Semantic Segmentatio

Hanchao Leng 82 Dec 29, 2022
Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera.

Tools to create pixel-wise object masks, bounding box labels (2D and 3D) and 3D object model (PLY triangle mesh) for object sequences filmed with an RGB-D camera. This project prepares training and testing data for various deep learning projects such as 6D object pose estimation projects singleshotpose, as well as object detection and instance segmentation projects.

null 305 Dec 16, 2022
ObjectDrawer-ToolBox: a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system

ObjectDrawer-ToolBox is a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system, Object Drawer.

null 77 Jan 5, 2023
Official PyTorch implementation of UACANet: Uncertainty Aware Context Attention for Polyp Segmentation

UACANet: Uncertainty Aware Context Attention for Polyp Segmentation Official pytorch implementation of UACANet: Uncertainty Aware Context Attention fo

Taehun Kim 85 Dec 14, 2022
Code for paper "ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation"

ASAP-Net This project implements ASAP-Net of paper ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation (BMVC2020). Overview We i

Hanwen Cao 26 Aug 25, 2022
U-Net Implementation: Convolutional Networks for Biomedical Image Segmentation" using the Carvana Image Masking Dataset in PyTorch

U-Net Implementation By Christopher Ley This is my interpretation and implementation of the famous paper "U-Net: Convolutional Networks for Biomedical

Christopher Ley 1 Jan 6, 2022
Code for our ICASSP 2021 paper: SA-Net: Shuffle Attention for Deep Convolutional Neural Networks

SA-Net: Shuffle Attention for Deep Convolutional Neural Networks (paper) By Qing-Long Zhang and Yu-Bin Yang [State Key Laboratory for Novel Software T

Qing-Long Zhang 199 Jan 8, 2023
Improving Convolutional Networks via Attention Transfer (ICLR 2017)

Attention Transfer PyTorch code for "Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Tran

Sergey Zagoruyko 1.4k Dec 23, 2022
Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

This is a fork of Fairseq(-py) with implementations of the following models: Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Se

Maha 490 Dec 15, 2022
Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Codes for TIM2021 paper "Anchor-Based Spatio-Temporal Attention 3-D Convolutional Networks for Dynamic 3-D Point Cloud Sequences"

Intelligent Robotics and Machine Vision Lab 4 Jul 19, 2022
A PyTorch implementation for V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation

A PyTorch implementation of V-Net Vnet is a PyTorch implementation of the paper V-Net: Fully Convolutional Neural Networks for Volumetric Medical Imag

Matthew Macy 606 Dec 21, 2022
CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes Implementation of CoSMA: Convolutional Semi-Regular Mesh Autoencoder arXiv p

Fraunhofer SCAI 10 Oct 11, 2022
Graph neural network message passing reframed as a Transformer with local attention

Adjacent Attention Network An implementation of a simple transformer that is equivalent to graph neural network where the message passing is done with

Phil Wang 49 Dec 28, 2022
Local Attention - Flax module for Jax

Local Attention - Flax Autoregressive Local Attention - Flax module for Jax Install $ pip install local-attention-flax Usage from jax import random fr

Phil Wang 16 Jun 16, 2022
codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

DLCF-DCA codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification. submitted t

null 15 Aug 30, 2022
PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Image Super-Resolution with Non-Local Sparse Attention This repository is for NLSN introduced in the following paper "Image Super-Resolution with Non-

null 143 Dec 28, 2022
Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"

Focal Transformer This is the official implementation of our Focal Transformer -- "Focal Self-attention for Local-Global Interactions in Vision Transf

Microsoft 486 Dec 20, 2022