2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation

Overview

2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation

Authors: Ge-Peng Ji*, Yu-Cheng Chou*, Deng-Ping Fan, Geng Chen, Huazhu Fu, Debesh Jha, & Ling Shao.

This repository provides code for paper"Progressively Normalized Self-Attention Network for Video Polyp Segmentation" published at the MICCAI-2021 conference (arXiv Version | 中文版). If you have any questions about our paper, feel free to contact me. And if you like our PNS-Net or evaluation toolbox for your personal research, please cite this paper (BibTeX).

Features

  • Hyper Real-time Speed: Our method, named Progressively Normalized Self-Attention Network (PNS-Net), can efficiently learn representations from polyp videos with real-time speed (~140fps) on a single NVIDIA RTX 2080 GPU without any post-processing techniques (e.g., Dense-CRF).
  • Plug-and-Play Module: The proposed core module, termed Normalized Self-attention (NS), utilizes channel split,query-dependent, and normalization rules to reduce the computational cost and improve the accuracy, respectively. Note that this module can be flexibly plugged into any framework customed.
  • Cutting-edge Performance: Experiments on three challenging video polyp segmentation (VPS) datasets demonstrate that the proposed PNS-Net achieves state-of-the-art performance.
  • One-key Evaluation Toolbox: We release the first one-key evaluation toolbox in the VPS field.

1.1. 🔥 NEWS 🔥 :

  • [2021/06/25] 🔥 Our paper have been elected to be honred a MICCAI Student Travel Award.
  • [2021/06/19] 🔥 A short introduction of our paper is available on my YouTube channel (2min).
  • [2021/06/18] Release the inference code! The whole project will be available at the time of MICCAI-2021.
  • [2021/06/18] The Chinese translation of our paper is coming, please enjoy it [pdf].
  • [2021/05/27] Uploading the training/testing dataset, snapshot, and benchmarking results.
  • [2021/05/14] Our work is provisionally accepted at MICCAI 2021. Many thanks to my collaborator Yu-Cheng Chou and supervisor Prof. Deng-Ping Fan.
  • [2021/03/10] Create repository.

1.2. Table of Contents

Table of contents generated with markdown-toc

1.3. State-of-the-art Approaches

  1. "PraNet: Parallel Reverse Attention Network for Polyp Segmentation" MICCAI, 2020. doi: https://arxiv.org/pdf/2006.11392.pdf
  2. "Adaptive context selection for polyp segmentation" MICCAI, 2020. doi: https://link.springer.com/chapter/10.1007/978-3-030-59725-2_25
  3. "Resunet++: An advanced architecture for medical image segmentation" IEEE ISM, 2019 doi: https://arxiv.org/pdf/1911.07067.pdf
  4. "Unet++: A nested u-net architecture for medical image segmentation" IEEE TMI, 2019 doi: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7329239/
  5. "U-Net: Convolutional networks for biomed- ical image segmentation" MICCAI, 2015. doi: https://arxiv.org/pdf/1505.04597.pdf

2. Overview

2.1. Introduction

Existing video polyp segmentation (VPS) models typically employ convolutional neural networks (CNNs) to extract features. However, due to their limited receptive fields, CNNs can not fully exploit the global temporal and spatial information in successive video frames, resulting in false-positive segmentation results. In this paper, we propose the novel PNS-Net (Progressively Normalized Self-attention Network), which can efficiently learn representations from polyp videos with real-time speed (~140fps) on a single RTX 2080 GPU and no post-processing.

Our PNS-Net is based solely on a basic normalized self-attention block, dispensing with recurrence and CNNs entirely. Experiments on challenging VPS datasets demonstrate that the proposed PNS-Net achieves state-of-the-art performance. We also conduct extensive experiments to study the effectiveness of the channel split, soft-attention, and progressive learning strategy. We find that our PNS-Net works well under different settings, making it a promising solution to the VPS task.

2.2. Framework Overview


Figure 1: Overview of the proposed PNS-Net, including the normalized self-attention block (see § 2.1) with a stacked (×R) learning strategy. See § 2 in the paper for details.

2.3. Qualitative Results


Figure 2: Qualitative Results.

3. Proposed Baseline

3.1. Training/Testing

The training and testing experiments are conducted using PyTorch with a single GeForce RTX 2080 GPU of 8 GB Memory.

  1. Configuring your environment (Prerequisites):

    Note that PNS-Net is only tested on Ubuntu OS with the following environments. It may work on other operating systems as well but we do not guarantee that it will.

    • Creating a virtual environment in terminal:

    conda create -n PNSNet python=3.6.

    • Installing necessary packages PyTorch 1.1:
    conda create -n PNSNet python=3.6
    conda activate PNSNet
    conda install pytorch=1.1.0 torchvision -c pytorch
    pip install tensorboardX tqdm Pillow==6.2.2
    pip install git+https://github.com/pytorch/tnt.git@master
    • Our core design is built on CUDA OP with torchlib. Please ensure the base CUDA toolkit version is 10.x (not at conda env), and then build the NS Block:
    cd ./lib/PNS
    python setup.py build develop
  2. Downloading necessary data:

  3. Training Configuration:

    • First, run python MyTrain_Pretrain.py in the terminal for pretraining, and then, run python MyTrain_finetune.py for finetuning.

    • Just enjoy it! Finish it and the snapshot would save in ./snapshot/PNS-Net/*.

  4. Testing Configuration:

    • After you download all the pre-trained model and testing dataset, just run MyTest_finetune.py to generate the final prediction map in ./res.

    • Just enjoy it!

    • The prediction results of all competitors and our PNS-Net can be found at Google Drive (7MB).

3.2 Evaluating your trained model:

One-key evaluation is written in MATLAB code (link), please follow this the instructions in ./eval/main_VPS.m and just run it to generate the evaluation results in ./eval-Result/.

4. Citation

Please cite our paper if you find the work useful:

@inproceedings{ji2021pnsnet,
  title={Progressively Normalized Self-Attention Network for Video Polyp Segmentation},
  author={Ji, Ge-Peng and Chou, Yu-Cheng and Fan, Deng-Ping and Chen, Geng and Jha, Debesh and Fu, Huazhu and Shao, Ling},
  booktitle={MICCAI},
  year={2021}
}

5. TODO LIST

If you want to improve the usability or any piece of advice, please feel free to contact me directly (E-mail).

  • Support NVIDIA APEX training.

  • Support different backbones ( VGGNet, ResNet, ResNeXt, iResNet, and ResNeSt etc.)

  • Support distributed training.

  • Support lightweight architecture and real-time inference, like MobileNet, SqueezeNet.

  • Support distributed training

  • Add more comprehensive competitors.

6. FAQ

  1. If the image cannot be loaded on the page (mostly in the domestic network situations).

    Solution Link


7. Acknowledgements

This code is built on SINetV2 (PyTorch) and PyramidCSA (PyTorch). We thank the authors for sharing the codes.

back to top

Comments
  • 无法复现论文精度

    无法复现论文精度

    您好,我阅读了您在VPS任务上的工作,是非常优秀和solid的工作,但是我在使用您的代码复现论文结果的过程当中遇到了一些问题,在您给出的三个数据集上很难达到论文给出的指标性能(尤其是cvc-colon-300上maxIOU和maxDice指标相差将近8、9个点)。 我的训练方式参照预训练100epochs + 微调1个epoch的方式,超参的设置使用了config.py文件中给出的设置,并且在微调时候替换了pretrained的路径。

    opened by zzzzzzpc 5
  • Could you provide the training file?

    Could you provide the training file?

    I encounter a problem when I am trying to implement your work on my own(video shadow segmentation). The Normalized Self-attention block does not work as you do.

    opened by Joooooohnny 5
  • ModuleNotFoundError: No module named 'self_cuda_backend'

    ModuleNotFoundError: No module named 'self_cuda_backend'

    作者,您好,我已经按照您在Readme中的环境配置命令配置相应环境并运行了“python setup.py build develop”命令,运行期间没有报错,但运行完成后似乎没有生成“self_cuda_backend”文件,而且代码还是会报错,下图是我这边运行后的文件结构不知道是不是我这边运行错误,如果运行正常,会在哪里生成“self_cuda_backend”文件,麻烦您解答一下,感谢! Snipaste_2022-04-29_15-11-50

    opened by GrandMountFoxc 4
  • Some questions about the NL module

    Some questions about the NL module

    Hello I would like to ask why it needs to redefine the backward for NL modules, instead of writing the module and using the backward in pytorch, because of the "processing" operation? I am not familiar with custom functions and I hope you can answer this for me.

    opened by Liqq1 3
  • Thanks for the excellent job!

    Thanks for the excellent job!

    Thanks for the excellent job! There is my questions:

    1. I am confused about the file PNS-Finetune.pth that you provide in the "Training/Testing" part. Whether it is the final weights or the pre-trained weights?
    2. Could you please explain the 'max' in the metrics 'maxDice', 'maxSpc' and 'maxIoU'?
    3. I find something wrong in the file utils/dataloader.py, where the variate 'begin' is supposed to be set to 0 not 1 in class VideoDataset. The file MyTest_Finetune.py is the case. It would be very grateful if you could answer my questions.
    opened by wener-yung 3
  • 训练技巧提问

    训练技巧提问

    尊敬的作者您好,我在尝试复现您的论文PNS-Net网络,在您提供的数据集上进行预训练和微调,测试结果在Dataset:CVC-ColonDB-300数据集上没有达到论文中给的精度,三个训练集的评价指标结果如下。

    (Dataset:CVC-ClinicDB-612-Test) meanDic:0.774;meanIoU:0.696;wFm:0.765;Sm:0.864;meanEm:0.857;MAE:0.051;maxEm:0.884;maxDice:0.803;maxIoU:0.729;meanSen:0.757;maxSen:0.992;meanSpe:0.975;maxSpe:0.985. (Dataset:CVC-ClinicDB-612-Valid) meanDic:0.843;meanIoU:0.756;wFm:0.824;Sm:0.922;meanEm:0.936;MAE:0.013;maxEm:0.966;maxDice:0.883;maxIoU:0.810;meanSen:0.862;maxSen:0.991;meanSpe:0.968;maxSpe:0.978. (Dataset:CVC-ColonDB-300) meanDic:0.654;meanIoU:0.551;wFm:0.617;Sm:0.838;meanEm:0.797;MAE:0.025;maxEm:0.835;maxDice:0.693;maxIoU:0.590;meanSen:0.764;maxSen:1.000;meanSpe:0.978;maxSpe:0.990.

    麻烦请教一下您,训练模型时有什么其他的技巧吗?另外,预训练和微调之后的网络模型测试评价指标差距不是很大,可以增加微调epoch或者有其他方法增强微调网络的性能吗?多有打扰。

    Pretrain【epoch100】

    (Dataset:CVC-ClinicDB-612-Test) meanDic:0.771;meanIoU:0.693;wFm:0.770;Sm:0.856;meanEm:0.852;MAE:0.050;maxEm:0.888;maxDice:0.811;maxIoU:0.736;meanSen:0.734;maxSen:0.992;meanSpe:0.974;maxSpe:0.983.

    Finetune【epoch1】

    (Dataset:CVC-ClinicDB-612-Test) meanDic:0.774;meanIoU:0.696;wFm:0.765;Sm:0.864;meanEm:0.857;MAE:0.051;maxEm:0.884;maxDice:0.803;maxIoU:0.729;meanSen:0.757;maxSen:0.992;meanSpe:0.975;maxSpe:0.985.

    opened by Moqixis 2
  • 训练技巧和数据集的分配

    训练技巧和数据集的分配

        您好,我最近有关注你们的工作,十分地出色!但是我遇到了一些问题,首先是我使用了你们的方法,无法达到你们的精度,我想问问您有没有使用一些tricks,还有您们的工作会再更新pytorch版本吗?下一个疑问是,对于您们的数据集的划分还是存在一些困惑,IVPS-TrainSet这个数据集是不是没有用到?验证集和测试集是用相同的吗?还是怎么去处理?
         期待您的回复,谢谢
    
    opened by forestlin7 1
Owner
Ge-Peng Ji (Daniel)
Computer Vision & Medical Imaging
Ge-Peng Ji (Daniel)
Official PyTorch implementation of UACANet: Uncertainty Aware Context Attention for Polyp Segmentation

UACANet: Uncertainty Aware Context Attention for Polyp Segmentation Official pytorch implementation of UACANet: Uncertainty Aware Context Attention fo

Taehun Kim 85 Dec 14, 2022
Copy Paste positive polyp using poisson image blending for medical image segmentation

Copy Paste positive polyp using poisson image blending for medical image segmentation According poisson image blending I've completely used it for bio

Phạm Vũ Hùng 2 Oct 19, 2021
Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Adaptive Segmentation Mask Attack This repository contains the implementation of the Adaptive Segmentation Mask Attack (ASMA), a targeted adversarial

Utku Ozbulak 53 Jul 4, 2022
Accelerate Neural Net Training by Progressively Freezing Layers

FreezeOut A simple technique to accelerate neural net training by progressively freezing layers. This repository contains code for the extended abstra

Andy Brock 203 Jun 19, 2022
Generative Autoregressive, Normalized Flows, VAEs, Score-based models (GANVAS)

GANVAS-models This is an implementation of various generative models. It contains implementations of the following: Autoregressive Models: PixelCNN, G

MRSAIL (Mini Robotics, Software & AI Lab) 6 Nov 26, 2022
Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval

BiDR Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval. Requirements torch==

Microsoft 11 Oct 20, 2022
Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

LESA Introduction This repository contains the official implementation of Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Cont

Chenglin Yang 20 Dec 31, 2021
Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Relational Self-Attention: What's Missing in Attention for Video Understanding This repository is the official implementation of "Relational Self-Atte

mandos 43 Dec 7, 2022
[MICCAI'20] AlignShift: Bridging the Gap of Imaging Thickness in 3D Anisotropic Volumes

AlignShift NEW: Code for our new MICCAI'21 paper "Asymmetric 3D Context Fusion for Universal Lesion Detection" will also be pushed to this repository

Medical 3D Vision 42 Jan 6, 2023
Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

HaloNet - Pytorch Implementation of the Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones. This re

Phil Wang 189 Nov 22, 2022
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

Phil Wang 180 Jan 5, 2023
Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification

STAM - Pytorch Implementation of STAM (Space Time Attention Model), yet another pure and simple SOTA attention model that bests all previous models in

Phil Wang 109 Dec 28, 2022
PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

Under construction... Attention in Attention Network for Image Super-Resolution (A2N) This repository is an PyTorch implementation of the paper "Atten

Haoyu Chen 71 Dec 30, 2022
The code is for the paper "A Self-Distillation Embedded Supervised Affinity Attention Model for Few-Shot Segmentation"

SD-AANet The code is for the paper "A Self-Distillation Embedded Supervised Affinity Attention Model for Few-Shot Segmentation" [arxiv] Overview confi

cv516Buaa 9 Nov 7, 2022
Implementation of "Efficient Regional Memory Network for Video Object Segmentation" (Xie et al., CVPR 2021).

RMNet This repository contains the source code for the paper Efficient Regional Memory Network for Video Object Segmentation. Cite this work @inprocee

Haozhe Xie 76 Dec 14, 2022
Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

Hierarchical Memory Matching Network for Video Object Segmentation Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim

Hongje Seong 72 Dec 14, 2022
Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

Hierarchical Memory Matching Network for Video Object Segmentation Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim

Hongje Seong 26 Sep 26, 2021
Code for the CIKM 2019 paper "DSANet: Dual Self-Attention Network for Multivariate Time Series Forecasting".

Dual Self-Attention Network for Multivariate Time Series Forecasting 20.10.26 Update: Due to the difficulty of installation and code maintenance cause

Kyon Huang 223 Dec 16, 2022
Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

GSAN Introduction Code for paper GSAN: Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving, wh

YE Luyao 6 Oct 27, 2022