Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers (arXiv2021)

Deng-Ping Fan

Last update: Jan 5, 2023

Related tags

Deep Learning Polyp-PVT

Overview

Polyp-PVT

by Bo Dong, Wenhai Wang, Deng-Ping Fan, Jinpeng Li, Huazhu Fu, & Ling Shao.

This repo is the official implementation of "Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers".

1. Introduction

Polyp-PVT is initially described in arxiv.

Most polyp segmentation methods use CNNs as their backbone, leading to two key issues when exchanging information between the encoder and decoder: 1) taking into account the differences in contribution between different-level features; and 2) designing effective mechanism for fusing these features. Different from existing CNN-based methods, we adopt a transformer encoder, which learns more powerful and robust representations. In addition, considering the image acquisition influence and elusive properties of polyps, we introduce three novel modules, including a cascaded fusion module (CFM), a camouflage identification module (CIM), a and similarity aggregation module (SAM). Among these, the CFM is used to collect the semantic and location information of polyps from high-level features, while the CIM is applied to capture polyp information disguised in low-level features. With the help of the SAM, we extend the pixel features of the polyp area with high-level semantic position information to the entire polyp area, thereby effectively fusing cross-level features. The proposed model, named Polyp-PVT , effectively suppresses noises in the features and significantly improves their expressive capabilities.

Polyp-PVT achieves strong performance on image-level polyp segmentation (0.808 mean Dice and 0.727 mean IoU on ColonDB) and video polyp segmentation (0.880 mean dice and 0.802 mean IoU on CVC-300-TV), surpassing previous models by a large margin.

2. Framework Overview

3. Results

3.1 Image-level Polyp Segmentation

3.2 Image-level Polyp Segmentation Compared Results:

We also provide some result of baseline methods, You could download from Google Drive/Baidu Drive [code:nhhv], including our results and that of compared models.

3.3 Video Polyp Segmentation

3.4 Video Polyp Segmentation Compared Results:

We also provide some result of baseline methods, You could download from Google Drive/Baidu Drive [code:33ie], including our results and that of compared models.

4. Usage:

4.1 Recommended environment:

Python 3.8
Pytorch 1.7.1
torchvision 0.8.2

4.2 Data preparation:

Downloading training and testing datasets and move them into ./dataset/, which can be found in this Google Drive/Baidu Drive [code:dr1h].

4.3 Pretrained model:

You should download the pretrained model from Google Drive/Baidu Drive [code:w4vk], and then put it in the './pretrained_pth' folder for initialization.

4.4 Training:

Clone the repository:

git clone https://github.com/DengPingFan/Polyp-PVT.git
cd Polyp-PVT 
bash train.sh

4.5 Testing:

cd Polyp-PVT 
bash test.sh

4.6 Evaluating your trained model:

Matlab: Please refer to the work of MICCAI2020 (link).

Python: Please refer to the work of ACMMM2021 (link).

Please note that we use the Matlab version to evaluate in our paper.

4.7 Well trained model:

You could download the trained model from Google Drive/Baidu Drive [code:9rpy] and put the model in directory './model_pth'.

4.8 Pre-computed maps:

Google Drive/Baidu Drive [code:x3jc]

5. Citation:

@aticle{dong2021PolypPVT,
  title={Polyp-PVT: Polyp Segmentation with PyramidVision Transformers},
  author={Bo, Dong and Wenhai, Wang and Deng-Ping, Fan and Jinpeng, Li and Huazhu, Fu and Ling, Shao},
  journal={arXiv preprint arXiv:2108.06932},
  year={2021}
}

6. Acknowledgement

We are very grateful for these excellent works PraNet, EAGRNet and MSEG, which have provided the basis for our framework.

7. FAQ:

If you want to improve the usability or any piece of advice, please feel free to contact me directly ([email protected]).

8. License

The source code is free for research and education use only. Any comercial use should get formal permission first.

Comments

Sincerely request the thesis baseline code

Hi, thank you for your excellent work. The baseline you used for comparison in your paper is from "Pvtv2: Improved baselines with pyramid vision transformer", which I have tried many times without success. I didn't find this part of the code in the project. Could you provide a baseline code of PVTV2 that you use? Thanks!

opened by jue12345 1
PVT V2 implementation

Hi @DengPingFan

Did you check the implementation of PVT V2? Actually, I need a classification head in forward propagation to apply some loss functions in classification HEAD. Unfortunately, you comment out this line. Could you please tell me the solution to this problem?

opened by khawar-islam 1
could u give FPS or FLOPs about Polyp-PVT, i test this backbone, its so slow

if name == "main": a = torch.randn(1, 3, 512, 512).cuda() backbone = pvt_v2_b0().cuda() start = time.time() out = backbone(a) end = time.time()-start print('each image use %5f seconds, and image size is 512' % end, ) print([i.shape for i in out]) each image use 0.374312 seconds, and image size is 512 [torch.Size([1, 32, 128, 128]), torch.Size([1, 64, 64, 64]), torch.Size([1, 160, 32, 32]), torch.Size([1, 256, 16, 16])]

opened by csliuchang 1
About pretrained module

Hello, I saw in the paper that you compared the resunet + + network. Can you send the pre training model of this network? I've tried for a long time and haven't realized it. I want to do a comparative experiment.

opened by coisino 0

Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers (arXiv2021)

Related tags

Overview

Polyp-PVT

1. Introduction

2. Framework Overview

3. Results

3.1 Image-level Polyp Segmentation

3.2 Image-level Polyp Segmentation Compared Results:

3.3 Video Polyp Segmentation

3.4 Video Polyp Segmentation Compared Results:

4. Usage:

4.1 Recommended environment:

4.2 Data preparation:

4.3 Pretrained model:

4.4 Training:

4.5 Testing:

4.6 Evaluating your trained model:

4.7 Well trained model:

4.8 Pre-computed maps:

5. Citation:

6. Acknowledgement

7. FAQ:

8. License

You might also like...

EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration

Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals

EPSANet：An Efficient Pyramid Split Attention Block on Convolutional Neural Network

Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space

Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

An Implementation of SiameseRPN with Feature Pyramid Networks

[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

a reimplementation of Optical Flow Estimation using a Spatial Pyramid Network in PyTorch

The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Comments

Sincerely request the thesis baseline code

PVT V2 implementation

could u give FPS or FLOPs about Polyp-PVT, i test this backbone, its so slow

About pretrained module

Owner

Deng-Ping Fan

Official PyTorch implementation of UACANet: Uncertainty Aware Context Attention for Polyp Segmentation

Copy Paste positive polyp using poisson image blending for medical image segmentation

(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation.

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation, CVPR2022

Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

(AAAI2020)Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing