TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification

Last update: Dec 30, 2022

Related tags

Deep Learning TransMIL

Overview

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification [NeurIPS 2021]

Abstract

Multiple instance learning (MIL) is a powerful tool to solve the weakly supervised classification in whole slide image (WSI) based pathology diagnosis. However, the current MIL methods are usually based on independent and identical distribution hypothesis, thus neglect the correlation among different instances. To address this problem, we proposed a new framework, called correlated MIL, and provided a proof for convergence. Based on this framework, we devised a Transformer based MIL (TransMIL), which explored both morphological and spatial information. The proposed TransMIL can effectively deal with unbalanced/balanced and binary/multiple classification with great visualization and interpretability. We conducted various experiments for three different computational pathology problems and achieved better performance and faster convergence compared with state-of-the-art methods. The test AUC for the binary tumor classification can be up to 93.09% over CAMELYON16 dataset. And the AUC over the cancer subtypes classification can be up to 96.03% and 98.82% over TCGA-NSCLC dataset and TCGA-RCC dataset, respectively.

Train

python train.py --stage='train' --config='Camelyon/TransMIL.yaml'  --gpus=0 --fold=0

Test

python train.py --stage='test' --config='Camelyon/TransMIL.yaml'  --gpus=0 --fold=0

Comments

Missing TCGA-NSCLC and TCGA-RCC code

Hi, I was trying to reproduce your results, but I see that the code for TCGA-NSCLC and TCGA-RCC datasets is missing. Could you add missing experiment setups?

opened by apardyl 11
Overfitting problem while implementing in Camelyon16

Hello,

I randomly select 90% and 10% of official train data as train-set and val-set. The test-set is official provided. But the gap between training and inference is big. Have you encountered this problem?

opened by pzSuen 6
How the performance of baseline methods are obtained?

As the dataset used in this paper is different from the compared baseline methods, how their performance results are obtained? Did you train and eval the baseline models respectively? Thanks.

opened by Lafite-Yu 5
What is the number of patches of each WSIs?

Some WSIs have only 2048 patches, some have over 40000 patches. In your paper, you used a Squaring method to pad the embeddings to a fixed length. What is the fixed length? If the largest WSI has 50000 patches, did you pad all the embeddings to (1, 50000, 1024) ?

opened by Furyboyy 4
Unfaithful accuracy numbers on the TCGA-NSCLC dataset

According to your paper, the total number of TCGA-NSCLC slides is 993 and you used 4-fold cross-validation, so each slide is classified once in hold-out sets after the cross-validation. By checking the numbers in table 1 from your paper: None of these accuracy values can be obtained by dividing an integer using 993. (i.e., N/993=acc does not resolve to an integer for the number of correctly classified slides N). Could you give an explanation on this?

opened by binli123 4
pytorch-lightning version

Hi, thank you for your sharing!

I met some problems when I tried to use multi gpus to train, which was seemly caused by pytorch-lightning version:

pytorch_lightning.utilities.exceptions.MisconfigurationException: You have asked for amp_level='O2' but it's only supported with amp_backend='apex'

init() got an unexpected keyword argument 'dirpath'

Can you tell me your pytorch-lightning and some other packages' version? Thank you very much!

opened by RuixiangZhao 4
attention weights for heatmaps

Hello,

Really fantastic work, thank you for the excellent repo.

I want to apply this to my own dataset and would love to produce attention heatmaps like you did in the paper. I understand that the Nystromformer has a return_attn argument however I am confused by the dimensions that it returns. I played around with a toy dataset that had 1000 instances and it returned a 1 x nheads x 1280 x 1280 tensor. Confused how to take that 1280x1280 array and tease out the cls_token attention values.

Any advice?

Thanks so much!

opened by gabemarx 3

No such file or directory: 'Camelyon16/pt_files/normal_024.pt'

The directory of the camelyon16 dataset I downloaded is as follows. Where should I download the .pt files?

CAMELYON16
  -training
     -normal
        -normal_xxx.tif
     -turmor
        -turmor_xxx.tif
     -lesion_annotation
        -turmor_xxx.xml
  -testing
     -images
        -test_xxx.tif
     -evaluation
        -evaluation_masks.zip
        -evaluation_matlab.zip
        -evaluation_python.zip

opened by jieruyao49 2

Questions about the attention visualization?

Hello,

Does it seem that you haven't mentioned how to compute the attention of tokens in the paper? And I also can't find the code to compute the heatmap and visualization.

Can you answer my questions? Looking forward to your replay.

opened by pzSuen 2
Cannot reproduce CAMELYON16 ablations
Hi all,

Thanks for your work. All looks good except I have been trying for some time to reproduce your results. I just discovered the thread https://github.com/szc19990412/TransMIL/issues/4 where others also have failed to reproduce the results. More specifically, I have been trying to reproduce the CAMELYON16 TransMIL ablations and the ABMIL benchmark using your repository and my own repository (written using PyTorch Lightning).

Some things I would like to know for reproducibility of the experiments:

The given CAMELYON16 fold is only one fold, based on a 5:1 train-val split. In your paper you report using 10:1 train-val splits. I would therefore like to know what splits you used and whether they were stratified or not. For reproducibility I would like to request all splits to be uploaded to the repository so people are working with the same data.

The CAMELYON16 dataset contains slides that were sourced from two centers that used different magnifications for each scanner (https://jamanetwork.com/journals/jama/fullarticle/2665774). If 20x magnification was used for the whole dataset, could you please confirm whether the slides from these two centers were tiled at two different physical resolutions (microns per pixel)? “The whole-slide images were acquired at 2 different centers using 2 different scanners. RUMC images were produced with a digital slide scanner (Pannoramic 250 Flash II; 3DHISTECH) with a 20x objective lens (specimen-level pixel size, 0.243 μm × 0.243 μm). UMCU images were produced using a digital slide scanner (NanoZoomer-XR Digital slide scanner C12000-01; Hamamatsu Photonics) with a 40x objective lens (specimen-level pixel size, 0.226 μm × 0.226 μm).”

The model selection/checkpointing details. I see from the code https://github.com/szc19990412/TransMIL/blob/3f6bbe868ac39e7d861a111398b848ba3b943ca8/utils/utils.py#L43-L49, that you used EarlyStopping with patience=10 on val_loss, correct?

What optimizer and hyperparameters were used for ABMIL? I assume what the authors (https://arxiv.org/pdf/1802.04712.pdf) report in Table 17 for the histopathology datasets?

I will later report my scores for TransMIL & ABMIL using the repository and given data fold. Thanks in advance.
opened by DennisHaijma 2
Handling large bags during inference

Hi, I was wondering how large slides were handled during inference and training. Was there any limit on the bag-size to prevent OOMs? If so can you clarify how the predictions from multiple bags were aggregated.

Thanks in advance.

opened by harshith2794 1
Is it able to provide the codes for visualization?

The codes worked really well on the new dataset.

Is it possible to provide the codes for visualization? I know you used some codes from CLAM, but I still unclear of how you generate the mappings in detail.

Thank you so much!

opened by YunanWu2168 0
computational cost of training per Epoch

i would like to ask how much the training step takes per Epoch, i used your built model and i modified the PPGE model by adding FFT to reduce the dimension Convolution operation, only issue i noticed was that the Trainer took a lot of time to finish single Epoch, that's is related to size the shape of the image (2154,1024) or i missed something

opened by deep-matter 5

Owner

GitHub

Deep Learning Slide Captcha

滑动验证码深度学习识别本项目使用深度学习 YOLOV3 模型来识别滑动验证码缺口，基于 https://github.com/eriklindernoren/PyTorch-YOLOv3 修改。只需要几百张缺口标注图片即可训练出精度高的识别模型，识别效果样例：克隆项目运行命令： git cl

55 Jan 2, 2023

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

39 Sep 20, 2022

This is a virtual picture dragging application. Users may virtually slide photos across the screen. The distance between the index and middle fingers determines the movement. Smaller distances indicate click and motion, whereas bigger distances indicate only hand movement.

Virtual_Image_Dragger This is a virtual picture dragging application. Users may virtually slide photos across the screen. The distance between the ind

17 Dec 17, 2022

Makes patches from huge resolution .svs slide files using openslide

openslide_patcher Makes patches from huge resolution .svs slide files using openslide Example collage I made from outputs:

2 Dec 23, 2021

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

272 Dec 23, 2022

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Swin Transformer for Object Detection This repo contains the supported code and configuration files to reproduce object detection results of Swin Tran

1.4k Dec 30, 2022

Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Simple-Image-Classification Simple Image Classification Code (PyTorch) Yechan Kim This repository contains: Python3 / Pytorch code for multi-class ima

8 Oct 29, 2022

Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

0 Jan 23, 2022

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection, CVPR 2021. Installation A Linux pla

269 Dec 21, 2022

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

MI-AOD Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection (The PDF is not available tem

269 Dec 21, 2022

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

GenForce: May Generative Force Be with You

93 Dec 25, 2022

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

129 Dec 11, 2022

Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

ImageProcessingTransformer Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

61 Jan 1, 2023

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

12.6k Jan 9, 2023

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification

Related tags

Overview

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification [NeurIPS 2021]

Abstract

Train

Test

Comments

Owner

Deep Learning Slide Captcha

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

This is a virtual picture dragging application. Users may virtually slide photos across the screen. The distance between the index and middle fingers determines the movement. Smaller distances indicate click and motion, whereas bigger distances indicate only hand movement.

Makes patches from huge resolution .svs slide files using openslide

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" in Pytorch.

An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data

HPRNet: Hierarchical Point Regression for Whole-Body Human Pose Estimation

minimizer-space de Bruijn graphs (mdBG) for whole genome assembly

Official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers