TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification

Overview

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification [NeurIPS 2021]

Abstract

Multiple instance learning (MIL) is a powerful tool to solve the weakly supervised classification in whole slide image (WSI) based pathology diagnosis. However, the current MIL methods are usually based on independent and identical distribution hypothesis, thus neglect the correlation among different instances. To address this problem, we proposed a new framework, called correlated MIL, and provided a proof for convergence. Based on this framework, we devised a Transformer based MIL (TransMIL), which explored both morphological and spatial information. The proposed TransMIL can effectively deal with unbalanced/balanced and binary/multiple classification with great visualization and interpretability. We conducted various experiments for three different computational pathology problems and achieved better performance and faster convergence compared with state-of-the-art methods. The test AUC for the binary tumor classification can be up to 93.09% over CAMELYON16 dataset. And the AUC over the cancer subtypes classification can be up to 96.03% and 98.82% over TCGA-NSCLC dataset and TCGA-RCC dataset, respectively.

Train

python train.py --stage='train' --config='Camelyon/TransMIL.yaml'  --gpus=0 --fold=0

Test

python train.py --stage='test' --config='Camelyon/TransMIL.yaml'  --gpus=0 --fold=0
Comments
  • Missing TCGA-NSCLC and TCGA-RCC code

    Missing TCGA-NSCLC and TCGA-RCC code

    Hi, I was trying to reproduce your results, but I see that the code for TCGA-NSCLC and TCGA-RCC datasets is missing. Could you add missing experiment setups?

    opened by apardyl 11
  • Overfitting problem while implementing in Camelyon16

    Overfitting problem while implementing in Camelyon16

    Hello,

    I randomly select 90% and 10% of official train data as train-set and val-set. The test-set is official provided. But the gap between training and inference is big. Have you encountered this problem?

    image

    opened by pzSuen 6
  • How the performance of baseline methods are obtained?

    How the performance of baseline methods are obtained?

    As the dataset used in this paper is different from the compared baseline methods, how their performance results are obtained? Did you train and eval the baseline models respectively? Thanks.

    opened by Lafite-Yu 5
  • What is the number of patches of each WSIs?

    What is the number of patches of each WSIs?

    Some WSIs have only 2048 patches, some have over 40000 patches. In your paper, you used a Squaring method to pad the embeddings to a fixed length. What is the fixed length? If the largest WSI has 50000 patches, did you pad all the embeddings to (1, 50000, 1024) ?

    opened by Furyboyy 4
  • Unfaithful accuracy numbers on the TCGA-NSCLC dataset

    Unfaithful accuracy numbers on the TCGA-NSCLC dataset

    According to your paper, the total number of TCGA-NSCLC slides is 993 and you used 4-fold cross-validation, so each slide is classified once in hold-out sets after the cross-validation. By checking the numbers in table 1 from your paper: image None of these accuracy values can be obtained by dividing an integer using 993. (i.e., N/993=acc does not resolve to an integer for the number of correctly classified slides N). Could you give an explanation on this?

    opened by binli123 4
  • pytorch-lightning version

    pytorch-lightning version

    Hi, thank you for your sharing!

    I met some problems when I tried to use multi gpus to train, which was seemly caused by pytorch-lightning version:

    pytorch_lightning.utilities.exceptions.MisconfigurationException: You have asked for amp_level='O2' but it's only supported with amp_backend='apex'

    init() got an unexpected keyword argument 'dirpath'

    Can you tell me your pytorch-lightning and some other packages' version? Thank you very much!

    opened by RuixiangZhao 4
  • attention weights for heatmaps

    attention weights for heatmaps

    Hello,

    Really fantastic work, thank you for the excellent repo.

    I want to apply this to my own dataset and would love to produce attention heatmaps like you did in the paper. I understand that the Nystromformer has a return_attn argument however I am confused by the dimensions that it returns. I played around with a toy dataset that had 1000 instances and it returned a 1 x nheads x 1280 x 1280 tensor. Confused how to take that 1280x1280 array and tease out the cls_token attention values.

    Any advice?

    Thanks so much!

    opened by gabemarx 3
  • No such file or directory: 'Camelyon16/pt_files/normal_024.pt'

    No such file or directory: 'Camelyon16/pt_files/normal_024.pt'

    The directory of the camelyon16 dataset I downloaded is as follows. Where should I download the .pt files?

    CAMELYON16
      -training
         -normal
            -normal_xxx.tif
         -turmor
            -turmor_xxx.tif
         -lesion_annotation
            -turmor_xxx.xml
      -testing
         -images
            -test_xxx.tif
         -evaluation
            -evaluation_masks.zip
            -evaluation_matlab.zip
            -evaluation_python.zip
    
    opened by jieruyao49 2
  • Questions about the attention visualization?

    Questions about the attention visualization?

    Hello,

    Does it seem that you haven't mentioned how to compute the attention of tokens in the paper? And I also can't find the code to compute the heatmap and visualization.

    Can you answer my questions? Looking forward to your replay.

    opened by pzSuen 2
  • Cannot reproduce CAMELYON16 ablations

    Cannot reproduce CAMELYON16 ablations

    Hi all,

    Thanks for your work. All looks good except I have been trying for some time to reproduce your results. I just discovered the thread https://github.com/szc19990412/TransMIL/issues/4 where others also have failed to reproduce the results. More specifically, I have been trying to reproduce the CAMELYON16 TransMIL ablations and the ABMIL benchmark using your repository and my own repository (written using PyTorch Lightning).

    Some things I would like to know for reproducibility of the experiments:

    • The given CAMELYON16 fold is only one fold, based on a 5:1 train-val split. In your paper you report using 10:1 train-val splits. I would therefore like to know what splits you used and whether they were stratified or not. For reproducibility I would like to request all splits to be uploaded to the repository so people are working with the same data.

    • The CAMELYON16 dataset contains slides that were sourced from two centers that used different magnifications for each scanner (https://jamanetwork.com/journals/jama/fullarticle/2665774). If 20x magnification was used for the whole dataset, could you please confirm whether the slides from these two centers were tiled at two different physical resolutions (microns per pixel)? “The whole-slide images were acquired at 2 different centers using 2 different scanners. RUMC images were produced with a digital slide scanner (Pannoramic 250 Flash II; 3DHISTECH) with a 20x objective lens (specimen-level pixel size, 0.243 μm × 0.243 μm). UMCU images were produced using a digital slide scanner (NanoZoomer-XR Digital slide scanner C12000-01; Hamamatsu Photonics) with a 40x objective lens (specimen-level pixel size, 0.226 μm × 0.226 μm).”

    • The model selection/checkpointing details. I see from the code https://github.com/szc19990412/TransMIL/blob/3f6bbe868ac39e7d861a111398b848ba3b943ca8/utils/utils.py#L43-L49, that you used EarlyStopping with patience=10 on val_loss, correct?

    • What optimizer and hyperparameters were used for ABMIL? I assume what the authors (https://arxiv.org/pdf/1802.04712.pdf) report in Table 17 for the histopathology datasets?

    I will later report my scores for TransMIL & ABMIL using the repository and given data fold. Thanks in advance.

    opened by DennisHaijma 2
  • Handling large bags during inference

    Handling large bags during inference

    Hi, I was wondering how large slides were handled during inference and training. Was there any limit on the bag-size to prevent OOMs? If so can you clarify how the predictions from multiple bags were aggregated.

    Thanks in advance.

    opened by harshith2794 1
  • Is it able to provide the codes for visualization?

    Is it able to provide the codes for visualization?

    The codes worked really well on the new dataset.

    Is it possible to provide the codes for visualization? I know you used some codes from CLAM, but I still unclear of how you generate the mappings in detail.

    Thank you so much!

    opened by YunanWu2168 0
  • computational cost of training per Epoch

    computational cost of training per Epoch

    i would like to ask how much the training step takes per Epoch, i used your built model and i modified the PPGE model by adding FFT to reduce the dimension Convolution operation, only issue i noticed was that the Trainer took a lot of time to finish single Epoch, that's is related to size the shape of the image (2154,1024) or i missed something

    opened by deep-matter 5
Owner
null
Deep Learning Slide Captcha

滑动验证码深度学习识别 本项目使用深度学习 YOLOV3 模型来识别滑动验证码缺口,基于 https://github.com/eriklindernoren/PyTorch-YOLOv3 修改。 只需要几百张缺口标注图片即可训练出精度高的识别模型,识别效果样例: 克隆项目 运行命令: git cl

Python3WebSpider 55 Jan 2, 2023
Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

Yun Liu 39 Sep 20, 2022
Makes patches from huge resolution .svs slide files using openslide

openslide_patcher Makes patches from huge resolution .svs slide files using openslide Example collage I made from outputs:

null 2 Dec 23, 2021
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

Phil Wang 272 Dec 23, 2022
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Swin Transformer for Object Detection This repo contains the supported code and configuration files to reproduce object detection results of Swin Tran

Swin Transformer 1.4k Dec 30, 2022
Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Simple-Image-Classification Simple Image Classification Code (PyTorch) Yechan Kim This repository contains: Python3 / Pytorch code for multi-class ima

Yechan Kim 8 Oct 29, 2022
Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

null 0 Jan 23, 2022
Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection, CVPR 2021. Installation A Linux pla

Tianning Yuan 269 Dec 21, 2022
Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

MI-AOD Language: 简体中文 | English Introduction This is the code for Multiple Instance Active Learning for Object Detection (The PDF is not available tem

Tianning Yuan 269 Dec 21, 2022
[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

GenForce: May Generative Force Be with You 93 Dec 25, 2022
Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal, multi-exposure and multi-focus image fusion.

U2Fusion Code of U2Fusion: a unified unsupervised image fusion network for multiple image fusion tasks, including multi-modal (VIS-IR, medical), multi

Han Xu 129 Dec 11, 2022
Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

ImageProcessingTransformer Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

null 61 Jan 1, 2023
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Phil Wang 12.6k Jan 9, 2023
An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" in Pytorch.

GLOM An implementation of Geoffrey Hinton's paper "How to represent part-whole hierarchies in a neural network" for MNIST Dataset. To understand this

null 50 Oct 19, 2022
An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data

GLOM TensorFlow This Python package attempts to implement GLOM in TensorFlow, which allows advances made by several different groups transformers, neu

Rishit Dagli 32 Feb 21, 2022
HPRNet: Hierarchical Point Regression for Whole-Body Human Pose Estimation

HPRNet: Hierarchical Point Regression for Whole-Body Human Pose Estimation Official PyTroch implementation of HPRNet. HPRNet: Hierarchical Point Regre

Nermin Samet 53 Dec 4, 2022
minimizer-space de Bruijn graphs (mdBG) for whole genome assembly

rust-mdbg: Minimizer-space de Bruijn graphs (mdBG) for whole-genome assembly rust-mdbg is an ultra-fast minimizer-space de Bruijn graph (mdBG) impleme

Barış Ekim 148 Dec 1, 2022
Official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers

Visual Parser (ViP) This is the official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers. Key Feature

Shuyang Sun 117 Dec 11, 2022