CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.

Overview

CvT2DistilGPT2

Improving Chest X-Ray Report Generation by Leveraging Warm-Starting

  • This repository houses the implementation of CvT2DistilGPT2 from [1].
  • CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.
  • Checkpoints for CvT2DistilGPT2 on MIMIC-CXR and IU X-Ray are available.
  • This implementation could be adapted for any image captioning task by modifying the datamodule.

CvT2DistilGPT2 for MIMIC-CXR. Q, K, and V are the queries, keys, and values, respectively, for multi-head attention. * indicates that the linear layers for Q, K, and V are replaced with the convolutional layers depicted below the multi-head attention module. [BOS] is the beginning-of-sentence special token. N_l is the number of layers for each stage, where N_l=1, N_l=4, and N_l=16 for the first, second, and third stage, respectively. The head for DistilGPT2 is the same used for language modelling. Subwords produced by DistilGPT2 are separated by a vertical bar.

Installation

The required packages are located in requirements.txt. It is recommended that these are installed in a virtualenv:

python3 -m venv --system-site-packages venv
source venv/bin/activate
pip install --upgrade pip
pip install --upgrade -r requirements.txt --no-cache-dir

Datasets

For MIMIC-CXR:

  1. Download MIMIC-CXR-JPG from:

    https://physionet.org/content/mimic-cxr-jpg/2.0.0/
    
  2. Place in dataset/mimic_cxr_jpg such that dataset/mimic_cxr_jpg/physionet.org/files/mimic-cxr-jpg/2.0.0/files.

  3. Download the Chen et al. labels for MIMIC-CXR from:

    https://drive.google.com/file/d/1DS6NYirOXQf8qYieSVMvqNwuOlgAbM_E/view?usp=sharing
    
  4. Place annotations.json in dataset/mimic_cxr_chen

For IU X-Ray:

  1. Download the Chen et al. labels and the chest X-rays in png format for IU X-Ray from:
    https://drive.google.com/file/d/1c0BXEuDy8Cmm2jfN0YYGkQxFZd2ZIoLg/view
    
  2. Place files into dataset/iu_x-ray_chen such that dataset/iu_x-ray_chen/annotations.json and dataset/iu_x-ray_chen/images.

#####Note: the dataset directory can be changed for each task with the variable dataset_dir in task/mimic_cxr_jpg_chen/paths.yaml and task/mimic_cxr_jpg_chen/paths.yaml

Checkpoints

The checkpoints for MIMIC-CXR and IU X-Ray can be found at (the download link is located at the top right): https://doi.org/10.25919/hbqx-2p71. Place the checkpoints in the experiment directory for each version of each task, e.g., experiment/mimic_cxr_jpg_chen/cvt_21_to_gpt2_scst/epoch=0-val_chen_cider=0.410965.ckpt #####Note: the experiment directory can be changed for each task with the variable exp_dir in task/mimic_cxr_jpg_chen/paths.yaml and task/mimic_cxr_jpg_chen/paths.yaml

Instructions

  • The model configurations for each task can be found in its config directory, e.g. task/mimic_cxr_jpg_chen/config.

  • A job for a model is described in the tasks jobs.yaml file, e.g. task/mimic_cxr_jpg_chen/jobs.yaml.

  • To test the CvT2DistilGPT2 + SCST checkpoint, set task/mimic_cxr_jpg_chen/jobs.yaml to (default):

    cvt_21_to_distilgpt2_scst:
        train: 0
        test: 1
        debug: 0
        num_nodes: 1
        num_gpus: 1
        num_workers: 5
    
  • To train CvT2DistilGPT2 with teacher forcing and then test, set task/mimic_cxr_jpg_chen/jobs.yaml to:

    cvt_21_to_distilgpt2:
        train: 1
        test: 1
        debug: 0
        num_nodes: 1
        num_gpus: 1
        num_workers: 5
    

    or with Slurm:

    cvt_21_to_distilgpt2:
        train: 1
        test: 1
        debug: 0
        num_nodes: 1
        num_gpus: 1
        num_workers: 5
        resumable: 1
        sbatch: 1
        time_limit: 1-00:00:00
    
  • To run the job:

    python3 main.py --task mimic_cxr_jpg_chen

#####Note: data from the job will be saved in the experiment directory.

Reference

[1] Aaron Nicolson, Jason Dowling, and Aaron Nicolson, Improving Chest X-Ray Report Generation by Leveraging Warm-Starting, Under review (January 2022)

Comments
  • CvT-21-384x384-IN-22k.pth not available from mirco modelzoo.

    CvT-21-384x384-IN-22k.pth not available from mirco modelzoo.

    @johngrimes @lawley @jimsteel @bevankoopman @BauerLab

    Hi, the following pth is not available now. Can you share this model?
    Thank you very much.

    CvT-21-384x384-IN-22k.pth

    opened by congjianting 3
  • MIMIC-CXR labels not available

    MIMIC-CXR labels not available

    Hi,

    I have access to MIMIC-CXR but the labels of the google link was not authenticated. I contacted Chen, the original author but didn't get replied. Could you send me a copy of the labels?

    https://drive.google.com/file/d/1DS6NYirOXQf8qYieSVMvqNwuOlgAbM_E/view?usp=sharing

    opened by 2533245542 8
  • 'GPT2Decoder' object has no attribute 'decoder'

    'GPT2Decoder' object has no attribute 'decoder'

    Hi Aaron, thank you for your great work. I tried to train for IU X-Ray data. But the program returns this error:

    AttributeError 'GPT2Decoder' object has no attribute 'decoder' File "/media/data/cvt2distilgpt2/transmodal/utils.py", line 234, in _getattr return getattr(obj, attr, *args) File "/media/data/cvt2distilgpt2/transmodal/utils.py", line 235, in rgetattr return functools.reduce(_getattr, [obj] + attr.split('.')) File "/media/data/cvt2distilgpt2/transmodal/model.py", line 684, in configure_optimizers named_params = rgetattr(self, k).named_parameters() File "/media/data/lcvt2distilgpt2/main.py", line 48, in objective trainer.fit(transmodal, datamodule=dataset) File "/media/data/cvt2distilgpt2/main.py", line 65, in main objective(config) File "/media/data/cvt2distilgpt2/main.py", line 175, in main(clargs)

    I debug the bug, GPT2Decoder only has a similar attribute named 'encoder_decoder'. I suspect the problem is caused by the huggingface checkpoint version not matching your codes. Because the same error also exists in predict mode.

    I used your latest repository and installed all needed packages with your pip requirements.txt.

    transformers==4.15.0 download CvT2DistilGPT2 checkpoint from your website. download DistilGPT2 checkpoint from huggingface. download CvT-21 Checkpoint from Microsoft download Bert-base-uncased checkpoint from huggingface download chexbert from stanfordmlgroup repository

    opened by jinghaoliu 5
  • 'type' object is not subscriptable

    'type' object is not subscriptable

    After I ran "!python3 main.py --task mimic_cxr_jpg_chen". I got the following error:

    warnings.warn(f"Workstation configuration for {socket.gethostname()} does not exist. Using default "

    • CUDA:
      • GPU:
        • A100-SXM4-40GB
      • available: True
      • version: 11.3
    • Packages:
      • numpy: 1.21.6
      • pyTorch_debug: False
      • pyTorch_version: 1.12.0+cu113
      • pytorch-lightning: 1.5.10
      • tqdm: 4.64.0
    • System:
      • OS: Linux
      • architecture:
        • 64bit
      • processor: x86_64
      • python: 3.7.13
      • version: #1 SMP Sun Apr 24 10:03:06 PDT 2022 Traceback (most recent call last): File "main.py", line 214, in main(clargs) File "main.py", line 58, in main config = get_config(clargs) File "/content/drive/MyDrive/cvt2distilgpt2/transmodal/config.py", line 54, in get_config config = load_config(clargs) File "/content/drive/MyDrive/cvt2distilgpt2/transmodal/config.py", line 26, in load_config config = getattr(importlib.import_module(module), "config")() File "/usr/lib/python3.7/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 677, in _load_unlocked File "", line 728, in exec_module File "", line 219, in _call_with_frames_removed File "/content/drive/MyDrive/cvt2distilgpt2/task/mimic_cxr_jpg_chen/config/cvt_21_to_distilgpt2_scst.py", line 1, in from config.cvt_21_to_distilgpt2_chexbert import config as external_config File "/content/drive/MyDrive/cvt2distilgpt2/config/cvt_21_to_distilgpt2_chexbert.py", line 1, in from config.cvt_21_to_distilgpt2 import config as external_config File "/content/drive/MyDrive/cvt2distilgpt2/config/cvt_21_to_distilgpt2.py", line 1, in from transmodal.network.cvt import spatial_position_feature_size File "/content/drive/MyDrive/cvt2distilgpt2/transmodal/network/cvt.py", line 27, in class CvT(Module): File "/content/drive/MyDrive/cvt2distilgpt2/transmodal/network/cvt.py", line 77, in CvT def forward(self, images: torch.FloatTensor) -> Union[dict[str, Tensor], dict[str, Union[Tensor, Any]]]:

    TypeError: 'type' object is not subscriptable

    Please guide me through this. Thank you.

    opened by jainnipun11 9
  • CE Metrics

    CE Metrics

    Thanks for sharing the code.

    I was trying to evaluate your model with the new labeler tool VisualCheXbert. However, I am unable to find your code for evaluating the CE metrics. Could you please provide the code you used to evaluate the CE metrics?

    Thanks.

    opened by NatthananR 2
Owner
The Australian e-Health Research Centre
The Australian e-Health Research Centre
This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa

Wanyu Du 18 Dec 29, 2022
Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers 1 Using Colab Please notic

Hila Chefer 489 Jan 7, 2023
Computational modelling of ray propagation through optical elements using the principles of geometric optics (Ray Tracer)

Computational modelling of ray propagation through optical elements using the principles of geometric optics (Ray Tracer) Introduction By applying the

Son Gyo Jung 1 Jul 9, 2022
[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation Getting Started Our codes are implemented and tested with pyth

ZiNiU WaN 176 Dec 15, 2022
DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

DeepLabv3+:Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Download

Bubbliiiing 31 Nov 25, 2022
An implementation of a sequence to sequence neural network using an encoder-decoder

Keras implementation of a sequence to sequence model for time series prediction using an encoder-decoder architecture. I created this post to share a

Luke Tonin 195 Dec 17, 2022
This is the official Pytorch implementation of "Lung Segmentation from Chest X-rays using Variational Data Imputation", Raghavendra Selvan et al. 2020

README This is the official Pytorch implementation of "Lung Segmentation from Chest X-rays using Variational Data Imputation", Raghavendra Selvan et a

Raghav 42 Dec 15, 2022
some classic model used to segment the medical images like CT、X-ray and so on

github_project This is a project for medical image segmentation. This project includes common medical image segmentation models such as U-net, FCN, De

null 2 Mar 30, 2022
Code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection"

CTDNet The PyTorch code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection" Requirements Python 3.6

CVTEAM 28 Oct 20, 2022
Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT

CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT CheXbert is an accurate, automated dee

Stanford Machine Learning Group 51 Dec 8, 2022
Code for technical report "An Improved Baseline for Sentence-level Relation Extraction".

RE_improved_baseline Code for technical report "An Improved Baseline for Sentence-level Relation Extraction". Requirements torch >= 1.8.1 transformers

Wenxuan Zhou 74 Nov 29, 2022
🔥 TensorFlow Code for technical report: "YOLOv3: An Incremental Improvement"

?? Are you looking for a new YOLOv3 implemented by TF2.0 ? If you hate the fucking tensorflow1.x very much, no worries! I have implemented a new YOLOv

null 3.6k Dec 26, 2022
Resco: A simple python package that report the effect of deep residual learning

resco Description resco is a simple python package that report the effect of dee

Pierre-Arthur Claudé 1 Jun 28, 2022
Pytorch Lightning Distributed Accelerators using Ray

Distributed PyTorch Lightning Training on Ray This library adds new PyTorch Lightning accelerators for distributed training using the Ray distributed

null 166 Dec 27, 2022
Pytorch Lightning Distributed Accelerators using Ray

Distributed PyTorch Lightning Training on Ray This library adds new PyTorch Lightning plugins for distributed training using the Ray distributed compu

null 167 Jan 2, 2023
Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.

Rayvens augments Ray with events. With Rayvens, Ray applications can subscribe to event streams, process and produce events. Rayvens leverages Apache

CodeFlare 32 Dec 25, 2022
Distributed DataLoader For Pytorch Based On Ray

Dpex——用户无感知分布式数据预处理组件 一、前言 随着GPU与CPU的算力差距越来越大以及模型训练时的预处理Pipeline变得越来越复杂,CPU部分的数据预处理已经逐渐成为了模型训练的瓶颈所在,这导致单机的GPU配置的提升并不能带来期望的线性加速。预处理性能瓶颈的本质在于每个GPU能够使用的C

Dalong 23 Nov 2, 2022
A fast python implementation of Ray Tracing in One Weekend using python and Taichi

ray-tracing-one-weekend-taichi A fast python implementation of Ray Tracing in One Weekend using python and Taichi. Taichi is a simple "Domain specific

null 157 Dec 26, 2022
LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image.

This project is based on ultralytics/yolov3. LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image. Download $ git clone http

null 26 Dec 13, 2022