Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation

Related tags

Deep Learning ifcc
Overview

Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation

The reference code of Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation.

Implemented Models

Supported Radiology Report Datasets

Radiology NLI Dataset

The Radiology NLI dataset (RadNLI) is available at a corresponding PhysioNet project.

Prerequisites

  • A Linux OS (tested on Ubuntu 16.04)
  • Memory over 24GB
  • A gpu with memory over 12GB (tested on NVIDIA Titan X and NVIDIA Titan XP)

Preprocesses

Python Setup

Create a conda environment

$ conda env create -f environment.yml

NOTE : environment.yml is set up for CUDA 10.1 and cuDNN 7.6.3. This may need to be changed depending on a runtime environment.

Resize MIMIC-CXR-JPG

  1. Download MIMIC-CXR-JPG
  2. Make a resized copy of MIMIC-CXR-JPG using resize_mimic-cxr-jpg.py (MIMIC_CXR_ROOT is a dataset directory containing mimic-cxr)
    • $ python resize_mimic-cxr-jpg.py MIMIC_CXR_ROOT
  3. Create the sections file of MIMIC-CXR (mimic_cxr_sectioned.csv.gz) with create_sections_file.py
  4. Move mimic_cxr_sectioned.csv.gz to MIMIC_CXR_ROOT/mimic-cxr-resized/2.0.0/

Compute Document Frequencies

Pre-calculate document frequencies that will be used in CIDEr by:

$ python cider-df.py MIMIC_CXR_ROOT mimic-cxr_train-df.bin.gz

Recognize Named Entities

Pre-recognize named entities in MIMIC-CXR by:

$ python ner_reports.py --stanza-download MIMIC_CXR_ROOT mimic-cxr_ner.txt.gz

Download Pre-trained Weights

Download pre-trained CheXpert weights, pre-trained radiology NLI weights, and GloVe embeddings

$ cd resources
$ ./download.sh

Training a Report Generation Model

First, train the Meshed-Memory Transformer model with an NLL loss.

# NLL
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --entity-match mimic-cxr_ner.txt.gz --img-model densenet --img-pretrained resources/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --corpus mimic-cxr --lr-scheduler trans MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll

Second, further train the model a joint loss using the self-critical RL to achieve a better performance.

# RL with NLL + BERTScore + EntityMatchExact
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --rl-epoch 1 --rl-metrics BERTScore,EntityMatchExact --rl-weights 0.01,0.495,0.495 --entity-match mimic-cxr_ner.txt.gz --baseline-model out_m2trans_nll/model_31-152173.dict.gz --img-model densenet --img-pretrained resources/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --lr 5e-6 --lr-step 32 MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll-bs-emexact
# RL with NLL + BERTScore + EntityMatchNLI
$ python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --rl-epoch 1 --rl-metrics BERTScore,EntityMatchNLI --rl-weights 0.01,0.495,0.495 --entity-match mimic-cxr_ner.txt.gz --baseline-model out_m2trans_nll/model_31-152173.dict.gz --img-model densenet --img-pretrained resources/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --lr 5e-6 --lr-step 32 MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll-bs-emnli

Checking Result with TensorBoard

A training result can be checked with TensorBoard.

$ tensorboard --logdir out_m2trans_nll-bs-emnli/log
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.0.0 at http://localhost:6006/ (Press CTRL+C to quit)

Evaluation using CheXbert

NOTE: This evaluation assumes that CheXbert is set up in ./CheXbert.

First, extract reference reports to a csv file.

$ python extract_reports.csv MIMIC_CXR_ROOT/mimic-cxr-resized/2.0.0/mimic_cxr_sectioned.csv.gz MIMIC_CXR_ROOT/mimic-cxr-resized/2.0.0/mimic-cxr-2.0.0-split.csv.gz mimic-imp
$ mv mimic-imp CheXbert/src/

Second, convert generated reports to a csv file. (TEST_SAMPLES is a path to test samples. e.g., out_m2trans_nll-bs-emnli/test_31-152173_samples.txt.gz)

$ python convert_generated.py TEST_SAMPLES gen.csv
$ mv gen.csv CheXbert/src/

Third, run CheXbert against the reference reports.

$ cd CheXbert/src/
$ python label.py -d mimic-imp/reports.csv -o mimic-imp -c chexbert.pth

Fourth, run eval_prf.py to obtain CheXbert scores.

$ cp ../../eval_prf.py . 
$ python eval_prf.py mimic-imp gen.csv gen_chex.csv
2947 references
2347 generated
...
5-micro x.xxx x.xxx x.xxx
5-acc x.xxx

Inferring from a Checkpoint

An inference from a checkpoint can be done with infer.py. (CHECKPOINT is a path to the checkpoint)

$ python infer.py --cuda --corpus mimic-cxr --cache-data cache --batch-size 24 --entity-match mimic-cxr_ner.txt.gz --img-model densenet --img-pretrained resources/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --corpus mimic-cxr --lr-scheduler trans MIMIC_CXR_ROOT CHECKPOINT resources/glove_mimic-cxr_train.512.txt.gz out_infer

Pre-trained checkpoints for M2 Transformer can be obtained with a download script.

$ cd checkpoints
$ ./download.sh

Licence

See LICENSE and clinicgen/external/LICENSE_bleu-cider-rouge-spice for details.

Comments
  • inference

    inference

    Hello and thank you for sharing your work! I was wondering how can I make an inference with your provided pretrained model checkpoint. Thank you very much in advance!

    opened by luantunez 5
  • about CheXpert&NegBio

    about CheXpert&NegBio

    Hello, ysmiura Thanks for opening your source code. It's very nice works. when i running the code,i have a few questions, it's mainly about the data preprocessing. i notice the sections file createed by create_sections_file.py are IMPRESSIONS instead FINDINGS, is that right? and if i should merge mimic_cxr_number_labeled.csv to one csv file and zip it to mimic-cxr-2.0.0-chexpert.csv.gz? and if the negbio(https://github.com/MIT-LCP/mimic-cxr/tree/master/txt/negbio) in the ifcc code is necessary?

    the other question is i have applied the license of mimic_cxr, while i have not seen the mimic-cxr-2.0.0-metadata.csv.gz and mimic-cxr-2.0.0-split.csv.gz, could you please tell me where i can find it?

    thank you very much, best wishes

    opened by newbietuan 2
  • Segmentation fault (core dumped)

    Segmentation fault (core dumped)

    Thanks for your great work. When I am following the tutorial to reproduce the results, some problems are raised. I would be appreciated if you tell me how to fix this issue.

    Step: python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --entity-match mimic-cxr_ner.txt.gz --img-model densenet --img-pretrained chexpert_densenet/model_auc14.dict.gz --bert-score distilbert-base-uncased --corpus mimic-cxr --lr-scheduler trans MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll

    Return: Segmentation fault (core dumped)

    opened by tron19920125 1
  • about training NLL

    about training NLL

    Thanks for your great work. When i run python train.py --cuda --corpus mimic-cxr --cache-data cache --epochs 32 --batch-size 24 --cider-df mimic-cxr_train-df.bin.gz --entity-match mimic-cxr_ner.txt.gz --img-model densenet --img-pretrained resources/chexpert_auc14.dict.gz --cider-df mimic-cxr_train-df.bin.gz --bert-score distilbert-base-uncased --corpus mimic-cxr --lr-scheduler trans MIMIC_CXR_ROOT resources/glove_mimic-cxr_train.512.txt.gz out_m2trans_nll, i got the message as below: Unexpected exception:Traceback(most recent call last): File “Train.py” ,line 165, in main save=args.log_models, progress=True) File “/home/mayt/ifcc-master/clinicgen/log.py”, line 25, in log_datasets logger.evaluator.setup() File “/home/mayt/ifcc-master/clinicgen/eval.py”,line 752, in setup model = SimpleNLI.load_model(model) File "home/mayt/ifcc-master/clinicgen/nli.py",line 599. in load_model bertnli.load_state_dict(states_dict) ... RuntimeError: Error(s) in loading state_dict for BERTNLI" Missing key(s) in state_dict:"bert.embeddings.position_ids".

    because the reason of net,i load bert-base-uncased and distilbert-base-uncased locally,and the number of line aforementioned may be small changes。(i had run the ./download.sh )

    opened by newbietuan 0
  • Clinical metrics Macro vs. Micro average?

    Clinical metrics Macro vs. Micro average?

    Hi, In the article, the metrics related to clinical accuracy (chexBert-labeler) F-1, P, R are reported in their Micro average version. Most papers in the area report the Macro average, making this method difficult to compare. Could you provide the Macro average version of these metrics (Table 2)?

    Thanks

    opened by denisparra 2
  • How to run the all the trained models to get metrics for comparing?

    How to run the all the trained models to get metrics for comparing?

    Hi, I am trying to run all the trained models on MIMIC-CXR data to get the BLEU, ROGUE metrics. How do I get that?. Do I have to get the prediction of the trained models and compute the metric separately?. Your help is much appreciated.

    opened by aj280192 0
  • error  in the inference.py when using it on the open-i dataset

    error in the inference.py when using it on the open-i dataset

    when trying to use the inference.py using the following parameters !python infer.py --cuda --corpus open-i --stanza-download --splits /content/ifcc/meta.txt --cache-data cache --batch-size 24 --entity-match /content/open-i_ner.txt.gz --cider-df /content/drive/MyDrive/open-i_train-df.bin.gz --img-model densenet --img-pretrained resources/chexpert_auc14.dict.gz /content/open-i /content/ifcc/checkpoints/checkpoint_nll-bs-emnli.dict.gz resources/glove_mimic-cxr_train.512.txt.gz out_infer we get the following error : Unexpected exception: Traceback (most recent call last): File "infer.py", line 88, in main EpochLog.log_datasets(logger, pbar_vals, 0, 0, None, val_loader, test_loader, save=False, progress=True) File "/content/ifcc/clinicgen/log.py", line 30, in log_datasets results[split] = logger.evaluator.generate_and_eval(data_loader, prog_name) File "/content/ifcc/clinicgen/eval.py", line 668, in generate_and_eval scores, scores_detailed = self.eval(report_ids, refs, hypos, tfidf_vectorizer) File "/content/ifcc/clinicgen/eval.py", line 559, in eval mse, sde, msn, sdn = self.entity_matcher.score(ref_ids, hypos_l) File "/content/ifcc/clinicgen/eval.py", line 156, in score _, _, _, stats = self.nli.sentence_scores_bert_score(texts1, texts2, label='all', prf=self.prf) File "/content/ifcc/clinicgen/nli.py", line 221, in sentence_scores_bert_score _, _, bf = self.bert_score_model.score(bsents1, bsents2) File "/content/ifcc/clinicgen/nli.py", line 535, in score device=self.device, batch_size=self.batch_size, all_layers=self.all_layers).cpu() File "/usr/local/lib/python3.7/dist-packages/bert_score/utils.py", line 520, in bert_cos_score_idf sen_batch, model, tokenizer, idf_dict, device=device, all_layers=all_layers File "/usr/local/lib/python3.7/dist-packages/bert_score/utils.py", line 399, in get_bert_embedding model, padded_sens[i : i + batch_size], attention_mask=mask[i : i + batch_size], all_layers=all_layers, File "/usr/local/lib/python3.7/dist-packages/bert_score/utils.py", line 309, in bert_encode out = model(x, attention_mask=attention_mask, output_hidden_states=all_layers) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) TypeError: forward() got an unexpected keyword argument 'output_hidden_states'

    opened by ma5n1sh 1
Owner
null
1st-in-MICCAI2020-CPM - Combined Radiology and Pathology Classification

Combined Radiology and Pathology Classification MICCAI 2020 Combined Radiology a

null 22 Dec 8, 2022
Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Learning-Action-Completeness-from-Points Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal A

Pilhyeon Lee 67 Jan 3, 2023
NAACL'2021: Factual Probing Is [MASK]: Learning vs. Learning to Recall

OptiPrompt This is the PyTorch implementation of the paper Factual Probing Is [MASK]: Learning vs. Learning to Recall. We propose OptiPrompt, a simple

Princeton Natural Language Processing 150 Dec 20, 2022
Measuring and Improving Consistency in Pretrained Language Models

ParaRel ?? This repository contains the code and data for the paper: Measuring and Improving Consistency in Pretrained Language Models as well as the

Yanai Elazar 26 Dec 2, 2022
This repository is an implementation of paper : Improving the Training of Graph Neural Networks with Consistency Regularization

CRGNN Paper : Improving the Training of Graph Neural Networks with Consistency Regularization Environments Implementing environment: GeForce RTX™ 3090

THUDM 1 Dec 9, 2021
Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

null 23 Oct 17, 2022
A 1.3B text-to-image generation model trained on 14 million image-text pairs

minDALL-E on Conceptual Captions minDALL-E, named after minGPT, is a 1.3B text-to-image generation model trained on 14 million image-text pairs for no

Kakao Brain 604 Dec 14, 2022
Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh

Arjun Majumdar 44 Dec 14, 2022
ISBI 2022: Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image.

Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image Introduction This repository contains the PyTorch implem

null 25 Nov 9, 2022
Source code for our paper "Improving Empathetic Response Generation by Recognizing Emotion Cause in Conversations"

Source code for our paper "Improving Empathetic Response Generation by Recognizing Emotion Cause in Conversations" this repository is maintained by bo

Yuhan Liu 24 Nov 29, 2022
This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models are Pix2Pix, Pix2PixHD, CycleGAN and PointWise.

RGB2NIR_Experimental This repository contains several image-to-image translation models, whcih were tested for RGB to NIR image generation. The models

null 5 Jan 4, 2023
Code for technical report "An Improved Baseline for Sentence-level Relation Extraction".

RE_improved_baseline Code for technical report "An Improved Baseline for Sentence-level Relation Extraction". Requirements torch >= 1.8.1 transformers

Wenxuan Zhou 74 Nov 29, 2022
🔥 TensorFlow Code for technical report: "YOLOv3: An Incremental Improvement"

?? Are you looking for a new YOLOv3 implemented by TF2.0 ? If you hate the fucking tensorflow1.x very much, no worries! I have implemented a new YOLOv

null 3.6k Dec 26, 2022
Resco: A simple python package that report the effect of deep residual learning

resco Description resco is a simple python package that report the effect of dee

Pierre-Arthur Claudé 1 Jun 28, 2022
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
BARTScore: Evaluating Generated Text as Text Generation

This is the Repo for the paper: BARTScore: Evaluating Generated Text as Text Generation Updates 2021.06.28 Release online evaluation Demo 2021.06.25 R

NeuLab 196 Dec 17, 2022
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Deep Daze mist over green hills shattered plates on the grass cosmic love and attention a time traveler in the crowd life during the plague meditative

Phil Wang 4.4k Jan 3, 2023
A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.

Ryan Murdock has done it again, combining OpenAI's CLIP and the generator from a BigGAN! This repository wraps up his work so it is easily accessible to anyone who owns a GPU.

Phil Wang 2.3k Jan 9, 2023
A Jupyter notebook to play with NVIDIA's StyleGAN3 and OpenAI's CLIP for a text-based guided image generation.

A Jupyter notebook to play with NVIDIA's StyleGAN3 and OpenAI's CLIP for a text-based guided image generation.

Eugenio Herrera 175 Dec 29, 2022