Behavioral Testing of Clinical NLP Models

Betty van Aken

Last update: Sep 20, 2022

Related tags

Text Data & NLP clinical-behavioral-testing

Overview

Behavioral Testing of Clinical NLP Models

This repository contains code for testing the behavior of clinical prediction models based on patient letters. For a detailed description of the testing framework see our paper What Do You See in this Patient? Behavioral Testing of Clinical NLP Models.

Usage

Install requirements: pip install -r requirements.txt

Run main.py, e.g. for diagnosis prediction test on gender, age and ethnicity:

python main.py 
    --test_set_path ./path_to_test_set
    --model_path bvanaken/CORe-clinical-diagnosis-prediction
    --task diagnosis
    --shift_keys gender,age,ethnicity
    --save_dir ./results
    --gpu False

Parameter	Description
test_set_path	Path to original test set file
model_path	Path to model or Huggingface model hub checkpoint
task	Current options: diagnosis, mortality
shift_keys	Which patient characteristics to test. Current options: age, gender, ethnicity, weight, intersectional (gender + ethnicity)
save_dir	Directory to save results, default: "./results"
gpu	Whether to use a gpu during inference or not, default: False

Using Non-Transformer models

The framework currently focuses on testing Transformer-based models. However, it is easy to extend it to any other prediction model. To do so, simply create a new class implementing the Predictor interface and add it to the TASK_MAP in main.py.

Cite

@inproceedings{vanAken2021,
  author    = {Betty van Aken and
               Sebastian Herrmann and
               Alexander Löser},
  title     = {What Do You See in this Patient? Behavioral Testing of Clinical NLP Models},
  booktitle = {Bridging the Gap: From Machine Learning Research to Clinical Practice, 
               Research2Clinics Workshop @ NeurIPS 2021},
  year      = {2021}
}

Machine learning models from Singapore's NLP research community

SG-NLP Machine learning models from Singapore's natural language processing (NLP) research community. sgnlp is a Python package that allows you to eas

21 Dec 17, 2022

A2T: Towards Improving Adversarial Training of NLP Models (EMNLP 2021 Findings)

A2T: Towards Improving Adversarial Training of NLP Models This is the source code for the EMNLP 2021 (Findings) paper "Towards Improving Adversarial T

17 Oct 15, 2022

Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks

Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks, which modifies the input text with a textual template and directly uses PLMs to conduct pre-trained tasks. This library provides a standard, flexible and extensible framework to deploy the prompt-learning pipeline. OpenPrompt supports loading PLMs directly from huggingface transformers. In the future, we will also support PLMs implemented by other libraries.

2.3k Jan 8, 2023

Anuvada: Interpretable Models for NLP using PyTorch

Anuvada: Interpretable Models for NLP using PyTorch So, you want to know why your classifier arrived at a particular decision or why your flashy new d

102 Oct 1, 2022

A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

MIDI Language Introduction Reference Paper: Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions: code This

3 May 25, 2022

TweebankNLP - Pre-trained Tweet NLP Pipeline (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Models + Tweebank-NER

TweebankNLP This repo contains the new Tweebank-NER dataset and Twitter-Stanza p

84 Dec 20, 2022

T‘rex Park is a Youzan sponsored project. Offering Chinese NLP and image models pretrained from E-commerce datasets

Behavioral Testing of Clinical NLP Models

Related tags

Overview

Behavioral Testing of Clinical NLP Models

Usage

Using Non-Transformer models

Cite

You might also like...

Machine learning models from Singapore's NLP research community

A2T: Towards Improving Adversarial Training of NLP Models (EMNLP 2021 Findings)

Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks

Anuvada: Interpretable Models for NLP using PyTorch

A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

TweebankNLP - Pre-trained Tweet NLP Pipeline (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Models + Tweebank-NER

T‘rex Park is a Youzan sponsored project. Offering Chinese NLP and image models pretrained from E-commerce datasets

Twitter bot that uses NLP models to summarize news articles referenced in a user's twitter timeline

An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations

Owner

Betty van Aken

Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

This library is testing the ethics of language models by using natural adversarial texts.

Super easy library for BERT based NLP models

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

Super easy library for BERT based NLP models

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Interpretable Models for NLP using PyTorch