Behavioral Testing of Clinical NLP Models
This repository contains code for testing the behavior of clinical prediction models based on patient letters. For a detailed description of the testing framework see our paper What Do You See in this Patient? Behavioral Testing of Clinical NLP Models.
Usage
Install requirements: pip install -r requirements.txt
Run main.py, e.g. for diagnosis prediction test on gender, age and ethnicity:
python main.py
--test_set_path ./path_to_test_set
--model_path bvanaken/CORe-clinical-diagnosis-prediction
--task diagnosis
--shift_keys gender,age,ethnicity
--save_dir ./results
--gpu False
Parameter | Description |
---|---|
test_set_path | Path to original test set file |
model_path | Path to model or Huggingface model hub checkpoint |
task | Current options: diagnosis, mortality |
shift_keys | Which patient characteristics to test. Current options: age, gender, ethnicity, weight, intersectional (gender + ethnicity) |
save_dir | Directory to save results, default: "./results" |
gpu | Whether to use a gpu during inference or not, default: False |
Using Non-Transformer models
The framework currently focuses on testing Transformer-based models. However, it is easy to extend it to any other prediction model. To do so, simply create a new class implementing the Predictor interface and add it to the TASK_MAP in main.py.
Cite
@inproceedings{vanAken2021,
author = {Betty van Aken and
Sebastian Herrmann and
Alexander Löser},
title = {What Do You See in this Patient? Behavioral Testing of Clinical NLP Models},
booktitle = {Bridging the Gap: From Machine Learning Research to Clinical Practice,
Research2Clinics Workshop @ NeurIPS 2021},
year = {2021}
}