Toward Model Interpretability in Medical NLP

Last update: Mar 4, 2022

Related tags

Text Data & NLP biobert_analysis

Overview

Toward Model Interpretability in Medical NLP

LING380: Topics in Computational Linguistics Final Project James Cross ([email protected]) and Daniel Kim ([email protected]), December 2021

Code Organization

data: contains medical report data [LINK TO THAT REPO] used in model fine-tuning and analysis, clinical stop words, and saved accuracy and entropy metrics during evaluation

models: checkpoints of the best performing BERT and BioBERT models after hyperparameter optimization

notebooks:

model_training.ipynb: code to train and fine-tune BERT and BioBERT

model_evaluation.ipynb: code to run various model evaluations, visualize word importances, perform post-training clinical stopword masking, and other analyses

scripts: same functionality as in the notebooks, in executable python scripts / functions

Dependencies

All packages needed to run the code are available in the default Google Colab environment (see documentation for full list), with the exception of huggingface (transformers), used for loading transformer models, and captum.ai (captum), which provides access for a variety of model interpretation tools.

How to run code

Two options available to run the code; on Google colab and/or locally on your machine.

Option 1) Google Colab

Model training notebook: [https://colab.research.google.com/drive/1uPIi-OVchs_8A-SNcQtLfwelr0ccsz19?usp=sharing] Model evaluation/analysis notebook: [https://colab.research.google.com/drive/1Hfy58JvyPbx55lKKhQAzzrhJIbN_Io0j?usp=sharing]

Option 2) Local Machine

Notebooks: You can run the model_training.ipynb or model_evaluation.ipynb notebooks as is, changing directory paths when needed.

You might also like...

AI-powered literature discovery and review engine for medical/scientific papers

AI-powered literature discovery and review engine for medical/scientific papers paperai is an AI-powered literature discovery and review engine for me

819 Dec 30, 2022

Mednlp - Medical natural language parsing and utility library

Medical natural language parsing and utility library A natural language medical

3 Aug 24, 2022

MEDIALpy: MEDIcal Abbreviations Lookup in Python

A small python package that allows the user to look up common medical abbreviations.

7 Nov 9, 2022

Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)

Frog for Python This is a Python binding to the Natural Language Processing suite Frog. Frog is intended for Dutch and performs part-of-speech tagging

46 Dec 14, 2022

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

Colibri Core by Maarten van Gompel, [email protected], Radboud University Nijmegen Licensed under GPLv3 (See http://www.gnu.org/licenses/gpl-3.0.html

122 Nov 17, 2022

Toward Model Interpretability in Medical NLP

Related tags

Overview

Toward Model Interpretability in Medical NLP

Code Organization

Dependencies

How to run code

Option 1) Google Colab

Option 2) Local Machine

You might also like...

AI-powered literature discovery and review engine for medical/scientific papers

Mednlp - Medical natural language parsing and utility library

MEDIALpy: MEDIcal Abbreviations Lookup in Python

Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)

💫 Industrial-strength Natural Language Processing (NLP) in Python

NLP, before and after spaCy

Multilingual text (NLP) processing toolkit

Basic Utilities for PyTorch Natural Language Processing (NLP)

Owner

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

Transformers-regression - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates

NLP Core Library and Model Zoo based on PaddlePaddle 2.0

Scikit-learn style model finetuning for NLP

Scikit-learn style model finetuning for NLP

Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP

Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.