code for modular summarization work published in ACL2021 by Krishna et al

Related tags

Text Data & NLP modular-summarization

Overview

This repository contains the code for running modular summarization pipelines as described in the publication
Krishna K, Khosla K, Bigham J, Lipton ZC. Generating SOAP Notes from Doctor-Patient Conversations." ACL 2021.

Instructions

Although we can not release models trained on the confidential medical data, we have released models trained on the publicly available AMI dataset.
To reproduce the results on the AMI dataset, you need to follow the steps listed below. For convenience, we have also created a Google Colab notebook here that runs these steps on Google's servers (free-of-cost as of June 2021) and produces the summaries and their rouge scores.

Step1: Set up the environment by installing the required packages mentioned in requirements.txt using pip.

Step2: Download the ami_models folder from this link and put it at the root of the repository:

Step3: Run the following 3 commands to prepare data, run summary generation pipelines, and show the achieved rouge scores.

# command1: downloads and preprocesses AMI dataset  
./prepare_data.sh  
  
 # command2: runs the summarization pipelines on the data and computes rouge scores  
 # (before running this command, you need to download the models as shown above)  
./predict_ami.sh  
  
# command3: print the results  
python show_results.py

You might also like...

Python implementation of TextRank for phrase extraction and summarization of text documents

PyTextRank PyTextRank is a Python implementation of TextRank as a spaCy pipeline extension, used to: extract the top-ranked phrases from text document

1.4k Feb 17, 2021

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

VizSeq is a Python toolkit for visual analysis on text generation tasks like machine translation, summarization, image captioning, speech translation

310 Feb 1, 2021

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX. This package is still in alpha stag

137 Feb 1, 2021

Comments

Training script

Hi,

Thanks for the interesting work.

Is there any chance that training scripts can be released to replicate the results presented in the paper, especially for AMI datasets?

thanks!

opened by duyvuleo 1
Can't test the model on Google Colab

Can't test the model on Google Colab. There are missing files: -predicted_entrywise_gapped.jsonl -predicted_sectionwise_allxmin.json -predicted_allxmin_test.jsonl -test_outputs.jsonl.rougescores.csv

Where can I get those files? Thank you

opened by kelvin6666 0

code for modular summarization work published in ACL2021 by Krishna et al

Related tags

Overview

Instructions

You might also like...

Python implementation of TextRank for phrase extraction and summarization of text documents

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Package for controllable summarization

The guide to tackle with the Text Summarization

FactSumm: Factual Consistency Scorer for Abstractive Summarization

Codes for processing meeting summarization datasets AMI and ICSI.

SummerTime - Text Summarization Toolkit for Non-experts

Korean extractive summarization. 2021 AI 텍스트 요약 온라인 해커톤 화성갈끄니까팀 코드

Comments

Training script

Can't test the model on Google Colab

Owner

Approximately Correct Machine Intelligence (ACMI) Lab

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

A Multi-modal Model Chinese Spell Checker Released on ACL2021.

(ACL 2022) The source code for the paper "Towards Abstractive Grounded Summarization of Podcast Transcripts"

Summarization module based on KoBART

Module for automatic summarization of text documents and HTML pages.

Python implementation of TextRank for phrase extraction and summarization of text documents

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Module for automatic summarization of text documents and HTML pages.