Does Pretraining for Summarization Reuqire Knowledge Transfer?

Approximately Correct Machine Intelligence (ACMI) Lab

Last update: Dec 19, 2022

Related tags

Deep Learning pretraining-with-nonsense

Overview

Does Pretraining for Summarization Reuqire Knowledge Transfer?

This repository is the official implementation of the work in the paper Does Pretraining for Summarization Reuqire Knowledge Transfer? to appear in Findings of EMNLP 2021.
You can find the paper on arXiv here: https://arxiv.org/abs/2109.04953

Requirements

This code requires Python 3 (tested using version 3.6)

To install requirements, run:

pip install -r requirements.txt

Preparing finetuning datasets

To prepare a summarization dataset for finetuning, run the corresponding script in the finetuning_datasetgen folder. For example, to prepare the cnn-dailymail dataset run:

cd finetuning_datasetgen
python cnndm.py

Running finetuning experiment

We show here how to run training, prediction and evaluation steps for a finetuning experiment. We assume that you have downloaded the pretrained models in the pretrained_models folder from the provided Google Drive link (see pretrained_models/README.md) If you want to pretrain models yourself, see latter part of this readme for the instructions.

All models in our work are trained using allennlp config files which are in .jsonnet format. To run a finetuning experiment, simply run

# for t5-like models
./pipeline_t5.sh 
   
    

# for pointer-generator models
./pipeline_pg.sh

For example, for finetuning a T5 model on cnndailymail dataset, starting from a model pretrained with ourtasks-nonsense pretraining dataset, run

./pipeline_t5.sh finetuning_experiments/cnndm/t5-ourtasks-nonsense

Similarly, for finetuning a randomly-initialized pointer-generator model, run

./pipeline_pg.sh finetuning_experiments/cnndm/pg-randominit

The trained model and output files would be available in the folder that would be created by the script.

model.tar.gz contains the trained (finetuned) model

test_outputs.jsonl contains the outputs of the model on the test split.

test_genmetrics.json contains the ROUGE scores of the output

Creating pretraining datasets

We have provided the nonsense pretraining datasets used in our work via Google Drive (see dataset_root/pretraining_datasets/README.md for instructions)

However, if you want to generate your own pretraining corpus, you can run

cd pretraining_datasetgen
# for generating dataset using pretraining tasks
python ourtasks.py
# for generating dataset using STEP pretraining tasks
python steptasks.py

These commands would create pretraining datasets using nonsense. If you want to create datasets starting from wikipedia documents please look into the two scripts which guide you how to do that by commenting/uncommenting two blocks of code.

Pretraining models

Although we provide you the pretrained model checkpoints via GoogleDrive, if you want to pretrain your own models, you can do that by using the corresponding pretraining config file. As an example, we have provided a config file which pretrains on ourtasks-nonsense dataset. Make sure that the pretraining dataset files exist (either created by you or downloaded from GoogleDrive) before running the pretraining command. The pretraining is also done using the same shell scripts used for the finetuning experiments. For example, to pretrain a model on the ourtasks-nonsense dataset, simply run :

./pipeline_t5.sh pretraining_experiments/pretraining_t5_ourtasks_nonsense

You might also like...

Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

KaGRMN-DSG_ABSA This repository contains the PyTorch source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated

4 May 20, 2022

Warning: This project does not have any current developer. See bellow.

Pylearn2: A machine learning research library Warning : This project does not have any current developer. We will continue to review pull requests and

Laboratoire d’Informatique des Systèmes Adaptatifs

2.7k Dec 26, 2022

code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

Does Unsupervised Architecture Representation Learning Help Neural Architecture Search? Code for paper: Does Unsupervised Architecture Representation

39 Dec 17, 2022

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

105 Nov 25, 2022

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

105 Nov 25, 2022

Does MAML Only Work via Feature Re-use? A Data Set Centric Perspective

Does-MAML-Only-Work-via-Feature-Re-use-A-Data-Set-Centric-Perspective Does MAML Only Work via Feature Re-use? A Data Set Centric Perspective Installin

2 Nov 7, 2022

A human-readable PyTorch implementation of "Self-attention Does Not Need O(n^2) Memory"

memory_efficient_attention.pytorch A human-readable PyTorch implementation of "Self-attention Does Not Need O(n^2) Memory" (Rabe&Staats'21). def effic

7 Dec 26, 2022

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

180 Jan 5, 2023

CLASP - Contrastive Language-Aminoacid Sequence Pretraining

CLASP - Contrastive Language-Aminoacid Sequence Pretraining Repository for creating models pretrained on language and aminoacid sequences similar to C

133 Dec 29, 2022

Comments

Error when running `./pipeline_t5.sh pretraining_experiments/pretraining_t5_ourtasks_nonsense`. Using `Python 3.6.9` and ran `pip install -r requirements.txt`

Hello,

I am trying to pretrain on the ourtasks-nonsense dataset with the configs pretraining_experiments/pretraining_t5_ourtasks_nonsense.jsonnet

I am using Python 3.6.9 because the README says that the code was tested using Python 3.6

Attempt 1

When I ran pip install -r requirements.txt I got the below error So in requirements.txt I changed torch==1.5.1 to just torch and then ran pip install -r requirements.txt

Then I ran: cd pretraining_datasetgen python ourtasks.py allennlp train pretraining_experiments/pretraining_t5_ourtasks_nonsense.jsonnet --include-package t5 -s pretraining_experiments/pretraining_t5_ourtasks_nonsense

From the allennlp command, I get the error:

Attempt 2

In requirements.txt I removed torch==1.5.1 before running pip install -r requirements.txt. After this, I installed torch 1.5.1 by itself with pip install torch==1.5.1.

Then I ran: cd pretraining_datasetgen python ourtasks.py allennlp train pretraining_experiments/pretraining_t5_ourtasks_nonsense.jsonnet --include-package t5 -s pretraining_experiments/pretraining_t5_ourtasks_nonsense

From the allennlp command, I get the error:

Thanks for any help

opened by felixzli 2
How to train with multiple GPU?

Hello,

I am using a machine with four Tesla T4 16GB GPUs.

I am trying to pretrain a model on the ourtasks-nonsense dataset by running CUDA_VISIBLE_DEVICES=0,1,2,3 allennlp train pretraining_experiments/pretraining_t5_ourtasks_nonsense.jsonnet --include-package t5 -s pretraining_experiments/pretraining_t5_ourtasks_nonsense

When I run that command, when I check nvidia-smi, I see that only 1 GPU is being used. And eventually the training fails due to a CUDA out of memory error.

I'm wondering how I can train with multiple GPUs. Is there a modification to training config file pretraining_t5_ourtasks_nonsense.jsonnet that I need to make?

Thanks

opened by felixzli 1

Owner

Approximately Correct Machine Intelligence (ACMI) Lab

Research on machine learning, its social impacts, and applications to healthcare. PI—@zackchase

GitHub

This is the repo for the paper `SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization'. (published in Bioinformatics'21)

SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization This is the code for our paper ``SumGNN: Multi-typed Drug

58 Dec 21, 2022

code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

SHOT++ Code for our TPAMI submission "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer" that is ext

75 Dec 16, 2022

Transfer-Learn is an open-source and well-documented library for Transfer Learning.

Transfer-Learn is an open-source and well-documented library for Transfer Learning. It is based on pure PyTorch with high performance and friendly API. Our code is pythonic, and the design is consistent with torchvision. You can easily develop new algorithms, or readily apply existing algorithms.

2.2k Jan 3, 2023

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

Transfer Style API It's an API to use with Tranfer Style App, where you can use

1 Feb 13, 2022

Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

Visualizing Adapted Knowledge in Domain Transfer @inproceedings{hou2021visualizing, title={Visualizing Adapted Knowledge in Domain Transfer}, auth

80 Dec 25, 2022

Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer

VidLanKD Implementation of VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer by Zineng Tang, Jaemin Cho, Hao Tan, Mohi

54 Dec 20, 2022

🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"

SGLKT-VisDial Pytorch Implementation for the paper: Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer Gi-Cheon Kang, Junseok P

9 Jul 5, 2022

Code for the ICME 2021 paper "Exploring Driving-Aware Salient Object Detection via Knowledge Transfer"

TSOD Code for the ICME 2021 paper "Exploring Driving-Aware Salient Object Detection via Knowledge Transfer" Usage For training, open train_test, run p

2 Dec 23, 2021

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

DataFree A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation" Authors: Gongfa

47 Jan 9, 2023

TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

FunMatch-Distillation TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A g

67 Dec 20, 2022

Does Pretraining for Summarization Reuqire Knowledge Transfer?

Related tags

Overview

Does Pretraining for Summarization Reuqire Knowledge Transfer?

Requirements

Preparing finetuning datasets

Running finetuning experiment

Creating pretraining datasets

Pretraining models

You might also like...

Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

Warning: This project does not have any current developer. See bellow.

code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Does MAML Only Work via Feature Re-use? A Data Set Centric Perspective

A human-readable PyTorch implementation of "Self-attention Does Not Need O(n^2) Memory"

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

CLASP - Contrastive Language-Aminoacid Sequence Pretraining

Comments

Error when running `./pipeline_t5.sh pretraining_experiments/pretraining_t5_ourtasks_nonsense`. Using `Python 3.6.9` and ran `pip install -r requirements.txt`

How to train with multiple GPU?

Owner

Approximately Correct Machine Intelligence (ACMI) Lab

This is the repo for the paper `SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization'. (published in Bioinformatics'21)

code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

Transfer-Learn is an open-source and well-documented library for Transfer Learning.

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer

🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"

Code for the ICME 2021 paper "Exploring Driving-Aware Salient Object Detection via Knowledge Transfer"

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.