A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.

Shahrukh Khan

Last update: Jan 2, 2023

Related tags

Text Data & NLP multitask-learning-transformers

Overview

multitask-learning-transformers

A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.

Colab Notebook

Trained Huggingface Model

HF Model

Install depedencies

pip install -r requirements.txt

Run training

python3 main.py \
        --model_name_or_path='roberta-base' \
        --per_device_train_batch_size=8 \
        --output_dir=output --num_train_epochs=1

Single Encoder Multiple Output Heads

A multi-task model in the age of BERT works by having a shared BERT-style encoder transformer, and different task heads for each task.

Shared Encoder

Separate models for each task, but we make them share the same encoder.

References: Multi-task Training with Transformers+NLP

This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).

Ucto for Python This is a Python binding to the tokeniser Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task,

27 Dec 14, 2022

Code and checkpoints for training the transformer-based Table QA models introduced in the paper TAPAS: Weakly Supervised Table Parsing via Pre-training.

End-to-end neural table-text understanding models.

914 Jan 7, 2023

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

GenSen Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning Sandeep Subramanian, Adam Trischler, Yoshua B

309 Oct 19, 2022

A collection of Korean Text Datasets ready to use using Tensorflow-Datasets.

tfds-korean A collection of Korean Text Datasets ready to use using Tensorflow-Datasets. TensorFlow-Datasets를 이용한 한국어/한글 데이터셋 모음입니다. Dataset Catalog |

20 Jul 11, 2022

Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

Lime Comparing deep contextualized model for sentences highlighting task. In addition, take the classic explanation model "LIME" with bert-base model

2 Jan 18, 2022

This repo contains simple to use, pretrained/training-less models for speaker diarization.

PyDiar This repo contains simple to use, pretrained/training-less models for speaker diarization. Supported Models Binary Key Speaker Modeling Based o

12 Jan 20, 2022

PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing

PhoNLP is a multi-task learning model for joint part-of-speech (POS) tagging, named entity recognition (NER) and dependency parsing. Experiments on Vietnamese benchmark datasets show that PhoNLP produces state-of-the-art results, outperforming a single-task learning approach that fine-tunes the pre-trained Vietnamese language model PhoBERT for each task independently.

109 Dec 2, 2022

Meta learning algorithms to train cross-lingual NLI (multi-task) models

4 Nov 20, 2022

Code to reprudece NeurIPS paper: Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

Accelerated Sparse Neural Training: A Provable and Efficient Method to FindN:M Transposable Masks Recently, researchers proposed pruning deep neural n

4 Feb 23, 2022

Comments

batch[k] = torch.stack([f[k] for f in features])

I tried to run the notebook https://colab.research.google.com/github/zphang/zphang.github.io/blob/master/files/notebooks/Multi_task_Training_with_Transformers_NLP.ipynb#scrollTo=U4YUxdIZz3_i an error aquiered TypeError: expected Tensor as element 0 in argument 0, but got list. It seems an type error in the NLPDataCollator class batch[k] = torch.stack([f[k] for f in features]) in multitask_data_collator.py/, NLPDataCollator Class returns a batch as list.

opened by HaithemH 2
Shared attention

Hi @shahrukhx01,

Thank you so much for sharing a nice repo. How can we combine the attention of all task heads for the shared encoder model and multiple prediction head model? Any lead in this direction will be very helpful.

Thanks

opened by MehwishFatimah 2

Owner

Shahrukh Khan

CS Grad Student @ Saarland University

GitHub

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning This is the PyTorch companion code for the paper: A

69 Jan 3, 2023

This converter will create the exact measure for your cappuccino recipe from the grandiose Rafaella Ballerini!

About CappuccinoJs This converter will create the exact measure for your cappuccino recipe from the grandiose Rafaella Ballerini! Este conversor criar

48 Nov 15, 2022

Script to generate VAD dataset used in Asteroid recipe

About the dataset LibriVAD is an open source dataset for voice activity detection in noisy environments. It is derived from LibriSpeech signals (clean

11 Sep 15, 2022

RecipeReduce: Simplified Recipe Processing for Lazy Programmers

RecipeReduce This repo will help you figure out the amount of ingredients to buy for a certain number of meals with selected recipes. RecipeReduce Get

9 Apr 22, 2022

Calibre recipe to convert latest issue of Analyse & Kritik into an ebook

Calibre Recipe für "Analyse & Kritik" Dies ist ein "Recipe" für die Konvertierung der aktuellen Ausgabe der Zeitung Analyse & Kritik in ein Ebook. Es

3 Jan 4, 2022

ElasticBERT: A pre-trained model with multi-exit transformer architecture.

This repository contains finetuning code and checkpoints for ElasticBERT. Towards Efficient NLP: A Standard Evaluation and A Strong Baseli

48 Dec 14, 2022

An end to end ASR Transformer model training repo

END TO END ASR TRANSFORMER 本项目基于transformer 6*encoder+6*decoder的基本结构构造的端到端的语音识别系统 Model Instructions 1.数据准备: 自行下载数据，遵循文件结构如下： ├── data │ ├── train │

10 Jul 19, 2022

Rhythm-Finder is a unsupervised ML driven python powered web-application that can find the songs that suits you.

ML-powered Music Recommendation Engine

23 Oct 9, 2022

STS Benchmark comprises a selection of the English datasets used in the STS tasks organized in the context of SemEval between 2012 and 2017. The selection of datasets include text from image captions, news headlines and user forums.

stsb_multi_mt_en STS Benchmark comprises a selection of the English datasets used in the STS tasks organized in the context of SemEval between 2012 an

2 Nov 5, 2021

Task-based datasets, preprocessing, and evaluation for sequence models.

SeqIO: Task-based datasets, preprocessing, and evaluation for sequence models. SeqIO is a library for processing sequential data to be fed into downst

290 Dec 26, 2022

A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.

Related tags

Overview

multitask-learning-transformers

Colab Notebook

Trained Huggingface Model

Install depedencies

Run training

Single Encoder Multiple Output Heads

Shared Encoder

You might also like...

Code and checkpoints for training the transformer-based Table QA models introduced in the paper TAPAS: Weakly Supervised Table Parsing via Pre-training.

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

A collection of Korean Text Datasets ready to use using Tensorflow-Datasets.

Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

This repo contains simple to use, pretrained/training-less models for speaker diarization.

PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing

Meta learning algorithms to train cross-lingual NLI (multi-task) models

Code to reprudece NeurIPS paper: Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

Comments

batch[k] = torch.stack([f[k] for f in features])

Shared attention

Owner

Shahrukh Khan

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

This converter will create the exact measure for your cappuccino recipe from the grandiose Rafaella Ballerini!

Script to generate VAD dataset used in Asteroid recipe

RecipeReduce: Simplified Recipe Processing for Lazy Programmers

Calibre recipe to convert latest issue of Analyse & Kritik into an ebook

ElasticBERT: A pre-trained model with multi-exit transformer architecture.

An end to end ASR Transformer model training repo

Rhythm-Finder is a unsupervised ML driven python powered web-application that can find the songs that suits you.

STS Benchmark comprises a selection of the English datasets used in the STS tasks organized in the context of SemEval between 2012 and 2017. The selection of datasets include text from image captions, news headlines and user forums.

Task-based datasets, preprocessing, and evaluation for sequence models.