Sequence-tagging using deep learning

Vineet Kumar

Last update: Dec 20, 2022

Related tags

Deep Learning natural-language-processing deep-learning neural-network python3 pytorch artificial-intelligence transfer-learning bert sequence-labeling natural-language-understanding sequence-tagging token-tagging

Overview

Classification using Deep Learning

Requirements

PyTorch version >= 1.9.1+cu111
Python version >= 3.8.10
PyTorch-Lightning version >= 1.4.9
Huggingface Transformers version >= 4.11.3
Tensorboard version >= 2.6.0
Pandas >= 1.3.4
Scikit-learn: numpy>=1.14.6, scipy>=1.1.0, threadpoolctl>=2.0.0, joblib>=0.11

Installation

pip3 install transformers
pip3 install pytorch-lightning
pip3 install tensorboard
pip3 install pandas
pip3 install scikit-learn
git clone https://github.com/vineetk1/clss.git
cd clss

Note that the default directory is clss. Unless otherwise stated, all commands from the Command-Line-Interface must be delivered from the default directory.

Download the dataset

Create a data directory.

mkdir data

Download a dataset in the data directory.

Saving all informtion and results of an experiment

All information about the experiment is stored in a unique directory whose path starts with tensorboard_logs and ends with a unique version-number. Its contents consist of hparams.yaml, hyperperameters_used.yaml, test-results.txt, events.* files, and a checkpoints directory that has one or more checkpoint-files.

Train, validate, and test a model

Following command trains a model, saves the last checkpoint plus checkpoints that have the lowest validation loss, runs the test dataset on the checkpointed model with the lowest validation loss, and outputs the results of the test:

python3 Main.py input_param_files/bert_seq_class

The user-settable hyper-parameters are in the file input_param_files/bert_seq_class. An explanation on the contents of this file is at input_param_files/README.md. A list of all the hyper-parameters is in the PyTorch-Lightning documentation, and any hyper-parameter can be used.
To assist in Training, the two parameters auto_lr_find and auto_scale_batch_size in the file input_param_files/bert_seq_class enable the software to automatically find an initial Learning-Rate and a Batch-Size respectively.
As training progresses, graphs of "training-loss vs. epoch #", "validation-loss vs. epoch #", and "learning-rate vs. batch #" are plotted in real-time on the TensorBoard. Training is stopped by typing, at the Command-Line-Interface, the keystroke ctrl-c. The current training information is checkpointed, and training stops. Training can be resumed, at some future time, from the checkpointed file.
Dueing testing, the results are sent to the standard-output, and also saved in the *test-results.txt" file that include the following: general information about the dataset and the classes, confusion matrix, precision, recall, f1, average f1, and weighted f1.

Resume training, validation, and testing a model with same hyper-parameters

Resume training a checkpoint model with the same model- and training-states by using the following command:

python3 Main.py input_param_files/bert_seq_class-res_from_chkpt

The user-settable hyper-parameters are in the file input_param_files/bert_seq_class-res_from_chkpt. An explanation on the contents of this file is at input_param_files/README.md.

Change hyper-parameters and continue training, validation, and testing a model

Continue training a checkpoint model with the same model-state but different hyperparameters for the training-state by using the following command:

python3 Main.py input_param_files/bert_seq_class-ld_chkpt

The user-settable hyper-parameters are in the file input_param_filesbert_seq_class-ld_chkpt. An explanation on the contents of this file is at input_param_files/README.md.

Further test a checkpoint model with a new dataset

Test a checkpoint model by using the following command:

python3 Main.py input_param_files/bert_seq_class-ld_chkpt_and_test

The user-settable hyper-parameters are in the file input_param_files/bert_seq_class-ld_chkpt_and_test. An explanation on the contents of this file is at input_param_files/README.md.

You might also like...

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Amazon Forest Computer Vision Satellite Image tagging code using PyTorch / Keras Here is a sample of images we had to work with Source: https://www.ka

359 Jan 5, 2023

Pytorch reimplement of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction" ACL2020. The original code is written in keras.

CasRel-pytorch-reimplement Pytorch reimplement of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction" ACL2020. The o

170 Dec 1, 2022

This project provides an unsupervised framework for mining and tagging quality phrases on text corpora with pretrained language models (KDD'21).

UCPhrase: Unsupervised Context-aware Quality Phrase Tagging To appear on KDD'21...[pdf] This project provides an unsupervised framework for mining and

146 Dec 22, 2022

Selene is a Python library and command line interface for training deep neural networks from biological sequence data such as genomes.

323 Jan 1, 2023

Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools

Deep-rPPG: Camera-based pulse estimation using deep learning tools Deep learning (neural network) based remote photoplethysmography: how to extract pu

138 Dec 17, 2022

Sequence-tagging using deep learning

Related tags

Overview

Classification using Deep Learning

Requirements

Installation

Download the dataset

Saving all informtion and results of an experiment

Train, validate, and test a model

Resume training, validation, and testing a model with same hyper-parameters

Change hyper-parameters and continue training, validation, and testing a model

Further test a checkpoint model with a new dataset

You might also like...

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Pytorch reimplement of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction" ACL2020. The original code is written in keras.

This project provides an unsupervised framework for mining and tagging quality phrases on text corpora with pretrained language models (KDD'21).

Easy to use Audio Tagging in PyTorch

Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Tagging

Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".

Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using 🤗 transformers

Selene is a Python library and command line interface for training deep neural networks from biological sequence data such as genomes.

Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools

Owner

Vineet Kumar

An implementation of a sequence to sequence neural network using an encoder-decoder

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Sequence to Sequence Models with PyTorch

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Sequence lineage information extracted from RKI sequence data repo

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks