A Transformer Implementation that is easy to understand and customizable.

Naoki Shibuya

Last update: Jan 20, 2022

Related tags

Text Data & NLP transformers transformer transformer-encoder transformer-architecture transformer-models transformer-pytorch transformers-models

Overview

Simple Transformer

I've written a series of articles on the transformer architecture and language models on Medium.

This repository contains an implementation of the Transformer architecture presented in the paper Attention Is All You Need by Ashish Vaswani, et. al.

My goal is to write an implementation that is easy to understand and dig into nitty-gritty details where the devil is.

Python environment

You can use any Python virtual environment like venv and conda.

For example, with venv:

python3 -m venv venv
source venv/bin/activate

pip install --upgrade pip
pip install -e.

Spacy Tokenizer Data Preparation

To use Spacy's tokenizer, make sure to download required languages.

For example, English and Germany tokenizers can be downloaded as below:

python -m spacy download en_core_web_sm
python -m spacy download de_core_news_sm

Text Data from Torchtext

This project uses text datasets from Torchtext.

from torchtext import datasets

The default configuration uses Multi30k dataset.

Training

python train.py config_path

The default config path is config/config.yaml.

It is possible to resume training from a checkpoint.

python train.py --checkpoint_path runs/20220108-164720-Multi30k-Transformer/checkpoint-010-2.3343.pt

You can run tensorboard to see the training progress.

tensorboard --logdir=runs

The logs are created under runs.

Test

python test.py checkpoint_path

Example,

python test.py runs/20220108-164720-Multi30k-Transformer/checkpoint-010-2.3343.pt

config.yaml is copied to the model folder when training starts, and the test.py assumes the existence of a config yaml file.

Unit tests

There are some unit tests in the tests folder.

pytest tests

References:

You might also like...

xFormers is a modular and field agnostic library to flexibly generate transformer architectures by interoperable and optimized building blocks.

Description xFormers is a modular and field agnostic library to flexibly generate transformer architectures by interoperable and optimized building bl

2.3k Jan 8, 2023

🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

15k Jan 2, 2023

An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations

FantasyBert English | 中文 Introduction An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations. You can imp

137 Oct 26, 2022

A Transformer Implementation that is easy to understand and customizable.

Related tags

Overview

Simple Transformer

Python environment

Spacy Tokenizer Data Preparation

Text Data from Torchtext

Training

Test

Unit tests

References:

You might also like...

xFormers is a modular and field agnostic library to flexibly generate transformer architectures by interoperable and optimized building blocks.

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

Easy, fast, effective, and automatic g-code compression!

🏖 Easy training and deployment of seq2seq models.

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

🏖 Easy training and deployment of seq2seq models.

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations

Owner

Naoki Shibuya

Using context-free grammar formalism to parse English sentences to determine their structure to help computer to better understand the meaning of the sentence.

A fast and easy implementation of Transformer with PyTorch.

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Implementation of Fast Transformer in Pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

Google's Meena transformer chatbot implementation

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.

Free and Open Source Machine Translation API. 100% self-hosted, offline capable and easy to setup.

An easy to use, user-friendly and efficient code for extracting OpenAI CLIP (Global/Grid) features from image and text respectively.