Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

Phil Wang

Last update: Dec 29, 2022

Related tags

Overview

N-Grammer - Pytorch

Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

Install

$ pip install n-grammer-pytorch

Usage

import torch
from n_grammer_pytorch import VQNgrammer

vq_ngram = VQNgrammer(
    num_clusters = 1024,             # number of clusters
    dim_per_head = 32,               # dimension per head
    num_heads = 16,                  # number of heads
    ngram_vocab_size = 768 * 256,    # ngram vocab size
    ngram_emb_dim = 16,              # ngram embedding dimension
    decay = 0.999                    # exponential moving decay value
)

x = torch.randn(1, 1024, 32 * 16)
vq_ngram(x) # (1, 1024, 32 * 16)

Learning Rates

Like product key memories, Ngrammer parameters need to have a higher learning rate (1e-2 was recommended in the paper). The repository offers an easy way to generate the parameter groups.

from torch.optim import Adam
from n_grammer_pytorch import get_ngrammer_parameters

# this helper function, for your root model, finds all the VQNgrammer models and the embedding parameters
ngrammer_parameters, other_parameters = get_ngrammer_parameters(transformer)

optim = Adam([
    {'params': other_parameters},
    {'params': ngrammer_parameters, 'lr': 1e-2}
], lr = 3e-4)

Or, even more simply

from torch.optim import Adam
from n_grammer_pytorch import get_ngrammer_param_groups

param_groups = get_ngrammer_param_groups(model) # automatically creates array of parameter settings with learning rate set at 1e-2 for ngrammer parameter values
optim = Adam(param_groups, lr = 3e-4)

Citations

@inproceedings{thai2020using,
    title   = {N-grammer: Augmenting Transformers with latent n-grams},
    author  = {Anonymous},
    year    = {2021},
    url     = {https://openreview.net/forum?id=GxjCYmQAody}
}

Transformers implementation for Fall 2021 Clinic

Installation Download miniconda3 if not already installed You can check by running typing conda in command prompt. Use conda to create an environment

1 Oct 28, 2021

Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

ITTR - Pytorch Implementation of the Hybrid Perception Block (HPB) and Dual-Pruned Self-Attention (DPSA) block from the ITTR paper for Image to Image

17 Dec 23, 2022

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

NERDA Not only is NERDA a mesmerizing muppet-like character. NERDA is also a python package, that offers a slick easy-to-use interface for fine-tuning

141 Dec 30, 2022

KoBART model on huggingface transformers

KoBART-Transformers SKT에서 공개한 KoBART를 편리하게 사용할 수 있게 transformers로 포팅하였습니다. Install (Optional) BartModel과 PreTrainedTokenizerFast를 이용하면 설치하실 필요 없습니다. p

58 Dec 7, 2022

Big Bird: Transformers for Longer Sequences

BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the capabilities of a complete transformer that the sparse model can handle.

457 Dec 23, 2022

:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

Haystack is an end-to-end framework for Question Answering & Neural search that enables you to ... ... ask questions in natural language and find gran

6.4k Jan 9, 2023

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

1.2k Jan 8, 2023

spaCy plugin for Transformers , Udify, ELmo, etc.

Camphr - spaCy plugin for Transformers, Udify, Elmo, etc. Camphr is a Natural Language Processing library that helps in seamless integration for a wid

342 Nov 21, 2022

:mag: End-to-End Framework for building natural language search interfaces to data by utilizing Transformers and the State-of-the-Art of NLP. Supporting DPR, Elasticsearch, HuggingFace’s Modelhub and much more!

Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want

1.4k Feb 18, 2021

Comments

error when passing `concat_ngrams=False`
in this assert statement, the condition inside not(...) is actually the required condition

assert not (not concat_ngrams and dim_per_head == ngram_emb_dim), 'unigram head dimension must be equal to ngram embedding dimension if not concatting'

https://github.com/lucidrains/n-grammer-pytorch/blob/main/n_grammer_pytorch/n_grammer_pytorch.py#L149
opened by yiyixuxu 1

Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

Related tags

Overview

N-Grammer - Pytorch

Install

Usage

Learning Rates

Citations

You might also like...

Transformers implementation for Fall 2021 Clinic

Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR paper for Image to Image Translation using Transformers

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

KoBART model on huggingface transformers

Big Bird: Transformers for Longer Sequences

:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spaCy plugin for Transformers , Udify, ELmo, etc.

:mag: End-to-End Framework for building natural language search interfaces to data by utilizing Transformers and the State-of-the-Art of NLP. Supporting DPR, Elasticsearch, HuggingFace’s Modelhub and much more!

Comments

error when passing `concat_ngrams=False`

Releases(0.0.14a)

0.0.14a(Dec 4, 2022)

0.0.14(Dec 4, 2022)

0.0.12(Dec 4, 2021)

0.0.11(Dec 4, 2021)

0.0.10(Dec 4, 2021)

0.0.9(Dec 4, 2021)

0.0.8(Dec 4, 2021)

0.0.7(Dec 4, 2021)

0.0.6(Dec 3, 2021)

0.0.5(Dec 3, 2021)

0.0.4(Dec 3, 2021)

0.0.3(Dec 3, 2021)

0.0.2(Dec 3, 2021)

0.0.1a(Dec 3, 2021)

Owner

Phil Wang

Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.

Implementation of Multistream Transformers in Pytorch

Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

A Word Level Transformer layer based on PyTorch and 🤗 Transformers.