📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

Bardia Shahrestani

Last update: May 26, 2022

Related tags

Text Data & NLP Rhyming_Limerick_Haiku_GPT2

Overview

Well-formed Limericks and Haikus with GPT2

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

In collaboration with Matthew Korahais & Daniel Korsunsky

Abstract

We explore the capabilities and limits of GPT-2 in the case of well-formed poems, specifically limericks and haikus. We hypothesized that GPT-2 trained without phonetic annotations would be unable to systematically learn and generate syllabic patterns and rhyme scheme, since these features are grounded in real world acoustic representations. Our model trained with list-of-rhymes annotations outperformed baselines, generating perfect-scoring limericks 33% of the time. Our best haiku model generated valid haikus in 29% of cases, with an average syllable error rate of <0.4. Our work invites further research into methods of combining text and phonetic data for more convincing text generation.

Limericks Colab here ->

Haiku Colab Here ->

Evaluation Data Here: https://docs.google.com/spreadsheets/d/1rd1qCbCcTX1zHa0Dvh1q8OJ2iidxxrifTJlYWg3MMes

Examples (Find more in the repo):

Limericks

To the one grading our research, I'd say,
that a lot of work's been done today.
our paper's been checked,
And our work is all correct.
We're not mired in conjecture today.

The Indians' chief deity, they say,
Was a god of the earth all day.
But the gods he made
Were the ones who would fade
As they were replaced by a new way.

A large, thick, thick, and thickly cut tree
(A weeping cedar) will please me.
It's a tree that's known
As a cedar it's own,
And it's named for a bird that I see.

Haiku

The only thing that
gets me going is you So
let's keep this going

Saw a duck come in
from the woods and now i know
what a duck is lol

the only thing I
wanna say to you is good
bye don't disappoint

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

Related tags

Overview

Well-formed Limericks and Haikus with GPT2

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

In collaboration with Matthew Korahais & Daniel Korsunsky

Abstract

Examples (Find more in the repo):

Limericks

Haiku

You might also like...

A python project made to generate code using either OpenAI's codex or GPT-J (Although not as good as codex)

Modified GPT using average pooling to reduce the softmax attention memory constraints.

Creating a chess engine using GPT-3

Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models

Code for text augmentation method leveraging large-scale language models

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

Owner

Bardia Shahrestani

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Neural text generators like the GPT models promise a general-purpose means of manipulating texts.

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and GPT-NEO (2.7 B) on a single 16 GB VRAM V100 Google Cloud instance with Huggingface Transformers using DeepSpeed

Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

Interactive Jupyter Notebook Environment for using the GPT-3 Instruct API

Train GPT-3 model on V100(16GB Mem) Using improved Transformer.