Prompt tuning toolkit for GPT-2 and GPT-Neo

Last update: Jan 1, 2023

Related tags

Text Data & NLP mkultra

Overview

mkultra

mkultra is a prompt tuning toolkit for GPT-2 and GPT-Neo.

Prompt tuning injects a string of 20-100 special tokens into the context in order to influence text generation. These tokens are trained on a corpus much like a finetune, but take up a fraction of the space. The Neuromancer example is only 401kb for 100 tokens.

Read the original paper: https://arxiv.org/abs/2104.08691

Text Generation

model = GPT2SoftPromptLM.from_pretrained("gpt2")
tokenizer = GPT2SPTokenizerFast.from_pretrained("gpt2")
generator = pipeline('text-generation', model=model, tokenizer=tokenizer)

sp = SoftPrompt.from_file("sample_sps/finetune/neuromancer_gpt2.json")
prompt = sp + "The sky over the port"
output = generator(prompt)

SoftPrompts can be concatenated at any point into your context as if they were strings. When the context is printed, SoftPrompts show up as human-readable tags for debugging. They also tokenize to the underlying number of tokens for easy budgeting.

See the text generation notebook for pointers on adding mkultra to your generator.

Training

For finetune-like soft prompts, the finetune notebook demonstrates training on a corpus.

For AI text adventures or writing, the World Info notebook notebook demonstrates tuning a soft prompt to describe a character or setting. This is highly experimental.

Limitations (for now)

The Huggingface Trainer class should work as long as you set params=[model.get_soft_params()] on the optimizer, but it will still save full model checkpoints.
mkultra syncs a set of special tokens between its tokenizers the scenes. Adding your own tokens may result in unexpected behaviour.

You might also like...

Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

TextCortex - HemingwAI Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingw

27 Nov 28, 2022

API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend

gpt-j-api 🦜 An API to interact with the GPT-J language model. You can use and test the model in two different ways: Streamlit web app at http://api.v

276 Dec 31, 2022

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

SpeechMix Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together. Introduction For the same input: from datas

31 Nov 7, 2022

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

Well-formed Limericks and Haikus with GPT2 📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation In collaboration with Matthew Korahais &

2 May 26, 2022

Interactive Jupyter Notebook Environment for using the GPT-3 Instruct API

gpt3-instruct-sandbox Interactive Jupyter Notebook Environment for using the GPT-3 Instruct API Description This project updates an existing GPT-3 san

312 Jan 3, 2023

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

GPT-NeoX An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hun

3.1k Jan 8, 2023

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

gpt-2-simple A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI's GPT-2 text generation model (specifical

3.1k Jan 7, 2023

Shirt Bot is a discord bot which uses GPT-3 to generate text

SHIRT BOT · Shirt Bot is a discord bot which uses GPT-3 to generate text. Made by Cyclcrclicly#3420 (474183744685604865) on Discord. Support Server EX

31 Oct 31, 2022

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

gpt-2-simple A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI's GPT-2 text generation model (specifical

2.5k Feb 17, 2021

Comments

Reason for different json file output sizes and sample quality based on same model?
All notebooks working as specified on ubuntu 18.04 and gpu but I have a question concerning the size of the my json output files compared to the provided repo examples.

'neuromancer_gpt2.json' is 410.4 kb and works with default gpt2 model (548 megs)

Size of 'Neuromancer.txt' is approximately 492.6 kb (depending on source)

The output is good for a small text and model.

Ran 'tuning_finetune.ipynb' on the above text using gpt-neo 2.7B (10.7 gigs)

Final 'neuromancer-cyclic-dropout-2-gpt-neo-2.7B-step-240.json' is only 314.8 kb (with 662.2 kb 'tokens.json' file)

The output via 'text_generation.ipynb' is ok but not as good.

I've tested current and recommended 'Adafactor hyperparameters' and the .json file sizes remain the same.

Attempted to recreate size and quality of the provided 'neuromancer_gpt2.json' (410.4 kb) with gpt2 small but the final json size is only 95 kb (???)

The sampled text isn't as good as the original.

Am I missing a step here? The notebook should reproduce a similar json using the same model?

Would appreciate clarification.
opened by GenTxt 4
Add initial GPT-J support

I have made a quick modification to the library to implement GPT-J support. I have not tested this on the full GPT-J 6B model but I had tested it by training a softprompt on a much smaller randomly initialized version of GPT-J.

opened by harubaru 0
Text Generation in Colab

Hi ! Text Generation in Colab AttributeError Traceback (most recent call last) in () 1 from transformers.pipelines import pipeline ----> 2 from mkultra.inference import GPT2SoftPromptLM 3 from mkultra.tokenizers import GPT2SPTokenizerFast 4 from mkultra.soft_prompt import SoftPrompt 5 import torch

/usr/local/lib/python3.7/dist-packages/mkultra/inference.py in () 10 11 for model in EXTRA_ALLOWED_MODELS: ---> 12 if model not in TextGenerationPipeline.ALLOWED_MODELS: 13 TextGenerationPipeline.ALLOWED_MODELS.append(model) 14

AttributeError: type object 'TextGenerationPipeline' has no attribute 'ALLOWED_MODELS'

opened by elliotthwang 0
'TextGenerationPipeline' has no attribute ALLOWED_MODELS

This part looks broken on Huggingface Transformers main build:

https://github.com/corolla-johnson/mkultra/blob/a25c72d47980a767b6178861a436900fd83c058f/mkultra/inference.py#L12

opened by mrseeker 2

Owner

firmware engineer (derogatory)

GitHub https://twitter.com/corolla_johnson

A high-level yet extensible library for fast language model tuning via automatic prompt search

ruPrompts ruPrompts is a high-level yet extensible library for fast language model tuning via automatic prompt search, featuring integration with Hugg

37 Dec 7, 2022

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model -- based on GPT-3, called GPT-Codex -- that is fine-tuned on publicly available code from GitHub.

2.3k Jan 1, 2023

Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.

Word2Wave is a simple method for text-controlled GAN audio generation. You can either follow the setup instructions below and use the source code and CLI provided in this repo or you can have a play around in the Colab notebook provided. Note that, in both cases, you will need to train a WaveGAN model first

91 Dec 23, 2022

Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks

Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks, which modifies the input text with a textual template and directly uses PLMs to conduct pre-trained tasks. This library provides a standard, flexible and extensible framework to deploy the prompt-learning pipeline. OpenPrompt supports loading PLMs directly from huggingface transformers. In the future, we will also support PLMs implemented by other libraries.

2.3k Jan 8, 2023

An example project using OpenPrompt under pytorch-lightning for prompt-based SST2 sentiment analysis model

pl_prompt_sst An example project using OpenPrompt under the framework of pytorch-lightning for a training prompt-based text classification model on SS

5 Oct 21, 2022

Utility for Google Text-To-Speech batch audio files generator. Ideal for prompt files creation with Google voices for application in offline IVRs

Google Text-To-Speech Batch Prompt File Maker Are you in the need of IVR prompts, but you have no voice actors? Let Google talk your prompts like a pr

1 Aug 19, 2021

Prompt tuning toolkit for GPT-2 and GPT-Neo

Related tags

Overview

mkultra

Text Generation

Training

Limitations (for now)

You might also like...

Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend

Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

Interactive Jupyter Notebook Environment for using the GPT-3 Instruct API

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

Shirt Bot is a discord bot which uses GPT-3 to generate text

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

Comments

Reason for different json file output sizes and sample quality based on same model?

Add initial GPT-J support

Text Generation in Colab

'TextGenerationPipeline' has no attribute ALLOWED_MODELS

Owner

A high-level yet extensible library for fast language model tuning via automatic prompt search

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model

Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.

Prompt-learning is the latest paradigm to adapt pre-trained language models (PLMs) to downstream NLP tasks

An example project using OpenPrompt under pytorch-lightning for prompt-based SST2 sentiment analysis model

Utility for Google Text-To-Speech batch audio files generator. Ideal for prompt files creation with Google voices for application in offline IVRs

Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy