Finetune gpt-2 in google colab

Overview

gpt-2-colab

finetune gpt-2(https://github.com/openai/gpt-2) in google colab

sample result (117M) from retraining on A Tale of Two Cities by Charles Dickens

No, Jerry! Jerry! You're a nice man, Jerry!”

That was all too remarkable. It was not merely impressive, but it took me on a turning short cough, and then swelling and stiffening, and rising to be a nice man, and a man, and not at all strivenly, and wicked.

The wonderful corner for echoes, and the echoes not being the echoes of footsteps that had their shameful imparted on the other side alone, that the time and tide waited for Lucie were sufficiently near the room, that to the utmost remaining of the time, even though assisted off other carriages, and were always ready and drove away that they should not hear themselves, Jerry heard no cry, and were as quick as in the morning if they had to pull up their heads and cidled away together as they could.

The farrier had not been in the barrier when he stopped, for the moment, and was as quick as they could make him.

He was to roll up arms, to get the outer coat to and frolic. He could not have laid down his hand to do so without another rain of the summer drops on high, when he was requested to do so. But, the rain of the summer was very strong and few, and the rain of the autumn month afterwards was strong and warm by those intervals. The storm in the west was very rarely beering, and the storm in the light of the summer was very rarely without it. The storm was really falling, and he stood there for a moment with his hand to open the barrier.

He was so far apart, that he could not have looked at him at all then; for, it was already dark when he looked at this figure, and it looked at IV

(an) seemed to fall, and reappeared, as old as Adam, until the morning, of the hour before.

“I fear the best, well,” said Jerry, stopping in his story, and laying his hand on hers, “what are you?”

“The worst.”

Though he had no hope of saying it, he could have looked at him, and then frowned at another figure, whose surface furnished a kind of dark street before him, for a few jewels.

He looked at it, and glanced at it. The Spy and prison-keeper looked at it, and the Spy showed a broken-hearted look.

“I am very much obliged to them for their looks and faces,” said Jerry. “No, Jerry! They are all in animosity and submission. They are in the habitually consently bad. I know what you have to do with my business. Whether I wait for you under the obligation to do so is the assumption yours. It is little to keep them in their places, to keep them in their times too much, is it not? No, Jerry. It is to keep them in their places, to cost and cost as the like. So much would you cost and change to-do exactly? That is to say, without deigning to say anything that is not at all, and no harm is to be expected of, will you not? No. It will cost nothing to save you, if it wos so, refuse. But it is always in the nature of things, and it is the nature of things. What is it? What would you have to say to me at all as, or to that degree?”

“I would ask you, is it not?”

Hah!” said Jerry, as he paused for the moment to ask him questions.

“It is true,” repeated the last question. “Does it cost to show me no harm, me nothing, yet? No. It is without loss,” repeated the Law; in the resting-looked-down sentiment. “Will you be very soon as restored to you?”

At the Judge, again a Judgeyer.

“If it is not restored to you within a minute, who should shut out the proceedings, and then the prisoner must be put back advance, and then must be removed.”

The Judge, whose eyes had gone in the general direction, leaned back in his seat, and stood ready.

Mr. Attorney-General then, following his leader's guidance, examined his manner with great obsequiousness and closeness, and passing on to the bench and tools, and passing on to Mr. Lorry. After looking at

Comments
  • Getting this error ZeroDivisionError: integer division or modulo by zero while training

    Getting this error ZeroDivisionError: integer division or modulo by zero while training

    Im getting error while training on this cell !PYTHONPATH=src ./train.py --dataset /content/gpt-2/goblet_book.txt --model_name '345M' Traceback (most recent call last): File "./train.py", line 266, in <module> main() File "./train.py", line 244, in main feed_dict={context: sample_batch()}) File "./train.py", line 220, in sample_batch return [data_sampler.sample(1024) for _ in range(args.batch_size)] File "./train.py", line 220, in <listcomp> return [data_sampler.sample(1024) for _ in range(args.batch_size)] File "/content/gpt-2/src/load_dataset.py", line 74, in sample self.chunks ZeroDivisionError: integer division or modulo by zero

    opened by archmord 3
  • Installing toposort is now redundant

    Installing toposort is now redundant

    Since toposort has been recently added to the requirements.txt of the nshepperd project, we can safely remove the line used to install it in the Colab notebook.

    opened by gsarti 1
  • No module named 'tensorflow.contrib'

    No module named 'tensorflow.contrib'

    I am new to tensorflow but something seems changed. I get this when i try to train

    Traceback (most recent call last):
      File "./train.py", line 14, in <module>
        import model, sample, encoder
      File "/content/gpt-2/src/model.py", line 3, in <module>
        from tensorflow.contrib.training import HParams
    ModuleNotFoundError: No module named 'tensorflow.contrib'
    

    I even tried %tensorflow_version 1.x

    opened by mitya12342 2
  • Trained model spits out exact text from trained text

    Trained model spits out exact text from trained text

    hi im quite new to this but when i put this on my own server and let it train on my own text data

    after getting close to avg loss 0.0 it started spitting out exact paragraphs from my trained data.

    did i overtrain it or set some flags wrong?

    opened by seoguypt 1
  • Required memory?

    Required memory?

    Hello, How much memory is required to run this model? I get out of memory errors with a graphic card having 6GB of memory. Not enough memory or local problem? Thanks,

    opened by Perdu 2
  • Which is better: GPT or RelGAN for text generation?

    Which is better: GPT or RelGAN for text generation?

    Based on my understanding, gpt or gpt-2 are using language model loss to train and generate text, which do not contains GAN.

    So which is better: GPT vs RelGAN/LeakGAN/SeqGAN/TextGAN

    I am so confused about this question. Thank you very much.

    opened by guotong1988 0
  • How to prepare the data for text generation task. Thank you very much.

    How to prepare the data for text generation task. Thank you very much.

    First, I'm not sure whether the model contains the encoder during training.

    EOS means end-of-sentence. Encoder and decoder are part of transformer network.

    If without-encoder, training time:

    target: [E, F, G, H, EOS]
    decoder input: [0, E, F, G, H]
    

    If without-encoder, testing time:

    decoder input: [0]
    

    If with encoder, training time:

    encoder input: [A, B, C, D]
    target: [E, F, G, H, EOS]
    decoder input: [0, E, F, G, H]
    

    If with-encoder, testing time:

    encoder input: [A, B, C, D]
    decoder input: [0]
    

    Am I exact right?

    I know it is beyond the topic of this project, but hope you could help. Thank you and thank you.

    opened by guotong1988 3
Owner
null
GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model -- based on GPT-3, called GPT-Codex -- that is fine-tuned on publicly available code from GitHub.

Nathan Cooper 2.3k Jan 1, 2023
A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models

wav2vec-toolkit A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR models This repository accompanies the

Anton Lozhkov 29 Oct 23, 2022
Fine-tune GPT-3 with a Google Chat conversation history

Google Chat GPT-3 This repo will help you fine-tune GPT-3 with a Google Chat conversation history. The trained model will be able to converse as one o

Nate Baer 7 Dec 10, 2022
GooAQ 🥑 : Google Answers to Google Questions!

This repository contains the code/data accompanying our recent work on long-form question answering.

AI2 112 Nov 6, 2022
Utility for Google Text-To-Speech batch audio files generator. Ideal for prompt files creation with Google voices for application in offline IVRs

Google Text-To-Speech Batch Prompt File Maker Are you in the need of IVR prompts, but you have no voice actors? Let Google talk your prompts like a pr

Ponchotitlán 1 Aug 19, 2021
Interactive Jupyter Notebook Environment for using the GPT-3 Instruct API

gpt3-instruct-sandbox Interactive Jupyter Notebook Environment for using the GPT-3 Instruct API Description This project updates an existing GPT-3 san

null 312 Jan 3, 2023
An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

GPT-NeoX An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hun

EleutherAI 3.1k Jan 8, 2023
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

Explosion 1.2k Jan 8, 2023
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

gpt-2-simple A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI's GPT-2 text generation model (specifical

Max Woolf 3.1k Jan 7, 2023
Shirt Bot is a discord bot which uses GPT-3 to generate text

SHIRT BOT · Shirt Bot is a discord bot which uses GPT-3 to generate text. Made by Cyclcrclicly#3420 (474183744685604865) on Discord. Support Server EX

null 31 Oct 31, 2022
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

Explosion 903 Feb 17, 2021
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

gpt-2-simple A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI's GPT-2 text generation model (specifical

Max Woolf 2.5k Feb 17, 2021
Transformer related optimization, including BERT, GPT

This repository provides a script and recipe to run the highly optimized transformer-based encoder and decoder component, and it is tested and maintained by NVIDIA.

NVIDIA Corporation 1.7k Jan 4, 2023
Code for producing Japanese GPT-2 provided by rinna Co., Ltd.

japanese-gpt2 This repository provides the code for training Japanese GPT-2 models. This code has been used for producing japanese-gpt2-medium release

rinna Co.,Ltd. 491 Jan 7, 2023
Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

TextCortex - HemingwAI Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingw

TextCortex AI 27 Nov 28, 2022
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

GPT2-Pytorch with Text-Generator Better Language Models and Their Implications Our model, called GPT-2 (a successor to GPT), was trained simply to pre

Tae-Hwan Jung 775 Jan 8, 2023
API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend

gpt-j-api ?? An API to interact with the GPT-J language model. You can use and test the model in two different ways: Streamlit web app at http://api.v

Víctor Gallego 276 Dec 31, 2022
Ongoing research training transformer language models at scale, including: BERT & GPT-2

What is this fork of Megatron-LM and Megatron-DeepSpeed This is a detached fork of https://github.com/microsoft/Megatron-DeepSpeed, which in itself is

BigScience Workshop 316 Jan 3, 2023
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing

Token Shift GPT Implementation of Token Shift GPT - An autoregressive model that relies solely on shifting along the sequence dimension and feedforwar

Phil Wang 32 Oct 14, 2022