Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset

Clay Mullis

Last update: Oct 13, 2022

Related tags

Deep Learning glide-finetune

Overview

glide-finetune

Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset.

Installation

git clone https://github.com/afiaka87/glide-finetune.git
cd glide-finetune/
python3 -m venv .venv # create a virtual environment to keep global install clean.
source .venv/bin/activate
(.venv) # optionally install pytorch manually for your own specific env first...
(.venv) python -m pip install -r requirements.txt

Usage

(.venv) python glide-finetune.py 
    --data_dir=./data \
    --batch_size=1 \
    --grad_acc=1 \
    --guidance_scale=4.0 \
    --learning_rate=2e-5 \
    --dropout=0.1 \
    --timestep_respacing=1000 \
    --side_x=64 \
    --side_y=64 \
    --resume_ckpt='' \
    --checkpoints_dir='./glide_checkpoints/' \
    --use_fp16 \
    --device='' \
    --freeze_transformer \
    --freeze_diffusion \
    --weight_decay=0.0 \
    --project_name='glide-finetune'

Known issues:

batching isn't handled in the dataloader
NaN/Inf errors
Resizing doesn't handle non-square aspect ratios properly
some of the code is messy, needs refactoring.

Comments

Fixed a couple of minor issues
Pinned webdataset version to work with python 3.7 which is the version being used in Colab, Kaggle. A new version of this module is releaed few days back which only works with 3.8/9

Fixed an issue with data_dir arg not getting picked up.
opened by vanga 1
Fix NameError when using --data_dir
Hello and thank you for your great work.

Right now using a local data folder with --data_dir results in

Traceback (most recent call last): File "/content/glide-finetune/train_glide.py", line 292, in <module> data_dir=data_dir, NameError: name 'data_dir' is not defined

This PR fixes that.
opened by tillfalko 0

mention mpi4py dependency

mpi4py installation will fail unless the user has this package installed. Since MPI is not a ubiquitous dependency it should probably be mentioned. Edit: Since torch==1.10.1 is a requirement, and torch versions come with their own cuda versions (torch 1.10.1 uses cuda 10.2), I don't see a reason not to just include bitsandbytes-cuda102 in requirements.txt.

$ py -m venv .venv
$ source .venv/bin/activate
$ pip install torch==1.10.1
Collecting torch==1.10.1
  Downloading torch-1.10.1-cp39-cp39-manylinux1_x86_64.whl (881.9 MB)
     |████████████████████████████████| 881.9 MB 15 kB/s
Collecting typing-extensions
  Downloading typing_extensions-4.0.1-py3-none-any.whl (22 kB)
Installing collected packages: typing-extensions, torch
Successfully installed torch-1.10.1 typing-extensions-4.0.1
$ py -c "import torch; print(torch.__version__)"
1.10.1+cu102

opened by tillfalko 0

Fixed half precision optimizer bug

Problem

In half precision, after the first iteration nan values start appearing regardless of input data or gradients since the adam optimizer breaks in float16. The discussion for that can be viewed here.

Solution

This can be fixed by setting the eps variable to 1e-4 instead of the default 1e-8. This is the only thing this pr does

opened by isamu-isozaki 0
Training on half precision leads to nan values

I was training my model and I noticed that after just the first iteration I was running into nan values. As it turns out my gradients and input values/images were all normal but the adam optimizer by pytorch does has some weird behavior on float16 precision where it produces nans probably because of a divide by 0 error. A discussion can be found below

https://discuss.pytorch.org/t/adam-half-precision-nans/1765/4

I hear changing the epison parameter for the adam weights parameter when on half precisions works but I haven't tested it yet. Will make one once I tested.

And also let me say thanks for this repo. I wanted to fine tune the glide model and this made it so much easier.

opened by isamu-isozaki 1
Where is the resume_ckpt

Hi, thanks for your job.

I noticed to finetune the glide, we should have a base_model, namely "resume_ckpt". --resume_ckpt 'ckpt_to_resume_from.pt'
Where can we get this model? Because I find Glide also didn't provide any checkpoint. Thanks for your help.

opened by zhaobingbingbing 0

Releases(v0.0.1)

v0.0.1(Feb 20, 2022)
Having some experience with finetuning GLIDE on laion/alamy, etc. I think this code works great now and hope as many people can use it as possible. Please file bugs - I know there may be a few.

New additions:

dataloader for LAION400M

dataloader for alamy

train the upsample model instead of just the base model

(early) code for training the released noisy CLIP. still a WIP.

Source code(tar.gz)
Source code(zip)

Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset

Related tags

Overview

glide-finetune

Installation

Usage

Known issues:

Comments

Fixed a couple of minor issues

Fix NameError when using --data_dir

mention mpi4py dependency

Fixed half precision optimizer bug

Problem

Solution

Training on half precision leads to nan values

Where is the resume_ckpt

Releases(v0.0.1)

v0.0.1(Feb 20, 2022)

Owner

Clay Mullis

Minimal But Practical Image Classifier Pipline Using Pytorch, Finetune on ResNet18, Got 99% Accuracy on Own Small Datasets.

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI

A general-purpose, flexible, and easy-to-use simulator alongside an OpenAI Gym trading environment for MetaTrader 5 trading platform (Approved by OpenAI Gym)

Train an imgs.ai model on your own dataset

Using this codebase as a tool for my own research. Making some modifications to the original repo for my own purposes.

Finetune SSL models for MOS prediction

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.

A Jupyter notebook to play with NVIDIA's StyleGAN3 and OpenAI's CLIP for a text-based guided image generation.

A colab notebook for training Stylegan2-ada on colab, transfer learning onto your own dataset.

This project uses Template Matching technique for object detecting by detection of template image over base image.

This project uses Template Matching technique for object detecting by detection of template image over base image

A 1.3B text-to-image generation model trained on 14 million image-text pairs

Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

A containerized REST API around OpenAI's CLIP model.