Just playing with getting CLIP Guided Diffusion running locally, rather than having to use colab.

Nerdy Rodent

Last update: Dec 9, 2022

Related tags

Overview

CLIP-Guided-Diffusion

Just playing with getting CLIP Guided Diffusion running locally, rather than having to use colab.

Original colab notebooks by Katherine Crowson (https://github.com/crowsonkb, https://twitter.com/RiversHaveWings):

Original 256x256 notebook:

It uses OpenAI's 256x256 unconditional ImageNet diffusion model (https://github.com/openai/guided-diffusion)

Original 512x512 notebook:

It uses a 512x512 unconditional ImageNet diffusion model fine-tuned from OpenAI's 512x512 class-conditional ImageNet diffusion model (https://github.com/openai/guided-diffusion)

Together with CLIP (https://github.com/openai/CLIP), they connect text prompts with images.

Either the 256 or 512 model can be used here (by setting --output_size to either 256 or 512)

Some example images:

"A woman standing in a park":

"An alien landscape":

"A painting of a man":

*images enhanced with Real-ESRGAN

You may also be interested in VQGAN-CLIP

Environment

Ubuntu 20.04 (Windows untested but should work)
Anaconda
Nvidia RTX 3090

Typical VRAM requirments:

256 defaults: 10 GB
512 defaults: 18 GB

Set up

This example uses Anaconda to manage virtual Python environments.

Create a new virtual Python environment for CLIP-Guided-Diffusion:

conda create --name cgd python=3.9
conda activate cgd

Download and change directory:

git clone https://github.com/nerdyrodent/CLIP-Guided-Diffusion.git
cd CLIP-Guided-Diffusion

Run the setup file:

./setup.sh

Or if you want to run the commands manually:

# Install dependencies

pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
git clone https://github.com/openai/CLIP
git clone https://github.com/crowsonkb/guided-diffusion
pip install -e ./CLIP
pip install -e ./guided-diffusion
pip install lpips matplotlib

# Download the diffusion models

curl -OL --http1.1 'https://the-eye.eu/public/AI/models/512x512_diffusion_unconditional_ImageNet/512x512_diffusion_uncond_finetune_008100.pt'
curl -OL 'https://openaipublic.blob.core.windows.net/diffusion/jul-2021/256x256_diffusion_uncond.pt'

Run

The simplest way to run is just to pass in your text prompt. For example:

python generate_diffuse.py -p "A painting of an apple"

Multiple prompts

Text and image prompts can be split using the pipe symbol in order to allow multiple prompts. You can also use a colon followed by a number to set a weight for that prompt. For example:

python generate_diffuse.py -p "A painting of an apple:1.5|a surreal painting of a weird apple:0.5"

Other options

There are a variety of other options to play with. Use help to display them:

python generate_diffuse.py -h

usage: generate_diffuse.py [-h] [-p PROMPTS] [-ip IMAGE_PROMPTS] [-ii INIT_IMAGE]
[-st SKIP_TIMESTEPS] [-is INIT_SCALE] [-m CLIP_MODEL] [-t TIMESTEPS]
[-ds DIFFUSION_STEPS] [-se SAVE_EVERY] [-bs BATCH_SIZE] [-nb N_BATCHES] [-cuts CUTN]
[-cutb CUTN_BATCHES] [-cutp CUT_POW] [-cgs CLIP_GUIDANCE_SCALE]
[-tvs TV_SCALE] [-rgs RANGE_SCALE] [-os IMAGE_SIZE] [-s SEED] [-o OUTPUT] [-nfp] [-pl]

init_image

'skip_timesteps' needs to be between approx. 200 and 500 when using an init image.
'init_scale' enhances the effect of the init image, a good value is 1000.

timesteps

The number of timesteps, or one of ddim25, ddim50, ddim150, ddim250, ddim500, ddim1000. Must go into diffusion_steps.

image guidance

'clip_guidance_scale' Controls how much the image should look like the prompt.
'tv_scale' Controls the smoothness of the final output.
'range_scale' Controls how far out of range RGB values are allowed to be.

Examples using a number of options:

python generate_diffuse.py -p "An amazing fractal" -os=256 -cgs=1000 -tvs=50 -rgs=50 -cuts=16 -cutb=4 -t=200 -se=200 -m=ViT-B/32 -o=my_fractal.png

python generate_diffuse.py -p "An impressionist painting of a cat:1.75|trending on artstation:0.25" -cgs=500 -tvs=55 -rgs=50 -cuts=16 -cutb=2 -t=100 -ds=2000 -m=ViT-B/32 -pl -o=cat_100.png

(Funny looking cat, but hey!)

Other repos

You may also be interested in https://github.com/afiaka87/clip-guided-diffusion

For upscaling images, try https://github.com/xinntao/Real-ESRGAN

Citations

@misc{unpublished2021clip,
    title  = {CLIP: Connecting Text and Images},
    author = {Alec Radford, Ilya Sutskever, Jong Wook Kim, Gretchen Krueger, Sandhini Agarwal},
    year   = {2021}
}

Guided Diffusion - https://github.com/openai/guided-diffusion
Katherine Crowson - https://github.com/crowsonkb

Comments

ERROR: No matching distribution found for blobfile>=1.0.5

Awesome repo. Thank you. Ran in Google Colab with no issue. Figured I would try running locally on my NVIDIA Jetson Xavier SBC. Got all dependencies to work except guided diffusion. Any ideas? TIA.

Obtaining file:///media/dennis/64GB_SD_EXT4/AI_Art/guided-diffusion Preparing metadata (setup.py) ... done ERROR: Could not find a version that satisfies the requirement blobfile>=1.0.5 (from guided-diffusion) (from versions: 0.1, 0.2.0, 0.2.1, 0.2.2, 0.2.3, 0.3.0, 0.3.1, 0.3.2, 0.3.3, 0.4.0, 0.4.1, 0.4.2, 0.4.3, 0.4.4, 0.4.5, 0.5.0, 0.6.1, 0.7.0, 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.10.1, 0.10.2, 0.11.0) ERROR: No matching distribution found for blobfile>=1.0.5

opened by DennisFaucher 3
Please update the download path of the 512*512 model

Thank you for your open source code : ), but I failed to download the 512x512_diffusion_uncond_finetune_008100.pt. I noticed that the download path of the model has been updated to curl -OL https://v-diffusion.s3.us-west-2.amazonaws.com/512x512_diffusion_uncond_finetune_008100.pt in the corresponding colab. Do you need to update the README or setup.sh?

opened by uk9921 1
Error when generating image

I'm getting this error after entering this command: python generate_diffuse.py -p "A painting of an apple"

(cgd) C:\Users\Computer\CLIP-Guided-Diffusion>python generate_diffuse.py -p "A painting of an apple" Traceback (most recent call last): File "C:\Users\Computer\CLIP-Guided-Diffusion\generate_diffuse.py", line 40, in from IPython import display ModuleNotFoundError: No module named 'IPython'

What should I do to solve this error?

opened by Redivh 0
Data conversion error on Apple silicon (e.g. M1, M2)

I get the following error in Katherine Crowson's code and in running your code as specified in the README. I started with the following after editing device = "mps" (rather than cuda).

% python generate_diffuse.py -p "A painting of an apple" Device: mps Size: 256 Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]

....and then later down at the end......

TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.

Do you know of a solution? Note that gaussian_diffusion.py and resample.py contain 'float64' but when trying and editing KC's original notebook, this change to float32 did not solve the data type problem.

opened by metaphorz 0
Fatal Error 'unfolded2d_copy not implemented' When Attempting to Run for the First Time

Hello, I have finished installing CLIP-Guided-Diffusion, but when I run it, this error happens: Device: cpu Size: 256 Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off] Loading model from: C:\Users\XXXXXX\Anaconda3\envs\cgd\lib\site-packages\lpips\weights\v0.1\vgg.pth Seed: 1221123546082200 Text prompt: A rope tied in a figure-eight knot 0%| | 0/1000 [00:00<?, ?it/s] Traceback (most recent call last): File "C:\Users\XXXXXX\CLIP-Guided-Diffusion\generate_diffuse.py", line 460, in do_run() File "C:\Users\XXXXXX\CLIP-Guided-Diffusion\generate_diffuse.py", line 359, in do_run for j, sample in enumerate(samples): File "c:\users\XXXXXX\guided-diffusion\guided_diffusion\gaussian_diffusion.py", line 637, in p_sample_loop_progressive out = sample_fn( File "c:\users\XXXXXXX\guided-diffusion\guided_diffusion\gaussian_diffusion.py", line 461, in p_sample out = self.p_mean_variance( File "c:\users\XXXXXXX\guided-diffusion\guided_diffusion\respace.py", line 91, in p_mean_variance return super().p_mean_variance(self._wrap_model(model), *args, **kwargs) File "c:\users\XXXXXX\guided-diffusion\guided_diffusion\gaussian_diffusion.py", line 260, in p_mean_variance model_output = model(x, self._scale_timesteps(t), **model_kwargs) File "c:\users\XXXXXX\guided-diffusion\guided_diffusion\respace.py", line 128, in call return self.model(x, new_ts, **kwargs) File "C:\Users\XXXXXX\Anaconda3\envs\cgd\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\users\XXXXX\guided-diffusion\guided_diffusion\unet.py", line 656, in forward h = module(h, emb) File "C:\Users\XXXXXX\Anaconda3\envs\cgd\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\users\XXXXXX\guided-diffusion\guided_diffusion\unet.py", line 77, in forward x = layer(x) File "C:\Users\XXXXXX\Anaconda3\envs\cgd\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\XXXXXX\Anaconda3\envs\cgd\lib\site-packages\torch\nn\modules\conv.py", line 443, in forward return self._conv_forward(input, self.weight, self.bias) File "C:\Users\XXXXXX\Anaconda3\envs\cgd\lib\site-packages\torch\nn\modules\conv.py", line 439, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: "unfolded2d_copy" not implemented for 'Half'

Any help would be appreciated.

opened by DC-19 1
In what is this project different from "Big Sleep"?

Hello... I was wondering, how is this project different from Big Sleep ? Does it improve it in some way? I am curious to know because I was actually trying to find similar projects to Big Sleep, because unfortunately my GPU's V-RAM is not enough to run it, and so I was looking for some different, kind of more "modest" implementation... :)

When I say that I'm low on VRAM I mean, desperately low 😅 (2 Gb !) BUT! - because I successfully managed to run Deep Daze which works similarly using CLIP and a SIREN in place of a BigGAN - then, I haven't lost all hope yet...! :) So yeah, if you have any pointers about your implementation and how it differs from Big Sleep, and most importantly if there could be a way to tune it for working on such a low VRAM amount (even at ridiculously low resolutions, doesn't matter), that would be hugely appreciated!

Also, yes, I know I could use Google Colabs! But witnessing the ML magic happening right within your Machine, makes it for a totally different experience... ;) And yeah I want to get a new, decent GPU as soon as possible! But I'd feel so stupid to pay 3 times its actual cost just because the IT hardware market it's fucked up (and keeps staying like so...) So... Now you know all :)

opened by illtellyoulater 0

Owner

Nerdy Rodent

Just a nerdy rodent. I do arty stuff with computers.

GitHub

A colab notebook for training Stylegan2-ada on colab, transfer learning onto your own dataset.

Stylegan2-Ada-Google-Colab-Starter-Notebook A no thrills colab notebook for training Stylegan2-ada on colab. transfer learning onto your own dataset h

66 Dec 16, 2022

Traditional deepdream with VQGAN+CLIP and optical flow. Ready to use in Google Colab

VQGAN-CLIP-Video cat.mp4 policeman.mp4 schoolboy.mp4 forsenBOG.mp4

23 Oct 26, 2022

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

GLIDE This is the official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing w

2.9k Jan 4, 2023

Api for getting bin info and getting encrypted card details for adyen.

Bin Info And Adyen Cse Enc Python api for getting bin info and getting encrypted

8 Dec 30, 2022

Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Minimal code and simple experiments to play with Denoising Diffusion Probabilist

16 Oct 6, 2022

Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

PyTorch implementation of 'Denoising Diffusion Probabilistic Models' This reposi

76 Jan 7, 2023

Much faster than SORT(Simple Online and Realtime Tracking), a little worse than SORT

QSORT QSORT(Quick + Simple Online and Realtime Tracking) is a simple online and realtime tracking algorithm for 2D multiple object tracking in video s

8 Jul 27, 2022

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

191 Dec 31, 2022

Just playing with getting CLIP Guided Diffusion running locally, rather than having to use colab.

Related tags

Overview

CLIP-Guided-Diffusion

Environment

Set up

Run

Multiple prompts

Other options

init_image

timesteps

image guidance

Other repos

Citations

Comments

ERROR: No matching distribution found for blobfile>=1.0.5

Please update the download path of the 512*512 model

Error when generating image

Data conversion error on Apple silicon (e.g. M1, M2)

Fatal Error 'unfolded2d_copy not implemented' When Attempting to Run for the First Time

In what is this project different from "Big Sleep"?

Owner

Nerdy Rodent

A colab notebook for training Stylegan2-ada on colab, transfer learning onto your own dataset.

Traditional deepdream with VQGAN+CLIP and optical flow. Ready to use in Google Colab

Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

Api for getting bin info and getting encrypted card details for adyen.

Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

Much faster than SORT(Simple Online and Realtime Tracking), a little worse than SORT

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

A Jupyter notebook to play with NVIDIA's StyleGAN3 and OpenAI's CLIP for a text-based guided image generation.

In the case of your data having only 1 channel while want to use timm models

Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

Just-Now - This Is Just Now Login Friendlist Cloner Tools

Try out deep learning models online on Google Colab

Monitor your ML jobs on mobile devices📱, especially for Google Colab / Kaggle

Stroke-predictions-ml-model - Machine learning model to predict individuals chances of having a stroke

A simple image/video to Desmos graph converter run locally

Code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"