CLIP + VQGAN / PixelDraw

Related tags

Deep Learning clipit
Overview

clipit

Yet Another VQGAN-CLIP Codebase

Alt text

This started as a fork of @nerdyrodent's VQGAN-CLIP code which was based on the notebooks of @RiversWithWings and @advadnoun. But it quickly morphed into a version of the code that had been tuned up with slightly different behavior and features. It also runs either at the command line or in a notebook or (soon) in batch mode.

Basically this is a verison of the notebook with opinionated defaults and slighly different internals. You are welcome to use it if you'd like.

For now, checkout THE DEMO NOTEBOOKS - especially the super simple "Start Here" colab.

Citations

@misc{unpublished2021clip,
    title  = {CLIP: Connecting Text and Images},
    author = {Alec Radford, Ilya Sutskever, Jong Wook Kim, Gretchen Krueger, Sandhini Agarwal},
    year   = {2021}
}
@misc{esser2020taming,
      title={Taming Transformers for High-Resolution Image Synthesis}, 
      author={Patrick Esser and Robin Rombach and Björn Ommer},
      year={2020},
      eprint={2012.09841},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Katherine Crowson - https://github.com/crowsonkb Adverb https://twitter.com/advadnoun

Comments
  • missing requirements

    missing requirements

    It's known that the repo is sloppy with requirements, in particular it needs pip install of:

    braceexpand perlin_numpy

    This issue replaces dribnet/clipit_old#3 (more context there)

    help wanted 
    opened by dribnet 8
  • Runtime Error for Pixray

    Runtime Error for Pixray

    Swirl Demo


    RuntimeError Traceback (most recent call last) in () 70 71 settings = pixray.apply_settings() ---> 72 pixray.do_init(settings) 73 pixray.do_run(settings) 74

    1 frames /usr/local/lib/python3.7/dist-packages/torch/jit/_script.py in fail(self, *args, **kwargs) 912 def _make_fail(name): 913 def fail(self, *args, **kwargs): --> 914 raise RuntimeError(name + " is not supported on ScriptModules") 915 916 return fail

    RuntimeError: requires_grad_ is not supported on ScriptModules

    opened by Ghee36 6
  • Best Optimiser for generation

    Best Optimiser for generation

    After some experimentation, I've found that the best optimiser to use (at least, for standard VQGAN, not pixel or clipdraw) is DiffGrad, with a step size somewhere around 1. It gives a perfect balance of structure and detail and very rarely screws up!

    Prompt: A village inside of a sewer, inhabited by humanoid tardigrades. 8K HD detailed Wallpaper, digital illustration, artstation. image Ran for 500 iterations, using sflckr.

    opened by varkarrus 6
  • Pixel Size

    Pixel Size

    What are recommended params for pixel_size? I am trying to achieve smaller pixels; have tried 1,1 - 2,2 - 3,3 - 8,8 - 16,16

    Am I doing it incorrectly? Probably. I tried via direct adding pixel_size = [4, 4] as well as clipit.add_settings(pixel_size = [16, 16]). without any change in visual output.

    opened by Ghee36 5
  • Two new features added.

    Two new features added.

    Two new features added: the palette enforcement and the smoothness enforcement. I added several options to "vq_parser" and made minor changes in ascend_txt() (added two new losses). Plus, a new demo notebook included.

    opened by altsoph 5
  • Fixed import pydiffvg import errors

    Fixed import pydiffvg import errors

    "CLIPIT PixelDraw" notebook on commit df1508, which still fails on Colab with the error NameError: name 'PixelDrawer' is not defined.

    The error in more detail is:

    <ipython-input-2-2646d65429bc> in <module>()
         40 
         41 settings = clipit.apply_settings()
    ---> 42 clipit.do_init(settings)
         43 clipit.do_run(settings)
    
    /content/clipit/clipit.py in do_init(args)
        395             drawer = PixelDrawer(args.size[0], args.size[1], args.do_mono, [40, 40], scale=args.pixel_scale)
        396         else:
    --> 397             drawer = PixelDrawer(args.size[0], args.size[1], args.do_mono, scale=args.pixel_scale)
        398     else:
        399         drawer = VqganDrawer(args.vqgan_model)
    
    NameError: name 'PixelDrawer' is not defined
    

    If I try running just from pixeldrawer import PixelDrawer, I get ModuleNotFoundError: No module named 'pydiffvg'

    I can run import diffvg however that is the wrong name.

    I changed the install code to:

      !git clone https://github.com/BachiLi/diffvg pydiffvg
      %cd pydiffvg
      # !ls
      !git submodule update --init --recursive
      !python setup.py install
    

    Which worked until I got the error:

    ----> 1 pydiffvg.set_print_timing
    
    AttributeError: module 'pydiffvg' has no attribute 'set_print_timing'.
    

    Removing this line fixes the error.

    Additionally my text editor did auto-blacken on this file to improve the formatting.

    opened by Wheest 3
  • Colab should warn when there are insufficient GPU resources.

    Colab should warn when there are insufficient GPU resources.

    NameError                                 Traceback (most recent call last)
    <ipython-input-6-e07db4d0b3d4> in <module>()
         40 
         41 settings = clipit.apply_settings()
    ---> 42 clipit.do_init(settings)
         43 clipit.do_run(settings)
    
    /content/clipit/clipit.py in do_init(args)
        373             drawer = PixelDrawer(args.size[0], args.size[1], args.do_mono, [40, 40], scale=args.pixel_scale)
        374         else:
    --> 375             drawer = PixelDrawer(args.size[0], args.size[1], args.do_mono, scale=args.pixel_scale)
        376     else:
        377         drawer = VqganDrawer(args.vqgan_model)
    
    NameError: name 'PixelDrawer' is not defined
    
    opened by su5yam 3
  • Parameter Explanations

    Parameter Explanations

    Hi, I maybe have missed it, but are there any docs on the parameters? Some I can puzzle together from the code, but others are a little bit more mysterious to me, like spot-prompts or number of cuts for example.

    opened by cadowyn 2
  • RuntimeError: Sizes of tensors

    RuntimeError: Sizes of tensors

    Using device: cuda:0 Optimising using: AdamP Using text prompts: ['leaf:-2', 'area:-3', 'flatness:5', 'text:-2', 'light on silver:3', 'flat Art', 'iridescence:1', 'the light and metallic volume of color luminance displayed'] Using seed: -1 Oops: runtime error: Sizes of tensors must match except in dimension 3. Got 224 and 398 (The offending index is 0) Try reducing --num-cuts to save memory /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:3613: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)

    RuntimeError Traceback (most recent call last) in () 104 settings = clipit.apply_settings() 105 clipit.do_init(settings) --> 106 clipit.do_run(settings)

    5 frames /content/clipit/clipit.py in forward(self, input, spot) 338 batch2, transforms2 = self.augs_wide(torch.cat(cutouts[self.cutn_zoom:], dim=0)) 339 # print(batch1.shape, batch2.shape) --> 340 batch = torch.cat([batch1, batch2]) 341 # print(batch.shape) 342 self.transforms = torch.cat([transforms1, transforms2])

    RuntimeError: Sizes of tensors must match except in dimension 3. Got 224 and 398 (The offending index is 0)`

    opened by Ghee36 2
  • Enforce palette annealing

    Enforce palette annealing

    Basically, I've added two things:

    1. I've tried to make enforce_smoothness to be more soft. Originally, this loss was a linear function of the local contrast, so a higher difference between neighbor pixels means a bigger loss. This leads to pale results. Now it's logarithmic, so it penalize the dithering as before but should be less brutal with sharp edges. The changes are minor, however.
    2. I've added another enforce_saturation loss, it based on the old "percepted colourfulness" heuristic from Hasler and Süsstrunk’s 2003 paper https://www.researchgate.net/publication/243135534_Measuring_Colourfulness_in_Natural_Images
    opened by altsoph 2
  • pixray account

    pixray account

    Dear @pixray

    I would like to set up a "pixray" account to be associated with this software. You registered the account several years ago but don't seem to be using it. Please let us know if you might be interested in transferring the account to us so that we could setup a github organization with this same name.

    opened by dribnet 0
  • Error while running `clipit.do_init(settings) `

    Error while running `clipit.do_init(settings) `

    213         if return_transform is not None:
        214             raise ValueError(
    --> 215                 "`return_transform` is deprecated. Please access the transformation matrix with "
        216                 "`.transform_matrix`. For chained matrices, please use `AugmentationSequential`."
        217             )
    
    ValueError: `return_transform` is deprecated. Please access the transformation matrix with `.transform_matrix`. For chained matrices, please use `AugmentationSequential`.
    
    opened by hritikb27 1
  • Swirl Demo

    Swirl Demo

    KeyError Traceback (most recent call last) in () 69 # pixray.add_settings(iterations=500, display_every=50) 70 ---> 71 settings = pixray.apply_settings() 72 pixray.do_init(settings) 73 pixray.do_run(settings)

    /content/pixray/pixray.py in apply_settings() 1567 1568 vq_parser = setup_parser(vq_parser) -> 1569 class_table[settings_core.drawer].add_settings(vq_parser) 1570 1571 if len(global_pixray_settings) > 0:

    KeyError: 'pixel'

    opened by manofiga 1
  • VQGAN+CLIP on linux gives different results

    VQGAN+CLIP on linux gives different results

    The pictures from this git in google colab are very beautiful. U've written, that this was a fork of a VQGAN+CLIP. I want to use it on Linux. However, VQGAN+CLIP(https://github.com/nerdyrodent/VQGAN-CLIP) with the same sid, pretrained LPIPS, checkpoints(.ckpt) and optimising(Adam) gives me bad picture. Why? Both projects use CLIP and taming-transformers. What makes picture of your clipit more beautiful. How to import your clipit to Linux or what to change in VQGAN+CLIP to get so nice results? Your clipit(sid 15075320329395737548, 'space ship'): kjdfgd VQGAN+CLIP for linux(the same sid, the same prompt):

    output

    opened by DmitroC 0
  • No such file or directory: 'wget'

    No such file or directory: 'wget'

    I'm attempting to run this in PyCharm. I am getting the error FileNotFoundError: [Errno 2] No such file or directory: 'wget'. It looks to be coming from output = subprocess.check_output(['wget', '-O', out, url]) from the vqgan.py file. I installed wget via pip, but it is still throwing the error. Anyone know why this error may be showing up?

    opened by mbecker4 1
  • Pixray 'MyRandomPerspective' issue

    Pixray 'MyRandomPerspective' issue

    Messing with pixray and during the run I get `--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) in () 26 pixray.do_init(settings) 27 ---> 28 pixray.do_run(settings) 29 30 shutil.copyfile(f"/content/output.png", f"{OUTPUT_DIR}/{text}_{seed}.png")

    11 frames /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in getattr(self, name) 1129 return modules[name] 1130 raise AttributeError("'{}' object has no attribute '{}'".format( -> 1131 type(self).name, name)) 1132 1133 def setattr(self, name: str, value: Union[Tensor, 'Module']) -> None:

    AttributeError: 'MyRandomPerspective' object has no attribute 'resample'`

    Is this due to a simple parameter?

    opened by Ghee36 0
Owner
dribnet
Lecturer at University of Wellington School of Design teaching creative coding and researching neural design.
dribnet
Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

Nerdy Rodent 2.3k Jan 4, 2023
Zero-Shot Text-to-Image Generation VQGAN+CLIP Dockerized

VQGAN-CLIP-Docker About Zero-Shot Text-to-Image Generation VQGAN+CLIP Dockerized This is a stripped and minimal dependency repository for running loca

Kevin Costa 73 Sep 11, 2022
Streamlit Tutorial (ex: stock price dashboard, cartoon-stylegan, vqgan-clip, stylemixing, styleclip, sefa)

Streamlit Tutorials Install pip install streamlit Run cd [directory] streamlit run app.py --server.address 0.0.0.0 --server.port [your port] # http:/

Jihye Back 30 Jan 6, 2023
Text2Art is an AI art generator powered with VQGAN + CLIP and CLIPDrawer models

Text2Art is an AI art generator powered with VQGAN + CLIP and CLIPDrawer models. You can easily generate all kind of art from drawing, painting, sketch, or even a specific artist style just using a text input. You can also specify the dimensions of the image. The process can take 3-20 mins and the results will be emailed to you.

Muhammad Fathy Rashad 643 Dec 30, 2022
An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics.

Sketch Simulator An architecture that makes any doodle realistic, in any specified style, using VQGAN, CLIP and some basic embedding arithmetics. See

null 12 Dec 18, 2022
Making a music video with Wav2CLIP and VQGAN-CLIP

music2video Overview A repo for making a music video with Wav2CLIP and VQGAN-CLIP. The base code was derived from VQGAN-CLIP The CLIP embedding for au

Joel Jang | 장요엘 163 Dec 26, 2022
Traditional deepdream with VQGAN+CLIP and optical flow. Ready to use in Google Colab

VQGAN-CLIP-Video cat.mp4 policeman.mp4 schoolboy.mp4 forsenBOG.mp4

null 23 Oct 26, 2022
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

CLIP-GEN [简体中文][English] 本项目在萤火二号集群上用 PyTorch 实现了论文 《CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP》。 CLIP-GEN 是一个 Language-F

null 75 Dec 29, 2022
Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Deep Daze mist over green hills shattered plates on the grass cosmic love and attention a time traveler in the crowd life during the plague meditative

Phil Wang 4.4k Jan 3, 2023
Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

CLIP-GLaSS Repository for the paper Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search An in-browser demo is

Federico Galatolo 172 Dec 22, 2022
CLIP+FFT text-to-image

Aphantasia This is a text-to-image tool, part of the artwork of the same name. Based on CLIP model, with FFT parameterizer from Lucent library as a ge

vadim epstein 690 Jan 2, 2023
Navigating StyleGAN2 w latent space using CLIP

Navigating StyleGAN2 w latent space using CLIP an attempt to build sth with the official SG2-ADA Pytorch impl kinda inspired by Generating Images from

Mike K. 55 Dec 6, 2022
RANZCR-CLiP 7th Place Solution

RANZCR-CLiP 7th Place Solution This repository is WIP. (18 Mar 2021) Installation git clone https://github.com/analokmaus/kaggle-ranzcr-clip-public.gi

Hiroshechka Y 21 Oct 22, 2022
A containerized REST API around OpenAI's CLIP model.

OpenAI's CLIP — REST API This is a container wrapping OpenAI's CLIP model in a RESTful interface. Running the container locally First, build the conta

Santiago Valdarrama 48 Nov 6, 2022
A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.

Ryan Murdock has done it again, combining OpenAI's CLIP and the generator from a BigGAN! This repository wraps up his work so it is easily accessible to anyone who owns a GPU.

Phil Wang 2.3k Jan 9, 2023
Simple implementation of OpenAI CLIP model in PyTorch.

It was in January of 2021 that OpenAI announced two new models: DALL-E and CLIP, both multi-modality models connecting texts and images in some way. In this article we are going to implement CLIP model from scratch in PyTorch. OpenAI has open-sourced some of the code relating to CLIP model but I found it intimidating and it was far from something short and simple. I also came across a good tutorial inspired by CLIP model on Keras code examples and I translated some parts of it into PyTorch to build this tutorial totally with our beloved PyTorch!

Moein Shariatnia 226 Jan 5, 2023
A PyTorch Lightning solution to training OpenAI's CLIP from scratch.

train-CLIP ?? A PyTorch Lightning solution to training CLIP from scratch. Goal ⚽ Our aim is to create an easy to use Lightning implementation of OpenA

Cade Gordon 396 Dec 30, 2022
CLIP: Connecting Text and Image (Learning Transferable Visual Models From Natural Language Supervision)

CLIP (Contrastive Language–Image Pre-training) Experiments (Evaluation) Model Dataset Acc (%) ViT-B/32 (Paper) CIFAR100 65.1 ViT-B/32 (Our) CIFAR100 6

Myeongjun Kim 52 Jan 7, 2023