CLIP-GLaSS
Repository for the paper Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search
here
An in-browser demo is availableInstallation
Clone this repository
git clone https://github.com/galatolofederico/clip-glass && cd clip-glass
Create a virtual environment and install the requirements
virtualenv --python=python3.6 env && . ./env/bin/activate
pip install -r requirements.txt
Run CLIP-GLaSS
You can run CLIP-GLaSS
with:
python run.py --config --target
Specifying
and
according to the following table:
Config | Meaning | Target Type |
---|---|---|
GPT2 | Use GPT2 to solve the Image-to-Text task | Image |
DeepMindBigGAN512 | Use DeepMind's BigGAN 512x512 to solve the Text-to-Image task | Text |
DeepMindBigGAN256 | Use DeepMind's BigGAN 256x256 to solve the Text-to-Image task | Text |
StyleGAN2_ffhq_d | Use StyleGAN2-ffhq to solve the Text-to-Image task | Text |
StyleGAN2_ffhq_nod | Use StyleGAN2-ffhq without Discriminator to solve the Text-to-Image task | Text |
StyleGAN2_church_d | Use StyleGAN2-church to solve the Text-to-Image task | Text |
StyleGAN2_church_nod | Use StyleGAN2-church without Discriminator to solve the Text-to-Image task | Text |
StyleGAN2_car_d | Use StyleGAN2-car to solve the Text-to-Image task | Text |
StyleGAN2_car_nod | Use StyleGAN2-car without Discriminator to solve the Text-to-Image task | Text |
If you do not have downloaded the models weights you will be prompted to run ./download-weights.sh
You will find the results in the folder ./tmp
, a different output folder can be specified with --tmp-folder
Examples
python run.py --config StyleGAN2_ffhq_d --target "the face of a man with brown eyes and stubble beard"
python run.py --config GPT2 --target gpt2_images/dog.jpeg
Acknowledgments and licensing
This work heavily relies on the following amazing repositories and would have not been possible without them:
- CLIP from openai (included in the folder
clip
) - pytorch-pretrained-BigGAN from huggingface
- stylegan2-pytorch from Adrian Sahlman (included in the folder
stylegan2
) - gpt-2-pytorch from Tae-Hwan Jung (included in the folder
gpt2
)
All their work can be shared under the terms of the respective original licenses.
All my original work (everything except the content of the folders clip
, stylegan2
and gpt2
) is released under the terms of the GNU/GPLv3 license. Coping, adapting e republishing it is not only consent but also encouraged.
Citing
If you want to cite use you can use this BibTeX
@article{galatolo_glass
, author = {Galatolo, Federico A and Cimino, Mario GCA and Vaglini, Gigliola}
, title = {Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search}
, year = {2021}
}
Contacts
For any further question feel free to reach me at [email protected] or on Telegram @galatolo