Python script to preprocess images of all Pokémon to finetune ruDALL-E

Max Woolf

Last update: Dec 11, 2022

Related tags

Miscellaneous ai-generated-pokemon-rudalle

Overview

ai-generated-pokemon-rudalle

Python script to preprocess images of all Pokémon (the "official artwork" of each Pokémon via PokéAPI) into a format such that it can be used to finetune ruDALL-E using the finetuning example Colab Notebook linked in that repo. This workflow was used to create a model that resulted in AI-Generated Pokemon that went viral (10k+ retweets on Twitter + 30k+ upvotes on Reddit)

My modified Colab Notebook that I used to finetune the model on Pokémon is here: this Notebook's release is purely for demonstration/authentication purposes and no support will be given on how to use it because it is incredibly messy and embarrassing, but there may be a few ideas there that are useful for future generation. Some notes on how the process works are included below, with oppertunity to reproduce/improve it.

The script outputs two things: an images folder with all the preprocessed images plus a data_desc.csv file which contains the image path and Russian caption pairs for finetuning. Some examples of the preprocessed input images are present in the images folder, plus the final data_desc.csv.

The model used is not included in this repo because it's currently too large (~3GB) to distribute (will add the model to Hugging Face at some point).

Preprocessing Script Notes

The GraphQL interface to PokéAPI is used as it allows to retrieve the type information plus IDs of all Pokémon in a single request. As a bonus, the returned IDs include the alternate forms of Pokémon (e.g. Mega) which would not otherwise be present just by incrementing IDs.
ruDALL-E requires 256x256px, RGB input images. In this case the source input images from PokéAPI are conveiently both square and larger than 256x256 so they downsample nicely. Since the images have transparency (RGBA), they are composited onto a white background.
The translation service used is Yandex, which apparently has decent rate limits, plus as a Russian company the translations from English to Russian should theoetically be better.
The captions (which are later translated into Russian) are determined by type. For example, a Grass/Poison type will have the caption A Grass-type and Poison-type Pokémon, which is then translated into Russian. In theory, this improves the finetuning process by allowing ruDALL-E to notice trends, plus in theory this can be leveraged at generation-time to control the generation (e.g. prompt with A Grass-type Pokémon and have ruDALL-E generate only Grass-type Pokémon)
Due to potential rate limits on translation, translations are cached at runtime by Pokémon type(s) so the API is pinged only once.

Finetuning and Generation Notes

The model used above was trained for 12 epochs (4.5 hours on a P100), at a max learning rate of 1e-5. The pct_start param of the OneCycleLR scheudler was set to 0.1 so that learning rate decay happens faster. Despite that, the model converged quickly.

The parameters for finetuning ruDALL-E are very difficult to get the expected results. Too little training and the output images will be too incoherent; too much training and the model will overfit and output the source images, and also ignore any text prompts. In the social media posts above, the model is slightly overfit and attempts at using text prompts to control generation failed. But overfitting is not necessairly a bad thing as long as it avoids verbatim output.

Usage

You can install the dependences via:

pip3 install Pillow requests translatepy tqdm

Then run build_image_dataset.py

Getting the images into the ruDALL-E finetuning Colab Notebook is up to the user, but the recommended way to do so is to ZIP the generated images folder (~42 MB!), upload it to Colab (or upload to Google Drive and copy it into the Notebook from there), and unzip the folder in Colab itself via !unzip.

Maintainer/Creator

Max Woolf (@minimaxir)

Max's open-source projects are supported by his Patreon and GitHub Sponsors. If you found this project helpful, any monetary contributions to the Patreon are appreciated and will be put to good creative use.

License

MIT

Comments

`Translator` does not always use Yandex.Translate
Thank you for using translatepy !

You put on the README and the line below that the script were using Yandex.Translate to translate things

https://github.com/minimaxir/ai-generated-pokemon-rudalle/blob/3a5b1e6a9818f45c5d980ed61e2d402014cab5e9/build_image_dataset.py#L36

Although this is in some case true, the purpose of Translator is to use as many translations API as possible to get a better chance of getting a result.

Therefore, Translator will use translation APIs in the defined order, which here should be the default one

services_list: List[BaseTranslator] = [ GoogleTranslate, BingTranslate, YandexTranslate, ReversoTranslate, DeeplTranslate, LibreTranslate, TranslateComTranslate, MyMemoryTranslate ]

You can change this order as you wish by passing a new list of "translators" when creating a new Translator object.

Please refer to the documentation for further explanation : The Translator Class — Animenosekai/translate

You could also use only the Yandex.Translate translator but I would not recommend this if you get rate-limited.

✨
opened by Animenosekai 0

A python script developed to process Windows memory images based on triage type.

Overview A python script developed to process Windows memory images based on triage type. Requirements Python3 Bulk Extractor Volatility2 with Communi

245 Nov 24, 2022

With the initiation of the COVID vaccination drive across India for all individuals above the age of 18, I wrote a python script which alerts the user regarding open slots in the vicinity!

cowin_notifier With the initiation of the COVID vaccination drive across India for all individuals above the age of 18, I wrote a python script which

13 Aug 1, 2021

This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.

Robots.txt tester With this script, you can enumerate all URLs present in robots.txt files, and test whether you can access them or not. Setup Clone t

32 Oct 10, 2022

Removes all archived super productivity tasks. Just run the python script.

delete-archived-sp-tasks.py Removes all archived super productivity tasks. Just run the python script. This is helpful to do a cleanup every 3-6 month

1 Jan 9, 2022

Bookmarkarchiver - Python script that archives all of your bookmarks on the Internet Archive

bookmarkarchiver Python script that archives all of your bookmarks on the Internet Archive. Supports all major browsers. bookmarkarchiver uses the off

3 Oct 9, 2022

JD-backup is an advanced Python script, that will extract all links from a jDownloader 2 file list and export them to a text file.

3 Jun 7, 2022

A GUI love Calculator which saves all the User Data in text file(sql based script will be uploaded soon). Interative GUI. Even For Admin Panel

Love-Calculator A GUI love Calculator which saves all the User Data in text file(sql based script will be uploaded soon). Interative GUI, even For Adm

1 Mar 22, 2022

Script to calculate delegator epoch returns for all pillars

znn_delegator_calculator Script to calculate estimated delegator epoch returns for all Pillars, so you can delegate to the best one. You can find me o

2 Dec 3, 2021

It's just a simple script to add all contest from site to your Google Calendar and make two reminder for them one before the contest one day, and another before half an hour, the event on Google Calendar have the registration link of the contest.

CP-Calendar It's just a simple script to add all contest from site to your Google Calendar and make two reminder for them one before the contest one d

12 Oct 17, 2022

Python script to preprocess images of all Pokémon to finetune ruDALL-E

Related tags

Overview

ai-generated-pokemon-rudalle

Preprocessing Script Notes

Finetuning and Generation Notes

Usage

Maintainer/Creator

License

You might also like...

A python script developed to process Windows memory images based on triage type.

With the initiation of the COVID vaccination drive across India for all individuals above the age of 18, I wrote a python script which alerts the user regarding open slots in the vicinity!

This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.

Removes all archived super productivity tasks. Just run the python script.

Bookmarkarchiver - Python script that archives all of your bookmarks on the Internet Archive

JD-backup is an advanced Python script, that will extract all links from a jDownloader 2 file list and export them to a text file.

A GUI love Calculator which saves all the User Data in text file(sql based script will be uploaded soon). Interative GUI. Even For Admin Panel

Script to calculate delegator epoch returns for all pillars

It's just a simple script to add all contest from site to your Google Calendar and make two reminder for them one before the contest one day, and another before half an hour, the event on Google Calendar have the registration link of the contest.

Comments

`Translator` does not always use Yandex.Translate

Owner

Max Woolf

Converts a base copy of Pokemon BDSP's masterdatas into a more readable and editable Pokemon Showdown Format.

This python script extracts all the video URLs from any youtube channel. Then it extracts all the information like the name of the youtube channel, published date, likes, dislikes, comments, views, etc for all the videos in that channel.

Islam - This is a simple python script.In this script I have written all the suras of Al Quran. As a result, by using this script, you can know the number of any sura at the moment.

Python library for datamining glitch information from Gen 1 Pokémon GameBoy ROMs

Pokemon catch events project to demonstrate data pipeline on AWS

Pokemon sword replay capture

Pokehandy - Data web app sobre Pokémon TCG que desarrollo durante transmisiones de Twitch, 2022

An unofficial opensource Pokemon cursor theme for Windows and Linux.

A small script I made that takes any standard Decklist of magic the gathering cards and pulls all card images from scryfall at once!

A simple program to recolour simple png icon-like pictures with just one colour + transparent or white background. Resulting images all have transparent background and a new colour.