IC-GAN: Instance-Conditioned GAN

Official Pytorch code of Instance-Conditioned GAN by Arantxa Casanova, Marlene Careil, Jakob Verbeek, Michał Drożdżal, Adriana Romero-Soriano.

Generate images with IC-GAN in a Colab Notebook

We provide a Google Colab notebook to generate images with IC-GAN and its class-conditional counter part.

The figure below depicts two instances, unseen during training and downloaded from Creative Commons search, and the generated images with IC-GAN and class-conditional IC-GAN when conditioning on the class "castle":

Additionally, and inspired by this Colab, we provide the funcionality in the same Colab notebook to guide generations with text captions, using the CLIP model. As an example, the following Figure shows three instance conditionings and a text caption (top), followed by the resulting generated images with IC-GAN (bottom), when optimizing the noise vector following CLIP's gradient for 100 iterations.

Credit for the three instance conditionings, from left to right, that were modified with a resize and central crop: 1: "Landscape in Bavaria" by shining.darkness, licensed under CC BY 2.0, 2: "Fantasy Landscape - slolsss" by Douglas Tofoli is marked with CC PDM 1.0, 3: "How to Draw Landscapes Simply" by Kuwagata Keisai is marked with CC0 1.0

Requirements

Python 3.8
Cuda v10.2 / Cudnn v7.6.5
gcc v7.3.0
Pytorch 1.8.0
A conda environment can be created from environment.yaml by entering the command: conda env create -f environment.yml, that contains the aforemention version of Pytorch and other required packages.
Faiss: follow the instructions in the original repository.

Overview

This repository consists of four main folders:

data_utils: A common folder to obtain and format the data needed to train and test IC-GAN, agnostic of the specific backbone.
inference: Scripts to test the models both qualitatively and quantitatively.
BigGAN_PyTorch: It provides the training, evaluation and sampling scripts for IC-GAN with a BigGAN backbone. The code base comes from Pytorch BigGAN repository, made available under the MIT License. It has been modified to add additional utilities and it enables IC-GAN training on top of it.
stylegan2_ada_pytorch: It provides the training, evaluation and sampling scripts for IC-GAN with a StyleGAN2 backbone. The code base comes from StyleGAN2 Pytorch, made available under the Nvidia Source Code License. It has been modified to add additional utilities and it enables IC-GAN training on top of it.

(Python script) Generate images with IC-GAN

Alternatively, we can generate images with IC-GAN models directly from a python script, by following the next steps:

Download the desired pretrained models (links below) and the pre-computed 1000 instance features from ImageNet and extract them into a folder pretrained_models_path.

model	backbone	class-conditional?	training dataset	resolution	url
IC-GAN	BigGAN	No	ImageNet	256x256	model
IC-GAN (half capacity)	BigGAN	No	ImageNet	256x256	model
IC-GAN	BigGAN	No	ImageNet	128x128	model
IC-GAN	BigGAN	No	ImageNet	64x64	model
IC-GAN	BigGAN	Yes	ImageNet	256x256	model
IC-GAN (half capacity)	BigGAN	Yes	ImageNet	256x256	model
IC-GAN	BigGAN	Yes	ImageNet	128x128	model
IC-GAN	BigGAN	Yes	ImageNet	64x64	model
IC-GAN	BigGAN	Yes	ImageNet-LT	256x256	model
IC-GAN	BigGAN	Yes	ImageNet-LT	128x128	model
IC-GAN	BigGAN	Yes	ImageNet-LT	64x64	model
IC-GAN	BigGAN	No	COCO-Stuff	256x256	model
IC-GAN	BigGAN	No	COCO-Stuff	128x128	model
IC-GAN	StyleGAN2	No	COCO-Stuff	256x256	model
IC-GAN	StyleGAN2	No	COCO-Stuff	128x128	model

Execute:

python inference/generate_images.py --root_path [pretrained_models_path] --model [model] --model_backbone [backbone] --resolution [res]

model can be chosen from ["icgan", "cc_icgan"] to use the IC-GAN or the class-conditional IC-GAN model respectively.
backbone can be chosen from ["biggan", "stylegan2"].
res indicates the resolution at which the model has been trained. For ImageNet, choose one in [64, 128, 256], and for COCO-Stuff, one in [128, 256].

This script results in a .PNG file where several generated images are shown, given an instance feature (each row), and a sampled noise vector (each grid position).

Additional and optional parameters:

index: (None by default), is an integer from 0 to 999 that choses a specific instance feature vector out of the 1000 instances that have been selected with k-means on the ImageNet dataset and stored in pretrained_models_path/stored_instances.
swap_target: (None by default) is an integer from 0 to 999 indicating an ImageNet class label. This label will be used to condition the class-conditional IC-GAN, regardless of which instance features are being used.
which_dataset: (ImageNet by default) can be chosen from ["imagenet", "coco"] to indicate which dataset (training split) to sample the instances from.
trained_dataset: (ImageNet by default) can be chosen from ["imagenet", "coco"] to indicate the dataset in which the IC-GAN model has been trained on.
num_imgs_gen: (5 by default), it changes the number of noise vectors to sample per conditioning. Increasing this number results in a bigger .PNG file to save and load.
num_conditionings_gen: (5 by default), it changes the number of conditionings to sample. Increasing this number results in a bigger .PNG file to save and load.
z_var: (1.0 by default) controls the truncation factor for the generation.
Optionally, the script can be run with the following additional options --visualize_instance_images --dataset_path [dataset_path] to visualize the ground-truth images corresponding to the conditioning instance features, given a path to the dataset's ground-truth images dataset_path. Ground-truth instances will be plotted as the leftmost image for each row.

Data preparation

ImageNet

Download dataset from here .
Download SwAV feature extractor weights from here .
Replace the paths in data_utils/prepare_data.sh: out_path by the path where hdf5 files will be stored, path_imnet by the path where ImageNet dataset is downloaded, and path_swav by the path where SwAV weights are stored.
Execute ./data_utils/prepare_data.sh imagenet [resolution], where [resolution] can be an integer in {64,128,256}. This script will create several hdf5 files:
- ILSVRC[resolution]_xy.hdf5 and ILSVRC[resolution]_val_xy.hdf5, where images and labels are stored for the training and validation set respectively.
- ILSVRC[resolution]_feats_[feature_extractor]_resnet50.hdf5 that contains the instance features for each image.
- ILSVRC[resolution]_feats_[feature_extractor]_resnet50_nn_k[k_nn].hdf5 that contains the list of [k_nn] neighbors for each of the instance features.

ImageNet-LT

Download ImageNet dataset from here . Following ImageNet-LT , the file ImageNet_LT_train.txt can be downloaded from this link and later stored in the folder ./BigGAN_PyTorch/imagenet_lt.
Download the pre-trained weights of the ResNet on ImageNet-LT from this link, provided by the classifier-balancing repository .
Replace the paths in data_utils/prepare_data.sh: out_path by the path where hdf5 files will be stored, path_imnet by the path where ImageNet dataset is downloaded, and path_classifier_lt by the path where the pre-trained ResNet50 weights are stored.
Execute ./data_utils/prepare_data.sh imagenet_lt [resolution], where [resolution] can be an integer in {64,128,256}. This script will create several hdf5 files:
- ILSVRC[resolution]longtail_xy.hdf5, where images and labels are stored for the training and validation set respectively.
- ILSVRC[resolution]longtail_feats_[feature_extractor]_resnet50.hdf5 that contains the instance features for each image.
- ILSVRC[resolution]longtail_feats_[feature_extractor]_resnet50_nn_k[k_nn].hdf5 that contains the list of [k_nn] neighbors for each of the instance features.

COCO-Stuff

Download the dataset following the LostGANs' repository instructions .
Download SwAV feature extractor weights from here .
Replace the paths in data_utils/prepare_data.sh: out_path by the path where hdf5 files will be stored, path_imnet by the path where ImageNet dataset is downloaded, and path_swav by the path where SwAV weights are stored.
Execute ./data_utils/prepare_data.sh coco [resolution], where [resolution] can be an integer in {128,256}. This script will create several hdf5 files:
- COCO[resolution]_xy.hdf5 and COCO[resolution]_val_test_xy.hdf5, where images and labels are stored for the training and evaluation set respectively.
- COCO[resolution]_feats_[feature_extractor]_resnet50.hdf5 that contains the instance features for each image.
- COCO[resolution]_feats_[feature_extractor]_resnet50_nn_k[k_nn].hdf5 that contains the list of [k_nn] neighbors for each of the instance features.

Other datasets

Download the corresponding dataset and store in a folder dataset_path.
Download SwAV feature extractor weights from here .
Replace the paths in data_utils/prepare_data.sh: out_path by the path where hdf5 files will be stored and path_swav by the path where SwAV weights are stored.
Execute ./data_utils/prepare_data.sh [dataset_name] [resolution] [dataset_path], where [dataset_name] will be the dataset name, [resolution] can be an integer, for example 128 or 256, and dataset_path contains the dataset images. This script will create several hdf5 files:
- [dataset_name][resolution]_xy.hdf5, where images and labels are stored for the training set.
- [dataset_name][resolution]_feats_[feature_extractor]_resnet50.hdf5 that contains the instance features for each image.
- [dataset_name][resolution]_feats_[feature_extractor]_resnet50_nn_k[k_nn].hdf5 that contains the list of k_nn neighbors for each of the instance features.

How to subsample an instance feature dataset with k-means

To downsample the instance feature vector dataset, after we have prepared the data, we can use the k-means algorithm:

 python data_utils/store_kmeans_indexes.py --resolution [resolution] --which_dataset [dataset_name] --data_root [data_path]

Adding --gpu allows the faiss library to compute k-means leveraging GPUs, resulting in faster execution.
Adding the parameter --feature_extractor [feature_extractor] chooses which feature extractor to use, with feature_extractor in ['selfsupervised', 'classification'] , if we are using swAV as feature extactor or the ResNet pretrained on the classification task on ImageNet, respectively.
The number of k-means clusters can be set with --kmeans_subsampled [centers], where centers is an integer.

How to train the models

BigGAN or StyleGAN2 backbone

Training parameters are stored in JSON files in [backbone_folder]/config_files/[dataset]/*.json, where [backbone_folder] is either BigGAN_Pytorch or stylegan2_ada_pytorch and [dataset] can either be ImageNet, ImageNet-LT or COCO_Stuff.

cd BigGAN_PyTorch
python run.py --json_config config_files/
   
    /
    
     .json --data_root [data_root] --base_root [base_root]

cd stylegan_ada_pytorch
python run.py --json_config config_files/
   
    /
    
     .json --data_root [data_root] --base_root [base_root]

where:

data_root path where the data has been prepared and stored, following the previous section (Data preparation).
base_root path where to store the model weights and logs.

Note that one can create other JSON files to modify the training parameters.

Other backbones

To be able to run IC-GAN with other backbones, we provide some orientative steps:

Place the new backbone code in a new folder under ic_gan (ic_gan/new_backbone).
Modify the relevant piece of code in the GAN architecture to allow instance features as conditionings (for both generator and discriminator).
Create a trainer.py file with the training loop to train an IC-GAN with the new backbone. The data_utils folder provides the tools to prepare the dataset, load the data and conditioning sampling to train an IC-GAN. The IC-GAN with BigGAN backbone trainer.py file can be used as an inspiration.

How to test the models

To obtain the FID and IS metrics on ImageNet and ImageNet-LT:

Execute:

python inference/test.py --json_config [BigGAN-PyTorch or stylegan-ada-pytorch]/config_files/
   
    /
    
     .json --num_inception_images [num_imgs] --sample_num_npz [num_imgs] --eval_reference_set [ref_set] --sample_npz --base_root [base_root] --data_root [data_root] --kmeans_subsampled [kmeans_centers] --model_backbone [backbone]

To obtain the tensorflow IS and FID metrics, use an environment with the Python <3.7 and Tensorflow 1.15. Then:

Obtain Inception Scores and pre-computed FID moments:

python ../data_utils/inception_tf13.py --experiment_name [exp_name] --experiment_root [base_root] --kmeans_subsampled [kmeans_centers]

For stratified FIDs in the ImageNet-LT dataset, the following parameters can be added --which_dataset 'imagenet_lt' --split 'val' --strat_name [stratified_split], where stratified_split can be in [few,low, many].

(Only needed once) Pre-compute reference moments with tensorflow code:

python ../data_utils/inception_tf13.py --use_ground_truth_data --data_root [data_root] --split [ref_set] --resolution [res] --which_dataset [dataset]

(Using this repository) FID can be computed using the pre-computed statistics obtained in 2) and the pre-computed ground-truth statistics obtain in 3). For example, to compute the FID with reference ImageNet validation set: python TTUR/fid.py [base_root]/[exp_name]/TF_pool_.npz [data_root]/imagenet_val_res[res]_tf_inception_moments_ground_truth.npz

To obtain the FID metric on COCO-Stuff:

Obtain ground-truth jpeg images: python data_utils/store_coco_jpeg_images.py --resolution [res] --split [ref_set] --data_root [data_root] --out_path [gt_coco_images] --filter_hd [filter_hd]
Store generated images as jpeg images: python sample.py --json_config ../[BigGAN-PyTorch or stylegan-ada-pytorch]/config_files/ / .json --data_root [data_root] --base_root [base_root] --sample_num_npz [num_imgs] --which_dataset 'coco' --eval_instance_set [ref_set] --eval_reference_set [ref_set] --filter_hd [filter_hd] --model_backbone [backbone]
Using this repository, compute FID on the two folders of ground-truth and generated images.

where:

dataset: option to select the dataset in `['imagenet', 'imagenet_lt', 'coco']
exp_name: name of the experiment folder.
data_root: path where the data has been prepared and stored, following the previous section "Data preparation".
base_root: path where to find the model (for example, where the pretrained models have been downloaded).
num_imgs: needs to be set to 50000 for ImageNet and ImageNet-LT (with validation set as reference) and set to 11500 for ImageNet-LT (with training set as reference). For COCO-Stuff, set to 75777, 2050, 675, 1375 if using the training, evaluation, evaluation seen or evaluation unseen set as reference.
ref_set: set to 'val' for ImageNet, ImageNet-LT (and COCO) to obtain metrics with the validation (evaluation) set as reference, or set to 'train' for ImageNet-LT or COCO to obtain metrics with the training set as reference.
kmeans_centers: set to 1000 for ImageNet and to -1 for ImageNet-LT.
backbone: model backbone architecture in ['biggan','stylegan2'].
res: integer indicating the resolution of the images (64,128,256).
gt_coco_images: folder to store the ground-truth JPEG images of that specific split.
filter_hd: only valid for ref_set=val. If -1, use the entire evaluation set; if 0, use only conditionings and their ground-truth images with seen class combinations during training (eval seen); if 1, use only conditionings and their ground-truth images with unseen class combinations during training (eval unseen).

Utilities for GAN backbones

We change and provide extra utilities to facilitate the training, for both BigGAN and StyleGAN2 base repositories.

BigGAN change log

The following changes were made:

BigGAN architecture:
- In train_fns.py: option to either have the optimizers inside the generator and discriminator class, or directly in the G_D wrapper module. Additionally, added an option to augment both generated and real images with augmentations from DiffAugment.
- In BigGAN.py: added a function get_condition_embeddings to handle the conditioning separately.
- Small modifications to layers.py to adapt the batchnorm function calls to the pytorch 1.8 version.
Training utilities:
- Added trainer.py file (replacing train.py):
  - Training now allows the usage of DDP for faster single-node and multi-node training.
  - Training is performed by epochs instead of by iterations.
  - Option to stop the training by using early stopping or when experiments diverge.
- In utils.py:
  - Replaced MultiEpochSampler for CheckpointedSampler to allow experiments to be resumable when using epochs and fixing a bug where MultiEpochSampler would require a long time to fetch data permutations when the number of epochs increased.
  - ImageNet-LT: Added option to use different class distributions when sampling a class label for the generator.
  - ImageNet-LT: Added class balancing (uniform and temperature annealed).
  - Added data augmentations from DiffAugment.
Testing utilities:
- In calculate_inception_moments.py: added option to obtain moments for ImageNet-LT dataset, as well as stratified moments for many, medium and few-shot classes (stratified FID computation).
- In inception_utils.py: added option to compute Precision, Recall, Density, Coverage and stratified FID.
Data utilities:
- In datasets.py, added option to load ImageNet-LT dataset.
- Added ImageNet-LT.txt files with image indexes for training and validation split.
- In utils.py:
  - Separate functions to obtain the data from hdf5 files (get_dataset_hdf5) or from directory (get_dataset_images), as well as a function to obtain only the data loader (get_dataloader).
  - Added the function sample_conditionings to handle possible different conditionings to train G with.
Experiment utilities:
- Added JSON files to launch experiments with the proposed hyper-parameter configuration.
- Script to launch experiments with either the submitit tool or locally in the same machine (run.py).

StyleGAN2 change log

Multi-node DistributedDataParallel training.
Added early stopping based on the training FID metric.
Automatic checkpointing when jobs are automatically rescheduled on a cluster.
Option to load dataset from hdf5 file.
Replaced the usage of Click python package by an `ArgumentParser`.
Only saving best and last model weights.

Acknowledgements

We would like to thanks the authors of the Pytorch BigGAN repository and StyleGAN2 Pytorch, as our model requires their repositories to train IC-GAN with BigGAN or StyleGAN2 bakcbone respectively. Moreover, we would like to further thank the authors of generative-evaluation-prdc, data-efficient-gans, faiss and sg2im as some components were borrowed and modified from their code bases. Finally, we thank the author of WanderCLIP as well as the following repositories, that we use in our Colab notebook: pytorch-pretrained-BigGAN and CLIP.

License

The majority of IC-GAN is licensed under CC-BY-NC, however portions of the project are available under separate license terms: BigGAN and PRDC are licensed under the MIT license; COCO-Stuff loader is licensed under Apache License 2.0; DiffAugment is licensed under BSD 2-Clause Simplified license; StyleGAN2 is licensed under a NVIDIA license, available here: https://github.com/NVlabs/stylegan2-ada-pytorch/blob/main/LICENSE.txt. In the Colab notebook, CLIP and pytorch-pretrained-BigGAN code is used, both licensed under the MIT license.

Disclaimers

THE DIFFAUGMENT SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

THE CLIP SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

THE PYTORCH-PRETRAINED-BIGGAN SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Cite the paper

If this repository, the paper or any of its content is useful for your research, please cite:

@misc{casanova2021instanceconditioned,
      title={Instance-Conditioned GAN}, 
      author={Arantxa Casanova and Marlène Careil and Jakob Verbeek and Michal Drozdzal and Adriana Romero-Soriano},
      year={2021},
      eprint={2109.05070},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Hi @ArantxaCasanova and rest of the team!

The work you've done with IC-GAN is quite exciting! Would you be interested in sharing your models in the Hugging Face Hub? The Hub offers free hosting of over 25K models, and it would make your work more accessible and visible to the rest of the ML ecosystem. There's an existing Facebook organization where your pretrained models could be.

Some of the benefits of sharing your models through the Hub would be:

wider reach of your work to the ecosystem
versioning, commit history and diffs
repos provide useful metadata about their tasks, languages, metrics, etc that make them discoverable
potential interactive widgets to demo your work
multiple features from TensorBoard visualizations, PapersWithCode integration, and more

Creating the repos and adding new models should be a relatively straightforward process if you've used Git before. This is a step-by-step guide explaining the process in case you're interested. Please let us know if you would be interested and if you have any questions.

Happy to hear your thoughts, Omar and the Hugging Face team

Add Docker environment & web demo

Hi @ArantxaCasanova! 👋

This pull request makes it possible to run your model inside a Docker environment, which makes it easier for other people to run it. We're using an open source tool called Cog to make this process easier.

This also means we can make a web page where other people can try out your model! At the moment we implemented Generating images with IC-GAN with input images (better user interaction), and simplified some specific input. View it here: https://replicate.ai/

Claim your page here so you can edit it, e.g. adding your favourite examples to the example tab, and we'll feature it on our website and tweet about your model too.

In case you're wondering who I am, I'm from Replicate, where we're trying to make machine learning reproducible. We got frustrated that we couldn't run all the really interesting ML work being done. So, we're going round implementing models we like. 😊
CLA Signed

opened by chenxwh 7
A way to get instance features closest to given image?

Hi, I am trying to regenerate an image using icgan (stylegan backbone, coco). For projected w vector, I use codes from projector.py and the problem is about instance features. To regenerate using icgan, I need to obtain instance features closest to given image (image to regenerate). I'll probably need to get a feature of that image and find the closest cluster index among pre-trained 1000 k-means. Are there any code segments or functions to do this simply..? I do see some related codes in ILSVRC_HDF5_feats class but is there anyway to do this simply without creating a that big dataset class?

opened by alex4727 5
Create pre-computed features of custom dataset

Hello everyone!

I am very interested in your project, and I was wondering please how did you create the pre-computed features you provided pre-computed 1000 instance features from ImageNet? I want to create something similar for my own dataset that contain only images.

Any help will be appreciated! Thanks

opened by mhbassel 4
Rationale on excluding conditional instance image from the kNN

The _obtain_nns function is coded in such a way that excluding the 0-NN (the conditioned instance itself) as shown here https://github.com/facebookresearch/ic_gan/blob/4428c9188ff9e6658b1062b8c19c13135298f561/data_utils/datasets_common.py#L742

Can I ask the rationale of excluding the image of the conditional instance from the k-NN? Doing this will also exclude the image during the training process for this specific condition right? Even if the image will be included in some other conditions, wouldn't this exclusion encourage the algorithm to learn a conditional distribution with a "hole" at the center?

Any intuition and design thought would be much appreciated. Thanks!

opened by Qianli-ion 4
how to use with a other stylegan2 models.

how to use with a other stylegan2 models.

is it possible to load any third party stylegan2 / stylegan2 ada pretrained models? or should I train from scratch.

opened by molo32 4
Sharing models through the Hugging Face Hub
Hi @ArantxaCasanova and rest of the team!

The work you've done with IC-GAN is quite exciting! Would you be interested in sharing your models in the Hugging Face Hub? The Hub offers free hosting of over 25K models, and it would make your work more accessible and visible to the rest of the ML ecosystem. There's an existing Facebook organization where your pretrained models could be.

Some of the benefits of sharing your models through the Hub would be:

wider reach of your work to the ecosystem

versioning, commit history and diffs

repos provide useful metadata about their tasks, languages, metrics, etc that make them discoverable

potential interactive widgets to demo your work

multiple features from TensorBoard visualizations, PapersWithCode integration, and more

Creating the repos and adding new models should be a relatively straightforward process if you've used Git before. This is a step-by-step guide explaining the process in case you're interested. Please let us know if you would be interested and if you have any questions.

Happy to hear your thoughts, Omar and the Hugging Face team
opened by osanseviero 3
File Not Found in Colab

Hi, I am trying to run the Generate images with IC-GAN + CLIP! section in Colab, but got error:

FileNotFoundError: [Errno 2] No such file or directory: '/content/icgan_biggan_imagenet_res256_nofeataug/state_dict_best0.pth'

Can only see directory icgan_biggan_imagenet_res256

opened by chenxwh 3
JSON config files for reproducing the results in the paper?

Thanks for releasing the code!

Quick question, is there anyway we can get the training configuration files in JSON for both the BigGAN and StyleGAN2 experiments as part of the --json_config input to the run.py?

opened by Qianli-ion 2
Colab Notebook torch/torchvision versions need update for compatibility

The two selected versions of torch and torchvision (1.8.0 and 0.8.2) that the notebook installs are no longer installable side-by-side. torchvision could be bumped up to to 0.9.0, or torch bumped down to 1.7.1. See https://pypi.org/project/torchvision/

opened by xloem 1
model for getting the instance features

Hi,

Do you plan to or can you release the model for getting the 1000 instance features? I want to try to compute the features for my own images. Thank you!

opened by zijin-gu 0
How to train to the precision in the paper

Hi, I'm interested in this work, but I can't reproduce the whole training yet, even if I follow the whole process. Please advise if there are some tips I don't know. Thank you very much for your generosity

opened by xudaopao 6
Generating images with IC-GAN and using my own dataset

Hi,

I followed the instructions to extract the features from my own test images and format the dataset to get the hdf5 files as explained in "Other datasets", and only could generate hdf5 files in the format of [dataset_name][resolution]feats[feature_extractor]_resnet50.hdf5 not the other two. My question is how to generate images using these generated instances features (in .hdf5 format)? How should generate.images.py be modified for image generation using my own test data? Can you please clarify this part?

Thanks

opened by BehzadBozorgtabar 1
any image from validate set can be fed into the trained IC-GAN model? how?

Here is my idea: generated feature for k-means instances in training data; given any image, find the closest center; generate images conditioned on this center feature.

Am I right?

opened by fido20160817 1
Difference between 128x128 and 256x256

Thank you for sharing this wonderful work.

In the author's view, which one do you guys think the generated image quality is better between 128x128 and 256x256? If generated correctly, 256x256 may output good quality high-resolution images, but my point is, which one generates informative images more frequently?

opened by sbkim052 2
How much time is required to train the model?

In supplementary materials, I found that what types of GPUs are used to obtain the results, but any information about training times could not be found.

Can you provide approximate times (or days) to train the model for ImageNet 64/128/256 for both unconditional and conditional BigGANs and COCO-stuff 128, 256 for StyleGAN2?

Thank you for sharing the code of this great work!

opened by shim94kr 3
Image generation on previously unseen data

Hi Community, I am trying to solve an image generation problem using IC_GAN, but I am not sure if IC_GAN is the right direction to move forward or not and therefore I am posting on this forum. I have attached a link to google drive. The first slide contains the training data and second slide contains the test data or data we want to generate.

Link : https://docs.google.com/presentation/d/1V3DysdfP4lJQRgAu9m8GDBnH17PKoxQU/edit?usp=sharing&ouid=108571067697385919652&rtpof=true&sd=true

Objective is to generate new images (new product) in different orientation and random backgrounds. Given that the new image (new product) is not directly trained in the model, but similar products are used to train the model. I have tried XingGAN and pose-gan for this problem but they were not successful.

Thank you.

opened by BapnaKhushal 1

Official repository for the paper "Instance-Conditioned GAN"

Related tags

Overview

IC-GAN: Instance-Conditioned GAN

Generate images with IC-GAN in a Colab Notebook

Requirements

Overview

(Python script) Generate images with IC-GAN

Data preparation

How to train the models

BigGAN or StyleGAN2 backbone

Other backbones

How to test the models

Utilities for GAN backbones

BigGAN change log

StyleGAN2 change log

Acknowledgements

License

Disclaimers

Cite the paper

Comments

Owner

Facebook Research

The repository offers the official implementation of our paper in PyTorch.

Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

Official repository for the paper "Going Beyond Linear Transformers with Recurrent Fast Weight Programmers"

Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"

Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.

Official repository of the paper 'Essentials for Class Incremental Learning'

This repository is an official implementation of the paper MOTR: End-to-End Multiple-Object Tracking with TRansformer.

Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We significantly improve the systematic generalization of transformer models on a variety of datasets using simple tricks and careful considerations.

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Official repository for CVPR21 paper "Deep Stable Learning for Out-Of-Distribution Generalization".

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

An official repository for Paper "Uformer: A General U-Shaped Transformer for Image Restoration".

The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

Official repository of the paper "A Variational Approximation for Analyzing the Dynamics of Panel Data". Mixed Effect Neural ODE. UAI 2021.