GANsformer: Generative Adversarial Transformers Drew A

Overview

PWC PWC PWC

GANsformer: Generative Adversarial Transformers

Drew A. Hudson* & C. Lawrence Zitnick

*I wish to thank Christopher D. Manning for the fruitful discussions and constructive feedback in developing the Bipartite Transformer, especially when explored within the language representation area, as well as for the kind financial support that allowed this work to happen!

This is an implementation of the GANsformer model, a novel and efficient type of transformer, explored for the task of image generation. The network employs a bipartite structure that enables long-range interactions across the image, while maintaining computation of linearly efficiency, that can readily scale to high-resolution synthesis. The model iteratively propagates information from a set of latent variables to the evolving visual features and vice versa, to support the refinement of each in light of the other and encourage the emergence of compositional representations of objects and scenes. In contrast to the classic transformer architecture, it utilizes multiplicative integration that allows flexible region-based modulation, and can thus be seen as a generalization of the successful StyleGAN network.

Instructions for model training and data prepreation as well as pretrained models will be available soon.
Note that the code is still going through some refactoring and clean-up. Will be ready for running by end of March 3. Stay Tuned!
(Code clean-up by March 3, all instructions by March 7, pretrained networks by March 20)

Bibtex

@article{hudson2021gansformer,
  title={Generative Adversarial Transformers},
  author={Hudson, Drew A and Zitnick, C. Lawrence},
  journal={arXiv preprint},
  year={2021}
}

Architecture overview

The GANsformer consists of two networks:

  • Generator: which produces the images (x) given randomly sampled latents (z). The latent z has a shape [batch_size, component_num, latent_dim], where component_num = 1 by default (Vanilla GAN, StyleGAN) but is > 1 for the GANsformer model. We can define the latent components by splitting z along the second dimension to obtain z_1,...,z_k latent components. The generator likewise consists of two parts:

    • Mapping network: converts sampled latents from a normal distribution (z) to the intermediate space (w). A series of Feed-forward layers. The k latent components either are mapped independently from the z space to the w space or interact with each other through self-attention (optional flag).
    • Synthesis network: the intermediate latents w are used to guide the generation of new images. Images features begin from a small constant/sampled grid of 4x4, and then go through multiple layers of convolution and up-sampling until reaching the desirable resolution (e.g. 256x256). After each convolution, the image features are modulated (meaning that their variance and bias are controlled) by the intermediate latent vectors w. While in the StyleGAN model there is one global w vectors that controls all the features equally. The GANsformer uses attention so that the k latent components specialize to control different regions in the image to create it cooperatively, and therefore perform better especially in generating images depicting multi-object scenes.
    • Attention can be used in several ways
      • Simplex Attention: when attention is applied in one direction only from the latents to the image features (top-down).
      • Duplex Attention: when attention is applied in the two directions: latents to image features (top-down) and then image features back to latents (bottom-up), so that each representation informs the other iteratively.
      • Self Attention between latents: can also be used so to each direct interactions between the latents.
      • Self Attention between image features (SAGAN model): prior approaches used attention directly between the image features, but this method does not scale well due to the quadratic number of features which becomes very high for high-resolutions.
  • Discriminator: Receives and image and has to predict whether it is real or fake – originating from the dataset or the generator. The model perform multiple layers of convolution and downsampling on the image, reducing the representation's resolution gradually until making final prediction. Optionally, attention can be incorporated into the discriminator as well where it has multiple (k) aggregator variables, that use attention to adaptively collect information from the image while being processed. We observe small improvements in model performance when attention is used in the discriminator, although note that most of the gain in using attention based on our observations arises from the generator.

Codebase

This codebase builds on top of and extends the great StyleGAN2 repository by Karras et al.
The GANsformer model can also be seen as a generalization of StyleGAN: while StyleGAN has one global latent vector that control the style of all image features globally, the GANsformer has k latent vectors, that cooperate through attention to control regions within the image, and thereby better modeling images of multi-object and compositional scenes.

More documentation and instructions will be coming soon!

Comments
  • Do you have any plans to export a pytorch version?

    Do you have any plans to export a pytorch version?

    Hi, I am not too familiar with tensorflow... If there are no such plans currently, do you have quick pointers to:

    1. the GANsformer model, especially where and how you deal with the latents (based on your paper, you split the latents?)
    2. what kind of optimizers are you using? and how do you implemented it? Is it similar to what we did in NLP (warmup, etc);
    3. did you ever tried using the standard feedforward after your duplex attention layer instead of 3x3? Did it still work?

    Thanks again for your kind attention! Best,

    opened by MultiPath 12
  • Some Errors On Training

    Some Errors On Training

    Thank you for your great work. I appreciate it a lot.

    I just tried to train a model with your codes, however there are lots of undefined variables used. For example:

    https://github.com/dorarad/gansformer/blob/148f72964219f8ead2621204bc5cfa89200b6879/training/network.py#L795

    It throw out undefined variable error for 'maps_in'. When I fix that with a constant, I get another error from

    https://github.com/dorarad/gansformer/blob/148f72964219f8ead2621204bc5cfa89200b6879/training/network.py#L811

    again gen_mod and gen_cond are not defined. When I fix that with a constant again, I get another error which says:

    gansformer-main/gansformer-main/training/network.py", line 1127, in G_synthesis grid_poses = get_positional_embeddings(resolution_log2, pos_dim or dlatent_size, pos_type, pos_directions_num, init = pos_init, **_kwargs) TypeError: get_positional_embeddings() got an unexpected keyword argument 'label_size'

    Am i missing something or is there a problem?

    opened by yilmazkorkmz 10
  • CLEVR pretrained model gives FID 22

    CLEVR pretrained model gives FID 22

    Hi, kudos for great work!

    I've just noticed that with recommended preprocessing and evaluation, the metrics on gdrive:cityscapes work as expected (FID ~5.2), while for CLEVR exactly two same lines:

    python prepare_data.py --clevr --max-images 100000
    python run_network.py --eval --gpus 0 --expname clevr-exp --dataset clevr --pretrained-pkl gdrive:clevr-snapshot.pkl
    

    give ~22 FID, not 9.2. Can you please double-check if the provided snapshot is correct? Or am I missing smth here?

    Thanks in advance!

    opened by JanRocketMan 8
  • kernel error in generate.py

    kernel error in generate.py

    In a python 3.7, tensorflow-gpu=1.15.0 cuda 10.0 and cuddn 7.5 I get this error in generate.py (which appeared to require cuddn 7.6.5, which brings a different error (see second part). Any advice?

    ... Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file

    ........... Total 35894608

    Generate images... 0%| | 0/8 [00:01<?, ?image (1 batches of 8 images)/s] Traceback (most recent call last): File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'FusedBiasAct' used by {{node Gs/_Run/Gs/G_mapping/AttLayer_0/FusedBiasAct}}with these attrs: [gain=1, T=DT_FLOAT, axis=1, alpha=0, grad=0, act=1] Registered devices: [CPU, XLA_CPU, XLA_GPU] Registered kernels: device='GPU'; T in [DT_HALF] device='GPU'; T in [DT_FLOAT]

         [[Gs/_Run/Gs/G_mapping/AttLayer_0/FusedBiasAct]]
    

    CUDNN7.6.5 error .... Total 35894608

    Generate images... 0%| | 0/8 [00:01<?, ?image (1 batches of 8 images)/s] Traceback (most recent call last): File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found. (0) Internal: cudaErrorNoKernelImageForDevice [[{{node Gs/_Run/Gs/G_mapping/global/Dense0_0/FusedBiasAct}}]] [[Gs/_Run/Gs/maps_out/_3151]] (1) Internal: cudaErrorNoKernelImageForDevice [[{{node Gs/_Run/Gs/G_mapping/global/Dense0_0/FusedBiasAct}}]] 0 successful operations. 0 derived errors ignored.

    opened by yaseryacoob 8
  • About the Duplex attention

    About the Duplex attention

    Hi, Thanks for sharing the code!

    I have a few questions about Section 3.1.2. Duplex attention.

    1. I am confused by the notation in the section. For example, in this section, "Y=(K^{P\times d}, V^{P\times d}), where the values store the content of the Y variables (e.g. the randomly sampled latents for the case of GAN)". Does it mean that V^{P\times d} is sampled from the original variable Y? how to set the number of P in your code?

    2. "keys track the centroids of the attention-based assignments from X to Y, which can be computed as K=a_b(Y, X)", does it mean K is calculated by using the self-attention module but with (Y, X) as input? If so, how to understand “the keys track the centroid of the attention-based assignments from X to Y”? BTW, how to get the centroids?

    3. For the update rule in duplex attention, what does the a() function mean? Does it denote a self-attention module like a_b() in Section 3.1.1, where X as query, K as keys, and V as values, if so, K is calculated from another self-attention module as mentioned in question 2, so the output of a_b(Y, X) will be treated as Keys, so the update rule contains two self-attention operations? is that right? Does it mean ’Duplex‘ attention?

    4. But finally I find I may be wrong when I read the last paragraph in this section. As mentioned in this section, "to support bidirectional interaction between elements, we can chain two reciprocal simplex attentions from X to Y and from Y to X, obtaining the duplex attention" So, does it mean, first, we calculate the Y by using a simplex attention module u^a(Y, X), and then use this Y as input of u^d(X, Y) to update X? Does it mean the duplex attention module contains three self-attention operations?

    Thanks a lot! :)

    opened by AndrewChiyz 7
  • FID VQ-GAN

    FID VQ-GAN

    Thank you for open-sourcing your code :)

    I was wondering about the generally very high FID values for the VQGAN. In the VQGAN paper, they report on, e.g., FFHQ 256x256 an FID of 11.4, whereas you report 63.1... Any idea why they are so different?

    Thanks!

    opened by xl-sr 7
  • PyTorch implementation generates same image samples

    PyTorch implementation generates same image samples

    Hi, I'm getting the same output image samples (see below) when I train the PyTorch implementation on FFHQ from scratch. The only changes I made (due to some memory issues mentioned in #33) were adding --batch-gpu 1 and removing saving attention map functionality (commenting out pytorch_version/training/visualize.py lines 167-206).

    python run_network.py --train --gpus 0 --batch-gpu 1 --ganformer-default --expname ffhq-scratch --dataset ffhq 000120 000240

    opened by kwhuang88228 6
  • Metrics PR Error

    Metrics PR Error

    Dear authors,

    Thank you for your wonderful contribution!!!

    When I tried to get precision and recall values during training by adding option, --metric pr, I got the following error


    \precision_recall.py", line 179, in _evaluate feats = self._gen_feats(Gs, inception, minibatch_size, num_gpus, Gs_kwargs) NameError: name 'inception' is not defined

    So, I have changed the lines in precision_recall.py. After the modification, It seems to work. I would greatly appreciate it if you kindly review my modification.


    def _evaluate(self, Gs, Gs_kwargs, num_gpus, num_imgs, paths = None, **kwargs):

           if paths is not None: 
               # Extract features for local sample image files (paths)
    ----->  eval_features = self._paths_to_feats(paths, feat_func, minibatch_size, num_gpus, num_imgs)
           else:
               # Extract features for newly generated fake imgs
    ----->  eval_features = self._gen_feats(Gs, feature_net, minibatch_size, num_imgs, num_gpus, Gs_kwargs)
    
           # Compute precision and recall
           state = knn_precision_recall_features(ref_features = ref_features, eval_features = eval_features,
               feature_net = feature_net, nhood_sizes = [self.nhood_size], row_batch_size = self.row_batch_size,
    ----->  col_batch_size = self.row_batch_size, num_gpus = num_gpus, num_imgs = num_imgs)
           self._report_result(state.knn_precision[0], suffix = "_precision")
           self._report_result(state.knn_recall[0], suffix = "_recall")
    
    -------------------------------------------------------------------------
    
    opened by bwhwang 6
  • Memory issue when training 1024 resolution

    Memory issue when training 1024 resolution

    I'm trying to train a 1024x1024 database on a V100 GPU. I tried both the tensorflow version and the pytorch version. Despite setting batch-gpu to 1, the tensorflow version always run out of system RAM(after the first tick, system ram total 51gb), and the pytorch version alway run out of cuda memory(before the first tick).

    Here are my training settings:

    python run_network.py --train --metrics 'none' --gpus 0 --batch-gpu 1 --resolution 1024 \
     --ganformer-default --expname art1 --dataset 1024art
    

    Also, I always encounter the warning: tcmalloc: large alloc

    opened by BlueberryGin 5
  • Issues with docker

    Issues with docker

    Hi,

    I'm trying to dockerize using this image - tensorflow/tensorflow:1.14.0-gpu-py3.

    FROM tensorflow/tensorflow:1.14.0-gpu-py3
    
    ARG USER="test"
    ARG WORK_DIR="/home/$USER"
    
    WORKDIR $WORK_DIR
    
    RUN apt-get update && apt-get install build-essential
    
    RUN apt-get install ffmpeg libsm6 libxext6  -y
    
    RUN pip install --upgrade pip setuptools wheel
    
    COPY . ./
    
    RUN pip install -r requirements.txt
    
    RUN python generate.py --gpus 0 --model gdrive:bedrooms-snapshot.pkl --output-dir images --images-num 4
    

    However, I am getting this error:

    Downloading https://drive.google.com/uc?id=1-2L3iCBpP_cf6T2onf3zEQJFAAzxsQne .... done
    
    2021-04-06 08:32:44 UTC -- Setting up TensorFlow plugin 'upfirdn_2d.cu': Preprocessing... Compiling... Loading... bin_file:  /home/test/dnnlib/tflib/_cudacache/upfirdn_2d_1.14_.so
    
    2021-04-06 08:32:44 UTC -- Failed!
    
    2021-04-06 08:32:44 UTC -- Traceback (most recent call last):
    
    2021-04-06 08:32:44 UTC --   File "generate.py", line 49, in <module>
    
    2021-04-06 08:32:44 UTC --     main()
    
    2021-04-06 08:32:44 UTC --   File "generate.py", line 46, in main
    
    2021-04-06 08:32:44 UTC --     run(**vars(args))
    
    2021-04-06 08:32:44 UTC --   File "generate.py", line 22, in run
    
    2021-04-06 08:32:44 UTC --     G, D, Gs = load_networks(model)                             # Load pre-trained network
    
    2021-04-06 08:32:44 UTC --   File "/home/test/pretrained_networks.py", line 30, in load_networks
    
    2021-04-06 08:32:44 UTC --     G, D, Gs = pickle.load(stream, encoding = "latin1")[:3]
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/network.py", line 306, in __setstate__
    
    2021-04-06 08:32:44 UTC --     self._init_graph()
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/network.py", line 159, in _init_graph
    
    2021-04-06 08:32:44 UTC --     out_expr = self._build_func(*self.input_templates, **build_kwargs)
    
    2021-04-06 08:32:44 UTC --   File "<string>", line 2371, in G_synthesis_stylegan2
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 229, in downsample_2d
    
    2021-04-06 08:32:44 UTC --     return _simple_upfirdn_2d(x, k, down=factor, pad0=(p+1)//2, pad1=p//2, data_format=data_format, impl=impl)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 358, in _simple_upfirdn_2d
    
    2021-04-06 08:32:44 UTC --     y = upfirdn_2d(y, k, upx=up, upy=up, downx=down, downy=down, padx0=pad0, padx1=pad1, pady0=pad0, pady1=pad1, impl=impl)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 61, in upfirdn_2d
    
    2021-04-06 08:32:44 UTC --     return impl_dict[impl](x=x, k=k, upx=upx, upy=upy, downx=downx, downy=downy, padx0=padx0, padx1=padx1, pady0=pady0, pady1=pady1)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 139, in _upfirdn_2d_cuda
    
    2021-04-06 08:32:44 UTC --     return func(x)
    
    2021-04-06 08:32:44 UTC --   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/custom_gradient.py", line 162, in decorated
    
    2021-04-06 08:32:44 UTC --     return _graph_mode_decorator(f, *args, **kwargs)
    
    2021-04-06 08:32:44 UTC --   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/custom_gradient.py", line 183, in _graph_mode_decorator
    
    2021-04-06 08:32:44 UTC --     result, grad_fn = f(*args)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 131, in func
    
    2021-04-06 08:32:44 UTC --     y = _get_plugin().up_fir_dn2d(x=x, k=kc, upx=upx, upy=upy, downx=downx, downy=downy, padx0=padx0, padx1=padx1, pady0=pady0, pady1=pady1)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 14, in _get_plugin
    
    2021-04-06 08:32:44 UTC --     return custom_ops.get_plugin(os.path.splitext(__file__)[0] + '.cu')
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/custom_ops.py", line 162, in get_plugin
    
    2021-04-06 08:32:44 UTC --     plugin = tf.load_op_library(bin_file)
    
    2021-04-06 08:32:44 UTC --   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
    
    2021-04-06 08:32:44 UTC --     lib_handle = py_tf.TF_LoadLibrary(library_filename)
    
    2021-04-06 08:32:44 UTC -- tensorflow.python.framework.errors_impl.NotFoundError: /home/test/dnnlib/tflib/_cudacache/upfirdn_2d_1.14_.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
    
    2021-04-06 08:32:44 UTC -- error building image: error building stage: failed to execute command: waiting for process to exit: exit status 1
    

    Please help to check and advise. Thanks!

    opened by arsyad-ah 5
  • Cannot utilize multiple CPU cores

    Cannot utilize multiple CPU cores

    Hi-

    Thank you for making such a fascinating project available here!

    I'm trying to run ganformer within a conda environment, but am having problems getting ganformer to utilize multiple CPU cores.

    Using Ubuntu 20.04. Here is the setup for the conda environment used:

    conda create --name cuda10 python=3.7
    conda activate cuda10
    conda install tensorflow-gpu=1.14
    conda install pillow h5py requests tqdm termcolor seaborn
    pip install opencv-python lmdb gdown easydict
    

    To run it

    python gansformer/run_network.py --train --pretrained-pkl None --gpus 0,1 --ganformer-default --expname myDS_256 --dataset myDS --data-dir /data/myDS_256_tf --keep-samples --metrics none --result-dir training_runs/256_c1/ --num-threads 24 --minibatch-size 16
    

    Everything seems to be running correctly, there are no errors or crashes. The only problem is slow training initialization and low GPU utilization during training. System Monitor shows that only one CPU core is used at a time, so I'm guessing this is the cause of both issues. Do you have any ideas of what might be causing the restriction to a single CPU core?

    I always try to avoid raising an issue when something obvious might be wrong on my end, but this is my first time using conda so it might be that I'm simply using it incorrectly, or that I'm using your program incorrectly. I appreciate your patience if that is the case.

    Thank you for your attention to this issue!

    opened by abstractdonut 4
  • question on duplex attention (k means) code

    question on duplex attention (k means) code

    First, thank you for this amazing work!

    I am suspecting that an indentation is missing at the following position of the code:

    https://github.com/dorarad/gansformer/blob/3a9efa4545be25604b70560b7f491ec3633c14a3/pytorch_version/training/networks.py#L784

    The reason why it raises my suspicion is that, if the code is executed as it is, it seems like the actual key values (to_tensor) are never involved in the computation of the attention scores when k means is enabled. If I am mistaken, would you mind explain why line 787 replaces the original attention scores with the values computed here (where the embedding "to_centroids" seems to be initialized to be a mapping of the queries)?

    opened by nintendops 0
  • Training wont work, needs tensor.contrib which was removed in tf version 1.14

    Training wont work, needs tensor.contrib which was removed in tf version 1.14

    When running: python3 run_network.py --train --ganformer-default --expname test --dataset plant --eval-images-num 10000 The following error appears:

    I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-10-11 14:56:30.661744: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2022-10-11 14:56:30.690985: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2022-10-11 14:56:31.202500: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64 2022-10-11 14:56:31.202557: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64 2022-10-11 14:56:31.202565: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. Traceback (most recent call last): File "/home/ali/gansformer/run_network.py", line 15, in import pretrained_networks File "/home/ali/gansformer/pretrained_networks.py", line 4, in import dnnlib.tflib as tflib File "/home/ali/gansformer/dnnlib/tflib/init.py", line 1, in from . import autosummary File "/home/ali/gansformer/dnnlib/tflib/autosummary.py", line 23, in from . import tfutil File "/home/ali/gansformer/dnnlib/tflib/tfutil.py", line 9, in import tensorflow.contrib # requires TensorFlow 1.x! ModuleNotFoundError: No module named 'tensorflow.contrib'

    opened by AliMezher18 0
  • Hosting models on Hugging Face

    Hosting models on Hugging Face

    Hello! Thank you for open-sourcing this work, this is amazing 😊 I was wondering if you'd be interested in mirroring the pretrained model weights over on the Hugging Face model hub. I'm sure our community would love to see your work, and (among other things) hosting checkpoints on the Hub helps a lot with discoverability. We've got a guide here on how to upload models, but I'm also happy to help out with it if you'd like!

    opened by NimaBoscarino 0
  • Ganformer2

    Ganformer2

    Thanks for your brilliant work of ganformer and ganformer2! May I ask is there a roughly timeline to when the ganformer2 model would be release? Thanks for your time!

    opened by yangkang98 0
Releases(v1.5.2)
  • v1.5.2(Feb 2, 2022)

    Official Implementation of the Generative Adversarial Transformers paper, in both pytorch and tensorflow, for image and compositional scene generation. The codebase supports training, evaluation, image sampling, and variety of visualizations.

    Updates for version 1.5.2 (Feb 22, 2022): We updated the weight initialization of the PyTorch version to the intended scale, leading to a substantial improvement in the model's learning speed.

    Source code(tar.gz)
    Source code(zip)
  • v1.0(Mar 17, 2021)

    Official Implementation of the Generative Adversarial Transformers paper for image and compositional scene generation. The codebase supports training, evaluation, image sampling, and variety of visualizations.

    Source code(tar.gz)
    Source code(zip)
Owner
Drew Arad Hudson
Drew Arad Hudson
StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

null 3k Jan 8, 2023
Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder

ASEGAN: Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder 中文版简介 Readme with English Version 介绍 基于SEGAN模型的改进版本,使用自主设计的非

Nitin 53 Nov 17, 2022
[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

Shengyu Zhao 373 Jan 2, 2023
Regularizing Generative Adversarial Networks under Limited Data (CVPR 2021)

Regularizing Generative Adversarial Networks under Limited Data [Project Page][Paper] Implementation for our GAN regularization method. The proposed r

Google 148 Nov 18, 2022
Implementation for Paper "Inverting Generative Adversarial Renderer for Face Reconstruction"

StyleGAR TODO: add arxiv link Implementation of Inverting Generative Adversarial Renderer for Face Reconstruction TODO: for test Currently, some model

null 155 Oct 27, 2022
House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects

House-GAN++ Code and instructions for our paper: House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent

null 122 Dec 28, 2022
NR-GAN: Noise Robust Generative Adversarial Networks

NR-GAN: Noise Robust Generative Adversarial Networks (CVPR 2020) This repository provides PyTorch implementation for noise robust GAN (NR-GAN). NR-GAN

Takuhiro Kaneko 59 Dec 11, 2022
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

Rishikesh (ऋषिकेश) 31 Dec 8, 2022
Generating Anime Images by Implementing Deep Convolutional Generative Adversarial Networks paper

AnimeGAN - Deep Convolutional Generative Adverserial Network PyTorch implementation of DCGAN introduced in the paper: Unsupervised Representation Lear

Rohit Kukreja 23 Jul 21, 2022
π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

null 375 Dec 31, 2022
Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) in PyTorch

alias-free-gan-pytorch Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) This implementation

Kim Seonghyeon 502 Jan 3, 2023
PyTorch implementations of Generative Adversarial Networks.

This repository has gone stale as I unfortunately do not have the time to maintain it anymore. If you would like to continue the development of it as

Erik Linder-Norén 13.4k Jan 8, 2023
Image Deblurring using Generative Adversarial Networks

DeblurGAN arXiv Paper Version Pytorch implementation of the paper DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks. Our netwo

Orest Kupyn 2.2k Jan 1, 2023
Code for the paper "TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks"

TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks This is a Python3 / Pytorch implementation of TadGAN paper. The associated

Arun 92 Dec 3, 2022
Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary Differential Equations

ODE GAN (Prototype) in PyTorch Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary

Somshubra Majumdar 15 Feb 10, 2022
Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

Han Zhang 809 Dec 16, 2022
pytorch implementation for Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network arXiv:1609.04802

PyTorch SRResNet Implementation of Paper: "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network"(https://arxiv.org/abs

Jiu XU 436 Jan 9, 2023
Code for "On the Effects of Batch and Weight Normalization in Generative Adversarial Networks"

Note: this repo has been discontinued, please check code for newer version of the paper here Weight Normalized GAN Code for the paper "On the Effects

Sitao Xiang 182 Sep 6, 2021