π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

Related tags

Deep Learning pi-GAN
Overview

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

Project Page | Paper | Data

Eric Ryan Chan*, Marco Monteiro*, Petr Kellnhofer, Jiajun Wu, Gordon Wetzstein
*denotes equal contribution

This is the official implementation of the paper "π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis".

π-GAN is a novel generative model for high-quality 3D aware image synthesis.

results2.mp4

Training a Model

The main training script can be found in train.py. Majority of hyperparameters for training and evaluation are set in the curriculums.py file. (see file for more details) We provide recommended curriculums for CelebA, Cats, and CARLA.

Relevant Flags:

Set the output directory: --output_dir=[output directory]

Set the model loading directory: --load_dir=[load directory]

Set the current training curriculum: --curriculum=[curriculum]

Set the port for distributed training: --port=[port]

To start training:

On one GPU for CelebA: CUDA_VISIBLE_DEVICES=0 python3 train.py --curriculum CelebA --output_dir celebAOutputDir

On multiple GPUs, simply list cuda visible devices in a comma-separated list: CUDA_VISIBLE_DEVICES=1,3 python3 train.py --curriculum CelebA --output_dir celebAOutputDir

To continue training from another run specify the --load_dir=path/to/directory flag.

Model Results and Evaluation

Evaluation Metrics

To generate real images for evaluation run python fid_evaluation --dataset CelebA --img_size 128 --num_imgs 8000. To calculate fid/kid/inception scores run python eval_metrics.py path/to/generator.pth --real_image_dir path/to/real_images/directory --curriculum CelebA --num_images 8000.

Rendering Images

python render_multiview_images.py path/to/generator.pth --curriculum CelebA --seeds 0 1 2 3

For best visual results, load the EMA parameters, use truncation, increase the resolution (e.g. to 512 x 512) and increase the number of depth samples (e.g. to 24 or 36).

Rendering Videos

python render_video.py path/to/generator.pth --curriculum CelebA --seeds 0 1 2 3

You can pass the flag --lock_view_dependence to remove view dependent effects. This can help mitigate distracting visual artifacts such as shifting eyebrows. However, locking view dependence may lower the visual quality of images (edges may be blurrier etc.)

Rendering Videos Interpolating between faces

python render_video_interpolation.py path/to/generator.pth --curriculum CelebA --seeds 0 1 2 3

Extracting 3D Shapes

python3 shape_extraction.py path/to/generator.pth --curriculum CelebA --seed 0

Pretrained Models

We provide pretrained models for CelebA, Cats, and CARLA.

CelebA: https://drive.google.com/file/d/1bRB4-KxQplJryJvqyEa8Ixkf_BVm4Nn6/view?usp=sharing

Cats: https://drive.google.com/file/d/1WBA-WI8DA7FqXn7__0TdBO0eO08C_EhG/view?usp=sharing

CARLA: https://drive.google.com/file/d/1n4eXijbSD48oJVAbAV4hgdcTbT3Yv4xO/view?usp=sharing

All zipped model files contain a generator.pth, ema.pth, and ema2.pth files. ema.pth used a decay of 0.999 and ema2.pth used a decay of 0.9999. All evaluation scripts will by default load the EMA from the file named ema.pth in the same directory as the generator.pth file.

Training Tips

If you have the resources, increasing the number of samples (steps) per ray will dramatically increase the quality of your 3D shapes. If you're looking for good shapes, e.g. for CelebA, try increasing num_steps and moving the back plane (ray_end) to allow the model to move the background back and capture the full head.

Training has been tested to work well on either two RTX 6000's or one RTX 8000. Training with smaller GPU's and batch sizes generally works fine, but it's also possible you'll encounter instability, especially at higher resolutions. Bubbles and artifacts that suddenly appear, or blurring in the tilted angles, are signs that training destabilized. This can usually be mitigated by training with a larger batch size or by reducing the learning rate.

Since the original implementation we added a pose identity component to the loss. Controlled by pos_lambda in the curriculum, the pose idedntity component helps ensure generated scenes share the same canonical pose. Empirically, it seems to improve 3D models, but may introduce a minor decrease in image quality scores.

Citation

If you find our work useful in your research, please cite:

@inproceedings{piGAN2021,
  title={pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis},
  author={Eric Chan and Marco Monteiro and Petr Kellnhofer and Jiajun Wu and Gordon Wetzstein},
  year={2021},
  booktitle={Proc. CVPR},
}
Comments
  • TypeError: can't pickle weakref objects

    TypeError: can't pickle weakref objects

    I fixed it by adding .state_dict(): torch.save(ema.state_dict(), os.path.join(opt.output_dir, now + 'ema.pth')) torch.save(ema2.state_dict(), os.path.join(opt.output_dir, now + 'ema2.pth'))

    opened by zzw-zwzhang 6
  • Can you share pretrained model for discriminator?

    Can you share pretrained model for discriminator?

    Hey, thank you for your splendid work! After some tests and diving into your example codes, I am shocked by your model's performance on generating consistent and truthful faces on different views quickly. A Neural Radiance Field representing human faces of diverse situations is reached. Thus, I want to write an encoder to project real faces into w space quickly and faithfully. However, during training, I found that the discriminator in pi-GAN may help the encoder perform better since it not only outputs whether faces are real or fake, which can be used as an additional adversarial loss term, but also outputs pitch and yaw, which should be used for parameters when rendering. Tilted faces from real dataset seem fail to be encoded well into w space, given h_mean and v_mean are set to pi / 2. Of course, I could train a discriminator of pi-GAN by myself based on pretrained generators and I am currently doing it, but because of the potential instability of initialization and the cost of time, I will really appreciate it if you could share the pretrained model for the discriminator. Thank you!

    opened by RaymondJiangkw 6
  • Multi-GPU training error while training with the `CARLA` curriculum

    Multi-GPU training error while training with the `CARLA` curriculum

    Hi,

    First of all, thanks for sharing this great work. I am currently trying to train a model on images of full-bodies of humans and is using a curriculum based on the CARLA curriculum. While training on multiple GPUS, I encountered this error:

    RuntimeError: Expected to mark a variable ready only once. This error is caused by one of the following reasons: 1) Use of a module pa
    rameter outside the `forward` function. Please make sure model parameters are not shared across multiple concurrent forward-backward p
    asses. or try to use _set_static_graph() as a workaround if this module graph does not change during training loop.2) Reused parameter
    s in multiple reentrant backward passes. For example, if you use multiple `checkpoint` functions to wrap the same part of your model,
    it would result in the same set of parameters been used by different reentrant backward passes multiple times, and hence marking a var
    iable ready multiple times. DDP does not support such use cases in default. You can try to use _set_static_graph() as a workaround if
    your module graph does not change over iterations.
    Parameter at index 52 with name fromRGB.5.model.0.weight has been marked as ready twice. This means that multiple autograd engine  hoo
    ks have fired for this particular parameter during this iteration.
    

    I noticed that fromRGB.5.model.0.weight is a variable in the ProgressiveEncoderDiscriminator. Since I have been able to reproduce results with the Celeba curriculum, and that the CARLA curriculum uses a different Discriminator class, I think the problem can be narrowed down to the ProgressiveEncoderDiscriminator class. But currently I do not have other clues. Could you please have a look at this?

    Many thanks and best!

    opened by yzhq97 6
  • Error when Rendering Videos

    Error when Rendering Videos

    I use the command python render_video.py ckpt/CelebA/generator.pth --curriculum CelebA --seeds 0 1 2 3 in readme, it seems the pipe is broken?

    I tried to re-install scikit-video, it doesn't work.

    image

    Is there anything I am doing wrong ? Please help me out.

    opened by zhywanna 4
  • Loss curve visualization

    Loss curve visualization

    Hello, thanks for sharing this interesting work! At lines 360 and 361 in train.py, there exist

                    torch.save(generator_losses, os.path.join(opt.output_dir, 'generator.losses'))
                    torch.save(discriminator_losses, os.path.join(opt.output_dir, 'discriminator.losses'))
    

    Could you please tell me how to load the saved generator.losses file and discriminator.losses file? I have tried torch.load, pickle.load, but none of them work.

    opened by MayuOshima 4
  • Cuda OOM error during training

    Cuda OOM error during training

    Hi,

    I tried to train the code on the CARLA dataset. But I am getting Cuda out of memory error. These are the things I have tried so far:

    1. I have tried running on a single as well multiple 2080 Ti GpuS (specified using CUDA_VISIBLE_DEVICES), each with 11GB memory, but it still generates OOM error.
    2. I tried on 3090 GPU, but the code generates errors on 3090 GPU (that are not related to Cuda OOM error)
    3. I have also tried to reduce the batch size for the Carla dataset in curriculum.py from 30 to 10 as shown below. But I still get the OOM error when I run on a single or multiple 2080Ti GPUs.

    CARLA = { 0: {'batch_size': 10, 'num_steps': 48, 'img_size': 32, 'batch_split': 1, 'gen_lr': 4e-5, 'disc_lr': 4e-4}, int(10e3): {'batch_size': 14, 'num_steps': 48, 'img_size': 64, 'batch_split': 2, 'gen_lr': 2e-5, 'disc_lr': 2e-4}, int(55e3): {'batch_size': 10, 'num_steps': 48, 'img_size': 128, 'batch_split': 5, 'gen_lr': 10e-6, 'disc_lr': 10e-5}, int(200e3): {},

    Is there anything else I can do to fix the OOM error?

    thanks

    opened by athena913 4
  • train.py only works under distributed settings

    train.py only works under distributed settings

    marco, thanks for the work

    i had to fix a few things to get this to run on a single GPU setting - if you care, i added a new file train_local.py that makes single gpu training functional on a fork https://github.com/xvdp/pi-GAN, I did not combine the two to avoid excess if statements. you could want to refactor everything to call the local and distributed functions from single code but I dont know if it is worth it.

    I noticed double definitions, so i linted the whole project; I saw some missing arguments in some functions and so on. I also noticed that you have your local dataset_path for cats and carla reversed. If you want i can pull request.

    xvdp

    opened by xvdp 3
  • Puzzled by Head Position

    Puzzled by Head Position

    Hi, @ericryanchan,

    I am really curious about how you solve the head position problem. I see that real images are not paired with ground-truth head positions, thus, the network learns the head position in an unsupervised way.

    After checking your code, I find that in every turn, you sample the head position, and let the discriminator to predict the head position of the rendered faces. The output of discriminator is corrected by the sampled head position in a self-supervised way.

    I am really puzzled by this mechanism, and fail to figure out why this works. Can you help me?

    opened by RaymondJiangkw 2
  • RuntimeError: [/opt/conda/conda-bld/pytorch_1603729006826/work/third_party/gloo/gloo/transport/tcp/unbound_buffer.cc:84] Timed out waiting 1800000ms for recv operation to complete

    RuntimeError: [/opt/conda/conda-bld/pytorch_1603729006826/work/third_party/gloo/gloo/transport/tcp/unbound_buffer.cc:84] Timed out waiting 1800000ms for recv operation to complete

    Hi, Thanks for sharing your work! I got this problem when training CelebA with pi-GAN. And I don't know how to solve it. It was runned in one GPU V100-32GB with pytorch 1.7.0 and cuda 10.1.

    Exception:
    -- Process 1 terminated with the following error:
    Traceback (most recent call last):
    File "/mnt/lustre/gaosicheng/anaconda3/envs/pigan/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
    fn(i, *args)
    File "/mnt/lustre/gaosicheng/codes/pi-GAN-master/train.py", line 249, in train
    scaler.scale(d_loss).backward()
    File "/mnt/lustre/gaosicheng/anaconda3/envs/pigan/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
    File "/mnt/lustre/gaosicheng/anaconda3/envs/pigan/lib/python3.7/site-packages/torch/autograd/init.py", line 132, in allow_unreachable=True) # allow_unreachable flag
    RuntimeError: [/opt/conda/conda-bld/pytorch_1603729006826/work/third_party/gloo/gloo/transport/tcp/unbound_buffer.cc:84] Timed out waiting 1800000ms for recv operation to complete

    opened by Ree1s 2
  • generator identity penalty loss

    generator identity penalty loss

    Hi,

    Thank you very much for the well-organized repository and inspiring work! I find the codebase very clean and convenient since it has all the necessary visualization code ready.

    I'm confused about one thing though: In line 278 of train.py, there is an identity penalty loss for the generator. But as far as I understand, the gradient for identity penalty does not flow through the generator because the sampling of z and pitch yaw values is not learnable. Is that right?

    Cheers, Yufeng

    opened by zhengyuf 2
  • 3D shape extraction

    3D shape extraction

    Hi, According to the Readme, the command for shape extraction is:

    python3 shape_extraction.py path/to/generator.pth --curriculum CelebA --seed 0

    However, there is no shape_extraction.py file. Instead, please correct this as follows:

    python extract_shapes.py --seeds 0 --output_dir path/to/output path/to/generator.pth

    Also, the generator extracts the shape as a voxel in .mrc format. Could you please provide some guidance on how we can extract the 3D shape as a mesh in .obj format?

    thanks

    opened by athena913 2
  • About the lr for G and D

    About the lr for G and D

    Dear friends: I am wondering that why the lr for G and D are 5e-5 and 4e-4, respectively. Do you have any idea? I came up with this queation because I saw that in GRAF and GIRAFFE, the lr for G and D are 5e-4 and 1e-4, respectively. Does it mean that in pi-GAN the G is "more powerful" than the D? Looking forward to hear from you, thanks!

    opened by silence-tang 0
  • Does the pi-GAN discriminator receive the camera pose during training?

    Does the pi-GAN discriminator receive the camera pose during training?

    This is a question about the method/paper, not so much the implementation.

    During training, do you provide the corresponding camera pose (denoted as ξ in the paper) to the discriminator? It appears the answer is no. If this is the case, why doesn't the generator just ignore the camera pose altogether, and just learn to generate images from a random angle each time? In my mind, the discriminator wouldn't be able to tell? Perhaps you train on multiple samples with the same z per batch, enforcing that different ξ give reasonable results for the same z?

    opened by nlml 0
  • Use the same z for training G and D in each iteration.

    Use the same z for training G and D in each iteration.

    In other GAN papers, we usually train D for k steps and train G for only one step in each GAN training iteration In this case, the z used for training G and D are obviously different. But in this paper, we train G and D simultaneously, i.e., training G and D for both one step in a training iteration. I'm wondering if I could use the same z to train both two networks in each iteration to reduce the computational cost? Just as shown in Pytorch's official GAN tutorial (https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html#training):

    # Training Loop
    
    # Lists to keep track of progress
    img_list = []
    G_losses = []
    D_losses = []
    iters = 0
    
    print("Starting Training Loop...")
    # For each epoch
    for epoch in range(num_epochs):
        # For each batch in the dataloader
        for i, data in enumerate(dataloader, 0):
    
            ############################
            # (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
            ###########################
            ## Train with all-real batch
            netD.zero_grad()
            # Format batch
            real_cpu = data[0].to(device)
            b_size = real_cpu.size(0)
            label = torch.full((b_size,), real_label, dtype=torch.float, device=device)
            # Forward pass real batch through D
            output = netD(real_cpu).view(-1)
            # Calculate loss on all-real batch
            errD_real = criterion(output, label)
            # Calculate gradients for D in backward pass
            errD_real.backward()
            D_x = output.mean().item()
    
            ## Train with all-fake batch
            # Generate batch of latent vectors
            noise = torch.randn(b_size, nz, 1, 1, device=device)
            # Generate fake image batch with G
            fake = netG(noise)
            label.fill_(fake_label)
            # Classify all fake batch with D
            output = netD(fake.detach()).view(-1)
            # Calculate D's loss on the all-fake batch
            errD_fake = criterion(output, label)
            # Calculate the gradients for this batch, accumulated (summed) with previous gradients
            errD_fake.backward()
            D_G_z1 = output.mean().item()
            # Compute error of D as sum over the fake and the real batches
            errD = errD_real + errD_fake
            # Update D
            optimizerD.step()
    
            ############################
            # (2) Update G network: maximize log(D(G(z)))
            ###########################
            netG.zero_grad()
            label.fill_(real_label)  # fake labels are real for generator cost
            # Since we just updated D, perform another forward pass of all-fake batch through D
            output = netD(fake).view(-1)
            # Calculate G's loss based on this output
            errG = criterion(output, label)
            # Calculate gradients for G
            errG.backward()
            D_G_z2 = output.mean().item()
            # Update G
            optimizerG.step()
    
            # Output training stats
            if i % 50 == 0:
                print('[%d/%d][%d/%d]\tLoss_D: %.4f\tLoss_G: %.4f\tD(x): %.4f\tD(G(z)): %.4f / %.4f'
                      % (epoch, num_epochs, i, len(dataloader),
                         errD.item(), errG.item(), D_x, D_G_z1, D_G_z2))
    
            # Save Losses for plotting later
            G_losses.append(errG.item())
            D_losses.append(errD.item())
    
            # Check how the generator is doing by saving G's output on fixed_noise
            if (iters % 500 == 0) or ((epoch == num_epochs-1) and (i == len(dataloader)-1)):
                with torch.no_grad():
                    fake = netG(fixed_noise).detach().cpu()
                img_list.append(vutils.make_grid(fake, padding=2, normalize=True))
    
            iters += 1
    

    Such that we only need to call G.forward once in each iteration. Will it affect the model performance?

    opened by onpix 0
  • Inverse code can't render fine detail as in demo

    Inverse code can't render fine detail as in demo

    Hi,

    I just run the inverse_render.py with Biden's portrait as shown in your demo, but I fail to get the same reconstruct video like yours as shown below:

    image

    image

    Is there anything I can do to refine the result? If anyone knows how to solve this problem, please leave your comment! Thanks.

    opened by zhywanna 0
  • 3d mesh reconstruction quality

    3d mesh reconstruction quality

    Hi,

    On the project website I found a pretty good 3d shape rendered from the CARLA dataset, so I am trying to extract similar mesh using the file provided in this repository. However, I noticed that no matter how I change the parameters, I am not able to generate a 3d mesh with similar quality using the checkpoint provided by the authors. Here is an example using a slightly larger cube length and 512 resolution: Screen Shot 2022-04-29 at 1 12 46 PM The car itself seems fine, but I am not sure why the big mass below the car is generated. I believe this is partially due to the fact that CARLA dataset does not include images view from the bottom or lower angle? Can you provide any hint on how to generate the fine 3D mesh like the one on the project website?

    Thanks

    opened by ZHDphilip 0
Owner
null
Digan - Official PyTorch implementation of Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks

DIGAN (ICLR 2022) Official PyTorch implementation of "Generating Videos with Dyn

Sihyun Yu 147 Dec 31, 2022
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

Rishikesh (ऋषिकेश) 31 Dec 8, 2022
Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Flickr-Faces-HQ Dataset (FFHQ) Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative

NVIDIA Research Projects 2.9k Dec 28, 2022
Unofficial Tensorflow 2 implementation of the paper Implicit Neural Representations with Periodic Activation Functions

Siren: Implicit Neural Representations with Periodic Activation Functions The unofficial Tensorflow 2 implementation of the paper Implicit Neural Repr

Seyma Yucer 2 Jun 27, 2022
Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN-v2 StackGAN-v1: Tensorflow implementation StackGAN-v1: Pytorch implementation Inception score evaluation Pytorch implementation for reproduci

Han Zhang 809 Dec 16, 2022
NR-GAN: Noise Robust Generative Adversarial Networks

NR-GAN: Noise Robust Generative Adversarial Networks (CVPR 2020) This repository provides PyTorch implementation for noise robust GAN (NR-GAN). NR-GAN

Takuhiro Kaneko 59 Dec 11, 2022
Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary Differential Equations

ODE GAN (Prototype) in PyTorch Partial implementation of ODE-GAN technique from the paper Training Generative Adversarial Networks by Solving Ordinary

Somshubra Majumdar 15 Feb 10, 2022
Generate high quality pictures. GAN. Generative Adversarial Networks

ESRGAN generate high quality pictures. GAN. Generative Adversarial Networks """ Super-resolution of CelebA using Generative Adversarial Networks. The

Lieon 1 Dec 14, 2021
A method that utilized Generative Adversarial Network (GAN) to interpret the black-box deep image classifier models by PyTorch.

A method that utilized Generative Adversarial Network (GAN) to interpret the black-box deep image classifier models by PyTorch.

Yunxia Zhao 3 Dec 29, 2022
Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"

GRAF This repository contains official code for the paper GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis. You can find detailed usage i

null 349 Dec 29, 2022
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Fre-GAN Vocoder Fre-GAN: Adversarial Frequency-consistent Audio Synthesis Training: python train.py --config config.json Citation: @misc{kim2021frega

Rishikesh (ऋषिकेश) 93 Dec 17, 2022
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects

House-GAN++ Code and instructions for our paper: House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent

null 122 Dec 28, 2022
Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation

?? Depth-Aware Generative Adversarial Network for Talking Head Video Generation (CVPR 2022) ?? If DaGAN is helpful in your photos/projects, please hel

Fa-Ting Hong 503 Jan 4, 2023
StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

null 3k Jan 8, 2023
[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

Shengyu Zhao 373 Jan 2, 2023
Image Deblurring using Generative Adversarial Networks

DeblurGAN arXiv Paper Version Pytorch implementation of the paper DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks. Our netwo

Orest Kupyn 2.2k Jan 1, 2023
Semi-supervised Representation Learning for Remote Sensing Image Classification Based on Generative Adversarial Networks

SSRL-for-image-classification Semi-supervised Representation Learning for Remote Sensing Image Classification Based on Generative Adversarial Networks

Feng 2 Nov 19, 2021
Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis, including human motion imitation, appearance transfer, and novel view synthesis. Currently the paper is under review of IEEE TPAMI. It is an extension of our previous ICCV project impersonator, and it has a more powerful ability in generalization and produces higher-resolution results (512 x 512, 1024 x 1024) than the previous ICCV version.

null 2.3k Jan 5, 2023