Official PyTorch repo for JoJoGAN: One Shot Face Stylization

Overview

JoJoGAN: One Shot Face Stylization

This is the PyTorch implementation of JoJoGAN: One Shot Face Stylization. Open In Colab

Abstract:
While there have been recent advances in few-shot image stylization, these methods fail to capture stylistic details that are obvious to humans. Details such as the shape of the eyes, the boldness of the lines, are especially difficult for a model to learn, especially so under a limited data setting. In this work, we aim to perform one-shot image stylization that gets the details right. Given a reference style image, we approximate paired real data using GAN inversion and finetune a pretrained StyleGAN using that approximate paired data. We then encourage the StyleGAN to generalize so that the learned style can be applied to all other images.

How to use

Everything to get started is in the colab notebook.

Citation

If you use this code or ideas from our paper, please cite our paper:

Acknowledgments

This code borrows from StyleGAN2 by rosalinity, e4e and ReStyle.

Comments
  • hitting google drive limits, torch/huggingface hub

    hitting google drive limits, torch/huggingface hub

    hitting drive limits on downloading models, another solution other than pydrive is weights could be under project release

    see torch hub

    https://pytorch.org/docs/stable/hub.html

    "Pretrained weights can either be stored locally in the github repo, or loadable by torch.hub.load_state_dict_from_url(). If less than 2GB, it’s recommended to attach it to a project release and use the url from the release. In the example above torchvision.models.resnet.resnet18 handles pretrained, alternatively you can put the following logic in the entrypoint definition."

    and a similar example from animegan hubconf, although the weights are much smaller in size here

    https://github.com/bryandlee/animegan2-pytorch/blob/main/hubconf.py

    or the models can be hosted on huggingface, see

    https://huggingface.co/models

    opened by AK391 10
  • The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0

    The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0

    I am confused by this error. I have uploaded my own style images and used the supplied iu.jpeg file to transform. I get this error in the last cell. I have verified that the style images are 3 channel images.

    How many style images are needed? Does the format of the images matter?

    opened by onzie9 5
  • style image pairs num

    style image pairs num

    Hi, I have some questions about the number of finetune data pairs. According to stylize.ipynb's part: Finetune StyleGAN, I find the variable "random_alpha" is not be used. If use only one reference style image, then I only have one pair to finetune the styleGAN?Could you plz tell me what am I doing wrong? Thanks a lot.

    opened by nzhang258 4
  • wandb integration

    wandb integration

    for the finetuning stage in colab, wandb is useful to track metrics, see https://github.com/danielroich/PTI#weights-and-biases for example. This would be helpful when running multiple experiments in colab, if you have time, otherwise I can also look into this as a PR. Thanks

    opened by AK391 3
  • alternative to dlib for face alignment

    alternative to dlib for face alignment

    dlib is very slow to build, possible to use a alternative like from here https://github.com/happy-jihye/FFHQ-Alignment, for example https://github.com/happy-jihye/FFHQ-Alignment/blob/master/FFHQ-Alignmnet/ffhq-align.py

    more example usage here: https://github.com/happy-jihye/FFHQ-Alignment/blob/master/FFHQ-Alignmnet/FFHQ-Alignment.ipynb

    opened by AK391 3
  • Using another pretrained StyleGAN2

    Using another pretrained StyleGAN2

    Hi,

    I'm playing with your notebook (awesome work btw!) and I try to give it another pretrained GAN from Awesome Pretrained StyleGAN2.

    I used the anime one (PyTorch implementation from here) but I get

    TypeError                                 Traceback (most recent call last)
    
    [<ipython-input-10-2395bdddca96>](https://localhost:8080/#) in <module>()
         22 
         23 #print(ckpt)
    ---> 24 generator.load_state_dict(ckpt["g"], strict=False)
         25 
         26 #@title Generate results
    
    TypeError: 'Generator' object is not subscriptable
    

    Do I need further operation on the model to make it compatible?

    Thank you

    opened by JbIPS 2
  • Perceptual loss: LPIPS vs. StyleGAN Discriminator

    Perceptual loss: LPIPS vs. StyleGAN Discriminator

    Hi!

    Thanks for sharing this awesome work :-)

    I'm wondering on the difference in perceptual image quality when using the LPIPS model (as stated in the paper) vs. the StyleGAN discriminator (as used in the updated collab notebook) for the perceptual loss. In your experience, what kind of difference does using the StyleGAN discriminator have on the image quality, when compared to using LPIPS?

    opened by matanby 2
  • Add Docker environment & web demo

    Add Docker environment & web demo

    Hey @mchong6! 👋

    I really enjoyed playing with JoJoGAN. This pull request makes it possible to run your model inside a Docker environment, which makes it easier for other people to run it. We're using an open source tool called Cog to make this process easier.

    This also means we can make a web page where other people can try out your model! View it here: https://replicate.ai/mchong6/jojogan

    Claim your page here so you can edit it, and we'll feature it on our website and tweet about it too. We have added some examples to the page, but do claim the page so you can own the page, customize the Example gallery as you like, and push any future updates to the web demo.

    In case you're wondering who I am, I'm from Replicate, where we're trying to make machine learning reproducible. We got frustrated that we couldn't run all the really interesting ML work being done. So, we're going round implementing models we like. 😊

    opened by vganapati 2
  • How many paired images of dataset C in the paper

    How many paired images of dataset C in the paper

    Hi, thank you for sharing source code. I can't find how many paired image of dataset C. From your experiment, at least how many paired (wi, y) can have a good result?

    opened by zhongtao93 2
  • IndexError: list index out of range

    IndexError: list index out of range

    when device is set to cpu in colab and hardware accelerator is none

    No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'

    IndexError Traceback (most recent call last) in () 24 from tqdm import tqdm 25 import lpips ---> 26 from model import * 27 from e4e_projection import projection as e4e_projection 28 from restyle_projection import projection as restyle_projection

    7 frames /usr/local/lib/python3.7/dist-packages/torch/utils/cpp_extension.py in _get_cuda_arch_flags(cflags) 1604 arch_list.append(arch) 1605 arch_list = sorted(arch_list) -> 1606 arch_list[-1] += '+PTX' 1607 else: 1608 # Deal with lists that are ' ' separated (only deal with ';' after)

    IndexError: list index out of range

    opened by AK391 2
  • Different results with automatic alignment and manual crop.

    Different results with automatic alignment and manual crop.

    I took a picture with a face in it and generated an image using the align_face function. Then, using the same picture, I generated another image by manual cropping as mentioned in https://github.com/mchong6/JoJoGAN/issues/18#issuecomment-1008996040. When I pass both of these through the e4e_projection function and view the final image, the results are very different although the images which were passed are very similar. Do you have any idea of why this might be?

    opened by 007prateekd 1
  • Upgrade to Cog version 0.1

    Upgrade to Cog version 0.1

    The new version of Cog improves the Python API, along with several other changes. Particularly pydantic is now used for Predictor and the previous version will be deprecated.

    This PR upgrades the Replicate demo and API to Cog version >= 0.1. I have already pushed this to Replicate, so you don't need to do anything for the demo to keep working :) https://replicate.com/mchong6/jojogan

    opened by chenxwh 0
  • How to train with many references image without running out of memory?

    How to train with many references image without running out of memory?

    Currently, I keep getting CUDA run out of memory error if I use more than 4 reference image with 16GB GPU. Is there a way to train with more images? I'd like the model to be more general.

    opened by mfrashad 1
  • Does not working on cuda:1

    Does not working on cuda:1

    Hello. Thank you for providing a greate code.

    I have a issue on running prediction.

    if i predice on cuda:1, the inference is not working.....

    i trace the code step by step and I found "op/fused_act.py" load fused_bias_Act.cpp and fused_bias_act_kernel.cu. that cpp code cannot another gpu...

    how can i predict with other gpu?.....

    opened by Mombin 0
  • Regarding style modulation layers, style parameters and controlling low-level features.

    Regarding style modulation layers, style parameters and controlling low-level features.

    Hey, great work! I had a couple of queries:

    1. In the paper it is mentioned that there are 26 style modulation layers, but in the code it seems to be 18 as n_latent = 18.
    2. What exactly do the style parameters s(w) correspond to in the code?
    3. For the pertained styles, is there any way to control low-level features like eyes, nose, etc. without training again? I know that while fine-tuning we can use blending (using RIS) and different masks for controlling them but is there any way for a model which is already fine-tuned?
    4. I see a change in results when fine-tuning the model using JoJo's photo. Is it because of e4e being used instead of ReStyle in the code?
    opened by 007prateekd 2
Owner
null
Official code for paper Exemplar Based 3D Portrait Stylization.

3D-Portrait-Stylization This is the official code for the paper "Exemplar Based 3D Portrait Stylization". You can check the paper on our project websi

null 60 Dec 7, 2022
A few stylization coreML models that I've trained with CreateML

CoreML-StyleTransfer A few stylization coreML models that I've trained with CreateML You can open and use the .mlmodel files in the "models" folder in

Doron Adler 8 Aug 18, 2022
Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python >= 3.6 , Pytorch >

FuxiVirtualHuman 84 Jan 3, 2023
Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

Clova AI Research 97 Dec 23, 2022
Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

One-Shot Voice Conversion with Weight Adaptive Instance Normalization By Shengjie Huang, Yanyan Xu*, Dengfeng Ke*, Mingjie Chen, Thomas Hain. This rep

null 31 Dec 7, 2022
A PyTorch Implementation of Single Shot Scale-invariant Face Detector.

S³FD: Single Shot Scale-invariant Face Detector A PyTorch Implementation of Single Shot Scale-invariant Face Detector. Eval python wider_eval_pytorch.

carwin 235 Jan 7, 2023
Pytorch implementation of One-Shot Affordance Detection

One-shot Affordance Detection PyTorch implementation of our one-shot affordance detection models. This repository contains PyTorch evaluation code, tr

null 46 Dec 12, 2022
Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

One-Shot Free-View Neural Talking Head Synthesis Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Vide

ZLH 406 Dec 23, 2022
Code repo for EMNLP21 paper "Zero-Shot Information Extraction as a Unified Text-to-Triple Translation"

Zero-Shot Information Extraction as a Unified Text-to-Triple Translation Source code repo for paper Zero-Shot Information Extraction as a Unified Text

cgraywang 88 Dec 31, 2022
img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation Figure 1: We estimate the 6DoF rigid transformation of a 3D face (rendered in si

Vítor Albiero 519 Dec 29, 2022
Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection The official PyTorch implementation for HLA-Face: Joint High-Low Adaptation for Low L

Wenjing Wang 77 Dec 8, 2022
DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition, TPAMI 2021

DVG-Face: Dual Variational Generation for HFR This repo is a PyTorch implementation of DVG-Face: Dual Variational Generation for Heterogeneous Face Re

null 52 Dec 30, 2022
[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

SADRNet Paper link: SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction Requirements python

Multimedia Computing Group, Nanjing University 99 Dec 30, 2022
Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

Realtime Face Anti-Spoofing Detection ?? Realtime Face Anti Spoofing Detection with Face Detector to detect real and fake faces Please star this repo

Prem Kumar 86 Aug 3, 2022
Swapping face using Face Mesh with TensorFlow Lite

Swapping face using Face Mesh with TensorFlow Lite

iwatake 17 Apr 26, 2022
Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels.

The Face Synthetics dataset Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels. It was introduced in ou

Microsoft 608 Jan 2, 2023
Face Library is an open source package for accurate and real-time face detection and recognition

Face Library Face Library is an open source package for accurate and real-time face detection and recognition. The package is built over OpenCV and us

null 52 Nov 9, 2022
VGGFace2-HQ - A high resolution face dataset for face editing purpose

The first open source high resolution dataset for face swapping!!! A high resolution version of VGGFace2 for academic face editing purpose

Naiyuan Liu 232 Dec 29, 2022
A large-scale face dataset for face parsing, recognition, generation and editing.

CelebAMask-HQ [Paper] [Demo] CelebAMask-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA da

switchnorm 1.7k Dec 26, 2022