Official PyTorch repo for JoJoGAN: One Shot Face Stylization

Last update: Dec 29, 2022

Related tags

Overview

JoJoGAN: One Shot Face Stylization

This is the PyTorch implementation of JoJoGAN: One Shot Face Stylization.

Abstract:
While there have been recent advances in few-shot image stylization, these methods fail to capture stylistic details that are obvious to humans. Details such as the shape of the eyes, the boldness of the lines, are especially difficult for a model to learn, especially so under a limited data setting. In this work, we aim to perform one-shot image stylization that gets the details right. Given a reference style image, we approximate paired real data using GAN inversion and finetune a pretrained StyleGAN using that approximate paired data. We then encourage the StyleGAN to generalize so that the learned style can be applied to all other images.

How to use

Everything to get started is in the colab notebook.

Citation

If you use this code or ideas from our paper, please cite our paper:

Acknowledgments

This code borrows from StyleGAN2 by rosalinity, e4e and ReStyle.

Comments

hitting google drive limits, torch/huggingface hub

hitting drive limits on downloading models, another solution other than pydrive is weights could be under project release

see torch hub

https://pytorch.org/docs/stable/hub.html

"Pretrained weights can either be stored locally in the github repo, or loadable by torch.hub.load_state_dict_from_url(). If less than 2GB, it’s recommended to attach it to a project release and use the url from the release. In the example above torchvision.models.resnet.resnet18 handles pretrained, alternatively you can put the following logic in the entrypoint definition."

and a similar example from animegan hubconf, although the weights are much smaller in size here

https://github.com/bryandlee/animegan2-pytorch/blob/main/hubconf.py

or the models can be hosted on huggingface, see

https://huggingface.co/models

opened by AK391 10
The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0

I am confused by this error. I have uploaded my own style images and used the supplied iu.jpeg file to transform. I get this error in the last cell. I have verified that the style images are 3 channel images.

How many style images are needed? Does the format of the images matter?

opened by onzie9 5
style image pairs num

Hi, I have some questions about the number of finetune data pairs. According to stylize.ipynb's part: Finetune StyleGAN, I find the variable "random_alpha" is not be used. If use only one reference style image, then I only have one pair to finetune the styleGAN？Could you plz tell me what am I doing wrong? Thanks a lot.

opened by nzhang258 4
wandb integration

for the finetuning stage in colab, wandb is useful to track metrics, see https://github.com/danielroich/PTI#weights-and-biases for example. This would be helpful when running multiple experiments in colab, if you have time, otherwise I can also look into this as a PR. Thanks

opened by AK391 3
alternative to dlib for face alignment

dlib is very slow to build, possible to use a alternative like from here https://github.com/happy-jihye/FFHQ-Alignment, for example https://github.com/happy-jihye/FFHQ-Alignment/blob/master/FFHQ-Alignmnet/ffhq-align.py

more example usage here: https://github.com/happy-jihye/FFHQ-Alignment/blob/master/FFHQ-Alignmnet/FFHQ-Alignment.ipynb

opened by AK391 3

Using another pretrained StyleGAN2

Hi,

I'm playing with your notebook (awesome work btw!) and I try to give it another pretrained GAN from Awesome Pretrained StyleGAN2.

I used the anime one (PyTorch implementation from here) but I get

TypeError                                 Traceback (most recent call last)

[<ipython-input-10-2395bdddca96>](https://localhost:8080/#) in <module>()
     22 
     23 #print(ckpt)
---> 24 generator.load_state_dict(ckpt["g"], strict=False)
     25 
     26 #@title Generate results

TypeError: 'Generator' object is not subscriptable

Do I need further operation on the model to make it compatible?

Thank you

opened by JbIPS 2

Perceptual loss: LPIPS vs. StyleGAN Discriminator

Hi!

Thanks for sharing this awesome work :-)

I'm wondering on the difference in perceptual image quality when using the LPIPS model (as stated in the paper) vs. the StyleGAN discriminator (as used in the updated collab notebook) for the perceptual loss. In your experience, what kind of difference does using the StyleGAN discriminator have on the image quality, when compared to using LPIPS?

opened by matanby 2
Add Docker environment & web demo

Hey @mchong6! 👋

I really enjoyed playing with JoJoGAN. This pull request makes it possible to run your model inside a Docker environment, which makes it easier for other people to run it. We're using an open source tool called Cog to make this process easier.

This also means we can make a web page where other people can try out your model! View it here: https://replicate.ai/mchong6/jojogan

Claim your page here so you can edit it, and we'll feature it on our website and tweet about it too. We have added some examples to the page, but do claim the page so you can own the page, customize the Example gallery as you like, and push any future updates to the web demo.

In case you're wondering who I am, I'm from Replicate, where we're trying to make machine learning reproducible. We got frustrated that we couldn't run all the really interesting ML work being done. So, we're going round implementing models we like. 😊

opened by vganapati 2
How many paired images of dataset C in the paper

Hi, thank you for sharing source code. I can't find how many paired image of dataset C. From your experiment, at least how many paired (wi, y) can have a good result?

opened by zhongtao93 2
IndexError: list index out of range

when device is set to cpu in colab and hardware accelerator is none

No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'

IndexError Traceback (most recent call last) in () 24 from tqdm import tqdm 25 import lpips ---> 26 from model import * 27 from e4e_projection import projection as e4e_projection 28 from restyle_projection import projection as restyle_projection

7 frames /usr/local/lib/python3.7/dist-packages/torch/utils/cpp_extension.py in _get_cuda_arch_flags(cflags) 1604 arch_list.append(arch) 1605 arch_list = sorted(arch_list) -> 1606 arch_list[-1] += '+PTX' 1607 else: 1608 # Deal with lists that are ' ' separated (only deal with ';' after)

IndexError: list index out of range

opened by AK391 2
Different results with automatic alignment and manual crop.

I took a picture with a face in it and generated an image using the align_face function. Then, using the same picture, I generated another image by manual cropping as mentioned in https://github.com/mchong6/JoJoGAN/issues/18#issuecomment-1008996040. When I pass both of these through the e4e_projection function and view the final image, the results are very different although the images which were passed are very similar. Do you have any idea of why this might be?

opened by 007prateekd 1
Upgrade to Cog version 0.1

The new version of Cog improves the Python API, along with several other changes. Particularly pydantic is now used for Predictor and the previous version will be deprecated.

This PR upgrades the Replicate demo and API to Cog version >= 0.1. I have already pushed this to Replicate, so you don't need to do anything for the demo to keep working :) https://replicate.com/mchong6/jojogan

opened by chenxwh 0
How to train with many references image without running out of memory?

Currently, I keep getting CUDA run out of memory error if I use more than 4 reference image with 16GB GPU. Is there a way to train with more images? I'd like the model to be more general.

opened by mfrashad 1
Does not working on cuda:1

Hello. Thank you for providing a greate code.

I have a issue on running prediction.

if i predice on cuda:1, the inference is not working.....

i trace the code step by step and I found "op/fused_act.py" load fused_bias_Act.cpp and fused_bias_act_kernel.cu. that cpp code cannot another gpu...

how can i predict with other gpu?.....

opened by Mombin 0
Regarding style modulation layers, style parameters and controlling low-level features.
Hey, great work! I had a couple of queries:

In the paper it is mentioned that there are 26 style modulation layers, but in the code it seems to be 18 as n_latent = 18.

What exactly do the style parameters s(w) correspond to in the code?

For the pertained styles, is there any way to control low-level features like eyes, nose, etc. without training again? I know that while fine-tuning we can use blending (using RIS) and different masks for controlling them but is there any way for a model which is already fine-tuned?

I see a change in results when fine-tuning the model using JoJo's photo. Is it because of e4e being used instead of ReStyle in the code?
opened by 007prateekd 2

Owner

GitHub

Official code for paper Exemplar Based 3D Portrait Stylization.

3D-Portrait-Stylization This is the official code for the paper "Exemplar Based 3D Portrait Stylization". You can check the paper on our project websi

60 Dec 7, 2022

A few stylization coreML models that I've trained with CreateML

CoreML-StyleTransfer A few stylization coreML models that I've trained with CreateML You can open and use the .mlmodel files in the "models" folder in

8 Aug 18, 2022

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python >= 3.6 , Pytorch >

84 Jan 3, 2023

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

97 Dec 23, 2022

Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

One-Shot Voice Conversion with Weight Adaptive Instance Normalization By Shengjie Huang, Yanyan Xu*, Dengfeng Ke*, Mingjie Chen, Thomas Hain. This rep

31 Dec 7, 2022

A PyTorch Implementation of Single Shot Scale-invariant Face Detector.

S³FD: Single Shot Scale-invariant Face Detector A PyTorch Implementation of Single Shot Scale-invariant Face Detector. Eval python wider_eval_pytorch.

235 Jan 7, 2023

Pytorch implementation of One-Shot Affordance Detection

One-shot Affordance Detection PyTorch implementation of our one-shot affordance detection models. This repository contains PyTorch evaluation code, tr

46 Dec 12, 2022

Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

One-Shot Free-View Neural Talking Head Synthesis Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Vide

406 Dec 23, 2022

Code repo for EMNLP21 paper "Zero-Shot Information Extraction as a Unified Text-to-Triple Translation"

Zero-Shot Information Extraction as a Unified Text-to-Triple Translation Source code repo for paper Zero-Shot Information Extraction as a Unified Text

88 Dec 31, 2022

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation Figure 1: We estimate the 6DoF rigid transformation of a 3D face (rendered in si

519 Dec 29, 2022

Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection The official PyTorch implementation for HLA-Face: Joint High-Low Adaptation for Low L

77 Dec 8, 2022

DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition, TPAMI 2021

DVG-Face: Dual Variational Generation for HFR This repo is a PyTorch implementation of DVG-Face: Dual Variational Generation for Heterogeneous Face Re

52 Dec 30, 2022

[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

SADRNet Paper link: SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction Requirements python

Multimedia Computing Group, Nanjing University

99 Dec 30, 2022

Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

Realtime Face Anti-Spoofing Detection ?? Realtime Face Anti Spoofing Detection with Face Detector to detect real and fake faces Please star this repo

86 Aug 3, 2022

Swapping face using Face Mesh with TensorFlow Lite

17 Apr 26, 2022

Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels.

The Face Synthetics dataset Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels. It was introduced in ou

608 Jan 2, 2023

Face Library is an open source package for accurate and real-time face detection and recognition

Face Library Face Library is an open source package for accurate and real-time face detection and recognition. The package is built over OpenCV and us

52 Nov 9, 2022

VGGFace2-HQ - A high resolution face dataset for face editing purpose

The first open source high resolution dataset for face swapping!!! A high resolution version of VGGFace2 for academic face editing purpose

232 Dec 29, 2022

A large-scale face dataset for face parsing, recognition, generation and editing.

CelebAMask-HQ [Paper] [Demo] CelebAMask-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA da

1.7k Dec 26, 2022