Official tensorflow implementation for CVPR2020 paper “Learning to Cartoonize Using White-box Cartoon Representations”

Last update: Dec 31, 2022

Related tags

Deep Learning White-box-Cartoonization

Overview

[CVPR2020]Learning to Cartoonize Using White-box Cartoon Representations

Tensorflow implementation for CVPR2020 paper “Learning to Cartoonize Using White-box Cartoon Representations”.
Improved method for facial images are now available:
https://github.com/SystemErrorWang/FacialCartoonization

Use cases

Scenery

Food

Indoor Scenes

People

More Images Are Shown In The Supplementary Materials

Online demo

Some kind people made online demo for this project
Demo link: https://cartoonize-lkqov62dia-de.a.run.app/cartoonize
Code: https://github.com/experience-ml/cartoonize
Sample Demo: https://www.youtube.com/watch?v=GqduSLcmhto&feature=emb_title

Prerequisites

Training code: Linux or Windows
NVIDIA GPU + CUDA CuDNN for performance
Inference code: Linux, Windows and MacOS

How To Use

Installation

Assume you already have NVIDIA GPU and CUDA CuDNN installed
Install tensorflow-gpu, we tested 1.12.0 and 1.13.0rc0
Install scikit-image==0.14.5, other versions may cause problems

Inference with Pre-trained Model

Store test images in /test_code/test_images
Run /test_code/cartoonize.py
Results will be saved in /test_code/cartoonized_images

Train

Place your training data in corresponding folders in /dataset
Run pretrain.py, results will be saved in /pretrain folder
Run train.py, results will be saved in /train_cartoon folder
Codes are cleaned from production environment and untested
There may be minor problems but should be easy to resolve
Pretrained VGG_19 model can be found at following url: https://drive.google.com/file/d/1j0jDENjdwxCDb36meP6-u5xDBzmKBOjJ/view?usp=sharing

Datasets

Due to copyright issues, we cannot provide cartoon images used for training
However, these training datasets are easy to prepare
Scenery images are collected from Shinkai Makoto, Miyazaki Hayao and Hosoda Mamoru films
Clip films into frames and random crop and resize to 256x256
Portrait images are from Kyoto animations and PA Works
We use this repo(https://github.com/nagadomi/lbpcascade_animeface) to detect facial areas
Manual data cleaning will greatly increace both datasets quality

Acknowledgement

We are grateful for the help from Lvmin Zhang and Style2Paints Research

License

Copyright (C) Xinrui Wang All rights reserved. Licensed under the CC BY-NC-SA 4.0
license (https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode).
Commercial application is prohibited, please remain this license if you clone this repo

Citation

If you use this code for your research, please cite our paper:

@InProceedings{Wang_2020_CVPR, author = {Wang, Xinrui and Yu, Jinze}, title = {Learning to Cartoonize Using White-Box Cartoon Representations}, booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2020} }

中文社区

我们有一个除了技术什么东西都聊的以技术交流为主的宇宙超一流二次元相关技术交流吹水群“纸片协会”。如果你一次加群失败，可以多次尝试。

纸片协会总舵：184467946

Comments

The training hyper-parameters
Hello, Wang. I'm very interested in your job. I have two questions. Could you help me?

Q1: I tried the training with λ1=1, λ2=10, λ3=λ4=2000, λ5=10000 as your paper suggests. But the result is not that good. I'm not sure it's the question of dataset or hyper-parameters?

Q2: What's more, the hyper-parameters in the selective search is not sure. I tried seg_num=200, power=1.2, γ1=20, γ2=40 and the output of image was very black, as the following: The result of simple search: After that, I tried seg_num=1500, power=0.35, γ1=20, γ2=40 and the output of image was not that bad:

I know the value of pixel must be too large with power=1.2. So I just want to make sure that the parameters is suitable?

Thanks very much.
opened by Xinxiang7 6
Issues about surface representation

Hi, An interesting topic and a great model! Thanks for sharing. Here is an issue about surface representation after I read the paper: How do you define the surface representation F_dgf? I didn't find the definition in section3.1 and after.

I'm not quite expert with tf v1, if my understand corrects, L_total in the section 3.4 paper is related to "g_loss_total = 1e4tv_loss + 1e-1g_loss_blur + g_loss_gray + 2e2*recon_loss" line 77, train.py, Right? Well, how to map the four items to the section 3.4?

One more general question, I think this model contributes a combination of losses functions. So my question is why you design these 5 items or 3 parts: structure, texture, and surface? Is there any reference supporting, because I always confused with these terminologies.

Thanks.

opened by lhanappa 6
texture transfer fluff trees to anime style

Hi, experts That's a nice work and I finished to rewrite your code using pytorch and the training result as attached file. I found out network try transfer to more "smooth area" on trees especially fluff May you give me some suggestion about whether to decrease surface weight or superpixel weight? I have eamail you last week, please check if you are free, thanks

opened by BossunWang 5
G(I_p)貌似都被替换成了F_dgf(G(I_p))?
作者你好，按照论文里面的描述的结构损失和纹理损失中，输入的生成图像应该是没有进行guided filter的，但是train.py中，生成器生成的图像都被替换成了滤波后的图像，这样后面只要用到G(I_p)的地方都被改成了F_dgf(G(I_p))，这是本来所期望的吗？

output = network.unet_generator(input_photo) output = guided_filter(input_photo, output, r=1)
opened by zhen8838 4
AttributeError: module 'tensorflow.python.framework.ops' has no attribute 'RegisterShape'

hello, i`m trying to run cartoonize.py or pretrain.py and everytime i have same error:

python cartoonize.py 2020-07-30 13:32:57.785610: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll Traceback (most recent call last): File "cartoonize.py", line 5, in import network File "C:\Users\Admin\Desktop\dfl\White-box-Cartoonization-master\White-box-Cartoonization-master\test_code\network.py", line 3, in import tensorflow.contrib.slim as slim File "C:\Users\Admin\Anaconda3\lib\site-packages\tensorflow\contrib_init_.py", line 40, in from tensorflow.contrib import coder File "C:\Users\Admin\Anaconda3\lib\site-packages\tensorflow\contrib\coder_init_.py", line 22, in from tensorflow.contrib.coder.python.ops.coder_ops import * File "C:\Users\Admin\Anaconda3\lib\site-packages\tensorflow\contrib\coder\python\ops\coder_ops.py", line 22, in from tensorflow.contrib.coder.python.ops import gen_coder_ops File "C:\Users\Admin\Anaconda3\lib\site-packages\tensorflow\contrib\coder\python\ops\gen_coder_ops.py", line 99, in _ops.RegisterShape("PmfToQuantizedCdf")(None) AttributeError: module 'tensorflow.python.framework.ops' has no attribute 'RegisterShape'

more info: cuda is 10.1 scikit-image==0.14.5 tensorflow-gpu==1.12.0

opened by 2dlabharryharry 4
No module named 'util'

raceback (most recent call last): File "train.py", line 11, in import utils File "E:\White-box-Cartoonization\train_code\utils.py", line 11, in from selective_search.util import switch_color_space File "E:\White-box-Cartoonization\train_code\selective_search_init_.py", line 1, in from .core import selective_search, box_filter File "E:\White-box-Cartoonization\train_code\selective_search\core.py", line 3, in from util import oversegmentation, switch_color_space, load_strategy ModuleNotFoundError: No module named 'util'

Windows10 2004，i7-9750H+1660TI python3.6.8 TensorFlow1.12.0/1.13.0均报此错误但是可以正常使用预训练模型进行推理

opened by ferretgeek 4
Datasets

Very interesting work and results. I have a couple of questions. From the paper:

For cartoon images, we collect 10000 images from animations for the human face and 10000 images for landscape. Producers of collected animations include Kyoto animation, P.A.Works, Shinkai Makoto, Hosoda Mamoru, and Miyazaki Hayao.

Can you share more details about how were these images collected? 10000 images of what size? Any particular algorithm of what images are used and what images are discarded from the animations? Any kind of balance in the dataset? Buildings, nature, etc.

Another question, in different parts of the paper, the code and the readme VGG19 and VGG16 appears to be used interchangeably, but they are not the same. Which one was used VGG19 or VGG16? Was it fine-tuned in any way or only used the stock pretrained model to extract the high level features?

Lastly, where in the code is the style interpolation used? Or is it only for inference to interpolate models trained with different loss weights?

opened by victorca25 4
Error running training.py

Dear wang, I really appreciate the work you have done.I run pretrain.py and got model in pretrain folder but got an error when i run train.py. The error is like this Traceback (most recent call last): File "train.py", line 206, in train(args)
File "train.py", line 193, in train str(total_iter)+'_face_result.jpg', 4) File "/content/drive/My Drive/cartoon/wbc_100/White-box-Cartoonization/train_code/utils.py", line 160, in write_batch_image image[k] = (image[k]+1) * 127.5 IndexError: index 1 is out of bounds for axis 0 with size 1]

opened by Alby-Thomas 3
Pretrained model download url requests

Hi，thanks for your great work！ I wanted to use pretrained-model to run demo, but found the url (https://drive.google.com/open?id=1JfJzJbNjAWBIHGm9mc_R9dXv7DAw3tZc) is already not available. I want to ask whethre you can release a new url for pretrained model. Hope for your reply, thank you very much!

opened by zomkey 3
Getting nan discriminator and generator loss

Hi, Thank you for uploading such a great work.

I am training the model with my custom dataset, only for portraits. I followed the steps mentioned, the pretrain.py runs properly and saves the model. But, on running train.py I am getting both discriminator and generator losses as NaN. The reconstruction loss does have some value, not sure why I am getting this.

Your help would mean a lot.

One more question. What should be the size of the dataset in order to get decent results. For instance, Paper mentioned use of 10000 cartoon faces, Is it possible to get great results with a smaller dataset?

opened by Nerdyvedi 3
pytorch porting of the (tensorflow) pretrained network
Hello,

I successfully ported your tensorflow pretrained network to pure pytorch model. Check "pytorch_test" folder

JOB description:

convert tf checkpoint files to a python dictionary like, {"generator.Conv.weight": weight_params(np.ndarray), ...}

write a pytorch model with keeping its modules having the same name with the keys of the dictionary (ex. self.Conv = nn.Conv2d and so on ...)

test model loading function & test on images...
opened by ncianeo 2
no healthy upstream

https://master-white-box-cartoonization-psi1104.endpoint.ainize.ai/predict

API is showing "no healthy upstream"

Please solve it as soo as possible

opened by jainmonica123 0
Questions about the material surface. I want more details when output

I am very impressed with what you do. Let me ask a few questions about the material surface

What if I want to output an image whose texture doesn't look smooth? Its texture has a lot of detail, like in the movie Weathering with you. I'll leave the illustration below. Usually the result after training is complete, the output will look smooth. Is there a way to make it look like in the picture? I pay much attention to the interior of the house, the road, the trees and I don't pay much attention to the face and the pupils.

opened by D-Mad 0
ModuleNotFoundError: No module named 'tensorflow.contrib'

it gives me error ModuleNotFoundError: No module named 'tensorflow.contrib' is it because of tenserflow version? can anything be done for windows 10 users?

opened by dreamhacking 0
Ported to TFJS and running in Browser using WebGL backend or in NodeJS with CUDA

This is just FYI, feel free to close to the issue

Code: https://github.com/vladmandic/anime
Live demo (using WebCam): https://vladmandic.github.io/anime/public

Running in real-time at 10+ fps using 720x720 video with nVidia RTX3060
Model weights are quantized to F16 to reduce size

opened by vladmandic 0
make images more cartoonized

For some reason, I need to use more cartoonization that makes the images more unclear compared to input images. Is there any way to do this without retraining? For example by changing some parameters. I'm a novice in AI and I would be appreciated it if you guided me through doing this.

opened by hosseinm1997 0

Owner

Interested in I2I translation，style transfer and ACG related ML applications

GitHub

Annotated notes and summaries of the TensorFlow white paper, along with SVG figures and links to documentation

TensorFlow White Paper Notes Features Notes broken down section by section, as well as subsection by subsection Relevant links to documentation, resou

437 Oct 9, 2022

This is an implementation for the CVPR2020 paper "Learning Invariant Representation for Unsupervised Image Restoration"

Learning Invariant Representation for Unsupervised Image Restoration (CVPR 2020) Introduction This is an implementation for the paper "Learning Invari

88 Nov 7, 2022

Pytorch implementation of CVPR2020 paper “VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation”

VectorNet Re-implementation This is the unofficial pytorch implementation of CVPR2020 paper "VectorNet: Encoding HD Maps and Agent Dynamics from Vecto

120 Jan 6, 2023

An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020

UnpairedSR An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020 turn RCAN(modified) --> xmodel(xilinx

10 Oct 28, 2022

Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Black-Box-Tuning Source code for paper "Black-Box Tuning for Language-Model-as-a

149 Jan 4, 2023

Tensorflow 2 implementation of the paper: Learning and Evaluating Representations for Deep One-class Classification published at ICLR 2021

Deep Representation One-class Classification (DROC). This is not an officially supported Google product. Tensorflow 2 implementation of the paper: Lea

137 Dec 23, 2022

Code for the Active Speakers in Context Paper (CVPR2020)

Active Speakers in Context This repo contains the official code and models for the "Active Speakers in Context" CVPR 2020 paper. Before Training The c

43 Oct 14, 2022

Streamlit Tutorial (ex: stock price dashboard, cartoon-stylegan, vqgan-clip, stylemixing, styleclip, sefa)

Streamlit Tutorials Install pip install streamlit Run cd [directory] streamlit run app.py --server.address 0.0.0.0 --server.port [your port] # http:/

30 Jan 6, 2023

An addon uses SMPL's poses and global translation to drive cartoon character in Blender.

Blender addon for driving character The addon drives the cartoon character by passing SMPL's poses and global translation into model's armature in Ble

153 Dec 14, 2022

Fine-tuning StyleGAN2 for Cartoon Face Generation

Cartoon-StyleGAN ?? : Fine-tuning StyleGAN2 for Cartoon Face Generation Abstract Recent studies have shown remarkable success in the unsupervised imag

520 Jan 4, 2023

Open CV - Convert a picture to look like a cartoon sketch in python

Use the video https://www.youtube.com/watch?v=k7cVPGpnels for initial learning.

3 Jan 29, 2022

Unofficial Tensorflow 2 implementation of the paper Implicit Neural Representations with Periodic Activation Functions

Siren: Implicit Neural Representations with Periodic Activation Functions The unofficial Tensorflow 2 implementation of the paper Implicit Neural Repr

2 Jun 27, 2022

Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination

Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm under Mixed Illumination (ICCV 2021) Dataset License This work is l

33 Jan 4, 2023

A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squares.

W.I.P-Aim-Memory-Game A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squar

1 Dec 8, 2021

This Artificial Intelligence program can take a black and white/grayscale image and generate a realistic or plausible colorized version of the same picture.

Colorizer The point of this project is to write a program capable of taking a black and white / grayscale image, and generating a realistic or plausib

1 Jan 6, 2022

[ICCV'21] Official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations

CrowdNav with Social-NCE This is an official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations by

125 Dec 23, 2022