Chunkmogrify: Real image inversion via Segments

David Futschik

Last update: Jan 4, 2023

Related tags

Deep Learning Chunkmogrify

Overview

Chunkmogrify: Real image inversion via Segments

Teaser video with live editing sessions can be found here

This code demonstrates the ideas discussed in arXiv submission Real Image Inversion via Segments.
http://arxiv.org/abs/2110.06269
(David Futschik, Michal Lukáč, Eli Shechtman, Daniel Sýkora)

Abstract:
We present a simple, yet effective approach to editing real images via generative adversarial networks (GAN). Unlike previous techniques, that treat all editing tasks as an operation that affects pixel values in the entire image in our approach we cut up the image into a set of smaller segments. For those segments corresponding latent codes of a generative network can be estimated with greater accuracy due to the lower number of constraints. When codes are altered by the user the content in the image is manipulated locally while the rest of it remains unaffected. Thanks to this property the final edited image better retains the original structures and thus helps to preserve natural look.

What do I need?

You will need a local machine with a relatively recent GPU - I wouldn't recommend trying Chunkmogrify with anything older than RTX 2080. It is technically possible to run even on CPU, but the operations become so slow that the user experience is not enjoyable.

Quick startup guide

Requirements:
Python 3.7 or newer

Note: If you are using Anaconda, I recommend creating a new environment to run this project. Packages installed with conda and pip often don't play together very nicely.

Steps to be able to successfully run the project:

Clone or download the repository and open a terminal / Powershell instance in the directory.
Install the required python packages by running pip install -r requirements.txt. This might take a while, since it will download a few packages which will be several hundred MBs of data. Some packages might need to compile their extensions (as well as this project itself), so a C++ compiler needs to be present. On Linux, this is typically not an issue, but running on Windows might require Visual Studio and CUDA installations to successfully setup the project.
Run python app.py. When running for the first time, it will automatically download required resources, which are also several hundred megabytes. Progression of the download can be monitored in the command line window.

To see if everything installed and configured properly, load up a photo and try running a projection step. If there are no errors, you are good to go.

Possible problems:

Torch not compiled with CUDA enabled.
Run

pip uninstall torch
pip cache purge
pip install torch -f https://download.pytorch.org/whl/torch_stable.html

Explanation of usage

Tutorial video: click below

Open an image using File -> Image from File. There is a sample image provided to check functionality.

Mask painting:
Left click paints, right click unpaints. Mouse wheel controls the size of the brush.

Projection:
Input a number of steps (100 or 200 is ok, 500 is max before LR goes to 0 currently) and press Projection Steps. Wait until projection finishes, you can observe the global image view by choosing output mode Projection Only during this process. To fine-tune, you can perform a small number of Pivotal Tuning steps.

Editing:
To add an edit, click the double arrow down icon in the Attribute Editor on the left side. Choose the type of edit (W, S, Styleclip), the direction of the edit, and drag the sliders to change the currently masked region. Usually it's necessary to increase the multiplier before noticeable changes are reflected via the direction slider.

Multiple different edits can be composed on top of each other at the same time. Their order is largely irrelevant. Currently in the default mode, only one region is being edited, and so all selected edits apply to the same region. If you would like to change the region, you can Freeze the current image, and perform a new projection, but you will lose the ability to change existing edits.

To save the current image, click the Save Current Image button. If the Unalign checkbox is active, the program will attempt to compose the aligned face back into the original image. Saved images can be found in the SavedImages directory by default. This can be changed in _config.yaml.

Keyboard shortcuts

Current keyboard shortcuts include:

Show/Hide mask :: Alt+M
Toggle mask painting :: Alt+N

W-space editing

Source for some of the basic directions:
(https://twitter.com/robertluxemburg/status/1207087801344372736)

To add your own directions, save them in a numpy pickle format as a (num_ws, 512) or (1, 512) format and specify their path in w_directions.py.

Style-space editing (S space edits)

Source:
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation
(https://arxiv.org/abs/2011.12799)
(https://github.com/betterze/StyleSpace)

The presets can be found in s_presets.py, some were taken directly from the paper, others I found by manual exploration. You can perform similar exploration by choosing the Custom preset once you have a projection.

StyleCLIP editing

Source:
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
(https://arxiv.org/abs/2103.17249)
(https://github.com/orpatashnik/StyleCLIP)

Pretrained models taken from (https://github.com/orpatashnik/StyleCLIP/blob/main/utils.py) and manually removed the decoder from the state dict, since it's not used and takes up majority of file size.

PTI Optimization

Source:
Pivotal Tuning for Latent-based Editing of Real Images
(https://arxiv.org/abs/2106.05744)

This method allows you to match the target photo very closely, while retaining editing capacities.

It's often good to run 30-50 iterations of PTI to get very close matching of the source image, which won't cause a very noticeable drop in the editing capabilities.

Attribution

This repository makes use of code provided by the various repositories linked above, plus additionally code from:

styleganv2-ada-pytorch (https://github.com/NVlabs/stylegan2-ada-pytorch)
poisson-image-editing (https://github.com/PPPW/poisson-image-editing) for optional support of idempotent blend (slow implementation of blending that only changes the masked part which can be accessed by uncommenting the option in synthesis.py)

Citation

If you find this code useful for your research, please cite the arXiv submission linked above.

Comments

Mac M1 processer cpp compile error (without cuda)

Hello, I found a error when run app.py.

My environment is Mac M1.

Actually, I meet another error when install pytorch with CUDA version. So I install pytorch without CUDA.

extensions/canvas_to_masks.cpp:61:31: error: expected expression
    bool output_wrong_array = [output, h, w, num_color]() {
                              ^
extensions/canvas_to_masks.cpp:73:9: warning: 'auto' type specifier is a C++11 extension [-Wc++11-extensions]
        auto descr = PyArray_DescrFromType(NPY_FLOAT);
        ^
extensions/canvas_to_masks.cpp:129:5: warning: 'auto' type specifier is a C++11 extension [-Wc++11-extensions]
    auto x = PyModule_Create(&python_module);
    ^

Do you know what is wrong?

opened by whatisand 3

Error: Pickle data was truncated

I followed every step and everything installed correctly, but as soon as I hit "projection steps," I get the error message:

Pickle data was truncated.

Any help would be appreciated, thanks :)

opened by Wythneth 2
"RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!"
Hello, I'm running the software on CPU by modifying _config.yaml as you documented in another issue.

After drawing a mask and processing some projections, I want to edit facial features and in order to do that I create a new attribute with the down-pointing arrows and then I click on "Style Edit" but as soon as I do that I receive the following error message:

File "...\Chunkmogrify-master\styleclip_mapper.py", line 13, in fused_leaky_relu input + bias.view(1, *rest_dim, bias.shape[0]), negative_slope=negative_slope RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
opened by illtellyoulater 1
"RuntimeError: "slow_conv_transpose2d_out_cpu" not implemented for 'Half'"
Hello, I'm running the software on CPU by modifying _config.yaml as you documented in another issue.

After drawing a mask and processing some projections, I want to edit facial features and in order to do that I create a new attribute with the down-pointing arrows and then I click on "S Edit" but as soon as I do that I receive the following error message:

File "...\Chunkmogrify-master\torch_utils\ops\conv2d_gradfix.py", line 43, in conv_transpose2d return torch.nn.functional.conv_transpose2d(input=input, weight=weight, bias=bias, stride=stride, padding=padding, output_padding=output_padding, groups=groups, dilation=dilation) RuntimeError: "slow_conv_transpose2d_out_cpu" not implemented for 'Half'
opened by illtellyoulater 1
[CUDA out of memory] - How much VRAM do we need to run this project?

Hello, I'm receiving "CUDA out of memory" errors just a little after pressing the Projection Steps button.

I only have 3 Gb VRAM, could this be enough? If not, is there a way to lower the requirements?

I've tried using the smallest JPEG at 320 * 200 px with a size of just 5 Kb, but it was still crashing...

If anything else fails, how could I run it on CPU?

opened by illtellyoulater 1
AttributeError: 'NoneType' object has no attribute 'device'

Hi!

My card is RTX 3090

(chunkmogrify) PS W:\Chunkmogrify-master> python app.py Projection mode set to w_projection Gui init: 0.19s Loading W:/1.png Number of faces detected: 1 Detection 0: Left: 319 Top: 320 Right: 587 Bottom: 587 Part 0: (335, 374), Part 1: (338, 411) ... Device cuda:0 requested, but cuda is not available. Using CPU. Ran out of input Initializing StyleganProjector Exception in thread Thread-2: Traceback (most recent call last): File "C:\Users\Creator\miniconda3\envs\chunkmogrify\lib\threading.py", line 926, in _bootstrap_inner self.run() File "C:\Users\Creator\miniconda3\envs\chunkmogrify\lib\threading.py", line 870, in run self._target(*self._args, **self.kwargs) File "app.py", line 592, in run_impl fn(*args) File "app.py", line 573, in steps step() File "app.py", line 533, in step self.synthesis.synthesize_with_project_step() File "W:\Chunkmogrify-master\synthesis.py", line 259, in synthesize_with_project_step self._init_projector() File "W:\Chunkmogrify-master\synthesis.py", line 426, in _init_projector projector = clasz(gan, self.target_image, self.mask_pull(), w_init=init, **kwargs) # **kwargs File "W:\Chunkmogrify-master\stylegan_project.py", line 264, in init self.device = provider.device AttributeError: 'NoneType' object has no attribute 'device'

I've tried something like this:

pip uninstall torch pip cache purge pip install torch -f https://download.pytorch.org/whl/torch_stable.html

torch-1.8.1 to torch-1.10.1

but still the same.

opened by chengkeng 1

Owner

David Futschik

PhD student @ CTU Prague, Czech Republic.

GitHub

An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

customer_segmentation_with_rfm Business Problem : An e-commerce company wants to

3 Jan 6, 2022

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

371 Dec 30, 2022

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Style Transformer for Image Inversion and Editing (CVPR2022) https://arxiv.org/abs/2203.07932 Existing GAN inversion methods fail to provide latent co

153 Dec 2, 2022

Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network."

R2RNet Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network." Jiang Hai, Zhu Xuan, Ren Yang, Yutong Hao, Fengzhu

77 Dec 24, 2022

A collection of resources on GAN Inversion.

This repo is a collection of resources on GAN inversion, as a supplement for our survey

857 Dec 29, 2022

[CVPR 2021] Unsupervised 3D Shape Completion through GAN Inversion

ShapeInversion Paper Junzhe Zhang, Xinyi Chen, Zhongang Cai, Liang Pan, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Bo Dai, Chen Change Loy "Unsupervised 3D

100 Dec 22, 2022

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

DataFree A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation" Authors: Gongfa

47 Jan 9, 2023

A Simplied Framework of GAN Inversion

Framework of GAN Inversion Introcuction You can implement your own inversion idea using our repo. We offer a full range of tuning settings (in hparams

13 Sep 27, 2022

Style-based Neural Drum Synthesis with GAN inversion

Style-based Drum Synthesis with GAN Inversion Demo TensorFlow implementation of a style-based version of the adversarial drum synth (ADS) from the pap

29 Nov 19, 2022

Real-CUGAN - Real Cascade U-Nets for Anime Image Super Resolution

Real Cascade U-Nets for Anime Image Super Resolution 中文 | English ?? Real-CUGAN

111 Dec 28, 2022

Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

Real-ESRGAN Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data Ported from https://github.com/xinntao/Real-ESRGAN Depend

44 Dec 27, 2022

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

1 Feb 15, 2022

Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

Restormer: Efficient Transformer for High-Resolution Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan,

906 Dec 30, 2022

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021) This repository is for BAAF-Net introduce

90 Dec 29, 2022

S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)

S2-BNN (Self-supervised Binary Neural Networks Using Distillation Loss) This is the official pytorch implementation of our paper: "S2-BNN: Bridging th

52 Dec 24, 2022

Official PyTorch code for WACV 2022 paper "CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows"

CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows WACV 2022 preprint:https://arxiv.org/abs/2107.1

156 Dec 28, 2022

Chunkmogrify: Real image inversion via Segments

Related tags

Overview

Chunkmogrify: Real image inversion via Segments

What do I need?

Quick startup guide

Possible problems:

Explanation of usage

Keyboard shortcuts

W-space editing

Style-space editing (S space edits)

StyleCLIP editing

PTI Optimization

Attribution

Citation

Comments

Mac M1 processer cpp compile error (without cuda)

Error: Pickle data was truncated

"RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!"

"RuntimeError: "slow_conv_transpose2d_out_cpu" not implemented for 'Half'"

[CUDA out of memory] - How much VRAM do we need to run this project?

AttributeError: 'NoneType' object has no attribute 'device'

Owner

David Futschik

An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network."

A collection of resources on GAN Inversion.

[CVPR 2021] Unsupervised 3D Shape Completion through GAN Inversion

[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

A Simplied Framework of GAN Inversion

Style-based Neural Drum Synthesis with GAN inversion

Real-CUGAN - Real Cascade U-Nets for Anime Image Super Resolution

Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)

S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)

Official PyTorch code for WACV 2022 paper "CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows"

Real-Time Multi-Contact Model Predictive Control via ADMM

Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

A framework for joint super-resolution and image synthesis, without requiring real training data