Chunkmogrify: Real image inversion via Segments

Overview

Chunkmogrify: Real image inversion via Segments

Logo

Teaser video with live editing sessions can be found here

This code demonstrates the ideas discussed in arXiv submission Real Image Inversion via Segments.
http://arxiv.org/abs/2110.06269
(David Futschik, Michal Lukáč, Eli Shechtman, Daniel Sýkora)

Abstract:
We present a simple, yet effective approach to editing real images via generative adversarial networks (GAN). Unlike previous techniques, that treat all editing tasks as an operation that affects pixel values in the entire image in our approach we cut up the image into a set of smaller segments. For those segments corresponding latent codes of a generative network can be estimated with greater accuracy due to the lower number of constraints. When codes are altered by the user the content in the image is manipulated locally while the rest of it remains unaffected. Thanks to this property the final edited image better retains the original structures and thus helps to preserve natural look.

before after

before after

What do I need?

You will need a local machine with a relatively recent GPU - I wouldn't recommend trying Chunkmogrify with anything older than RTX 2080. It is technically possible to run even on CPU, but the operations become so slow that the user experience is not enjoyable.

Quick startup guide

Requirements:
Python 3.7 or newer

Note: If you are using Anaconda, I recommend creating a new environment to run this project. Packages installed with conda and pip often don't play together very nicely.

Steps to be able to successfully run the project:

  1. Clone or download the repository and open a terminal / Powershell instance in the directory.
  2. Install the required python packages by running pip install -r requirements.txt. This might take a while, since it will download a few packages which will be several hundred MBs of data. Some packages might need to compile their extensions (as well as this project itself), so a C++ compiler needs to be present. On Linux, this is typically not an issue, but running on Windows might require Visual Studio and CUDA installations to successfully setup the project.
  3. Run python app.py. When running for the first time, it will automatically download required resources, which are also several hundred megabytes. Progression of the download can be monitored in the command line window.

To see if everything installed and configured properly, load up a photo and try running a projection step. If there are no errors, you are good to go.

Possible problems:

Torch not compiled with CUDA enabled.
Run

pip uninstall torch
pip cache purge
pip install torch -f https://download.pytorch.org/whl/torch_stable.html

Explanation of usage

Tutorial video: click below

Open an image using File -> Image from File. There is a sample image provided to check functionality.

Mask painting:
Left click paints, right click unpaints. Mouse wheel controls the size of the brush.

Projection:
Input a number of steps (100 or 200 is ok, 500 is max before LR goes to 0 currently) and press Projection Steps. Wait until projection finishes, you can observe the global image view by choosing output mode Projection Only during this process. To fine-tune, you can perform a small number of Pivotal Tuning steps.

Editing:
To add an edit, click the double arrow down icon in the Attribute Editor on the left side. Choose the type of edit (W, S, Styleclip), the direction of the edit, and drag the sliders to change the currently masked region. Usually it's necessary to increase the multiplier before noticeable changes are reflected via the direction slider.

Multiple different edits can be composed on top of each other at the same time. Their order is largely irrelevant. Currently in the default mode, only one region is being edited, and so all selected edits apply to the same region. If you would like to change the region, you can Freeze the current image, and perform a new projection, but you will lose the ability to change existing edits.

To save the current image, click the Save Current Image button. If the Unalign checkbox is active, the program will attempt to compose the aligned face back into the original image. Saved images can be found in the SavedImages directory by default. This can be changed in _config.yaml.

Keyboard shortcuts

Current keyboard shortcuts include:

Show/Hide mask :: Alt+M
Toggle mask painting :: Alt+N

W-space editing

Source for some of the basic directions:
(https://twitter.com/robertluxemburg/status/1207087801344372736)

To add your own directions, save them in a numpy pickle format as a (num_ws, 512) or (1, 512) format and specify their path in w_directions.py.

Style-space editing (S space edits)

Source:
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation
(https://arxiv.org/abs/2011.12799)
(https://github.com/betterze/StyleSpace)

The presets can be found in s_presets.py, some were taken directly from the paper, others I found by manual exploration. You can perform similar exploration by choosing the Custom preset once you have a projection.

StyleCLIP editing

Source:
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
(https://arxiv.org/abs/2103.17249)
(https://github.com/orpatashnik/StyleCLIP)

Pretrained models taken from (https://github.com/orpatashnik/StyleCLIP/blob/main/utils.py) and manually removed the decoder from the state dict, since it's not used and takes up majority of file size.

PTI Optimization

Source:
Pivotal Tuning for Latent-based Editing of Real Images
(https://arxiv.org/abs/2106.05744)

This method allows you to match the target photo very closely, while retaining editing capacities.

It's often good to run 30-50 iterations of PTI to get very close matching of the source image, which won't cause a very noticeable drop in the editing capabilities.

Attribution

This repository makes use of code provided by the various repositories linked above, plus additionally code from:

styleganv2-ada-pytorch (https://github.com/NVlabs/stylegan2-ada-pytorch)
poisson-image-editing (https://github.com/PPPW/poisson-image-editing) for optional support of idempotent blend (slow implementation of blending that only changes the masked part which can be accessed by uncommenting the option in synthesis.py)

Citation

If you find this code useful for your research, please cite the arXiv submission linked above.

Comments
  • Mac M1 processer cpp compile error (without cuda)

    Mac M1 processer cpp compile error (without cuda)

    Hello, I found a error when run app.py.

    My environment is Mac M1.

    Actually, I meet another error when install pytorch with CUDA version. So I install pytorch without CUDA.

    extensions/canvas_to_masks.cpp:61:31: error: expected expression
        bool output_wrong_array = [output, h, w, num_color]() {
                                  ^
    extensions/canvas_to_masks.cpp:73:9: warning: 'auto' type specifier is a C++11 extension [-Wc++11-extensions]
            auto descr = PyArray_DescrFromType(NPY_FLOAT);
            ^
    extensions/canvas_to_masks.cpp:129:5: warning: 'auto' type specifier is a C++11 extension [-Wc++11-extensions]
        auto x = PyModule_Create(&python_module);
        ^
    

    Do you know what is wrong?

    opened by whatisand 3
  • Error: Pickle data was truncated

    Error: Pickle data was truncated

    I followed every step and everything installed correctly, but as soon as I hit "projection steps," I get the error message:

    Pickle data was truncated.

    Any help would be appreciated, thanks :)

    opened by Wythneth 2
  • "RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!"

    Hello, I'm running the software on CPU by modifying _config.yaml as you documented in another issue.

    After drawing a mask and processing some projections, I want to edit facial features and in order to do that I create a new attribute with the down-pointing arrows and then I click on "Style Edit" but as soon as I do that I receive the following error message:

      File "...\Chunkmogrify-master\styleclip_mapper.py", line 13, in fused_leaky_relu
        input + bias.view(1, *rest_dim, bias.shape[0]), negative_slope=negative_slope
    RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
    
    opened by illtellyoulater 1
  • "RuntimeError: "slow_conv_transpose2d_out_cpu" not implemented for 'Half'"

    Hello, I'm running the software on CPU by modifying _config.yaml as you documented in another issue.

    After drawing a mask and processing some projections, I want to edit facial features and in order to do that I create a new attribute with the down-pointing arrows and then I click on "S Edit" but as soon as I do that I receive the following error message:

      File "...\Chunkmogrify-master\torch_utils\ops\conv2d_gradfix.py", line 43, in conv_transpose2d
        return torch.nn.functional.conv_transpose2d(input=input, weight=weight, bias=bias, stride=stride, padding=padding, output_padding=output_padding, groups=groups, dilation=dilation)
    RuntimeError: "slow_conv_transpose2d_out_cpu" not implemented for 'Half'
    
    opened by illtellyoulater 1
  • [CUDA out of memory] - How much VRAM do we need to run this project?

    [CUDA out of memory] - How much VRAM do we need to run this project?

    Hello, I'm receiving "CUDA out of memory" errors just a little after pressing the Projection Steps button.

    I only have 3 Gb VRAM, could this be enough? If not, is there a way to lower the requirements?

    I've tried using the smallest JPEG at 320 * 200 px with a size of just 5 Kb, but it was still crashing...

    If anything else fails, how could I run it on CPU?

    opened by illtellyoulater 1
  • AttributeError: 'NoneType' object has no attribute 'device'

    AttributeError: 'NoneType' object has no attribute 'device'

    Hi!

    My card is RTX 3090

    (chunkmogrify) PS W:\Chunkmogrify-master> python app.py Projection mode set to w_projection Gui init: 0.19s Loading W:/1.png Number of faces detected: 1 Detection 0: Left: 319 Top: 320 Right: 587 Bottom: 587 Part 0: (335, 374), Part 1: (338, 411) ... Device cuda:0 requested, but cuda is not available. Using CPU. Ran out of input Initializing StyleganProjector Exception in thread Thread-2: Traceback (most recent call last): File "C:\Users\Creator\miniconda3\envs\chunkmogrify\lib\threading.py", line 926, in _bootstrap_inner self.run() File "C:\Users\Creator\miniconda3\envs\chunkmogrify\lib\threading.py", line 870, in run self._target(*self._args, **self.kwargs) File "app.py", line 592, in run_impl fn(*args) File "app.py", line 573, in steps step() File "app.py", line 533, in step self.synthesis.synthesize_with_project_step() File "W:\Chunkmogrify-master\synthesis.py", line 259, in synthesize_with_project_step self._init_projector() File "W:\Chunkmogrify-master\synthesis.py", line 426, in _init_projector projector = clasz(gan, self.target_image, self.mask_pull(), w_init=init, **kwargs) # **kwargs File "W:\Chunkmogrify-master\stylegan_project.py", line 264, in init self.device = provider.device AttributeError: 'NoneType' object has no attribute 'device'

    I've tried something like this:

    pip uninstall torch pip cache purge pip install torch -f https://download.pytorch.org/whl/torch_stable.html

    torch-1.8.1 to torch-1.10.1

    but still the same.

    opened by chengkeng 1
Owner
David Futschik
PhD student @ CTU Prague, Czech Republic.
David Futschik
An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

customer_segmentation_with_rfm Business Problem : An e-commerce company wants to

Buse Yıldırım 3 Jan 6, 2022
Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

Tengfei Wang 371 Dec 30, 2022
Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Style Transformer for Image Inversion and Editing (CVPR2022) https://arxiv.org/abs/2203.07932 Existing GAN inversion methods fail to provide latent co

Xueqi Hu 153 Dec 2, 2022
Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network."

R2RNet Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network." Jiang Hai, Zhu Xuan, Ren Yang, Yutong Hao, Fengzhu

null 77 Dec 24, 2022
A collection of resources on GAN Inversion.

This repo is a collection of resources on GAN inversion, as a supplement for our survey

null 857 Dec 29, 2022
[CVPR 2021] Unsupervised 3D Shape Completion through GAN Inversion

ShapeInversion Paper Junzhe Zhang, Xinyi Chen, Zhongang Cai, Liang Pan, Haiyu Zhao, Shuai Yi, Chai Kiat Yeo, Bo Dai, Chen Change Loy "Unsupervised 3D

null 100 Dec 22, 2022
[IJCAI-2021] A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation"

DataFree A benchmark of data-free knowledge distillation from paper "Contrastive Model Inversion for Data-Free Knowledge Distillation" Authors: Gongfa

ZJU-VIPA 47 Jan 9, 2023
A Simplied Framework of GAN Inversion

Framework of GAN Inversion Introcuction You can implement your own inversion idea using our repo. We offer a full range of tuning settings (in hparams

Kangneng Zhou 13 Sep 27, 2022
Style-based Neural Drum Synthesis with GAN inversion

Style-based Drum Synthesis with GAN Inversion Demo TensorFlow implementation of a style-based version of the adversarial drum synth (ADS) from the pap

Sound and Music Analysis (SoMA) Group 29 Nov 19, 2022
Real-CUGAN - Real Cascade U-Nets for Anime Image Super Resolution

Real Cascade U-Nets for Anime Image Super Resolution 中文 | English ?? Real-CUGAN

tarsin 111 Dec 28, 2022
Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

Real-ESRGAN Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data Ported from https://github.com/xinntao/Real-ESRGAN Depend

Holy Wu 44 Dec 27, 2022
Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

Rounak Das 1 Feb 15, 2022
Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

Restormer: Efficient Transformer for High-Resolution Image Restoration Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan,

Syed Waqas Zamir 906 Dec 30, 2022
Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021) This repository is for BAAF-Net introduce

null 90 Dec 29, 2022
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)

S2-BNN (Self-supervised Binary Neural Networks Using Distillation Loss) This is the official pytorch implementation of our paper: "S2-BNN: Bridging th

Zhiqiang Shen 52 Dec 24, 2022
Official PyTorch code for WACV 2022 paper "CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows"

CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows WACV 2022 preprint:https://arxiv.org/abs/2107.1

Denis 156 Dec 28, 2022
Real-Time Multi-Contact Model Predictive Control via ADMM

Here, you can find the code for the paper 'Real-Time Multi-Contact Model Predictive Control via ADMM'. Code is currently being cleared up and optimize

null 17 Dec 28, 2022
Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement".

HiSD: Image-to-image Translation via Hierarchical Style Disentanglement Official pytorch implementation of paper "Image-to-image Translation

null 364 Dec 14, 2022
A framework for joint super-resolution and image synthesis, without requiring real training data

SynthSR This repository contains code to train a Convolutional Neural Network (CNN) for Super-resolution (SR), or joint SR and data synthesis. The met

null 83 Jan 1, 2023