Stitch it in Time: GAN-Based Facial Editing of Real Videos

Last update: Jan 4, 2023

Related tags

Deep Learning STIT

Overview

STIT - Stitch it in Time

Stitch it in Time: GAN-Based Facial Editing of Real Videos
Rotem Tzaban, Ron Mokady, Rinon Gal, Amit Bermano, Daniel Cohen-Or

Abstract:
The ability of Generative Adversarial Networks to encode rich semantics within their latent space has been widely adopted for facial image editing. However, replicating their success with videos has proven challenging. Sets of high-quality facial videos are lacking, and working with videos introduces a fundamental barrier to overcome - temporal coherency. We propose that this barrier is largely artificial. The source video is already temporally coherent, and deviations from this state arise in part due to careless treatment of individual components in the editing pipeline. We leverage the natural alignment of StyleGAN and the tendency of neural networks to learn low frequency functions, and demonstrate that they provide a strongly consistent prior. We draw on these insights and propose a framework for semantic editing of faces in videos, demonstrating significant improvements over the current state-of-the-art. Our method produces meaningful face manipulations, maintains a higher degree of temporal consistency, and can be applied to challenging, high quality, talking head videos which current methods struggle with.

Requirements

Pytorch(tested with 1.10, should work with 1.8/1.9 as well) + torchvision

For the rest of the requirements, run:

pip install Pillow imageio imageio-ffmpeg dlib face-alignment opencv-python click wandb tqdm scipy matplotlib clip lpips

Pretrained models

In order to use this project you need to download pretrained models from the following Link.

Unzip it inside the project's main directory.

You can use the download_models.sh script (requires installing gdown with pip install gdown)

Alternatively, you can unzip the models to a location of your choice and update configs/path_config.py accordingly.

Splitting videos into frames

Our code expects videos in the form of a directory with individual frame images. To produce such a directory from an existing video, we recommend using ffmpeg:

ffmpeg -i "video.mp4" "video_frames/out%04d.png"

Example Videos

The videos used to produce our results can be downloaded from the following Link.

Inversion

To invert a video run:

python train.py --input_folder /path/to/images_dir \ 
 --output_folder /path/to/experiment_dir \
 --run_name RUN_NAME \
 --num_pti_steps NUM_STEPS

This includes aligning, cropping, e4e encoding and PTI

For example:

python train.py --input_folder /data/obama \ 
 --output_folder training_results/obama \
 --run_name obama \
 --num_pti_steps 80

Weights and biases logging is disabled by default. to enable, add --use_wandb

Naive Editing

To run edits without stitching tuning:

python edit_video.py --input_folder /path/to/images_dir \ 
 --output_folder /path/to/experiment_dir \
 --run_name RUN_NAME \
 --edit_name EDIT_NAME \
 --edit_range EDIT_RANGE \

edit_range determines the strength of the edits applied. It should be in the format RANGE_START RANGE_END RANGE_STEPS.
for example, if we use --edit_range 1 5 2, we will apply edits with strength 1, 3 and 5.

For young Obama use:

python edit_video.py --input_folder /data/obama \ 
 --output_folder edits/obama/ \
 --run_name obama \
 --edit_name age \
 --edit_range -8 -8 1 \

Editing + Stitching Tuning

To run edits with stitching tuning:

python edit_video_stitching_tuning.py --input_folder /path/to/images_dir \ 
 --output_folder /path/to/experiment_dir \
 --run_name RUN_NAME \
 --edit_name EDIT_NAME \
 --edit_range EDIT_RANGE \
 --outer_mask_dilation MASK_DILATION

We support early breaking the stitching tuning process, when the loss reaches a specified threshold.
This enables us to perform more iterations for difficult frames while maintaining a reasonable running time.
To use this feature, add --border_loss_threshold THRESHOLD to the command(Shown in the Jim and Kamala Harris examples below).
For videos with a simple background to reconstruct (e.g Obama, Jim, Emma Watson, Kamala Harris), we use THRESHOLD=0.005.
For videos where a more exact reconstruction of the background is required (e.g Michael Scott), we use THRESHOLD=0.002.
Early breaking is disabled by default.

For young Obama use:

python edit_video_stitching_tuning.py --input_folder /data/obama \ 
 --output_folder edits/obama/ \
 --run_name obama \
 --edit_name age \
 --edit_range -8 -8 1 \  
 --outer_mask_dilation 50

For gender editing on Obama use:

python edit_video_stitching_tuning.py --input_folder /data/obama \ 
 --output_folder edits/obama/ \
 --run_name obama \
 --edit_name gender \
 --edit_range -6 -6 1 \  
 --outer_mask_dilation 50

For young Emma Watson use:

python edit_video_stitching_tuning.py --input_folder /data/emma_watson \ 
 --output_folder edits/emma_watson/ \
 --run_name emma_watson \
 --edit_name age \
 --edit_range -8 -8 1 \  
 --outer_mask_dilation 50

For smile removal on Emma Watson use:

python edit_video_stitching_tuning.py --input_folder /data/emma_watson \ 
 --output_folder edits/emma_watson/ \
 --run_name emma_watson \
 --edit_name smile \
 --edit_range -3 -3 1 \  
 --outer_mask_dilation 50

For Emma Watson lipstick editing use: (done with styleclip global direction)

python edit_video_stitching_tuning.py --input_folder /data/emma_watson \ 
 --output_folder edits/emma_watson/ \
 --run_name emma_watson \
 --edit_type styleclip_global \
 --edit_name lipstick \
 --neutral_class "Face" \
 --target_class "Face with lipstick" \
 --beta 0.2 \
 --edit_range 10 10 1 \  
 --outer_mask_dilation 50

For Old + Young Jim use (with early breaking):

python edit_video_stitching_tuning.py --input_folder datasets/jim/ \
 --output_folder edits/jim \
 --run_name jim \
 --edit_name age \
 --edit_range -8 8 2 \
 --outer_mask_dilation 50 \ 
 --border_loss_threshold 0.005

For smiling Kamala Harris:

python edit_video_stitching_tuning.py \
 --input_folder datasets/kamala/ \ 
 --output_folder edits/kamala \
 --run_name kamala \
 --edit_name smile \
 --edit_range 2 2 1 \
 --outer_mask_dilation 50 \
 --border_loss_threshold 0.005

Example Results

With stitching tuning:

out.mp4

Without stitching tuning:

out.mp4

Gender editing:

out.mp4

Young Emma Watson:

out.mp4

Emma Watson with lipstick:

out.mp4

Emma Watson smile removal:

out.mp4

Old Jim:

out.mp4

Young Jim:

out.mp4

Smiling Kamala Harris:

out.mp4

Out of domain video editing (Animations)

For editing out of domain videos, Some different parameters are required while training. First, dlib's face detector doesn't detect all animated faces, so we use a different face detector provided by the face_alignment package. Second, we reduce the smoothing of the alignment parameters with --center_sigma 0.0 Third, OOD videos require more training steps, as they are more difficult to invert.

To train, we use:

python train.py --input_folder datasets/ood_spiderverse_gwen/ \
 --output_folder training_results/ood \
 --run_name ood \
 --num_pti_steps 240 \
 --use_fa \
 --center_sigma 0.0

Afterwards, editing is performed the same way:

python edit_video.py --input_folder datasets/ood_spiderverse_gwen/ \
 --output_folder edits/ood --run_name ood \
 --edit_name smile --edit_range 2 2 1

out.mp4

python edit_video.py --input_folder datasets/ood_spiderverse_gwen/ \
 --output_folder edits/ood \
 --run_name ood \
 --edit_type styleclip_global
 --edit_range 10 10 1
 --edit_name lipstick
 --target_class 'Face with lipstick'

out.mp4

Credits:

StyleGAN2-ada model and implementation:
https://github.com/NVlabs/stylegan2-ada-pytorch Copyright © 2021, NVIDIA Corporation.
Nvidia Source Code License https://nvlabs.github.io/stylegan2-ada-pytorch/license.html

PTI implementation:
https://github.com/danielroich/PTI
Copyright (c) 2021 Daniel Roich
License (MIT) https://github.com/danielroich/PTI/blob/main/LICENSE

LPIPS model and implementation:
https://github.com/richzhang/PerceptualSimilarity
Copyright (c) 2020, Sou Uchida
License (BSD 2-Clause) https://github.com/richzhang/PerceptualSimilarity/blob/master/LICENSE

e4e model and implementation:
https://github.com/omertov/encoder4editing Copyright (c) 2021 omertov
License (MIT) https://github.com/omertov/encoder4editing/blob/main/LICENSE

StyleCLIP model and implementation:
https://github.com/orpatashnik/StyleCLIP Copyright (c) 2021 orpatashnik
License (MIT) https://github.com/orpatashnik/StyleCLIP/blob/main/LICENSE

StyleGAN2 Distillation for Feed-forward Image Manipulation - for editing directions:
https://github.com/EvgenyKashin/stylegan2-distillation
Copyright (c) 2019, Yandex LLC
License (Creative Commons NonCommercial) https://github.com/EvgenyKashin/stylegan2-distillation/blob/master/LICENSE

face-alignment Library:
https://github.com/1adrianb/face-alignment
Copyright (c) 2017, Adrian Bulat
License (BSD 3-Clause License) https://github.com/1adrianb/face-alignment/blob/master/LICENSE

face-parsing.PyTorch:
https://github.com/zllrunning/face-parsing.PyTorch
Copyright (c) 2019 zll
License (MIT) https://github.com/zllrunning/face-parsing.PyTorch/blob/master/LICENSE

Citation

If you make use of our work, please cite our paper:

@misc{tzaban2022stitch,
      title={Stitch it in Time: GAN-Based Facial Editing of Real Videos},
      author={Rotem Tzaban and Ron Mokady and Rinon Gal and Amit H. Bermano and Daniel Cohen-Or},
      year={2022},
      eprint={2201.08361},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Comments

ImportError: No module named 'upfirdn2d_plugin'

I got this warning...

(stitenv) hari@hari-MS-7C02:/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT$ python train.py --input_folder ./data/obama --output_folder ./training_results/obama --run_name obama --num_pti_steps 80
Number of images: 200
Aligning images
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:01<00:00, 117.42it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:04<00:00, 47.81it/s]
Aligning completed
Loading e4e over the pSp framework from checkpoint: ./pretrained_models/e4e_ffhq_encode.pt
Setting up [LPIPS] perceptual loss: trunk [alex], v[0.1], spatial [off]
Loading model from: /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/lpips/weights/v0.1/alex.pth
Calculating initial inversions
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:37<00:00,  5.30it/s]
Fine tuning generator
  0%|                                                                                                                                                                                                 | 0/80 [00:00<?, ?it/s]Setting up PyTorch plugin "bias_act_plugin"... Failed!
/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/bias_act.py:50: UserWarning: Failed to build CUDA kernels for bias_act. Falling back to slow reference implementation. Details:

Traceback (most recent call last):
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1673, in _run_ninja_build
    env=env)
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/bias_act.py", line 48, in _init
    _plugin = custom_ops.get_plugin('bias_act_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math'])
  File "/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/custom_ops.py", line 110, in get_plugin
    torch.utils.cpp_extension.load(name=module_name, verbose=verbose_build, sources=sources, **build_kwargs)
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1091, in load
    keep_intermediates=keep_intermediates)
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1302, in _jit_compile
    is_standalone=is_standalone)
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1407, in _write_ninja_file_and_build_library
    error_prefix=f"Error building extension '{name}'")
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1683, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'bias_act_plugin': [1/2] /usr/bin/nvcc --generate-dependencies-with-compile --dependency-output bias_act.cuda.o.d -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/TH -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/THC -isystem /home/hari/anaconda3/envs/stitenv/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' --use_fast_math -std=c++14 -c /mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/bias_act.cu -o bias_act.cuda.o 
FAILED: bias_act.cuda.o 
/usr/bin/nvcc --generate-dependencies-with-compile --dependency-output bias_act.cuda.o.d -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/TH -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/THC -isystem /home/hari/anaconda3/envs/stitenv/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' --use_fast_math -std=c++14 -c /mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/bias_act.cu -o bias_act.cuda.o 
nvcc fatal   : Unknown option '-generate-dependencies-with-compile'
ninja: build stopped: subcommand failed.


  warnings.warn('Failed to build CUDA kernels for bias_act. Falling back to slow reference implementation. Details:\n\n' + traceback.format_exc())
Setting up PyTorch plugin "upfirdn2d_plugin"... Failed!
/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/upfirdn2d.py:34: UserWarning: Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:

Traceback (most recent call last):
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1673, in _run_ninja_build
    env=env)
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/upfirdn2d.py", line 32, in _init
    _plugin = custom_ops.get_plugin('upfirdn2d_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math'])
  File "/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/custom_ops.py", line 110, in get_plugin
    torch.utils.cpp_extension.load(name=module_name, verbose=verbose_build, sources=sources, **build_kwargs)
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1091, in load
    keep_intermediates=keep_intermediates)
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1302, in _jit_compile
    is_standalone=is_standalone)
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1407, in _write_ninja_file_and_build_library
    error_prefix=f"Error building extension '{name}'")
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1683, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'upfirdn2d_plugin': [1/2] /usr/bin/nvcc --generate-dependencies-with-compile --dependency-output upfirdn2d.cuda.o.d -DTORCH_EXTENSION_NAME=upfirdn2d_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/TH -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/THC -isystem /home/hari/anaconda3/envs/stitenv/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' --use_fast_math -std=c++14 -c /mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/upfirdn2d.cu -o upfirdn2d.cuda.o 
FAILED: upfirdn2d.cuda.o 
/usr/bin/nvcc --generate-dependencies-with-compile --dependency-output upfirdn2d.cuda.o.d -DTORCH_EXTENSION_NAME=upfirdn2d_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/TH -isystem /home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/include/THC -isystem /home/hari/anaconda3/envs/stitenv/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' --use_fast_math -std=c++14 -c /mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/upfirdn2d.cu -o upfirdn2d.cuda.o 
nvcc fatal   : Unknown option '-generate-dependencies-with-compile'
ninja: build stopped: subcommand failed.


  warnings.warn('Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:\n\n' + traceback.format_exc())
Setting up PyTorch plugin "upfirdn2d_plugin"... Failed!
/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/upfirdn2d.py:34: UserWarning: Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:

Traceback (most recent call last):
  File "/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/ops/upfirdn2d.py", line 32, in _init
    _plugin = custom_ops.get_plugin('upfirdn2d_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math'])
  File "/mnt/95d2aa3d-99e9-4600-91b1-2fcecff0dec5/AI_Tools/STIT/torch_utils/custom_ops.py", line 110, in get_plugin
    torch.utils.cpp_extension.load(name=module_name, verbose=verbose_build, sources=sources, **build_kwargs)
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1091, in load
    keep_intermediates=keep_intermediates)
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1317, in _jit_compile
    return _import_module_from_library(name, build_directory, is_python_module)
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1699, in _import_module_from_library
    file, path, description = imp.find_module(module_name, [path])
  File "/home/hari/anaconda3/envs/stitenv/lib/python3.7/imp.py", line 296, in find_module
    raise ImportError(_ERR_MSG.format(name), name=name)
ImportError: No module named 'upfirdn2d_plugin'

opened by harisreedhar 9

TypeError: Got secondary option for non boolean flag.

Traceback (most recent call last): File "train.py", line 60, in @click.option('--use_wandb/--no_wandb', default=False) File "/home/xwh/anaconda3/lib/python3.7/site-packages/click/decorators.py", line 173, in decorator _param_memo(f, OptionClass(param_decls, **option_attrs)) File "/home/xwh/anaconda3/lib/python3.7/site-packages/click/core.py", line 1601, in init raise TypeError('Got secondary option for non boolean flag.') TypeError: Got secondary option for non boolean flag.

Got an error. Anyone knows how to fix this?

opened by bdseal 7

ValueError: Image must be a numpy array.

File "edit_video.py", line 137, in _main
    imageio.mimwrite(os.path.join(folder_path, 'out.mp4'), frames, fps=18, output_params=['-vf', 'fps=25'])
  File "/opt/conda/lib/python3.8/site-packages/imageio/core/functions.py", line 338, in mimwrite
    raise ValueError('Image must be a numpy array.')

imageio.mimwrite(uri, ims, format=None, **kwargs) Write multiple images to the specified file. Parameters ... ims [sequence of numpy arrays] The image data. Each array must be NxM, NxMx3 or NxMx4.

So there is something wrong.

opened by onefish51 5

Aligning images

Hi！Your work is amazing!

But there is a problem with image alignment. (https://github.com/rotemtzaban/STIT/blob/ef851d3f0ecc1839cbd307fdfdc7fa6f981f6228/train.py#L81)

The next two rows are the input and output respectively. And the result in the second row does not look well aligned（The angle of the face posture is different）. Do you have any advice for me?

opened by Carlyx 3

not an issue - question on beta value

I've been playing around with styleclip -

python edit_video_stitching_tuning.py --input_folder data/obama \
 --output_folder edits/obama/ \
 --run_name obama \
 --edit_type styleclip_global \
 --edit_name aids \
 --neutral_class "Face" \
 --target_class "Face with sores" \
 --beta 0.1 \ 
 --edit_range 10 10 1 \
 --outer_mask_dilation 50 \
 --start_frame 0 \
 --end_frame 100

changing the target class - ValueError: Beta value 0.15 is too high for mapping from Face to Face with sores, try setting it to a lower value

I have to bump it down to 0.1 for the program to run. But then the results seem indifferent to original video. Is there something obvious between the two? 'Face with sores' vs 'Face with lipstick'

also styleclip has global mappers which take > 10 hrs to build - but then they're fast for inference. https://github.com/orpatashnik/StyleCLIP

"mapper/pretrained/afro.pt": "https://drive.google.com/uc?id=1i5vAqo4z0I-Yon3FNft_YZOq7ClWayQJ",
"mapper/pretrained/angry.pt": "https://drive.google.com/uc?id=1g82HEH0jFDrcbCtn3M22gesWKfzWV_ma",
"mapper/pretrained/beyonce.pt": "https://drive.google.com/uc?id=1KJTc-h02LXs4zqCyo7pzCp0iWeO6T9fz",
"mapper/pretrained/bobcut.pt": "https://drive.google.com/uc?id=1IvyqjZzKS-vNdq_OhwapAcwrxgLAY8UF",
"mapper/pretrained/bowlcut.pt": "https://drive.google.com/uc?id=1xwdxI2YCewSt05dEHgkpmmzoauPjEnnZ",
"mapper/pretrained/curly_hair.pt": "https://drive.google.com/uc?id=1xZ7fFB12Ci6rUbUfaHPpo44xUFzpWQ6M",
"mapper/pretrained/depp.pt": "https://drive.google.com/uc?id=1FPiJkvFPG_y-bFanxLLP91wUKuy-l3IV",
"mapper/pretrained/hilary_clinton.pt": "https://drive.google.com/uc?id=1X7U2zj2lt0KFifIsTfOOzVZXqYyCWVll",
"mapper/pretrained/mohawk.pt": "https://drive.google.com/uc?id=1oMMPc8iQZ7dhyWavZ7VNWLwzf9aX4C09",
"mapper/pretrained/purple_hair.pt": "https://drive.google.com/uc?id=14H0CGXWxePrrKIYmZnDD2Ccs65EEww75",
"mapper/pretrained/surprised.pt": "https://drive.google.com/uc?id=1F-mPrhO-UeWrV1QYMZck63R43aLtPChI",
"mapper/pretrained/taylor_swift.pt": "https://drive.google.com/uc?id=10jHuHsKKJxuf3N0vgQbX_SMEQgFHDrZa",
"mapper/pretrained/trump.pt": "https://drive.google.com/uc?id=14v8D0uzy4tOyfBU3ca9T0AzTt3v-dNyh",
"mapper/pretrained/zuckerberg.pt": "https://drive.google.com/uc?id=1NjDcMUL8G-pO3i_9N6EPpQNXeMc3Ar1r",

"example_celebs.pt": "https://drive.google.com/uc?id=1VL3lP4avRhz75LxSza6jgDe-pHd2veQG"

Did you experiment using these ? It may shorten the time to render..... or is the time mostly on pti?

opened by johndpope 2

missing import line in styleclip_global_utils.py?

Hi, thank you so much for sharing this amazing work. I just wanted to ask that if there's a missing line import clip in the file editings/styleclip_global_utils.py, because I got the following error when trying to run it with styleclip and it was fixed by adding that import line.

Thanks!

opened by bycloudai 2
How to use the pre-trained StyleGAN2 for another architecture?

Hi I want to replace the pre-trained StyleGAN2 model ffhq.pkl with rosinality's StyleGAN2 Implementation, It seems that currently your implementation of StyleGAN2 is based on StyleGAN2-ada. These two implementations have different weights paramenters & architecture. Besides changing the Generator network, Is there any way for a convertion? Thanks for your great work!

opened by mazzzystar 2
fix bug of train & edit_video

1.Fix bug of type dismatch for transform.Resize() in training part. The original code would results in TypeError: img should be PIL Image. Got <class 'torch.Tensor'>. I replace the Resize op with F.interpolate when the image was still a Tensor so do not need to convert type TWICE.

2.Fix bug of Normalize for transform.normalize in edit_video.py.

opened by mazzzystar 2
help

warnings.warn('Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:\n\n' + traceback.format_exc()) Setting up PyTorch plugin "upfirdn2d_plugin"... Failed! D:\dongzuoqianyi\STIT-main\torch_utils\ops\upfirdn2d.py:34: UserWarning: Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:

Traceback (most recent call last): File "D:\dongzuoqianyi\STIT-main\torch_utils\ops\upfirdn2d.py", line 32, in _init _plugin = custom_ops.get_plugin('upfirdn2d_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math']) File "D:\dongzuoqianyi\STIT-main\torch_utils\custom_ops.py", line 110, in get_plugin torch.utils.cpp_extension.load(name=module_name, verbose=verbose_build, sources=sources, **build_kwargs) File "C:\ProgramData\Anaconda3\envs\stit\lib\site-packages\torch\utils\cpp_extension.py", line 1136, in load keep_intermediates=keep_intermediates) File "C:\ProgramData\Anaconda3\envs\stit\lib\site-packages\torch\utils\cpp_extension.py", line 1362, in _jit_compile return _import_module_from_library(name, build_directory, is_python_module) File "C:\ProgramData\Anaconda3\envs\stit\lib\site-packages\torch\utils\cpp_extension.py", line 1752, in _import_module_from_library module = importlib.util.module_from_spec(spec) File "", line 583, in module_from_spec File "", line 1043, in create_module File "", line 219, in _call_with_frames_removed ImportError: DLL load failed: 找不到指定的模块。

warnings.warn('Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:\n\n' + traceback.format_exc()) Setting up PyTorch plugin "upfirdn2d_plugin"...

opened by huanxve 2
Duration of out.mp4 is messed up

Hi

The out.mp4 time duration results to be longer than original video becouse of this line in imageio.mimwrite(os.path.join(folder_path, 'out.mp4'), frames, fps=18, output_params=['-vf', 'fps=25'])

18 must be changed to 25 to fix this issue

opened by icookycom 1
Training my own InterfaceGAN boundary

Thanks for your great work, the results were impressive!

I'd to train my own InterfaceGAN w_direction, and noticed that your w_direction vector(such as 'age.npy') is the size of (18, 512) rather than (1,512).

My previous experience was sampling a lot of images with their w counter part with the size of (1,512), then train a model for binary classification. So I wonder whether you're using w+ with the size of(18,512) for training InterfaceGAN?

Refer to https://github.com/omertov/encoder4editing/issues/9#issuecomment-806139382

opened by mazzzystar 1
Doing multiple edits

Can I do multiple edits, and if so what is the syntax for that? I'm interested in mixing a few different edits, possibly across interface gan and styleclip methods, with each other all at the same time. If this isn't directly support can I just just modify the front end to collect multiple edits, then just sum the vectors for each family of edits together?

opened by skyler14 0
Checkpoint architecture

I am interested in being able to reload a trained checkpoint file and use that to repeat several of the steps in the training file, namely I'd like to try extrapolating out the face reconstruction made from our tuned model to more frames to see the results. However I've noticed that the file we end up saving with:

save_tuned_G(run_id)

is significantly different from the model we initially load up when we run load_old_G(). For starters we actually initially loaded a pkl and now are saving torch .pt files. Was there any weights or other data we left out when we saved the tuned files which must also be saved if we want to reload tuned run data inside?

Or could we basically take the generator loaded in edit_video.py and assign that to a coach.G in place of how we ran self.G=load_old_G when the train.py called the coach.train() function. Is this all it would take to resume with our training runtime model?

opened by skyler14 0
Improper reconstruction of view by Stitching Tuning Method (edit_video_stitching_tuning.py)

Hi I ran the code on the following mp4 file : https://user-images.githubusercontent.com/75319437/187859899-f8d52a47-cb55-4cff-903f-24b157916969.mp4 The reconstructed video on stitching tuning was : https://user-images.githubusercontent.com/75319437/187860422-43cb399f-03e2-4192-a983-ef518a26ec7c.mp4 The stitching tuning part has a black mask over the face of the person in the video. Please do look into it

Thanks

opened by Raghav-2002 0
Result video different duration

Hi

I know this problem was fixed (18->25fps in code). I reinstalled STIT yesterday all new but still getting result duration problem

Its notable and i use comand to get exact duration ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 out.mp4

opened by icookycom 0

Owner

GitHub

Instant Real-Time Example-Based Style Transfer to Facial Videos

FaceBlit: Instant Real-Time Example-Based Style Transfer to Facial Videos The official implementation of FaceBlit: Instant Real-Time Example-Based Sty

131 Dec 19, 2022

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

21 Dec 22, 2022

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time Introduction This is official implementation for DR-GAN (IEEE TCS

18 Dec 23, 2022

Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

OpenGaze: Web Service for OpenFace Facial Behaviour Analysis Toolkit Overview OpenFace is a fantastic tool intended for computer vision and machine le

4 Nov 3, 2022

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

OpenFace 2.2.0: a facial behavior analysis toolkit Over the past few years, there has been an increased interest in automatic facial behavior analysis

5.8k Dec 31, 2022

Automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azure

fwhr-calc-website This project is to automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azur

1 Feb 7, 2022

Code for Talk-to-Edit (ICCV2021). Paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog.

Talk-to-Edit (ICCV2021) This repository contains the implementation of the following paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog Yumin

221 Jan 7, 2023

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

1 Feb 15, 2022

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

371 Dec 30, 2022

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

191 Dec 31, 2022

Stitch it in Time: GAN-Based Facial Editing of Real Videos

Related tags

Overview

STIT - Stitch it in Time

Requirements

Pretrained models

Splitting videos into frames

Example Videos

Inversion

Naive Editing

Editing + Stitching Tuning

Example Results

Out of domain video editing (Animations)

Credits:

Citation

Comments

Owner

Instant Real-Time Example-Based Style Transfer to Facial Videos

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

DR-GAN: Automatic Radial Distortion Rectification Using Conditional GAN in Real-Time

Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

Automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azure

Code for Talk-to-Edit (ICCV2021). Paper: Talk-to-Edit: Fine-Grained Facial Editing via Dialog.

Real-Time-Student-Attendence-System - Real Time Student Attendence System

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

Keeping it safe - AI Based COVID-19 Tracker using Deep Learning and facial recognition

This dlib-based facial login system

An automated facial recognition based attendance system (desktop application)

Real-world Anomaly Detection in Surveillance Videos- pytorch Re-implementation

Code for PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Relighting and Material Editing

Invert and perturb GAN images for test-time ensembling

Invert and perturb GAN images for test-time ensembling

An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation