This code extends the neural style transfer image processing technique to video by generating smooth transitions between several reference style images

Overview

Neural Style Transfer Transition Video Processing

By Brycen Westgarth and Tristan Jogminas

Description

This code extends the neural style transfer image processing technique to video by generating smooth transitions between a sequence of reference style images across video frames. The generated output video is a highly altered, artistic representation of the input video consisting of constantly changing abstract patterns and colors that emulate the original content of the video. The user's choice of style reference images, style sequence order, and style sequence length allow for infinite user experimentation and the creation of an endless range of artistically interesting videos.

System Requirements

This algorithm is computationally intensive so I highly recommend optimizing its performance by installing drivers for Tensorflow GPU support if you have access to a CUDA compatible GPU. Alternatively, you can take advantage of the free GPU resources available through Google Colab Notebooks. Even with GPU acceleration, the program may take several minutes to render a video.

Colab Notebook Version

Configuration

All configuration of the video properties and input/output file locations can be set by the user in config.py

Configurable Variable in config.py Description
ROOT_PATH Path to input/output directory
FRAME_HEIGHT Sets height dimension in pixels to resize the output video to. Video width will be calculated automatically to preserve aspect ratio. Low values will speed up processing time but reduce output video quality
INPUT_FPS Defines the rate at which frames are captured from the input video
INPUT_VIDEO_NAME Filename of input video
STYLE_SEQUENCE List that contains the indices corresponding to the image files in the 'style_ref' folder. Defines the reference style image transition sequence. Can be arbitrary length, the rate at which the video transitions between styles will be adjusted to fit the video
OUTPUT_FPS Defines the frame rate of the output video
OUTPUT_VIDEO_NAME Filename of output video to be created
GHOST_FRAME_TRANSPARENCY Proportional feedback constant for frame generation. Should be a value between 0 and 1. Affects the amount change that can occur between frames and the smoothness of the transitions.

The user must find and place their own style reference images in the style_ref directory. Style reference images can be arbitrary size. Three example style reference images are given.

Minor video time effects can be created by setting INPUT_FPS and OUTPUT_FPS to different relative values

  • INPUT_FPS > OUTPUT_FPS creates a slowed time effect
  • INPUT_FPS = OUTPUT_FPS creates no time effect
  • INPUT_FPS < OUTPUT_FPS creates a timelapse effect

Usage

$ python3 -m venv env
$ source env/bin/activate
$ pip3 install -r requirements.txt
$ python3 style_frames.py

Examples

Input Video

file

Example 1

Reference Style Image Transition Sequence

file

Output Video

file

Example 2

Reference Style Image Transition Sequence

file

Output Video

file

Example Video made using this program
You might also like...
Code for ACL 2021 main conference paper
Code for ACL 2021 main conference paper "Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances".

Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances This repository contains the code and pre-trained mode

Code for paper "Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features"

Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features Train python main.py --dataset brazil-flights C

Source code for AAAI20 "Generating Persona Consistent Dialogues by Exploiting Natural Language Inference".

Generating Persona Consistent Dialogues by Exploiting Natural Language Inference Source code for RCDG model in AAAI20 Generating Persona Consistent Di

This repository contains the code for "Generating Datasets with Pretrained Language Models".

Datasets from Instructions (DINO 🦕 ) This repository contains the code for Generating Datasets with Pretrained Language Models. The paper introduces

💛 Code and Dataset for our EMNLP 2021 paper:
💛 Code and Dataset for our EMNLP 2021 paper: "Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes"

Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes Official PyTorch implementation and EmoCause evaluatio

Code for "Generating Disentangled Arguments with Prompts: a Simple Event Extraction Framework that Works"

GDAP The code of paper "Code for "Generating Disentangled Arguments with Prompts: a Simple Event Extraction Framework that Works"" Event Datasets Prep

PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing

PORORO: Platform Of neuRal mOdels for natuRal language prOcessing pororo performs Natural Language Processing and Speech-related tasks. It is easy to

A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks

A Deep Learning NLP/NLU library by Intel® AI Lab Overview | Models | Installation | Examples | Documentation | Tutorials | Contributing NLP Architect

A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks

A Deep Learning NLP/NLU library by Intel® AI Lab Overview | Models | Installation | Examples | Documentation | Tutorials | Contributing NLP Architect

Comments
  • QUESTION: How to make higher resolution properly

    QUESTION: How to make higher resolution properly

    Hi, i'm very thankful for the developer of this project i was looking for working video style transfer ai for many months, i was literally dying finding it Thank you a lot ♥️

    I have question about resolution, i have resources and time to make 1080p But when i change frame height, it makes the style_ref images smaller on the result Is there a way to keep the size of the pattern same as height 400, but make 1080 resolution, thank you 😃

    opened by xxxcrow 3
  • Unsupported depth of input image

    Unsupported depth of input image

    I receive this error when PRESERVE_COLORS = True. Tried mucking around to change array types but it looks like that should already be taken care of with .astype(np.uint8). Otherwise amazing program, works like a charm on my M1 Macbook Pro. Thanks in advance.

    (tf_m1) anthony@Anthonys-MacBook-Pro ~ % python /Users/anthony/Documents/style-transfer-video-processor-main/style_frames.py Getting input frames Getting style info Getting output frames Output frame: 0% Traceback (most recent call last): File "/Users/anthony/Documents/style-transfer-video-processor-main/style_frames.py", line 211, in StyleFrame().run() File "/Users/anthony/Documents/style-transfer-video-processor-main/style_frames.py", line 206, in run self.get_output_frames() File "/Users/anthony/Documents/style-transfer-video-processor-main/style_frames.py", line 167, in get_output_frames temp_ghost_frame = cv2.cvtColor(ghost_frame, cv2.COLOR_RGB2BGR) * self.MAX_CHANNEL_INTENSITY cv2.error: OpenCV(4.5.1) ../modules/imgproc/src/color.simd_helpers.hpp:94: error: (-2:Unspecified error) in function 'cv::impl::(anonymous namespace)::CvtHelper<cv::impl::(anonymous namespace)::Set<3, 4, -1>, cv::impl::(anonymous namespace)::Set<3, 4, -1>, cv::impl::(anonymous namespace)::Set<0, 2, 5>, cv::impl::(anonymous namespace)::NONE>::CvtHelper(cv::InputArray, cv::OutputArray, int) [VScn = cv::impl::(anonymous namespace)::Set<3, 4, -1>, VDcn = cv::impl::(anonymous namespace)::Set<3, 4, -1>, VDepth = cv::impl::(anonymous namespace)::Set<0, 2, 5>, sizePolicy = cv::impl::(anonymous namespace)::NONE]'

    Unsupported depth of input image: 'VDepth::contains(depth)' where 'depth' is 6 (CV_64F)

    opened by steeler1 2
  • ZeroDivisionError: division by zero

    ZeroDivisionError: division by zero

    Traceback (most recent call last): File "style_frames.py", line 112, in <module> sf.get_style_info() File "style_frames.py", line 57, in get_style_info self.t_const = np.ceil(frame_length / (ref_count - 1)) ZeroDivisionError: division by zero

    I'm a newbie. Can you please make readme a lil bit easy to read for a newbie. 😅 Great repo btw. I'm using only one style image. Here is the colab you can replicate this or can add this to the repo as well.

    My config.py looks like this:

        INPUT_VIDEO_NAME = 'input.mp4'
        INPUT_VIDEO_PATH = f'{ROOT_PATH}/{INPUT_VIDEO_NAME}'
        INPUT_FRAME_DIRECTORY = f'{ROOT_PATH}/input_frames'
        INPUT_FRAME_FILE = '{:0>4d}_frame.png'
        INPUT_FRAME_PATH = f'{INPUT_FRAME_DIRECTORY}/{INPUT_FRAME_FILE}'
    
        STYLE_REF_DIRECTORY = f'{ROOT_PATH}/style_ref'
        # defines the reference style image transition sequence. Values correspond to indices in STYLE_REF_DIRECTORY
        STYLE_SEQUENCE = [1]
    
        OUTPUT_FPS = 30
        OUTPUT_VIDEO_NAME = 'output_video.mp4'
        OUTPUT_VIDEO_PATH = f'{ROOT_PATH}/{OUTPUT_VIDEO_NAME}'
        OUTPUT_FRAME_DIRECTORY = f'{ROOT_PATH}/output_frames'
        OUTPUT_FRAME_FILE = '{:0>4d}_frame.png'
        OUTPUT_FRAME_PATH = f'{OUTPUT_FRAME_DIRECTORY}/{OUTPUT_FRAME_FILE}'
    
        GHOST_FRAME_TRANSPARENCY = 0.1`
    
    Also this error too:
    
    `Traceback (most recent call last):
      File "style_frames.py", line 114, in <module>
        sf.get_output_frames()
      File "style_frames.py", line 84, in get_output_frames
        blended_img = prev_style + next_style
    ValueError: operands could not be broadcast together with shapes (1138,1707,3) (1138,1713,3) `
    opened by aertist 1
  • it just fails randomly, without any changes.

    it just fails randomly, without any changes.

    error:

    `TF Version: 2.4.1 TF Hub version: 0.11.0 Eager mode enabled: True GPU available: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')] Getting input frames Getting style info WARNING: Resizing style images which may cause distortion. To avoid this, please provide style images with the same dimensions Getting output frames Output frame: 0% Traceback (most recent call last): File "style_frames.py", line 216, in StyleFrame().run() File "style_frames.py", line 211, in run self.get_output_frames() File "style_frames.py", line 154, in get_output_frames stylized_img = self.hub_module(expanded_content_img, expanded_blended_img).pop() File "/home/chop/Scripts/style-transfer-video-processor/.venv/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 668, in _call_attribute return instance.call(*args, **kwargs) File "/home/chop/Scripts/style-transfer-video-processor/.venv/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1669, in call return self._call_impl(args, kwargs) File "/home/chop/Scripts/style-transfer-video-processor/.venv/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1687, in _call_impl return self._call_with_flat_signature(args, kwargs, cancellation_manager) File "/home/chop/Scripts/style-transfer-video-processor/.venv/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1736, in _call_with_flat_signature return self._call_flat(args, self.captured_inputs, cancellation_manager) File "/home/chop/Scripts/style-transfer-video-processor/.venv/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 115, in _call_flat return super(_WrapperFunction, self)._call_flat(args, captured_inputs, File "/home/chop/Scripts/style-transfer-video-processor/.venv/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1918, in _call_flat return self._build_call_outputs(self._inference_function.call( File "/home/chop/Scripts/style-transfer-video-processor/.venv/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 555, in call outputs = execute.execute( File "/home/chop/Scripts/style-transfer-video-processor/.venv/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, tensorflow.python.framework.errors_impl.NotFoundError: No algorithm worked! [[node InceptionV3/Conv2d_1a_3x3/Conv2D (defined at /home/chop/Scripts/style-transfer-video-processor/.venv/lib/python3.8/site-packages/tensorflow_hub/module_v2.py:106) ]] [Op:__inference_pruned_3217]

    Function call stack: pruned `

    how to avoid this problem. thanks

    opened by masterchop 0
Owner
Brycen Westgarth
Computer Engineering Student at UC Santa Barbara
Brycen Westgarth
Ceaser-Cipher - The Caesar Cipher technique is one of the earliest and simplest method of encryption technique

Ceaser-Cipher The Caesar Cipher technique is one of the earliest and simplest me

Lateefah Ajadi 2 May 12, 2022
This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Technique for Text Classification

The baseline code is for EDA: Easy Data Augmentation techniques for boosting performance on text classification tasks

Akbar Karimi 81 Dec 9, 2022
Global Rhythm Style Transfer Without Text Transcriptions

Global Prosody Style Transfer Without Text Transcriptions This repository provides a PyTorch implementation of AutoPST, which enables unsupervised glo

Kaizhi Qian 193 Dec 30, 2022
Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning

Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning English | 中文 ❗ Now we provide inferencing code and pre-training models

null 164 Jan 2, 2023
Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS)

TOPSIS implementation in Python Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) CHING-LAI Hwang and Yoon introduced TOPSIS

Hamed Baziyad 8 Dec 10, 2022
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Max Woolf 4.8k Dec 30, 2022
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Max Woolf 4.3k Feb 18, 2021
Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation This is the implementaion of our paper: Bridging the

hezw.tkcw 20 Dec 12, 2022
Abhijith Neil Abraham 2 Nov 5, 2021
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

Google Research Datasets 740 Dec 24, 2022