an implementation of 3D Ken Burns Effect from a Single Image using PyTorch

Overview

3d-ken-burns

This is a reference implementation of 3D Ken Burns Effect from a Single Image [1] using PyTorch. Given a single input image, it animates this still image with a virtual camera scan and zoom subject to motion parallax. Should you be making use of our work, please cite our paper [1].

Paper

setup

Several functions are implemented in CUDA using CuPy, which is why CuPy is a required dependency. It can be installed using pip install cupy or alternatively using one of the provided binary packages as outlined in the CuPy repository. Please also make sure to have the CUDA_HOME environment variable configured.

In order to generate the video results, please also make sure to have pip install moviepy installed.

usage

To run it on an image and generate the 3D Ken Burns effect fully automatically, use the following command.

python autozoom.py --in ./images/doublestrike.jpg --out ./autozoom.mp4

To start the interface that allows you to manually adjust the camera path, use the following command. You can then navigate to http://localhost:8080/ and load an image using the button on the bottom right corner. Please be patient when loading an image and saving the result, there is a bit of background processing going on.

python interface.py

To run the depth estimation to obtain the raw depth estimate, use the following command. Please note that this script does not perform the depth adjustment, see #22 for information on how to add it.

python depthestim.py --in ./images/doublestrike.jpg --out ./depthestim.npy

To benchmark the depth estimation, run python benchmark-ibims.py or python benchmark-nyu.py. You can use it to easily verify that the provided implementation runs as expected.

colab

If you do not have a suitable environment to run this projects then you could give Colab a try. It allows you to run the project in the cloud, free of charge. There are several people who provide Colab notebooks that should get you started. A few that I am aware of include one from Arnaldo Gabriel, one from Vlad Alex, and one from Ahmed Harmouche.

dataset

This dataset is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License (CC BY-NC-SA 4.0) and may only be used for non-commercial purposes. Please see the LICENSE file for more information.

scene mode color depth normal
asdf flying 3.7 GB 1.0 GB 2.9 GB
asdf walking 3.6 GB 0.9 GB 2.7 GB
blank flying 3.2 GB 1.0 GB 2.8 GB
blank walking 3.0 GB 0.9 GB 2.7 GB
chill flying 5.4 GB 1.1 GB 10.8 GB
chill walking 5.2 GB 1.0 GB 10.5 GB
city flying 0.8 GB 0.2 GB 0.9 GB
city walking 0.7 GB 0.2 GB 0.8 GB
environment flying 1.9 GB 0.5 GB 3.5 GB
environment walking 1.8 GB 0.5 GB 3.3 GB
fort flying 5.0 GB 1.1 GB 9.2 GB
fort walking 4.9 GB 1.1 GB 9.3 GB
grass flying 1.1 GB 0.2 GB 1.9 GB
grass walking 1.1 GB 0.2 GB 1.6 GB
ice flying 1.2 GB 0.2 GB 2.1 GB
ice walking 1.2 GB 0.2 GB 2.0 GB
knights flying 0.8 GB 0.2 GB 1.0 GB
knights walking 0.8 GB 0.2 GB 0.9 GB
outpost flying 4.8 GB 1.1 GB 7.9 GB
outpost walking 4.6 GB 1.0 GB 7.4 GB
pirates flying 0.8 GB 0.2 GB 0.8 GB
pirates walking 0.7 GB 0.2 GB 0.8 GB
shooter flying 0.9 GB 0.2 GB 1.1 GB
shooter walking 0.9 GB 0.2 GB 1.0 GB
shops flying 0.2 GB 0.1 GB 0.2 GB
shops walking 0.2 GB 0.1 GB 0.2 GB
slums flying 0.5 GB 0.1 GB 0.8 GB
slums walking 0.5 GB 0.1 GB 0.7 GB
subway flying 0.5 GB 0.1 GB 0.9 GB
subway walking 0.5 GB 0.1 GB 0.9 GB
temple flying 1.7 GB 0.4 GB 3.1 GB
temple walking 1.7 GB 0.3 GB 2.8 GB
titan flying 6.2 GB 1.1 GB 11.5 GB
titan walking 6.0 GB 1.1 GB 11.3 GB
town flying 1.7 GB 0.3 GB 3.0 GB
town walking 1.8 GB 0.3 GB 3.0 GB
underland flying 5.4 GB 1.2 GB 12.1 GB
underland walking 5.1 GB 1.2 GB 11.4 GB
victorian flying 0.5 GB 0.1 GB 0.8 GB
victorian walking 0.4 GB 0.1 GB 0.7 GB
village flying 1.6 GB 0.3 GB 2.8 GB
village walking 1.6 GB 0.3 GB 2.7 GB
warehouse flying 0.9 GB 0.2 GB 1.5 GB
warehouse walking 0.8 GB 0.2 GB 1.4 GB
western flying 0.8 GB 0.2 GB 0.9 GB
western walking 0.7 GB 0.2 GB 0.8 GB

Please note that this is an updated version of the dataset that we have used in our paper. So while it has fewer scenes in total, each sample capture now has a varying focal length which should help with generalizability. Furthermore, some examples are either over- or under-exposed and it would be a good idea to remove these outliers. Please see #37, #39, and #40 for supplementary discussions.

video

Video

license

This is a project by Adobe Research. It is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License (CC BY-NC-SA 4.0) and may only be used for non-commercial purposes. Please see the LICENSE file for more information.

references

[1]  @article{Niklaus_TOG_2019,
         author = {Simon Niklaus and Long Mai and Jimei Yang and Feng Liu},
         title = {3D Ken Burns Effect from a Single Image},
         journal = {ACM Transactions on Graphics},
         volume = {38},
         number = {6},
         pages = {184:1--184:15},
         year = {2019}
     }

acknowledgment

The video above uses materials under a Creative Common license or with the owner's permission, as detailed at the end.

Issues
  • Evaluate on NYUV2 test set?

    Evaluate on NYUV2 test set?

    Hi! Could You please provide a script how you evaluated the NYUV2 test set. I am trying to get the same results as you mentioned in the paper but can't.

    opened by oljike 12
  • Output mp4 do not play if width/height not divisible by 2

    Output mp4 do not play if width/height not divisible by 2

    At least i think this is the problem. More of a mp4 limitation that a bug. I usually fix this in ffmpeg using: -vf "pad=ceil(iw/2)*2:ceil(ih/2)*2"

    opened by BurguerJohn 10
  • RuntimeError: view size is not compatible

    RuntimeError: view size is not compatible

    When I try and run the colab of this I'm getting and error on the final step of

    Traceback (most recent call last): File "autozoom.py", line 76, in process_load(npyImage, {}) File "", line 10, in process_load File "", line 128, in disparity_refinement File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "", line 94, in forward RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

    Do you know what this might be?

    Thank you

    opened by thesoulharmonic 9
  • Curious About Training Dataset

    Curious About Training Dataset

    Thank you for your impressive work. I am really curious about how to create training pairs for color and depth images inpainting. Wonder if you would like to share a link to training dataset in the future?

    opened by JasonLSC 6
  • When I run interfac.py it hangs

    When I run interfac.py it hangs

    I have to control+c to get out. On Windows

    image

    opened by GantMan 4
  • Getting IndexError: list index out of range while running default command

    Getting IndexError: list index out of range while running default command

    Using python autozoom.py --in ./images/doublestrike.jpg --out ./autozoom.mp4 command to run the code but i am getting IndexError: list index out of range. image

    This is the package list i have installed:

    blas=1.0=mkl ca-certificates=2021.7.5=haa95532_1 cached-property=1.5.2=py_0 certifi=2021.5.30=py37haa95532_0 cffi=1.14.6=py37h2bbff1b_0 charset-normalizer=2.0.4=pypi_0 click=8.0.1=pyhd3eb1b0_0 colorama=0.4.4=pypi_0 cudatoolkit=10.1.243=h74a9793_0 cudnn=7.6.5=cuda10.1_0 cupy=8.3.0=py37hd4ca531_0 decorator=4.4.2=pypi_0 fastrlock=0.6=py37hd77b12b_0 flask=1.1.2=pyhd3eb1b0_0 freetype=2.10.4=hd328e21_0 gevent=21.8.0=py37h2bbff1b_1 greenlet=1.1.1=py37hd77b12b_0 h5py=3.2.1=py37h3de5c98_0 hdf5=1.10.6=h7ebc959_0 icc_rt=2019.0.0=h0cc432a_1 idna=3.2=pypi_0 imageio=2.9.0=pypi_0 imageio-ffmpeg=0.4.5=pypi_0 importlib-metadata=3.10.0=py37haa95532_0 intel-openmp=2021.3.0=haa95532_3372 itsdangerous=2.0.1=pyhd3eb1b0_0 jinja2=3.0.1=pyhd3eb1b0_0 jpeg=9b=hb83a4c4_2 libpng=1.6.37=h2a8f88b_0 libtiff=4.2.0=hd0e1b90_0 lz4-c=1.9.3=h2bbff1b_1 markupsafe=2.0.1=py37h2bbff1b_0 mkl=2021.3.0=haa95532_524 mkl-service=2.4.0=py37h2bbff1b_0 mkl_fft=1.3.0=py37h277e83a_2 mkl_random=1.2.2=py37hf11a4ad_0 moviepy=1.0.3=pypi_0 ninja=1.10.2=h6d14046_1 numpy=1.20.3=py37ha4e8547_0 numpy-base=1.20.3=py37hc2deb75_0 olefile=0.46=py37_0 opencv-contrib-python=4.5.3.56=pypi_0 openssl=1.1.1k=h2bbff1b_0 pillow=8.3.1=py37h4fa10fc_0 pip=21.2.4=pypi_0 proglog=0.1.9=pypi_0 pycparser=2.20=py_2 pyreadline=2.1=py37_1 python=3.7.11=h6244533_0 pytorch=1.6.0=py3.7_cuda101_cudnn7_0 requests=2.26.0=pypi_0 scipy=1.6.2=py37h66253e8_1 setuptools=52.0.0=py37haa95532_0 six=1.16.0=pyhd3eb1b0_0 sqlite=3.36.0=h2bbff1b_0 tk=8.6.10=he774522_0 torchvision=0.7.0=py37_cu101 tqdm=4.62.2=pypi_0 typing_extensions=3.10.0.0=pyh06a4308_0 urllib3=1.26.6=pypi_0 vc=14.2=h21ff451_1 vs2015_runtime=14.27.29016=h5e58377_2 werkzeug=1.0.1=pyhd3eb1b0_0 wheel=0.37.0=pyhd3eb1b0_0 wincertstore=0.2=py37_0 xz=5.2.5=h62dcd97_0 zipp=3.5.0=pyhd3eb1b0_0 zlib=1.2.11=h62dcd97_4 zope=1.0=py37_1 zope.event=4.5.0=py37_0 zope.interface=5.4.0=py37h2bbff1b_0 zstd=1.4.9=h19a0ad4_0

    opened by PotatoHate 4
  • Meaning of parameters in meta data json files

    Meaning of parameters in meta data json files

    For the synthetic depth/normal dataset, there is a meta data json file included with the RGB images. This contains two parameters: intSample and fltFov. Could you explain what these mean? Ideally, I would like to be able to compute camera intrinsics in the form of focal length/principal point or a K matrix. Any guidance on doing this from the json files would be appreciated.

    opened by waps101 4
  • Two places to calculate disparity map in the project

    Two places to calculate disparity map in the project

    Hi, Love your project. After reading your code, I am a little confused for generating the disparity map. From the code, there seems two different ways to calculate disparity map:

    1. common.py https://github.com/sniklaus/3d-ken-burns/blob/a75ac8060ff93b316ec702cb24ca733c15899a69/common.py#L11
    2. depthestim.py https://github.com/sniklaus/3d-ken-burns/blob/a75ac8060ff93b316ec702cb24ca733c15899a69/depthestim.py#L71

    Can you kindly explain why the differences? Thanks.

    opened by xgd 4
  • CPU inference ?

    CPU inference ?

    Congratulations for you paper, and thanks for open sourcing this very effective work. I have one question about your python implementation, hope you can give some advice.

    • Is it possible to process 3d-ken-burns effect without using CUDA (just by doing some CPU inference) ?

    Thanks,

    Axel

    opened by axelbellec 4
  • Single pixel artefact in resulting video

    Single pixel artefact in resulting video

    Hi Team,

    Awesome work!

    There is a single pixel artefact that moves in a way that looks relative to the acceleration of the viewport. It looks like it could be that the pixel at the origin of the viewport is set to 0 or very high or something.

    You can see it in most of the videos created with the tool, but I am sure it is there in all of them.

    In this waxy article https://waxy.org/2019/11/turning-photos-into-2-5d-parallax-animations-with-machine-learning/

    You can spot it on bottom left corner of the dress of the kissing in time square one. Bottom of Nixons jacket in Elvis + Nixon Gordon Sondlands left hand

    opened by neossian 0
  • Dolly Zoom effect

    Dolly Zoom effect

    Has anybody tried to implement the dolly zoom effect based on this code? We can zoom in the video now we only need to add the dolly effect which is moving the camera toward or from the scene. I think by changing process_shift function to shift in y direction instead of x this effect can be implemented, any ideas?

    opened by sepideh-srj 1
  • A way to only zoom in one direction and to control the length/speed?

    A way to only zoom in one direction and to control the length/speed?

    Is there a way to control the zoom behavior, like if I only want to zoom in or zoom out and how fast the zoom should be? Best for the zoom speed I found so far is the fltSteps by increasing the number of steps and modifying the framerate accordingly.

    opened by Roemer 0
  • Feature/interface improvements

    Feature/interface improvements

    • Added a command line argument to set the port for the flask web server (-p or --port).
    • Added a debounce for the api calls when updating the rectangles (to not stress the server too much)
    • Added a toggle to switch between preview stream and still image

    I really like this project as I do some slideshows for my family quite often and this is much nicer than the 2d ken burns effect.

    opened by Roemer 0
  • AI-FYI featured project badge

    AI-FYI featured project badge

    Only one project is featured per newsletter. This project will be featured on June 16th by AI-FYI.com. Congratulations!

    opened by GantMan 0
  • No Image when using interface.py on Google Colab

    No Image when using interface.py on Google Colab

    Using https://github.com/wpmed92/3d-ken-burns-colab as the base, I added the following to the end of the notebook to attempt to use interface.py so I can do my own camera paths:

    #Get an internet accessible address to the local server from interface.py. Changed server port in interface.py to 8050 from google.colab.output import eval_js print(eval_js("google.colab.kernel.proxyPort(8050)")) # Will be something like: https://z4spb7cvssd-496ff2e9c6d22116-8050-colab.googleusercontent.com/

    #Run the interface !python interface.py

    Everything seems to be setup right and allows me to load an image from my PC, modify the zooms, etc., However, the finished 3D image never appears. Here is the log from interface.py:

    127.0.0.1 - - [2020-05-24 18:49:04] "GET / HTTP/1.1" 200 9154 0.001346 127.0.0.1 - - [2020-05-24 18:49:04] "GET /favicon.ico HTTP/1.1" 404 356 0.001295 127.0.0.1 - - [2020-05-24 18:49:15] "POST /update_mode HTTP/1.1" 400 318 0.000623 127.0.0.1 - - [2020-05-24 18:49:15] "POST /load_image HTTP/1.1" 400 318 0.001246 127.0.0.1 - - [2020-05-24 18:49:15] "POST /update_mode HTTP/1.1" 400 318 0.000806 127.0.0.1 - - [2020-05-24 18:49:15] "POST /update_mode HTTP/1.1" 400 318 0.000689 127.0.0.1 - - [2020-05-24 18:49:15] "POST /update_mode HTTP/1.1" 400 318 0.000638 127.0.0.1 - - [2020-05-24 18:49:15] "POST /update_from HTTP/1.1" 400 318 0.000629 127.0.0.1 - - [2020-05-24 18:49:15] "POST /update_mode HTTP/1.1" 400 318 0.000582 127.0.0.1 - - [2020-05-24 18:49:16] "POST /update_mode HTTP/1.1" 400 318 0.000783 127.0.0.1 - - [2020-05-24 18:49:16] "POST /update_from HTTP/1.1" 400 318 0.000632 127.0.0.1 - - [2020-05-24 18:49:16] "POST /update_mode HTTP/1.1" 400 318 0.000649 127.0.0.1 - - [2020-05-24 18:49:16] "POST /update_from HTTP/1.1" 400 318 0.000658 127.0.0.1 - - [2020-05-24 18:49:16] "POST /update_from HTTP/1.1" 400 318 0.000727 127.0.0.1 - - [2020-05-24 18:49:16] "POST /update_to HTTP/1.1" 400 318 0.001202 127.0.0.1 - - [2020-05-24 18:49:17] "POST /update_mode HTTP/1.1" 400 318 0.000553 127.0.0.1 - - [2020-05-24 18:49:17] "POST /update_to HTTP/1.1" 400 318 0.000643 127.0.0.1 - - [2020-05-24 18:49:17] "POST /update_to HTTP/1.1" 400 318 0.000613 127.0.0.1 - - [2020-05-24 18:49:17] "POST /update_to HTTP/1.1" 400 318 0.000714 127.0.0.1 - - [2020-05-24 18:49:17] "POST /update_mode HTTP/1.1" 400 318 0.000655 127.0.0.1 - - [2020-05-24 18:50:59] "GET /get_live HTTP/1.1" 200 36695563 115.058576

    And here is what the screen shows from the browser tab showing the interface: image

    Any help to get this to work is much appreciated.

    Doug

    opened by DougShuffield 4
  • Documentation for understanding/following the code

    Documentation for understanding/following the code

    Hi there,

    Could you please consider adding documentation on reading the code, and which files to follow for what?

    I would also consider adding function-level comments.

    opened by PuneetKohli 6
  • Image resolution seems to take a hit

    Image resolution seems to take a hit

    Loving the paper and its implementation. The results really do look stunning! I'm just having an issue with the output resolution being severely degraded in comparison to the input image. Everything seems so blurry. Is this just an artifact of the animation process itself?

    opened by cyrilzakka 1
Owner
Simon Niklaus
Research Scientist at Adobe
Simon Niklaus
pytorch implementation for Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network arXiv:1609.04802

PyTorch SRResNet Implementation of Paper: "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network"(https://arxiv.org/abs

Jiu XU 398 Nov 28, 2021
Efficient neural networks for analog audio effect modeling

micro-TCN Efficient neural networks for audio effect modeling

Christian Steinmetz 50 Nov 23, 2021
[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias

Counterfactual VQA (CF-VQA) This repository is the Pytorch implementation of our paper "Counterfactual VQA: A Cause-Effect Look at Language Bias" in C

Yulei Niu 68 Nov 28, 2021
Algebraic effect handlers in Python

PyEffect: Algebraic effects in Python What IDK. Usage effects.handle(operation, handlers=None) effects.set_handler(effect, handler) Supported effects

Greg Werbin 4 Aug 13, 2021
Code for Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding

?? quince Code for Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding ?? Installation $ git clone [email protected]

Andrew Jesson 15 Nov 16, 2021
Official repository of the paper "A Variational Approximation for Analyzing the Dynamics of Panel Data". Mixed Effect Neural ODE. UAI 2021.

Official repository of the paper (UAI 2021) "A Variational Approximation for Analyzing the Dynamics of Panel Data", Mixed Effect Neural ODE. Panel dat

Jurijs Nazarovs 2 Nov 3, 2021
Official implementation of "SinIR: Efficient General Image Manipulation with Single Image Reconstruction" (ICML 2021)

SinIR (Official Implementation) Requirements To install requirements: pip install -r requirements.txt We used Python 3.7.4 and f-strings which are in

null 38 Nov 27, 2021
[CVPR 2021] Official PyTorch Implementation for "Iterative Filter Adaptive Network for Single Image Defocus Deblurring"

IFAN: Iterative Filter Adaptive Network for Single Image Defocus Deblurring Checkout for the demo (GUI/Google Colab)! The GUI version might occasional

Junyong Lee 98 Nov 26, 2021
Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py

Gyeongsik Moon 529 Nov 29, 2021
Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"

SinGAN Project | Arxiv | CVF | Supplementary materials | Talk (ICCV`19) Official pytorch implementation of the paper: "SinGAN: Learning a Generative M

Tamar Rott Shaham 3k Nov 25, 2021
[ICCV 2021] Official Tensorflow Implementation for "Single Image Defocus Deblurring Using Kernel-Sharing Parallel Atrous Convolutions"

KPAC: Kernel-Sharing Parallel Atrous Convolutional block This repository contains the official Tensorflow implementation of the following paper: Singl

Hyeongseok Son 24 Dec 1, 2021
PyTorch implementation of "Image-to-Image Translation Using Conditional Adversarial Networks".

pix2pix-pytorch PyTorch implementation of Image-to-Image Translation Using Conditional Adversarial Networks. Based on pix2pix by Phillip Isola et al.

mrzhu 344 Nov 20, 2021
Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Implicit3DUnderstanding (Im3D) [Project Page] Holistic 3D Scene Understanding from a Single Image with Implicit Representation Cheng Zhang, Zhaopeng C

Cheng Zhang 104 Nov 27, 2021
PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network"

HAN PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network" This repository is for HAN introduced in the

五维空间 109 Nov 25, 2021
PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

About PyTorch 1.2.0 Now the master branch supports PyTorch 1.2.0 by default. Due to the serious version problem (especially torch.utils.data.dataloade

Sanghyun Son 1.8k Nov 19, 2021
GANimation: Anatomically-aware Facial Animation from a Single Image (ECCV'18 Oral) [PyTorch]

GANimation: Anatomically-aware Facial Animation from a Single Image [Project] [Paper] Official implementation of GANimation. In this work we introduce

Albert Pumarola 1.7k Nov 26, 2021
Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

SSC-GAN_repo Pytorch implementation for 'Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation'.PDF SSC-GAN:Sem

tyty 1 Nov 26, 2021
Practical Single-Image Super-Resolution Using Look-Up Table

Practical Single-Image Super-Resolution Using Look-Up Table [Paper] Dependency Python 3.6 PyTorch glob numpy pillow tqdm tensorboardx 1. Training deep

Younghyun Jo 55 Nov 24, 2021
Implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"

SinGAN This is an unofficial implementation of SinGAN from someone who's been sitting right next to SinGAN's creator for almost five years. Please ref

null 14 Nov 19, 2021