Layered Neural Atlases for Consistent Video Editing

Overview

Layered Neural Atlases for Consistent Video Editing

Project Page | Paper

This repository contains an implementation for the SIGGRAPH Asia 2021 paper Layered Neural Atlases for Consistent Video Editing.

The paper introduces the first approach for neural video unwrapping using an end-to-end optimized interpretable and semantic atlas-based representation, which facilitates easy and intuitive editing in the atlas domain.

Installation Requirements

The code is compatible with Python 3.7 and PyTorch 1.6.

You can create an anaconda environment called neural_atlases with the required dependencies by running:

conda create --name neural_atlases python=3.7 
conda activate neural_atlases 
conda install pytorch=1.6.0 torchvision=0.7.0 cudatoolkit=10.1 matplotlib tensorboard scipy  scikit-image tqdm  opencv -c pytorch
pip install imageio-ffmpeg gdown
python -m pip install detectron2 -f   https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.6/index.html

Data convention

The code expects 3 folders for each video input, e.g. for a video of 50 frames named "blackswan":

  1. data/blackswan: A folder of video frames containing image files in the following convention: blackswan/00000.jpg,blackswan/00001.jpg,...,blackswan/00049.jpg (as in the DAVIS dataset).
  2. data/blackswan_flow: A folder with forward and backward optical flow files in the following convention: blackswan_flow/00000.jpg_00001.jpg.npy,blackswan_flow/00001.jpg_00000.jpg,...,blackswan_flow/00049.jpg_00048.jpg.npy.
  3. data/blackswan_maskrcnn: A folder with rough masks (created by Mask-RCNN or any other way) containing files in the following convention: blackswan_maskrcnn/00000.jpg,blackswan_maskrcnn/00001.jpg,...,blackswan_maskrcnn/00049.jpg

For a few examples of DAVIS sequences run:

gdown https://drive.google.com/uc?id=1WipZR9LaANTNJh764ukznXXAANJ5TChe
unzip data.zip

Masks extraction

Given only the video frames folder data/blackswan it is possible to extract the Mask-RCNN masks (and create the required folder data/blackswan_maskrcnn) by running:

python preprocess_mask_rcnn.py --vid-path data/blackswan --class_name bird

where --class_name determines the COCO class name of the sought foreground object. It is also possible to choose the first instance retrieved by Mask-RCNN by using --class_name anything. This is usefull for cases where Mask-RCNN gets correct masks with wrong classes as in the "libby" video:

python preprocess_mask_rcnn.py --vid-path data/libby --class_name anything

Optical flows extraction

Furthermore, the optical flow folder can be extracted using RAFT. For linking RAFT into the current project run:

git submodule update --init
cd thirdparty/RAFT/
./download_models.sh
cd ../..

For extracting the optical flows (and creating the required folder data/blackswan_flow) run:

python preprocess_optical_flow.py --vid-path data/blackswan --max_long_edge 768

Pretrained models

For downloading a sample set of our pretrained models together with sample edits run:

gdown https://drive.google.com/uc?id=10voSCdMGM5HTIYfT0bPW029W9y6Xij4D
unzip pretrained_models.zip

Training

For training a model on a video, run:

python train.py config/config.json

where the video frames folder is determined by the config parameter "data_folder". Note that in order to reduce the training time it is possible to reduce the evaluation frequency controlled by the parameter "evaluate_every" (e.g. by changing it to 10000). The other configurable parameters are documented inside the file train.py.

Evaluation

During training, the model is evaluated. For running only evaluation on a trained folder run:

python only_evaluate.py --trained_model_folder=pretrained_models/checkpoints/blackswan --video_name=blackswan --data_folder=data --output_folder=evaluation_outputs

where trained_model_folder is the path to a folder that contains the config.json and checkpoint files of the trained model.

Editing

To apply editing, run the script only_edit.py. Examples for the supplied pretrained models for "blackswan" and "boat":

python only_edit.py --trained_model_folder=pretrained_models/checkpoints/blackswan --video_name=blackswan --data_folder=data --output_folder=editing_outputs --edit_foreground_path=pretrained_models/edit_inputs/blackswan/edit_blackswan_foreground.png --edit_background_path=pretrained_models/edit_inputs/blackswan/edit_blackswan_background.png
python only_edit.py --trained_model_folder=pretrained_models/checkpoints/boat --video_name=boat --data_folder=data --output_folder=editing_outputs --edit_foreground_path=pretrained_models/edit_inputs/boat/edit_boat_foreground.png --edit_background_path=pretrained_models/edit_inputs/boat/edit_boat_backgound.png

Where edit_foreground_path and edit_background_path specify the paths to 1000x1000 images of the RGBA atlas edits.

For applying an edit that was done on a frame (e.g. for the pretrained "libby"):

python only_edit.py --trained_model_folder=pretrained_models/checkpoints/libby --video_name=libby --data_folder=data --output_folder=editing_outputs  --use_edit_frame --edit_frame_index=7 --edit_frame_path=pretrained_models/edit_inputs/libby/edit_frame_.png

Citation

If you find our work useful in your research, please consider citing:

@article{kasten2021layered,
  title={Layered Neural Atlases for Consistent Video Editing},
  author={Kasten, Yoni and Ofri, Dolev and Wang, Oliver and Dekel, Tali},
  journal={arXiv preprint arXiv:2109.11418},
  year={2021}
}
Comments
  • RuntimeError: No such operator detectron2::nms_rotated

    RuntimeError: No such operator detectron2::nms_rotated

    Thanks for sharing your code. I have a problem with running preprocess_mask_rcnn.py. I get the following error: RuntimeError: No such operator detectron2::nms_rotated. I followed all your instructions for creating the required environment, and it seems that detectron2 is installed successfully based on this line: Successfully installed antlr4-python3-runtime-4.8 detectron2-0.4+cu101 future-0.18.2 fvcore-0.1.3.post20210317 google-auth-1.35.0 iopath-0.1.9 omegaconf-2.1.2 portalocker-2.4.0 pycocotools-2.0.4 pydot-1.4.2 tabulate-0.8.9 termcolor-1.1.0 yacs-0.1.8 I can import detectron2 without any problem, but after running from detectron2 import model_zoo, I get the error below:

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/anaconda3/envs/neural_atlases/lib/python3.7/site-packages/detectron2/model_zoo/__init__.py", line 8, in <module>
        from .model_zoo import get, get_config_file, get_checkpoint_url, get_config
      File "/anaconda3/envs/neural_atlases/lib/python3.7/site-packages/detectron2/model_zoo/model_zoo.py", line 9, in <module>
        from detectron2.modeling import build_model
      File "/anaconda3/envs/neural_atlases/lib/python3.7/site-packages/detectron2/modeling/__init__.py", line 2, in <module>
        from detectron2.layers import ShapeSpec
      File "/anaconda3/envs/neural_atlases/lib/python3.7/site-packages/detectron2/layers/__init__.py", line 5, in <module>
        from .nms import batched_nms, batched_nms_rotated, nms, nms_rotated
      File "/anaconda3/envs/neural_atlases/lib/python3.7/site-packages/detectron2/layers/nms.py", line 16, in <module>
        nms_rotated_func = torch.ops.detectron2.nms_rotated
      File "/.local/lib/python3.7/site-packages/torch/_ops.py", line 61, in __getattr__
        op = torch._C._jit_get_operation(qualified_op_name)
    RuntimeError: No such operator detectron2::nms_rotated
    
    

    I am wondering to find out why this error happens and how I can solve it. Thank you.

    opened by denabazazian 2
  • Trouble with preprocess_optical_flow.py

    Trouble with preprocess_optical_flow.py

    Hi there! This project is phenomenol, seriously excited to try this, see what others make of it and how people use it.

    Update: Managed to fix this! I realised that I just hadn't cloned the Raft project properly.


    I'm trying to run this in colab, it appears to work up until the point Im trying to run preprocess_optical_flow.py where it gives me. "" /content/drive/MyDrive/VideoEdit/layered-neural-atlases Traceback (most recent call last): File "/content/drive/MyDrive/VideoEdit/layered-neural-atlases/preprocess_optical_flow.py", line 4, in from raft_wrapper import RAFTWrapper File "/content/drive/MyDrive/VideoEdit/layered-neural-atlases/raft_wrapper.py", line 12, in from utils.utils import InputPadder ModuleNotFoundError: No module named 'utils' "" I have imported the models under RAFT/models but don't see a utils file or folder anywhere, other than unwrap_utils.py.

    Hoping you might have time to answer my query. Again amazing work!

    opened by corranmac 0
  • How to make new edits on atlas?

    How to make new edits on atlas?

    Hello thanks for your amazing work.

    I have run the train and edit based on your pretrained model, everything goes fine.

    However, after I train 40w model on blackswan and use your edited altas, I cannot get correct results. Should I make new atlas images based on the new trained model? Then where I could get the raw atlas(unedit from the model) images?

    Thanks in advance.

    opened by sydney0zq 1
  • ValueError: number sections must be larger than 0

    ValueError: number sections must be larger than 0

    (neural_atlases) D:\layered-neural-atlases-main>python only_edit.py --trained_mo del_folder=pretrained_models/checkpoints/libby --video_name=libby --data_folder= data/libby --output_folder=editing_outputs --use_edit_frame --edit_frame_index= 7 --edit_frame_path=pretrained_models/edit_inputs/libby/edit_frame_.png

    Model has 264706 params Model has 133122 params Model has 416379 params Model has 402945 params Traceback (most recent call last): File "only_edit.py", line 472, in main(training_folder, frame_edit, frames_folder, mask_rcnn_folder, frame_edi t_file, edit_tex1_file, edit_tex2_file, File "only_edit.py", line 378, in main edit_im1, edit_im2 = texture_edit_from_frame_edit(edit_frame, frame_number, model_F_mapping1, model_F_mapping2, File "only_edit.py", line 204, in texture_edit_from_frame_edit maxx2, minx2, maxy2, miny2, edge_size2 = get_mapping_area(model_F_mapping2, model_alpha, mask_frames > -1, larger_dim, File "D:\layered-neural-atlases-main\evaluate.py", line 149, in get_mapping_ar ea relisa = np.array_split(relis_i.numpy(), np.ceil(relis_i.shape[0] / 100000))

    File "<array_function internals>", line 5, in array_split File "D:\Anaconda3\envs\neural_atlases\lib\site-packages\numpy\lib\shape_base. py", line 778, in array_split raise ValueError('number sections must be larger than 0.') from None ValueError: number sections must be larger than 0.

    opened by 616099859 1
  • How to implement the Multi Foreground Atlases feature?

    How to implement the Multi Foreground Atlases feature?

    Thanks for sharing this amazing code! I'm trying to implement the Multi Foreground Atlases feature referenced by the Section 4.3 on arxiv paper. But, i can't understand this sentence:

    Unlike the one foreground object case, to support occlusions between different foreground objects, the sparsity loss is applied directly on the atlas, by applying 𝑙1 regularization on randomly sampled UV coordinates from foreground regions in the atlas.

    What this means in practice? I need to apply the l1 regularization equation error(y, ŷ) + λ * Σ |w|?

    if yes:

    • What is the lambda value used in the lucia results on paper?
    • The error(y, ŷ) term is the single layer case sparsity loss equation (Eq. 14)?
    • The |w| term is the values from uv coordinates given by the multiple foreground mapping models?

    if not:

    • How to calculate, in practice, the sparsity loss on multi foreground object case?
    • Please, explain with a equation for easier understanding.

    Another questions:

    • The equations that calculates the losses for each mapping model (like the rigidity_loss and the optical_flow_loss) need to be applied for each foreground mapping model and summed at the end?

    • What is the coefficient values used for the user scribble losses on the equations:

      • l_red = -log(alpha_red) Eq. 20
      • l_green = -log(alpha_green) Eq. 21
    • What is this βtv = 100 variable at Section 3.5?

    I would be grateful if you can answer these questions. Thanks!

    opened by thiagoambiel 2
  • how set edit_frame_index = ? 1 2 3 ... ?

    how set edit_frame_index = ? 1 2 3 ... ?

    python only_edit.py --trained_model_folder=pretrained_models/checkpoints/libby --video_name=libby --data_folder=data --output_folder=editing_outputs --use_edit_frame --edit_frame_index=7 --edit_frame_path=pretrained_models/edit_inputs/libby/edit_frame_.png

    how set edit_frame_index = ? 1 2 3 ... ?

    opened by 565ee 2
  • How to deal with video with height=768 and width=432

    How to deal with video with height=768 and width=432

    Thanks for your excellent work and source code.

    I saw that in your provided config.json, height is 432 and width is 768. But in my case, the height is 768 and width is 432, and I only changed the corresponding config file as: "resx": 768, "resy": 432, However, the reconstructed video seem strange and cannot work well.

    Wonder if there are other codes I should change when I deal with this portrait video. Thanks a lot ~

    opened by CrossLee1 1
Owner
Yoni Kasten
Yoni Kasten
PyTorch implementations for our SIGGRAPH 2021 paper: Editable Free-viewpoint Video Using a Layered Neural Representation.

st-nerf We provide PyTorch implementations for our paper: Editable Free-viewpoint Video Using a Layered Neural Representation SIGGRAPH 2021 Jiakai Zha

Diplodocus 258 Jan 2, 2023
Interpretation of T cell states using reference single-cell atlases

Interpretation of T cell states using reference single-cell atlases ProjecTILs is a computational method to project scRNA-seq data into reference sing

Cancer Systems Immunology Lab 139 Jan 3, 2023
Code for "Unsupervised Layered Image Decomposition into Object Prototypes" paper

DTI-Sprites Pytorch implementation of "Unsupervised Layered Image Decomposition into Object Prototypes" paper Check out our paper and webpage for deta

null 40 Dec 22, 2022
Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity

Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity, such as gratings, photonic-crystal slabs, metasurfaces, surface-emitting lasers, nano-antennas, and more.

Alex Song 17 Dec 19, 2022
Read and write layered TIFF ImageSourceData and ImageResources tags

Read and write layered TIFF ImageSourceData and ImageResources tags Psdtags is a Python library to read and write the Adobe Photoshop(r) specific Imag

Christoph Gohlke 4 Feb 5, 2022
Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

TCMR: Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video Qualtitative result Paper teaser video Introduction This r

Hongsuk Choi 215 Jan 6, 2023
Robust Consistent Video Depth Estimation

[CVPR 2021] Robust Consistent Video Depth Estimation This repository contains Python and C++ implementation of Robust Consistent Video Depth, as descr

Facebook Research 213 Dec 17, 2022
Code for the SIGGRAPH 2021 paper "Consistent Depth of Moving Objects in Video".

Consistent Depth of Moving Objects in Video This repository contains training code for the SIGGRAPH 2021 paper "Consistent Depth of Moving Objects in

Google 203 Jan 5, 2023
The code for the CVPR 2021 paper Neural Deformation Graphs, a novel approach for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects.

Neural Deformation Graphs Project Page | Paper | Video Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction Aljaž Božič, Pablo P

Aljaz Bozic 134 Dec 16, 2022
OpenMMLab Image and Video Editing Toolbox

Introduction MMEditing is an open source image and video editing toolbox based on PyTorch. It is a part of the OpenMMLab project. The master branch wo

OpenMMLab 3.9k Jan 4, 2023
A Python script that creates subtitles of a given length from text paragraphs that can be easily imported into any Video Editing software such as FinalCut Pro for further adjustments.

Text to Subtitles - Python This python file creates subtitles of a given length from text paragraphs that can be easily imported into any Video Editin

Dmytro North 9 Dec 24, 2022
data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

C2F-FWN data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer" (https://arxiv.org/abs/

EKILI 46 Dec 14, 2022
PyTorch Implementation of Spatially Consistent Representation Learning(SCRL)

Spatially Consistent Representation Learning (CVPR'21) Official PyTorch implementation of Spatially Consistent Representation Learning (SCRL). This re

Kakao Brain 102 Nov 3, 2022
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Fre-GAN Vocoder Fre-GAN: Adversarial Frequency-consistent Audio Synthesis Training: python train.py --config config.json Citation: @misc{kim2021frega

Rishikesh (ऋषिकेश) 93 Dec 17, 2022
TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A good teacher is patient and consistent by Beyer et al.

FunMatch-Distillation TF2 implementation of knowledge distillation using the "function matching" hypothesis from the paper Knowledge distillation: A g

Sayak Paul 67 Dec 20, 2022
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation This project attempted to implement the paper Putting NeRF on a

null 254 Dec 27, 2022
Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

PixelSynth: Generating a 3D-Consistent Experience from a Single Image (ICCV 2021) Chris Rockwell, David F. Fouhey, and Justin Johnson [Project Website

Chris Rockwell 95 Nov 22, 2022
This is the offical website for paper ''Category-consistent deep network learning for accurate vehicle logo recognition''

The Pytorch Implementation of Category-consistent deep network learning for accurate vehicle logo recognition This is the offical website for paper ''

Wanglong Lu 28 Oct 29, 2022
Cycle Consistent Adversarial Domain Adaptation (CyCADA)

Cycle Consistent Adversarial Domain Adaptation (CyCADA) A pytorch implementation of CyCADA. If you use this code in your research please consider citi

Hyunwoo Ko 2 Jan 10, 2022