This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Overview

Omnimatte in PyTorch

This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Prerequisites

  • Linux
  • Python 3.6+
  • NVIDIA GPU + CUDA CuDNN

Installation

This code has been tested with PyTorch 1.8 and Python 3.8.

  • Install PyTorch 1.8 and other dependencies.
    • For pip users, please type the command pip install -r requirements.txt.
    • For Conda users, you can create a new Conda environment using conda env create -f environment.yml.

Demo

To train a model on a video (e.g. "tennis"), run:

python train.py --name tennis --dataroot ./datasets/tennis --gpu_ids 0,1

To view training results and loss plots, visit the URL http://localhost:8097. Intermediate results are also at ./checkpoints/tennis/web/index.html.

To save the omnimatte layer outputs of the trained model, run:

python test.py --name tennis --dataroot ./datasets/tennis --gpu_ids 0

The results (RGBA layers, videos) will be saved to ./results/tennis/test_latest/.

Custom video

To train on your own video, you will have to preprocess the data:

  1. Extract the frames, e.g.
    mkdir ./datasets/my_video && cd ./datasets/my_video 
    mkdir rgb && ffmpeg -i video.mp4 rgb/%04d.png
    
  2. Resize the video to 256x448 and save the frames in my_video/rgb.
  3. Get input object masks (e.g. using Mask-RCNN and STM), save each object's masks in its own subdirectory, e.g. my_video/mask/01/, my_video/mask/02/, etc.
  4. Compute flow (e.g. using RAFT), and save the forward .flo files to my_video/flow and backward flow to my_video/flow_backward
  5. Compute the confidence maps from the forward/backward flows:
    python datasets/confidence.py --dataroot ./datasets/tennis
  6. Register the video and save the computed homographies in my_video/homographies.txt. See here for details.

Note: Videos that are suitable for our method have the following attributes:

  • Static camera or limited camera motion that can be represented with a homography.
  • Limited number of omnimatte layers, due to GPU memory limitations. We tested up to 6 layers.
  • Objects that move relative to the background (static objects will be absorbed into the background layer).
  • We tested a video length of up to 200 frames (~7 seconds).

Citation

If you use this code for your research, please cite the following paper:

@inproceedings{lu2021,
  title={Omnimatte: Associating Objects and Their Effects in Video},
  author={Lu, Erika and Cole, Forrester and Dekel, Tali and Zisserman, Andrew and Freeman, William T and Rubinstein, Michael},
  booktitle={CVPR},
  year={2021}
}

Acknowledgments

This code is based on retiming and pytorch-CycleGAN-and-pix2pix.

Comments
  • IndexError: list index out of range

    IndexError: list index out of range

    hello, when i run the training command, the error occurs. i'm confused about the H0 and H_1 parameters, what's the meaning of these two parameters?

    Traceback (most recent call last): File "train.py", line 88, in main() File "train.py", line 42, in main train(model, dataset, visualizer, opt) File "train.py", line 54, in train for i, data in enumerate(dataset): # inner loop within one epoch File "/home/notebook/data/personal/80303875/project/omnimatte/omnimatte/third_party/data/init.py", line 92, in iter for i, data in enumerate(self.dataloader): File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 517, in next data = self._next_data() File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1179, in _next_data return self._process_data(data) File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data data.reraise() File "/opt/conda/lib/python3.7/site-packages/torch/_utils.py", line 429, in reraise raise self.exc_type(msg) IndexError: Caught IndexError in DataLoader worker process 1. Original Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop data = fetcher.fetch(index) File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/notebook/data/personal/80303875/project/omnimatte/omnimatte/data/omnimatte_dataset.py", line 129, in getitem data_t2 = self.get_transformed_item(index + 1, transform_params) File "/home/notebook/data/personal/80303875/project/omnimatte/omnimatte/data/omnimatte_dataset.py", line 194, in get_transformed_item bg_flow = self.get_background_flow(index, mask_w, mask_h) # 2, H, W File "/home/notebook/data/personal/80303875/project/omnimatte/omnimatte/data/omnimatte_dataset.py", line 287, in get_background_flow H_1 = self.homographies[index + 1] IndexError: list index out of range

    opened by ai1361720220000 8
  • Out of memory  about train

    Out of memory about train

    my computer: RTX 2080 11G
    RuntimeError: CUDA out of memory. Tried to allocate 118.00 MiB (GPU 0; 10.76 GiB total capacity; 8.79 GiB already allocated; 135.25 MiB free; 9.05 GiB reserved in total by PyTorch)

    I should modify those parameters to reduce memory consumption

    opened by Ruinmou 6
  • Duration of the processed video is Longer than source video

    Duration of the processed video is Longer than source video

    Hi erkalu, I complete all attempts about default video demos and custom demos. But now I have a problem which is the processed video duration does not match with source video, and the processed video object moves very slowly. I don’t know if there is a problem with my steps.

    opened by o98k-ok 3
  • Computing homographies

    Computing homographies

    Hello,

    I am having trouble with point 6) in the custom video training steps. How do you compute the initial homographies.txt file? In the readme it just says "use OpenCV". Can you please share the script for the initial generation of homographies that you used for the "tennis" dataset?

    opened by AlexSS1001 3
  • Optical flow

    Optical flow

    When I was preparing my data set, I had a question in step 4. What is the difference between forward optical flow and reverse optical flow? After I generate my own optical flow, step 5 prompts ' magic number incorrect. Invalid. Flo file '

    opened by Ruinmou 2
  • Open CV Can't be found

    Open CV Can't be found

    When I install the requirements using pip, I get this error

    ERROR: Could not find a version that satisfies the requirement opencv-python==4.5.1 (from versions: 3.4.8.29, 3.4.9.31, 3.4.9.33, 3.4.10.35, 3.4.10.37, 3.4.11.39, 3.4.11.41, 3.4.11.43, 3.4.11.45, 3.4.13.47, 3.4.14.51, 3.4.14.53, 3.4.15.55, 4.1.2.30, 4.2.0.32, 4.2.0.34, 4.3.0.36, 4.3.0.38, 4.4.0.40, 4.4.0.42, 4.4.0.44, 4.4.0.46, 4.5.1.48, 4.5.2.52, 4.5.2.54, 4.5.3.56)
    ERROR: No matching distribution found for opencv-python==4.5.1
    
    opened by Lucien950 2
  • The minimum memory of GPU

    The minimum memory of GPU

    I am wondering what is the minimum memory size of GPU that can train this model. Currently, I have tried a 12GB GPU, but PyTorch reports that memory of GPU is not enough.

    Exception has occurred: RuntimeError CUDA out of memory. Tried to allocate 448.00 MiB (GPU 0; 11.17 GiB total capacity; 9.54 GiB already allocated; 427.25 MiB free; 10.34 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

    opened by v-owendeng 1
  • Question on homography calculation

    Question on homography calculation

    Hello, i have a question on calculating homographies. Am i understand correctly that strings in homographies.txt is homographies between first frame and i-th frame? (or maybe between frame and previous frame?) My guesses are built on the fact that homography for frame1 in your example is identity matrix.

    opened by shiron8bit 1
  • [Errno 101] Network is unreachable

    [Errno 101] Network is unreachable

    hello, thanks for sharing projectes. After i run the training command, the problem occured. Could you give me advice?

    The number of training images = 73 setting up model model [OmnimatteModel] was created ---------- Networks initialized ------------- [Network Omnimatte] Total number of parameters : 10.517 M [Network Omnimatte] Total number of trainable parameters : 10.517 M

    Setting up a new session... Exception in user code:

    Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/urllib3/connection.py", line 160, in _new_conn (self._dns_host, self.port), self.timeout, **extra_kw File "/opt/conda/lib/python3.7/site-packages/urllib3/util/connection.py", line 84, in create_connection raise err File "/opt/conda/lib/python3.7/site-packages/urllib3/util/connection.py", line 74, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen chunked=chunked, File "/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py", line 392, in _make_request conn.request(method, url, **httplib_request_kw) File "/opt/conda/lib/python3.7/http/client.py", line 1252, in request self._send_request(method, url, body, headers, encode_chunked) File "/opt/conda/lib/python3.7/http/client.py", line 1298, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/opt/conda/lib/python3.7/http/client.py", line 1247, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/opt/conda/lib/python3.7/http/client.py", line 1026, in _send_output self.send(msg) File "/opt/conda/lib/python3.7/http/client.py", line 966, in send self.connect() File "/opt/conda/lib/python3.7/site-packages/urllib3/connection.py", line 187, in connect conn = self._new_conn() File "/opt/conda/lib/python3.7/site-packages/urllib3/connection.py", line 172, in _new_conn self, "Failed to establish a new connection: %s" % e urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fcad03e2cd0>: Failed to establish a new connection: [Errno 111] Connection refused

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/requests/adapters.py", line 449, in send timeout=timeout File "/opt/conda/lib/python3.7/site-packages/urllib3/connectionpool.py", line 725, in urlopen method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2] File "/opt/conda/lib/python3.7/site-packages/urllib3/util/retry.py", line 439, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /env/main (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fcad03e2cd0>: Failed to establish a new connection: [Errno 111] Connection refused'))

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/visdom/init.py", line 711, in _send data=json.dumps(msg), File "/opt/conda/lib/python3.7/site-packages/visdom/init.py", line 677, in _handle_post r = self.session.post(url, data=data) File "/opt/conda/lib/python3.7/site-packages/requests/sessions.py", line 578, in post return self.request('POST', url, data=data, json=json, **kwargs) File "/opt/conda/lib/python3.7/site-packages/requests/sessions.py", line 530, in request resp = self.send(prep, **send_kwargs) File "/opt/conda/lib/python3.7/site-packages/requests/sessions.py", line 643, in send r = adapter.send(request, **kwargs) File "/opt/conda/lib/python3.7/site-packages/requests/adapters.py", line 516, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /env/main (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fcad03e2cd0>: Failed to establish a new connection: [Errno 111] Connection refused')) [Errno 101] Network is unreachable [Errno 101] Network is unreachable

    opened by ai1361720220000 1
  • about latest_net_Omnimatte.pth

    about latest_net_Omnimatte.pth

    Hi, sir Where could I download the latest_net_Omnimatte.pth.

    it's needed in test.py

    FileNotFoundError: [Errno 2] No such file or directory: './checkpoints/tennis/latest_net_Omnimatte.pth'
    
    opened by 9p15p 1
  • Why every video should train a new model?

    Why every video should train a new model?

    Hello, thanks for sharing such nice work.

    I'm curious about why "every video should train a new model". Is that because your training Loss is based on your own input. And the training strategy lead to your model have to train video by video.

    opened by minired2154 0
  • Discrepancy between code and paper

    Discrepancy between code and paper

    Hi Erika, thanks for sharing this nice work.

    I'm trying to understand the alpha regularization term and I've found what I believe to be either a bug or a discrepancy between your paper and the code.

    The alpha regularization is defined in the paper as: image

    However in the code you use the alpha_composite rather than the individual alpha layers. This has different behaviour when multiple layers have some alpha activation. Below is the value of the regularizations plotted against two 1 pixel alpha layers. green is using alpha_composite, blue is using the l1 norm as defined in the paper. drawing

    Is there a reason for this change?

    Furthermore, this loss is aimed to prevent a trivial solution where one layer reconstructs the entire image. I've encountered a situation where one object is reconstructed in multiple object layers. Have you ever encountered this problem? And if so, how would you deal with this?

    Thanks in advance

    opened by GuidoVisser 1
  • The video for computing homographies.txt should be the orignal one or the resized one?

    The video for computing homographies.txt should be the orignal one or the resized one?

    According to the 2th step of processing our own video, we should resize the video to 256x448 and save the frames in my_video/rgb. But at the 6th step, register the video and save the computed homographies in my_video/homographies.txt. The first line in omnimatte/datasets/tennis/homographies.txt is size: 854 480. So I'm little bit confused which video should we use for computing the homograpyies? The one before resized or after resized?

    opened by sorata118 1
  • ERROR: Could not find a version that satisfies the requirement opencv-python==4.5.1

    ERROR: Could not find a version that satisfies the requirement opencv-python==4.5.1

    Hello, trying to pip install requirements but getting this error:

    !cd omnimatte && ls && pip3 install -r requirements.txt

    Collecting torch==1.8.0
      Using cached torch-1.8.0-cp37-cp37m-manylinux1_x86_64.whl (735.5 MB)
    Requirement already satisfied: torchvision in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 2)) (0.10.0+cu111)
    Collecting dominate>=2.4.0
      Using cached dominate-2.6.0-py2.py3-none-any.whl (29 kB)
    Collecting visdom>=0.1.8
      Using cached visdom-0.1.8.9.tar.gz (676 kB)
    Requirement already satisfied: matplotlib>=3.2.1 in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 5)) (3.2.2)
    ERROR: Could not find a version that satisfies the requirement opencv-python==4.5.1 (from versions: 3.4.2.17, 3.4.3.18, 3.4.4.19, 3.4.5.20, 3.4.6.27, 3.4.7.28, 3.4.8.29, 3.4.9.31, 3.4.9.33, 3.4.10.35, 3.4.10.37, 3.4.11.39, 3.4.11.41, 3.4.11.43, 3.4.11.45, 3.4.13.47, 3.4.14.51, 3.4.14.53, 3.4.15.55, 3.4.16.57, 4.0.0.21, 4.0.1.23, 4.0.1.24, 4.1.0.25, 4.1.1.26, 4.1.2.30, 4.2.0.32, 4.2.0.34, 4.3.0.36, 4.3.0.38, 4.4.0.40, 4.4.0.42, 4.4.0.44, 4.4.0.46, 4.5.1.48, 4.5.2.52, 4.5.2.54, 4.5.3.56, 4.5.4.58)
    ERROR: No matching distribution found for opencv-python==4.5.1 ```
    opened by jryebread 1
  • how use Detectron2   output Binary Mask image

    how use Detectron2 output Binary Mask image

    o m

    im = cv2.imread("./1.jpg")
    cv2_imshow(im)
    
    cfg = get_cfg()
    cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
    cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 
    
    cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
    predictor = DefaultPredictor(cfg)
    outputs = predictor(im)
    
    v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
    out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
    cv2_imshow(out.get_image()[:, :, ::-1])
    
    opened by 565ee 1
  • Pretrained Model for Tennis Example

    Pretrained Model for Tennis Example

    Could you please upload the trained model for the tennis example (and possibly other examples) with the data so we don't have to retrain the models just to test it out?

    Thank you!

    opened by NOT-HAL9000 0
Owner
Erika Lu
Erika Lu
This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"

Differentiable Volumetric Rendering Paper | Supplementary | Spotlight Video | Blog Entry | Presentation | Interactive Slides | Project Page This repos

null 697 Jan 6, 2023
This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab 89 Dec 26, 2022
This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

HeadNeRF: A Real-time NeRF-based Parametric Head Model This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametr

null 294 Jan 1, 2023
null 190 Jan 3, 2023
This GitHub repository contains code used for plots in NeurIPS 2021 paper 'Stochastic Multi-Armed Bandits with Control Variates.'

About Repository This repository contains code used for plots in NeurIPS 2021 paper 'Stochastic Multi-Armed Bandits with Control Variates.' About Code

Arun Verma 1 Nov 9, 2021
This repository contains the PyTorch implementation of the paper STaCK: Sentence Ordering with Temporal Commonsense Knowledge appearing at EMNLP 2021.

STaCK: Sentence Ordering with Temporal Commonsense Knowledge This repository contains the pytorch implementation of the paper STaCK: Sentence Ordering

Deep Cognition and Language Research (DeCLaRe) Lab 23 Dec 16, 2022
RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

[3DV 2021] We propose a new cascaded architecture for novel view synthesis, called RGBD-Net, which consists of two core components: a hierarchical depth regression network and a depth-aware generator network.

Phong Nguyen Ha 4 May 26, 2022
This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Skeleton Aware Multi-modal Sign Language Recognition By Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li and Yun Fu. Smile Lab @ Northeastern

Isen (Songyao Jiang) 128 Dec 8, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

selfcontact This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] It includes the main function

Lea Müller 68 Dec 6, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

SMPLify-XMC This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright Lic

Lea Müller 83 Dec 14, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

TUCH This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright License fo

Lea Müller 45 Jan 7, 2023
git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

NingWang 236 Dec 22, 2022
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 87 Jan 8, 2023
This repository contains all the code and materials distributed in the 2021 Q-Programming Summer of Qode.

Q-Programming Summer of Qode This repository contains all the code and materials distributed in the Q-Programming Summer of Qode. If you want to creat

Sammarth Kumar 11 Jun 11, 2021
This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in Eurographics 2021

Deep-Detail-Enhancement-for-Any-Garment Introduction This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in

null 40 Dec 13, 2022
This repository contains the code for the paper "Hierarchical Motion Understanding via Motion Programs"

Hierarchical Motion Understanding via Motion Programs (CVPR 2021) This repository contains the official implementation of: Hierarchical Motion Underst

Sumith Kulal 40 Dec 5, 2022
This repository contains the source code and data for reproducing results of Deep Continuous Clustering paper

Deep Continuous Clustering Introduction This is a Pytorch implementation of the DCC algorithms presented in the following paper (paper): Sohil Atul Sh

Sohil Shah 197 Nov 29, 2022
This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

Facebook Research 281 Dec 22, 2022
This repository contains the code and models for the following paper.

DC-ShadowNet Introduction This is an implementation of the following paper DC-ShadowNet: Single-Image Hard and Soft Shadow Removal Using Unsupervised

AuAgCu 65 Dec 27, 2022