Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

Overview

TimeLens: Event-based Video Frame Interpolation

TimeLens

This repository is about the High Speed Event and RGB (HS-ERGB) dataset, used in the 2021 CVPR paper TimeLens: Event-based Video Frame Interpolation by Stepan Tulyakov*, Daniel Gehrig*, Stamatios Georgoulis, Julius Erbach, Mathias Gehrig, Yuanyou Li, and Davide Scaramuzza.

For more information, visit our project page.

Citation

A pdf of the paper is available here. If you use this dataset, please cite this publication as follows:

@Article{Tulyakov21CVPR,
  author        = {Stepan Tulyakov and Daniel Gehrig and Stamatios Georgoulis and Julius Erbach and Mathias Gehrig and Yuanyou Li and
                  Davide Scaramuzza},
  title         = {{TimeLens}: Event-based Video Frame Interpolation},
  journal       = "IEEE Conference on Computer Vision and Pattern Recognition",
  year          = 2021,
}

Google Colab

A Google Colab notebook is now available here. You can upsample your own video and events from you gdrive.

Gallery

For more examples, visit our project page.

coke paprika pouring water_bomb_floor

Installation

Install the dependencies with

cuda_version=10.2
conda create -y -n timelens python=3.7
conda activate timelens
conda install -y pytorch torchvision cudatoolkit=$cuda_version -c pytorch
conda install -y -c conda-forge opencv scipy tqdm click

Test TimeLens

First start by cloning this repo into a new folder

mkdir ~/timelens/
cd ~/timelens
git clone https://github.com/uzh-rpg/rpg_timelens

Then download the checkpoint and data to the repo

cd rpg_timelens
wget http://rpg.ifi.uzh.ch/timelens/data/checkpoint.bin
wget http://rpg.ifi.uzh.ch/timelens/data/example_github.zip
unzip example_github.zip 
rm -rf example_github.zip

Running Timelens

To run timelens simply call

skip=0
insert=7
python -m timelens.run_timelens checkpoint.bin example/events example/images example/output $skip $insert

This will generate the output in example/output. The first four variables are the checkpoint file, image folder and event folder and output folder respectively. The variables skip and insert determine the number of skipped vs. inserted frames, i.e. to generate a video with an 8 higher framerate, 7 frames need to be inserted, and 0 skipped.

The resulting images can be converted to a video with

ffmpeg -i example/output/%06d.png timelens.mp4

the resulting video is timelens.mp4.

Dataset

hsergb

Download the dataset from our project page. The dataset structure is as follows

.
├── close
│   └── test
│       ├── baloon_popping
│       │   ├── events_aligned
│       │   └── images_corrected
│       ├── candle
│       │   ├── events_aligned
│       │   └── images_corrected
│       ...
│
└── far
    └── test
        ├── bridge_lake_01
        │   ├── events_aligned
        │   └── images_corrected
        ├── bridge_lake_03
        │   ├── events_aligned
        │   └── images_corrected
        ...

Each events_aligned folder contains events files with template filename %06d.npz, and images_corrected contains image files with template filename %06d.png. In events_aligned each event file with index n contains events between images with index n-1 and n, i.e. event file 000001.npz contains events between images 000000.png and 000001.png. Moreover, images_corrected also contains timestamp.txt where image timestamps are stored. Note that in some folders there are more image files than event files. However, the image stamps in timestamp.txt should match with the event files and the additional images can be ignored.

For a quick test download the dataset to a folder using the link sent by email.

wget download_link.zip -O /tmp/dataset.zip
unzip /tmp/dataset.zip -d hsergb/

And run the test

python test_loader.py --dataset_root hsergb/ \ 
                      --dataset_type close \ 
                      --sequence spinning_umbrella \ 
                      --sample_index 400

This should open a window visualizing aligned events with a single image.

Comments
  • no response after clicking on the data download button in your project website

    no response after clicking on the data download button in your project website

    Hi there,

    There is no response after clicking on this button: http://rpg.ifi.uzh.ch/timelensdownload.html on your project website.

    Could you please show us a proper way to download the data?

    Best,

    Yukin

    opened by JiangtianPan 12
  • About simulation events

    About simulation events

    Hi ~

    It's really a nice work but I have several questions. I found that in the paper you used the middlebury and vimeo90k dataset. However, different from adobe240 or GOPRO dataset, the middlebury and vimeo90k datasets are captured under 30fps. So how did you simulte the event for middlebury and vimeo90k datasets? Directly use the 30fps original videos?

    Thanks a lot!

    opened by hewh16 6
  • Index Error: List index out of range while reading input images and event frames.

    Index Error: List index out of range while reading input images and event frames.

    I am trying to run the TimeLens.ipynb file where following are the description of input directories:

    1. FRAME_INPUT_DIR: Folder having input images(.png) indexed from 0 to 595 and a file called timestamp.txt having timestamp information of each image.

    2. EVENTS_INPUT_DIR : Folder having event frames(in .npy format) indexed from 0 to 594. Between image 0.png and 1.png events information is combined in the frame 0.npy and so on.

    But when I run the timelens from given TimeLens.ipynb file, I got the error IndexError: list index out of range as shown below:

    Run_timelens_Error

    Perhaps I have made mistake while making event frame directory and giving it as an input. Can you please help me out as how to prepare or what modifications is needed in Event frame directory/Image frame directory to rectify this error?

    opened by Rishabh-Samra 5
  • Events files

    Events files

    You wrote: A Google Colab notebook is now available here. You can upsample your own video and events from you gdrive. How to get events files and timestamp.txt for my own video?

    opened by qo4on 5
  • Just-in-Time Reading of images & events

    Just-in-Time Reading of images & events

    I find it a bit annoying to have to wait for data to load to inspect a particular video. This P.R proposes to load just-in-time images & events as you go. the sample "viz_all_data" does the visualization of the whole data.

    opened by etienne87 4
  • several

    several "not found" errors

    I saw the paper on this AI and was really impressed but the google colab is broken. Several things that it needs to download simply aren't available.

    http://rpg.ifi.uzh.ch/timelens/data/checkpoint.bin returns 404 http://rpg.ifi.uzh.ch/timelens/data/example_github.zip also returns 404

    opened by KCGD 3
  • Model is not downloading correctly

    Model is not downloading correctly

    Description of Problem ❗️

    I get a 403 message when I try to access the model

    >>> !wget http://rpg.ifi.uzh.ch/timelens/data2/checkpoint.bin
    HTTP request sent, awaiting response... 403 Forbidden
    

    Expected Behavior ✅

    model downloads from the web


    Environment 🛠

    Google Colab


    Additional Comments 🌳

    I was using the Google Colab notebook included in the repository, but I was not able to download the model. I'm apparently forbidden from downloading or accessing it.

    opened by mathemusician 2
  • Help With Evaluation

    Help With Evaluation

    Sorry for the beginner question.

    I've been trying to evaluate this code. It works perfectly with the provided video and events but I am facing issues with my own videos.

    1. I extracted the video to frames.
    2. I created a timestamp.txt with the correct time stamps.
    3. I converted the pngs to npz using a script I found online.

    But still the evaluation fails to go through. It actually runs without errors but it does not generate any new content. Am I missing something? Or is there a particular way to generate events?

    opened by blessedcoolant 2
  • 'torch.jit' has no attribute 'unused'

    'torch.jit' has no attribute 'unused'

    I followed all the steps written in the README. But I get the following error when I try to run the code on the example sequence: AttributeError: 'torch.jit' has no attribute 'unused'

    opened by AmoghTiwari 2
  • suggested train/val/test split

    suggested train/val/test split

    Sorry if this is mentioned in the paper,

    Is there a suggestion of train/val/test split between the "far" and "close" sections for training on this dataset?

    opened by etienne87 1
  • random access logic

    random access logic

    Hi, thanks for this great dataset! I would like to code a random access function getitem that, given index i, num_skip s, returns the necessary data to train. Can i assume that:

    • input is image_png[i], image_png[i+s], event_npz[i], timestamps[i], timestamps[i+s]
    • output is, given chosen intermediate time tau in [i, i+s]: image_png[tau].

    When i try this i sometimes get events that begin before timestamps[i]

    opened by etienne87 1
  • Request to add a license file

    Request to add a license file

    Great project, thanks!

    If possible, please add an open-source license file to the project to indicate your intended terms of use. i.e., MIT, Berkeley, Apache 2, etc.

    opened by robgon-art 0
  • Different results between paper and code test

    Different results between paper and code test

    Hey, It's a pretty work. I also conduct some comparison experiments in video frame interpolation. But I have several questions in some details. I've been trying to run this code and to output results in timelens. However, in terms of PSNR and SSIM, values of metrics from timelens code are different that from your paper 'Time Lens'. I don't know what's wrong.

    Some details are provided as follow, Dataset: HSERGB, BS-ERGB Test code: uzh-rpg/rpg_timelens Evaluation code: rpg_event_based_frame_interpolation_evaluation

    Like your paper 'Time Lens', I report PSNR and SSIM for all sequences by skipping 1, 5 and 7 frames respectively, and reconstructing the missing frames.

    My results (mean+std):

    1. skip 5 frames:

    |PSNR/SSIM| HSERGB(far)| HSERGB(close)| |:-------:|:-------:|:-------:| |code(timelens)| 31.33±2.55/0.883±0.069| 31.81±4.20/0.822±0.108| |paper(timelens)| 33.13±2.10/0.877±0.092| 32.19±4.19/0.839±0.090|

    1. skip 7 frames:

    |PSNR/SSIM| HSERGB(far)| HSERGB(close)| |:-------:|:-------:|:-------:| |code(timelens)| 30.05±2.24/0.864±0.065| 31.54±6.05/0.844±0.120| |paper(timelens)| 32.31±2.27/0.869±0.110| 31.68±4.18/0.835±0.091|

    1. skip 1 frame:

    |PSNR/SSIM| BS-ERGB| |:-------:|:-------:| |code(timelens)| 24.03±4.30/0.741±0.153| |paper(timelens)| 28.56/-|

    In addition, I noticed and solved this issue for HSERGB dataset, as you mentioned before. """ In events_aligned each event file with index n contains events between images with index n-1 and n, i.e. event file 000001.npz contains events between images 000000.png and 000001.png. """ So, I deleted event file '000000.npz' of each sequence to make sure that all steps are correct for HSERGB dataset. BS-ERGB dataset are still unchanged. Other additional things are not done. If I have any mistakes please correct me. What confuses me is that the results obtained using the code are different from the results of the paper. Looking forward to your reply.

    opened by EMJian 0
  • The event visualization of the hsergb dataset

    The event visualization of the hsergb dataset

    Thanks to open source for this amazing work! I have two questions about the hsergb dataset.

    python test_loader.py --dataset_root data/hsergb --dataset_type close --sequence spinning_umbrella --sample_index 400
    

    image

    python test_loader.py --dataset_root data/hsergb --dataset_type close --sequence fountain_schaffhauserplatz_02 --sample_index 420
    

    image

    From the above two examples, I found that some events are not aligned with the image, and some grid locations are missing events. Is this normal? And why are the coordinates (x, y) in the event data decimal value rather than integer value? Is this the raw data collected by the event camera, or due to the interpolation during dual camera calibration?

    opened by danqu130 5
  • ValueError: zero-dimensional arrays cannot be concatenated

    ValueError: zero-dimensional arrays cannot be concatenated

    (timelens) C:\Users\kurs\Documents\rpg_timelens>python -m timelens.run_timelens checkpoint.bin C:\Users\kurs\Desktop\rpg_events C:\Users\kurs\Desktop\rpg_upsampled_1 C:\Users\kurs\Desktop\output
    Processing .
    100%|█████████████████████████████████████████████████████████████████████████████| 1392/1392 [00:02<00:00, 492.17it/s]
    Traceback (most recent call last):
      File "C:\Users\kurs\anaconda3\envs\timelens\lib\runpy.py", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "C:\Users\kurs\anaconda3\envs\timelens\lib\runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "C:\Users\kurs\Documents\rpg_timelens\timelens\run_timelens.py", line 176, in <module>
        main()
      File "C:\Users\kurs\anaconda3\envs\timelens\lib\site-packages\click\core.py", line 1137, in __call__
        return self.main(*args, **kwargs)
      File "C:\Users\kurs\anaconda3\envs\timelens\lib\site-packages\click\core.py", line 1062, in main
        rv = self.invoke(ctx)
      File "C:\Users\kurs\anaconda3\envs\timelens\lib\site-packages\click\core.py", line 1404, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "C:\Users\kurs\anaconda3\envs\timelens\lib\site-packages\click\core.py", line 763, in invoke
        return __callback(*args, **kwargs)
      File "C:\Users\kurs\Documents\rpg_timelens\timelens\run_timelens.py", line 170, in main
        number_of_frames_to_insert,
      File "C:\Users\kurs\Documents\rpg_timelens\timelens\run_timelens.py", line 120, in run_recursively
        leaf_event_folder, leaf_image_folder, "*.npz", "*.png"
      File "C:\Users\kurs\Documents\rpg_timelens\timelens\common\hybrid_storage.py", line 81, in from_folders
        event_file_template=event_file_template
      File "C:\Users\kurs\Documents\rpg_timelens\timelens\common\event.py", line 424, in from_folder
        return cls.from_npz_files(filenames, image_height, image_width)
      File "C:\Users\kurs\Documents\rpg_timelens\timelens\common\event.py", line 440, in from_npz_files
        features = np.concatenate(features_list)
      File "<__array_function__ internals>", line 6, in concatenate
    ValueError: zero-dimensional arrays cannot be concatenated 
    

    How can i fix this error?

    opened by c6s0 0
Owner
Robotics and Perception Group
Robotics and Perception Group
This repository contains codes on how to handle mouse event using OpenCV

Handling-Mouse-Click-Events-Using-OpenCV This repository contains codes on how t

Happy  N. Monday 3 Feb 15, 2022
A simple python program to record security cam footage by detecting a face and body of a person in the frame.

SecurityCam A simple python program to record security cam footage by detecting a face and body of a person in the frame. This code was created by me,

null 1 Nov 8, 2021
Distort a video using Seam Carving (video) and Vibrato effect (sound)

Distort videos Applies a Seam Carving algorithm (aka liquid rescale) on every frame of a video, and a vibrato effect on the audio to distort the video

AlexZeGamer 6 Dec 6, 2022
Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

Dataset and Code for RealVSR Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme Xi Yang, Wangmeng Xiang,

Xi Yang 91 Nov 22, 2022
Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

null 213 Nov 12, 2022
A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.

A python scripts that uses 3 different feature extraction methods such as SIFT, SURF and ORB to find a book in a video clip and project trailer of a movie based on that book, on to it.

tooraj taraz 3 Feb 10, 2022
This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"

SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks (CVPR 2021 Oral) This repository contains the official PyTorch implementation

Shunsuke Saito 235 Dec 18, 2022
Repository collecting all the submodules for the new PyTorch-based OCR System.

OCRopus3 is being replaced by OCRopus4, which is a rewrite using PyTorch 1.7; release should be soonish. Please check github.com/tmbdev/ocropus for up

NVIDIA Research Projects 138 Dec 9, 2022
A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval (arXiv) Repository to contain the code, models, data for end-to-end

null 225 Dec 25, 2022
A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database. The structure, shape and proportions of the faces are compared during the face recognition steps.

Pavankumar Khot 4 Mar 19, 2022
A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports "with"-syntax.

Patrice Matz 0 Oct 30, 2021
A python screen recorder for low-end computers, provides high quality video output.

RecorderX - v1.0 A screen recorder made in Python with the help of OpenCv, it has ability to record your screen in high quality. No matter what your P

Priyanshu Jindal 4 Nov 10, 2021
Face_mosaic - Mosaic blur processing is applied to multiple faces appearing in the video

動機 face_recognitionを使用して得られる顔座標は長方形であり、この座標をそのまま用いてぼかし処理を行った場合得られる画像は醜い。 それに対してモ

Yoshitsugu Kesamaru 6 Feb 3, 2022
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

Bridging Video-text Retrieval with Multiple Choice Questions, CVPR 2022 (Oral) Paper | Project Page | Pre-trained Model | CLIP-Initialized Pre-trained

Applied Research Center (ARC), Tencent PCG 99 Jan 6, 2023
Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"

Streaming speaker diarization Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation by Juan Manuel Coria, Hervé

Juanma Coria 185 Jan 1, 2023
Code for paper "Role-based network embedding via structural features reconstruction with degree-regularized constraint"

Role-based network embedding via structural features reconstruction with degree-regularized constraint Train python main.py --dataset brazil-flights

wang zhang 1 Jun 28, 2022
Code for the paper: Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution

Fusformer Code for the paper: "Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution" Plateform Python 3.8.5 + Pytor

Jin-Fan Hu (胡锦帆) 11 Dec 12, 2022
This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis"

StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image

Meta Research 840 Dec 26, 2022
Line based ATR Engine based on OCRopy

OCR Engine based on OCRopy and Kraken using python3. It is designed to both be easy to use from the command line but also be modular to be integrated

null 948 Dec 23, 2022