Official Pytorch implementation for video neural representation (NeRV)

Related tags

Deep Learning NeRV
Overview

NeRV: Neural Representations for Videos (NeurIPS 2021)

Project Page | Paper | UVG Data

Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava
This is the official implementation of the paper "NeRV: Neural Representations for Videos ".

Get started

We run with Python 3.8, you can set up a conda environment with all dependencies like so:

pip install -r requirements.txt 

High-Level structure

The code is organized as follows:

  • train_nerv.py includes a generic traiing routine.
  • model_nerv.py contains the dataloader and neural network architecure
  • data/ directory video/imae dataset, we provide big buck bunny here
  • checkpoint/ directory contains some pre-trained model on big buck bunny dataset
  • log files (tensorboard, txt, state_dict etc.) will be saved in output directory (specified by --outf)

Reproducing experiments

Training experiments

The NeRV-S experiment on 'big buck bunny' can be reproduced with

python train_nerv.py -e 300 --cycles 1  --lower-width 96 --num-blocks 1 --dataset bunny --frame_gap 1 \
    --outf bunny_ab --embed 1.25_40 --stem_dim_num 512_1  --reduction 2  --fc_hw_dim 9_16_26 --expansion 1  \
    --single_res --loss Fusion6   --warmup 0.2 --lr_type cosine  --strides 5 2 2 2 2  --conv_type conv \
    -b 1  --lr 0.0005 --norm none --act swish 

Evaluation experiments

To evaluate pre-trained model, just add --eval_Only and specify model path with --weight, you can specify model quantization with --quant_bit [bit_lenght], yuo can test decoding speed with --eval_fps, below we preovide sample commends for NeRV-S on bunny dataset

python train_nerv.py -e 300 --cycles 1  --lower-width 96 --num-blocks 1 --dataset bunny --frame_gap 1 \
    --outf bunny_ab --embed 1.25_40 --stem_dim_num 512_1  --reduction 2  --fc_hw_dim 9_16_26 --expansion 1  \
    --single_res --loss Fusion6   --warmup 0.2 --lr_type cosine  --strides 5 2 2 2 2  --conv_type conv \
    -b 1  --lr 0.0005 --norm none  --act swish \
    --weight checkpoints/nerv_S.pth --eval_only 

Dump predictions with pre-trained model

To evaluate pre-trained model, just add --eval_Only and specify model path with --weight

python train_nerv.py -e 300 --cycles 1  --lower-width 96 --num-blocks 1 --dataset bunny --frame_gap 1 \
    --outf bunny_ab --embed 1.25_40 --stem_dim_num 512_1  --reduction 2  --fc_hw_dim 9_16_26 --expansion 1  \
    --single_res --loss Fusion6   --warmup 0.2 --lr_type cosine  --strides 5 2 2 2 2  --conv_type conv \
    -b 1  --lr 0.0005 --norm none  --act swish \
   --weight checkpoints/nerv_S.pth --eval_only  --dump_images

Citation

If you find our work useful in your research, please cite:

@inproceedings{hao2021nerv,
    author = {Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava },
    title = {NeRV: Neural Representations for Videos s},
    booktitle = {NeurIPS},
    year={2021}
}

Contact

If you have any questions, please feel free to email the authors.

Comments
  • UVG dataset experiment options

    UVG dataset experiment options

    Thanks for sharing your research.

    I'm trying to reproduce the Figure 7 graph of your paper(PSNR vs. BPP on UVG dataset), but I couldn’t find the appropriate experiment options. Could you tell me (C1,C2) for that result? (among Appendix A.1’s values)

    Other options I've tried so far are as follows.

    • Learning rate: op1. 5e-4 (paper 4.1) op2. 5e-4 x 6 (linear rule /w batch size 6)
    • Up-scale factor: 5,3,2,2,2 (paper 4.1)
    • Train epochs: 1500 epochs (paper 4.1)
    • Warmup epochs: op1. 300 epochs (train code's default. train epochs * 0.2) op2. 30 epochs (paper 4.1)
    opened by applezoos 5
  • Missing UVG video identifiers

    Missing UVG video identifiers

    Hello, thanks again for sharing the results of your research!

    I would like to make use of the results listed in psnr_bpp_results.csv for comparison, but I can't figure out which video relates to each line in the CSV file. The UVG dataset itself is bigger than the amount of results (7) listed in the first half.

    Could you please add the video names in the first column? Thanks.

    opened by aegroto 3
  • About weight pruning and entropy coding

    About weight pruning and entropy coding

    https://github.com/haochen-rye/NeRV/blob/adf61b81fc192c64d2de7b93745b28ff1cf33a39/train_nerv.py#L442

    When compressing the network's weights by Huffman codes, I confirmed that zero values are excepted. In this case, we can not know the position of pruned weights when reconstructing the model weights. I think additional information(such as indices of pruned weight) is required.

    Can you explain the details?

    Thank you for sharing the brilliant work!

    opened by maincold2 3
  • UVG dataset reproduce

    UVG dataset reproduce

    Hello, I have one question about reproducing the results, especially UVG.

    To train NeRV for UVG dataset, I set the command as follows:

    python train_nerv.py -e 150 --lower-width 96 --num-blocks 1 --dataset PATH --frame_gap 1 --outf bunny_ab --embed 1.25_80 --stem_dim_num 512_1 --reduction 2 --fc_hw_dim 9_16_112 --expansion 1 --single_res --loss Fusion6 --warmup 0.2 --lr_type cosine --strides 5 3 2 2 2 --conv_type conv -b 1 --lr 0.0005 --norm none --act gelu

    Is there any suggestion for accurate reproduced results?

    Thank you :)

    opened by subin-kim-cv 3
  • Distrotion-Compression result

    Distrotion-Compression result

    Hello, I really appreciate your impressive work.

    I have one question about calculating bits per pixel.

    You mentioned that bpp is calculated as follows: Model_Parameter∗(1−Prune_Ratio)∗Quant_Bit/Pixel_Num

    Here, Pixel_Num means what? For example, if we have a video, which has 100 frames of 720x1280 resolution.

    1. number of frames * width * height (100x720x1280)
    2. width * height (720x1280)

    Thanks.

    opened by subin-kim-cv 2
  • RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

    RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

    I was trying to run the training script & I faced this error.

    
    Use GPU: None for training
    waiting
    => No resume checkpoint found at 'output/bunny_ab/bunny/embed1.25_40_512_1_fc_9_16_26__exp1.0_reduce2_low96_blk1_cycle1_gap1_e300_warm60_b1_conv_lr0.0005_cosine_Fusion6_Strd5,2,2,2,2_SinRes_actswish_/model_latest.pth'
    Traceback (most recent call last):
      File "/home/sparsh/event_fit/NeRV/train_nerv.py", line 532, in <module>
        main()
      File "/home/sparsh/event_fit/NeRV/train_nerv.py", line 141, in main
        train(None, args)
      File "/home/sparsh/event_fit/NeRV/train_nerv.py", line 342, in train
        loss_sum.backward()
      File "/home/sparsh/anaconda3/envs/nerv/lib/python3.9/site-packages/torch/_tensor.py", line 307, in backward
        torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
      File "/home/sparsh/anaconda3/envs/nerv/lib/python3.9/site-packages/torch/autograd/__init__.py", line 154, in backward
        Variable._execution_engine.run_backward(
    RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR
    You can try to repro this exception using the following code snippet. If that doesn't trigger the error, please include your original repro script when reporting this issue.
    
    import torch
    torch.backends.cuda.matmul.allow_tf32 = True
    torch.backends.cudnn.benchmark = True
    torch.backends.cudnn.deterministic = False
    torch.backends.cudnn.allow_tf32 = True
    data = torch.randn([1, 96, 360, 640], dtype=torch.float, device='cuda', requires_grad=True)
    net = torch.nn.Conv2d(96, 384, kernel_size=[3, 3], padding=[1, 1], stride=[1, 1], dilation=[1, 1], groups=1)
    net = net.cuda().float()
    out = net(data)
    out.backward(torch.randn_like(out))
    torch.cuda.synchronize()
    
    ConvolutionParams 
        data_type = CUDNN_DATA_FLOAT
        padding = [1, 1, 0]
        stride = [1, 1, 0]
        dilation = [1, 1, 0]
        groups = 1
        deterministic = false
        allow_tf32 = true
    input: TensorDescriptor 0x7fa204013f00
        type = CUDNN_DATA_FLOAT
        nbDims = 4
        dimA = 1, 96, 360, 640, 
        strideA = 22118400, 230400, 640, 1, 
    output: TensorDescriptor 0x7fa204014420
        type = CUDNN_DATA_FLOAT
        nbDims = 4
        dimA = 1, 384, 360, 640, 
        strideA = 88473600, 230400, 640, 1, 
    weight: FilterDescriptor 0x7fa204007fa0
        type = CUDNN_DATA_FLOAT
        tensor_format = CUDNN_TENSOR_NCHW
        nbDims = 4
        dimA = 384, 96, 3, 3, 
    Pointer addresses: 
        input: 0x585de6000
        output: 0x58b246000
        weight: 0x50f18c000
    

    As suggested, I ran the code snippet & the error is reproduced.

    P.S: I had to make some changes to the conda env. I am using a machine with RTX3060. Before making the changes, it gave me the following error.

    NVIDIA GeForce RTX 3060 Laptop GPU with CUDA capability sm_86 is not compatible with the current PyTorch installation.
    The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
    If you want to use the NVIDIA GeForce RTX 3060 Laptop GPU GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
    

    So, I installed the conda pytorch package from the official website. My current env looks like this

    pytorch                   1.10.2          py3.9_cuda11.3_cudnn8.2.0_0    pytorch
    pytorch-msssim            0.2.1                    pypi_0    pypi
    pytorch-mutex             1.0                        cuda    pytorch
    torchvision               0.11.3               py39_cu113    pytorch
    cudatoolkit               11.3.1               h2bc3f7f_2
    

    However, the GPU is still not being used for training (as shown in the first line of the first code snippet in this post).

    P.P.S: Might be irrelevant, but the model is loading into the GPU memory (confirmed by nvidia-smi).

    Thanks for the help!

    opened by sparsh-b 2
  • Lack of instructions for decoding

    Lack of instructions for decoding

    Hello,

    I would like to thank you for sharing your work, it's a very interesting concept and I can see a lot of promising research on the subject to be done in the future.

    I've got a doubt about the decoding part, is there already a way to convert the resulting neural network back to a sequence of frames?

    opened by aegroto 2
  • Some problems when processing with UVG dataset

    Some problems when processing with UVG dataset

    Hi, thanks for you impressive work. I'm trying to reproduce your work, but I failed to convert 7 UVG videos into png files and put them into one folder. If possible , can you share command How to merge multiple y4m files into one file.

    opened by maoqingyu1996 1
  • Possible mistake in ReadMe

    Possible mistake in ReadMe

    Hi,

    I think I may have found some mistake in the Readme (or in the paper, but I believe it is just the readme)

    From the paper, I see that you choose to prune away 40% of the weights.

    From the code, the prune_ratio parameter seems to mean the proportion of weights that are kept, as 1-prune_ratio is passed to the pytorch prune function as the amount parameter.

    In the final section of the ReadME, you calculate bpp using 1 - prune_ratio however. Should this not be just prune_ratio? or 1-model_sparsity.

    Furthermore, am I correct in thinking the value of the prune_ratio parameter to replicate the results in the paper should be 0.6, unlike the 0.4 in the ReadMe?

    Thanks for any clarifications.

    opened by CarlosGomes98 1
  • Interpolating between two frames

    Interpolating between two frames

    hi,

    I'm interested in interpolating between frames. In the appendix A.4 of the paper there is a figure where you interpolate between two seen frames. I tried to reproduced that with your pretrained model but there are strong artifacts in the result:

    pred_5

    The only thing i changed was embed_input = pe(norm_idx+(1/132)*0.5) in train_nerv.py l. 484. Is there something I didn't notice for reproducing the interpolation? I would be glad for any hint.

    Its a great work and i saw there is a follow up paper in review, that seems to focus that problem too, right? Are there any planes for a approx. release date?

    opened by Alpe6825 1
  • Some questions about the experimental details.

    Some questions about the experimental details.

    This is an impressive job. I try to reproduce the results in the paper. I found that there are two parameters that control the compression ratio, quantization (bit length) and pruning ratio. I was wondering how did you get the curves for the BDBR? Looking forward to your reply.

    opened by JXH-SHU 1
Owner
hao
hao
[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c

Junyong Lee 151 Dec 30, 2022
Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

null 49 Nov 23, 2022
This is the official pytorch implementation for the paper: Instance Similarity Learning for Unsupervised Feature Representation.

ISL This is the official pytorch implementation for the paper: Instance Similarity Learning for Unsupervised Feature Representation, which is accepted

null 19 May 4, 2022
An official PyTorch implementation of the TKDE paper "Self-Supervised Graph Representation Learning via Topology Transformations".

Self-Supervised Graph Representation Learning via Topology Transformations This repository is the official PyTorch implementation of the following pap

Hsiang Gao 2 Oct 31, 2022
[SIGIR22] Official PyTorch implementation for "CORE: Simple and Effective Session-based Recommendation within Consistent Representation Space".

CORE This is the official PyTorch implementation for the paper: Yupeng Hou, Binbin Hu, Zhiqiang Zhang, Wayne Xin Zhao. CORE: Simple and Effective Sess

RUCAIBox 26 Dec 19, 2022
Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.

6D Rotation Representation for Unconstrained Head Pose Estimation (Pytorch) Paper Thorsten Hempel and Ahmed A. Abdelrahman and Ayoub Al-Hamadi, "6D Ro

Thorsten Hempel 284 Dec 23, 2022
Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

null 1 Jan 23, 2022
Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.

vid2vid Project | YouTube(short) | YouTube(full) | arXiv | Paper(full) Pytorch implementation for high-resolution (e.g., 2048x1024) photorealistic vid

NVIDIA Corporation 8.1k Jan 1, 2023
This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

DeepMind 892 Dec 28, 2022
Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Implicit3DUnderstanding (Im3D) [Project Page] Holistic 3D Scene Understanding from a Single Image with Implicit Representation Cheng Zhang, Zhaopeng C

Cheng Zhang 149 Jan 8, 2023
The official PyTorch code for 'DER: Dynamically Expandable Representation for Class Incremental Learning' accepted by CVPR2021

DER.ClassIL.Pytorch This repo is the official implementation of DER: Dynamically Expandable Representation for Class Incremental Learning (CVPR 2021)

rhyssiyan 108 Jan 1, 2023
Generative Query Network (GQN) in PyTorch as described in "Neural Scene Representation and Rendering"

Update 2019/06/24: A model trained on 10% of the Shepard-Metzler dataset has been added, the following notebook explains the main features of this mod

Jesper Wohlert 313 Dec 27, 2022
Ἀνατομή is a PyTorch library to analyze representation of neural networks

Ἀνατομή is a PyTorch library to analyze representation of neural networks

Ryuichiro Hataya 50 Dec 5, 2022
The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

Leo Xiao 3.9k Jan 5, 2023
This is the official implementation for "Do Transformers Really Perform Bad for Graph Representation?".

Graphormer By Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng*, Guolin Ke, Di He*, Yanming Shen and Tie-Yan Liu. This repo is the official impl

Microsoft 1.3k Dec 29, 2022
This is the official implementation for "Do Transformers Really Perform Bad for Graph Representation?".

Graphormer By Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng*, Guolin Ke, Di He*, Yanming Shen and Tie-Yan Liu. This repo is the official impl

Microsoft 1.3k Dec 26, 2022
The official implementation of CSG-Stump: A Learning Friendly CSG-Like Representation for Interpretable Shape Parsing

CSGStumpNet The official implementation of CSG-Stump: A Learning Friendly CSG-Like Representation for Interpretable Shape Parsing Paper | Project page

Daxuan 39 Dec 26, 2022
The official implementation of the paper, "SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning"

SubTab: Author: Talip Ucar ([email protected]) The official implementation of the paper, SubTab: Subsetting Features of Tabular Data for Self-Supervis

AstraZeneca 98 Dec 29, 2022
(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

Haoxi Ran 264 Dec 23, 2022