Official code repository for the work: "The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth Refinement"

Last update: Dec 14, 2022

Related tags

Deep Learning HNDR

Overview

Handheld Multi-Frame Neural Depth Refinement

This is the official code repository for the work: The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth Refinement .

If you use parts of this work, or otherwise take inspiration from it, please considering citing our paper:

@article{chugunov2021implicit,
  title={The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth Refinement},
  author={Chugunov, Ilya and Zhang, Yuxuan and Xia, Zhihao and Zhang, Cecilia and Chen, Jiawen and Heide, Felix},
  journal={arXiv preprint arXiv:2111.13738},
  year={2021}
}

Requirements:

Developed using PyTorch 1.10.0 on Linux x64 machine
Condensed package requirements are in \requirements.txt. Note that this contains the package versions at the time of publishing, if you update to, for example, a newer version of PyTorch you will need to watch out for changes in class/function calls

Data:

Download data from this Google Drive link and unpack into the \data folder
Each folder corresponds to a scene [castle, eagle, elephant, frog, ganesha, gourd, rocks, thinker] and contains four files.
- model.pt is the frozen, trained MLP corresponding to the scene
- frame_bundle.npz is the recorded bundle data (images, depth, and poses)
- reprojected_lidar.npy is the merged LiDAR depth baseline as described in the paper
- snapshot.mp4 is a video of the recorded snapshot for visualization purposes

An explanation of the format and contents of the frame bundles (frame_bundle.npz) is given in an interactive format in \0_data_format.ipynb. We recommend you go through this jupyter notebook before you record your own bundles or otherwise manipulate the data.

Project Structure:

HNDR
  ├── checkpoints  
  │   └── // folder for network checkpoints
  ├── data  
  │   └── // folder for recorded bundle data
  ├── utils  
  │   ├── dataloader.py  // dataloader class for bundle data
  │   ├── neural_blocks.py  // MLP blocks and positional encoding
  │   └── utils.py  // miscellaneous helper functions (e.g. grid/patch sample)
  ├── 0_data_format.ipynb  // interactive tutorial for understanding bundle data
  ├── 1_reconstruction.ipynb  // interactive tutorial for depth reconstruction
  ├── model.py  // the learned implicit depth model
  │             // -> reproject points, query MLP for offsets, visualization
  ├── README.md  // a README in the README, how meta
  ├── requirements.txt  // frozen package requirements
  ├── train.py  // wrapper class for arg parsing and setting up training loop
  └── train.sh  // example script to run training

Reconstruction:

The jupyter notebook \1_reconstruction.ipynb contains an interactive tutorial for depth reconstruction: loading a model, loading a bundle, generating depth.

Training:

The script \train.sh demonstrates a basic call of \train.py to train a model on the gourd scene data. It contains the arguments

checkpoint_path - path to save model and tensorboard checkpoints
device - device for training [cpu, cuda]
bundle_path - path to the bundle data

For other training arguments, see the argument parser section of \train.py.

Best of luck,
Ilya

You might also like...

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

SMPLify-XMC This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright Lic

83 Dec 14, 2022

Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Hurdles to Progress in Long-form Question Answering This repository contains the official scripts and datasets accompanying our NAACL 2021 paper, "Hur

41 Nov 8, 2022

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

MultiModal-InfoMax This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Informa

Deep Cognition and Language Research (DeCLaRe) Lab

89 Dec 26, 2022

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

TUCH This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] License Software Copyright License fo

45 Jan 7, 2023

This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

BALLAD This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model. Requirements Python3 Pytorch(1.7.

11 Dec 1, 2021

This is the official implementation code repository of Underwater Light Field Retention : Neural Rendering for Underwater Imaging (Accepted by CVPR Workshop2022 NTIRE)

Underwater Light Field Retention : Neural Rendering for Underwater Imaging (UWNR) (Accepted by CVPR Workshop2022 NTIRE) Authors: Tian Ye†, Sixiang Che

17 Dec 14, 2022

PyTorchVideo is a deeplearning library with a focus on video understanding work

PyTorchVideo is a deeplearning library with a focus on video understanding work. PytorchVideo provides resusable, modular and efficient components needed to accelerate the video understanding research. PyTorchVideo is developed using PyTorch and supports different deeplearning video components like video models, video datasets, and video-specific transforms.

2.7k Jan 7, 2023

Evaluating different engineering tricks that make RL work

Reinforcement Learning Tricks, Index This repository contains the code for the paper "Distilling Reinforcement Learning Tricks for Video Games". Short

15 Dec 26, 2022

This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Towards Persona-Based Empathetic Conversational Models (PEC) This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (E

35 Nov 17, 2022

Comments

Normal maps collapsed

Hi, Why are normal maps in the paper so homogeneous (collapsed to a single color, corresponding to a planar and frontal depth map)?

The ones I get after backprojecting depth: vs the ones in the paper:

I have had similar issues when depth is too small numerically and computing normals from depth/point-cloud. The issue in my case (which produced similar bad normal maps) was that when normalizing vectors before the cross product, I had an additive epsilon to avoid dividing over 0. This additive epsilon dominates if the depth values are small, so it requires adjusting the epsilon.

opened by mbaradad 3

Official code repository for the work: "The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth Refinement"

Related tags

Overview

Handheld Multi-Frame Neural Depth Refinement

Requirements:

Data:

Project Structure:

Reconstruction:

Training:

You might also like...

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

This is the official implementation code repository of Underwater Light Field Retention : Neural Rendering for Underwater Imaging (Accepted by CVPR Workshop2022 NTIRE)

PyTorchVideo is a deeplearning library with a focus on video understanding work

Evaluating different engineering tricks that make RL work

This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Comments

Normal maps collapsed

Owner

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Official code of our work, AVATAR: A Parallel Corpus for Java-Python Program Translation.

The personal repository of the work: DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer.

Official repo for the work titled "SharinGAN: Combining Synthetic and Real Data for Unsupervised GeometryEstimation"

Official Repo of my work for SREC Nandyal Machine Learning Bootcamp

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face Manipulation" published in CVPR 2020.

Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

Official code repository for the work: "The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth Refinement"

Related tags

Overview

Handheld Multi-Frame Neural Depth Refinement

Requirements:

Data:

Project Structure:

Reconstruction:

Training:

You might also like...

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

This is the official implementation code repository of Underwater Light Field Retention : Neural Rendering for Underwater Imaging (Accepted by CVPR Workshop2022 NTIRE)

PyTorchVideo is a deeplearning library with a focus on video understanding work

Evaluating different engineering tricks that make RL work

This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Comments

Normal maps collapsed

Owner

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Official code of our work, AVATAR: A Parallel Corpus for Java-Python Program Translation.

The personal repository of the work: *DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer*.

Official repo for the work titled "SharinGAN: Combining Synthetic and Real Data for Unsupervised GeometryEstimation"

Official Repo of my work for SREC Nandyal Machine Learning Bootcamp

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

Provided is code that demonstrates the training and evaluation of the work presented in the paper: "On the Detection of Digital Face Manipulation" published in CVPR 2020.

Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

The personal repository of the work: DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer.