ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

Overview

ViSER

Installation with conda

conda env create -f viser.yml
conda activate viser-release
# install softras
cd third_party/softras; python setup.py install; cd -;
# install manifold remeshing
git clone --recursive git://github.com/hjwdzh/Manifold; cd Manifold; mkdir build; cd build; cmake .. -DCMAKE_BUILD_TYPE=Release;make -j8; cd ../../

Data preparation

Create folders to store intermediate data and training logs

mkdir log; mkdir tmp; 

Download pre-processed data (rgb, mask, flow) following the link here and unzip under ./database/DAVIS/. The dataset is organized as:

DAVIS/
    Annotations/
        Full-Resolution/
            sequence-name/
                {%05d}.png
    JPEGImages/
        Full-Resolution/
            sequence-name/
                {%05d}.jpg
    FlowBW/ and FlowFw/
        Full-Resolution/
            sequence-name/ and optionally seqname-name_{%02d}/ (frame interval)
                flo-{%05d}.pfm
                occ-{%05d}.pfm
                visflo-{%05d}.jpg
                warp-{%05d}.jpg

To run preprocessing scripts on other videos, see install.md.

Example: breakdance-flare

Run

bash scripts/template.sh breakdance-flare

To monitor optimization, run

tensorboard --logdir log/

To render optimized breakdance-flare

bash scripts/render_result.sh breakdance-flare log/breakdance-flare-1003-ft2/pred_net_20.pth 36

Example outputs:

Example: elephants

Run

bash scripts/relephant-walk.sh

To monitor optimization, run

tensorboard --logdir log/

To render optimized breakdance-flare

bash scripts/render_elephants.sh log/elephant-walk-1003-6/pred_net_10.pth

Additional Notes

Distributed training

The current codebase supports single-node multi-gpu training with pytorch distributed data-parallel. Please modify dev and ngpu in scripts/template.sh to select devices.

Potential bugs
  • When setting batch_size to 3, rendered flow may become constant values.

Acknowledgement

The code borrows the skeleton of CMR

External repos:

Citation

To cite our paper
@inproceedings{yang2021viser,
  title={ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction},
  author={Yang, Gengshan 
      and Sun, Deqing
      and Jampani, Varun
      and Vlasic, Daniel
      and Cole, Forrester
      and Liu, Ce
      and Ramanan, Deva},
  booktitle = {NeurIPS},
  year={2021}
}  
@inproceedings{yang2021lasr,
  title={LASR: Learning Articulated Shape Reconstruction from a Monocular Video},
  author={Yang, Gengshan 
      and Sun, Deqing
      and Jampani, Varun
      and Vlasic, Daniel
      and Cole, Forrester
      and Chang, Huiwen
      and Ramanan, Deva
      and Freeman, William T
      and Liu, Ce},
  booktitle={CVPR},
  year={2021}
}  

TODO

  • data pre-processing scripts
  • evaluation data and scripts
  • code clean up
Comments
  • About symmetric option

    About symmetric option

    Hi, thanks for publishing this wonderful works.

    I want to ask is there any missing about symmetric code. I checked the code but don't see the initialize for symmetric like in LASR source code. Also the results are different from what I expected.

    image7 image6

    opened by phamtrongthang123 11
  • Videos without static camera

    Videos without static camera

    Hi, I'm trying to use Viser or LASR on videos that camera may rotate around the objects. But the results assume the camera is fixed. Have you ever tried videos where the camera has large motions?

    opened by Iven-Wu 8
  • Problems of matching loss with other dataset

    Problems of matching loss with other dataset

    I'm currently training VISER on our synthetic dataset. But when we calculate the matching loss in VISER, it always does to pdb in the first iteration. Our input video size is 1024*1024, would this lead to the problem? Also, our dataset has the camera moving between frames. It seems that the main difference between our video and the video in the demo is the resolution.

    The pdb sentence is shown as followed. I tested on many videos of our datasets with various camera initial locations and animal actions. Most of them stopped in this sentence within 1 or 2 iterations, others stopped several iterations later. https://github.com/gengshan-y/viser-release/blob/a3943ad80d391f1b60379524de3c5d07f924c6bd/nnutils/mesh_net.py#L806-L808

    opened by Iven-Wu 7
  • Failure during optimization

    Failure during optimization

    Hello, I'm trying to run ViSER on some of my own datasets. Out of my 5 datasets, 2 succeed and 3 fail: all with the same failure case:

    > /HPS/articulated_nerf/work/viser/nnutils/mesh_net.py(809)forward()
    -> self.match_loss = (csm_pred - csm_gt).norm(2,1)[mask].mean() * 0.1
    (Pdb) 
    Traceback (most recent call last):
      File "optimize.py", line 59, in <module>
        app.run(main)
      File "/HPS/articulated_nerf/work/miniconda3/envs/viser/lib/python3.8/site-packages/absl/app.py", line 312, in run
        _run_main(main, args)
      File "/HPS/articulated_nerf/work/miniconda3/envs/viser/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
        sys.exit(main(argv))
      File "optimize.py", line 56, in main
        trainer.train()
      File "/HPS/articulated_nerf/work/viser/nnutils/train_utils.py", line 339, in train
        total_loss,aux_output = self.model(input_batch)
      File "/HPS/articulated_nerf/work/miniconda3/envs/viser/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/HPS/articulated_nerf/work/miniconda3/envs/viser/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 705, in forward
        output = self.module(*inputs[0], **kwargs[0])
      File "/HPS/articulated_nerf/work/miniconda3/envs/viser/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/HPS/articulated_nerf/work/viser/nnutils/mesh_net.py", line 809, in forward
        self.match_loss = (csm_pred - csm_gt).norm(2,1)[mask].mean() * 0.1
      File "/HPS/articulated_nerf/work/viser/nnutils/mesh_net.py", line 809, in forward
        self.match_loss = (csm_pred - csm_gt).norm(2,1)[mask].mean() * 0.1
      File "/HPS/articulated_nerf/work/miniconda3/envs/viser/lib/python3.8/bdb.py", line 88, in trace_dispatch
        return self.dispatch_line(frame)
      File "/HPS/articulated_nerf/work/miniconda3/envs/viser/lib/python3.8/bdb.py", line 113, in dispatch_line
        if self.quitting: raise BdbQuit
    bdb.BdbQuit
    Traceback (most recent call last):
      File "/HPS/articulated_nerf/work/miniconda3/envs/viser/lib/python3.8/runpy.py", line 194, in _run_module_as_main
        return _run_code(code, main_globals, None,
      File "/HPS/articulated_nerf/work/miniconda3/envs/viser/lib/python3.8/runpy.py", line 87, in _run_code
        exec(code, run_globals)
      File "/HPS/articulated_nerf/work/miniconda3/envs/viser/lib/python3.8/site-packages/torch/distributed/launch.py", line 340, in <module>
        main()
      File "/HPS/articulated_nerf/work/miniconda3/envs/viser/lib/python3.8/site-packages/torch/distributed/launch.py", line 326, in main
        sigkill_handler(signal.SIGTERM, None)  # not coming back
      File "/HPS/articulated_nerf/work/miniconda3/envs/viser/lib/python3.8/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
        raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
    subprocess.CalledProcessError: Command '['/HPS/articulated_nerf/work/miniconda3/envs/viser/bin/python', '-u', 'optimize.py', '--local_rank=0', '--name=cactus_full-1003-0', '--checkpoint_dir', 'log', '--n_bones', '21', '--num_epochs', '20', '--dataname', 'cactus_full-init', '--ngpu', '1', '--batch_size', '4', '--seed', '1003']' returned non-zero exit status 1.
    Killing subprocess 6097
    

    Full error log 1 Full error log 2 Full error log 3

    I would tend to assume this is a division by zero in the identified line. Have you encountered this issue before?

    I have tried multiple values of init_frame and end_frame for initially optimizing on a subset (where the failure occurs). I have also tried different seed values. I haven't found any choice of these parameters that cause these datasets to avoid this failure case.

    Any help or insight you can provide would be appreciated

    opened by ecmjohnson 4
  • Results for dance-twirl

    Results for dance-twirl

    Hi, Thanks for open-sourcing this awesome work. I tried to run on the dance-twirl video and got the following results:

    https://user-images.githubusercontent.com/96632902/152095343-c2fc9026-7835-468a-be92-a2114da35bf1.mp4

    It seems the rendered frame does not match the input frame, I organized the data as suggested in the Readme and run by the command sh scripts/dance-twirl.sh and test by bash scripts/render_result.sh dance-twirl log/dance-twirl-1003-ft2/pred_net_latest.pth --catemodel --cnnpp `, could you please let me know if this result is expected or if there's something wrong? Thanks!

    opened by iris112358 4
  • Location to the Implementation of Feature Consistency Loss

    Location to the Implementation of Feature Consistency Loss

    Hello,

    I would like to ask about the implementation of feature consistency loss in your source code. I was looking through the nnutils/mesh_net.py but could only find match loss and cycle (reprojection) loss, and imatch loss (maybe inverse match loss), which seems not reported in the paper.

    Would you please help me locate the implementation of feature consistency loss ?

    Thank you in advance.

    opened by vhvkhoa 3
  • About the evaluation.

    About the evaluation.

    Hi, thanks for publishing this wonderful works.

    I want to ask is it possible for us to use the evalutation code from LASR to evaluate the ViSER's outputs? I think that it is possible, but I don't see you publish the evaluation code. Therefore, I think there is something that I haven't noticed. Thank you very much.

    opened by phamtrongthang123 1
  • Err result with ama-female dataset

    Err result with ama-female dataset

    I have a problem in reproducing the ama-female dataset from viser-rlease. I used the init-cam provided by Posenet, I also used your updated ama-female.sh script, and I used the banmo format in vid.py, and changed the smooth loss to the original 0.005. In the end the training of ama-female resulted in an image like this, what could have gone wrong with this process?

    4786b8ea38163371c5191b6bf3f0b86

    opened by bravotty 2
  • warning: loading empty camera

    warning: loading empty camera

    image I follow the code (https://github.com/gengshan-y/viser-release/blob/main/preprocess/README.md) and run "bash sh scripts/breakdance-flare.sh" But It stopped at the break point

    opened by kyosocan 5
SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements (CVPR 2021)

SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements (CVPR 2021) This repository contains the official PyTorch implementa

Qianli Ma 133 Jan 5, 2023
A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation (ICCV 2021)

A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation (ICCV 2021) This repository contains the official implemen

null 81 Dec 14, 2022
Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose Paper | Website | Data A-NeRF: Articulated Neural Radiance F

Shih-Yang Su 172 Dec 22, 2022
Poisson Surface Reconstruction for LiDAR Odometry and Mapping

Poisson Surface Reconstruction for LiDAR Odometry and Mapping Surfels TSDF Our Approach Table: Qualitative comparison between the different mapping te

Photogrammetry & Robotics Bonn 305 Dec 21, 2022
Implementation for the "Surface Reconstruction from 3D Line Segments" paper.

Surface Reconstruction from 3D Line Segments Surface reconstruction from 3d line segments. Langlois, P. A., Boulch, A., & Marlet, R. In 2019 Internati

null 85 Jan 4, 2023
[ICCV 2021 (oral)] Planar Surface Reconstruction from Sparse Views

Planar Surface Reconstruction From Sparse Views Linyi Jin, Shengyi Qian, Andrew Owens, David F. Fouhey University of Michigan ICCV 2021 (Oral) This re

Linyi Jin 89 Jan 5, 2023
Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance

Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance Project Page | Paper | Data This repository contains an implementatio

Lior Yariv 521 Dec 30, 2022
The official implementation code of "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction."

PlantStereo This is the official implementation code for the paper "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction".

Wang Qingyu 14 Nov 28, 2022
Deep Surface Reconstruction from Point Clouds with Visibility Information

Data, code and pretrained models for the paper Deep Surface Reconstruction from Point Clouds with Visibility Information.

Raphael Sulzer 23 Jan 4, 2023
Implementation of CVPR'2022:Surface Reconstruction from Point Clouds by Learning Predictive Context Priors

Surface Reconstruction from Point Clouds by Learning Predictive Context Priors (CVPR 2022) Personal Web Pages | Paper | Project Page This repository c

null 136 Dec 12, 2022
[3DV 2020] PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction

PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction International Conference on 3D Vision, 2020 Sai Sagar Jinka1, Rohan

Rohan Chacko 39 Oct 12, 2022
A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

null 117 Dec 28, 2022
"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction (CVPRW 2022) Yuanhao Cai, Jing Lin, Zudi Lin, Haoqian Wang, Yulun Z

Yuanhao Cai 274 Jan 5, 2023
Where2Act: From Pixels to Actions for Articulated 3D Objects

Where2Act: From Pixels to Actions for Articulated 3D Objects The Proposed Where2Act Task. Given as input an articulated 3D object, we learn to propose

Kaichun Mo 69 Nov 28, 2022
Official PyTorch implementation of CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds Introduction This is the official PyTorch implementation of o

Yijia Weng 96 Dec 7, 2022
Code for Motion Representations for Articulated Animation paper

Motion Representations for Articulated Animation This repository contains the source code for the CVPR'2021 paper Motion Representations for Articulat

Snap Research 851 Jan 9, 2023
This repository contains the accompanying code for Deep Virtual Markers for Articulated 3D Shapes, ICCV'21

Deep Virtual Markers This repository contains the accompanying code for Deep Virtual Markers for Articulated 3D Shapes, ICCV'21 Getting Started Get sa

KimHyomin 45 Oct 7, 2022
Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

TCMR: Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video Qualtitative result Paper teaser video Introduction This r

Hongsuk Choi 215 Jan 6, 2023
Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

null 1 Jan 23, 2022