Code for ECCV 2020 paper "Contacts and Human Dynamics from Monocular Video".

Overview

Contact and Human Dynamics from Monocular Video

This is the official implementation for the ECCV 2020 spotlight paper by Davis Rempe, Leonidas J. Guibas, Aaron Hertzmann, Bryan Russell, Ruben Villegas, and Jimei Yang. For more information, see the project webpage.

Teaser

Environment Setup

Note: the code in this repo has only been tested on Ubuntu 16.04.

First create and activate a virtual environment to install dependencies for the code in this repo. For example with conda:

  • conda create -n contact_dynamics_env python=3.6
  • conda activate contact_dynamics_env
  • pip install -r requirements.txt

Note the package versions in the requirements file are the exact ones tested on, but may need to be modified for your system. The code also uses ffmpeg.

This codebase requires the installation of a number of external dependencies that have their own installation instructions/environments, e.g., you will likely want to create a different environment just to run Monocular Total Capture below. The following external dependencies are only necessary to run the full pipeline (both contact detection and physical optimization). If you're only interested in detecting foot contacts, it is only necessary to install OpenPose.

To get started, from the root of this repo mkdir external.

Monocular Total Capture (MTC)

The full pipeline runs on the output from Monocular Total Capture (MTC). To run MTC, you must clone this fork which contains a number of important modifications:

  • cd external
  • git clone https://github.com/davrempe/MonocularTotalCapture.git
  • Follow installation instructions in that repo to set up the MTC environment.

TOWR

The physics-based optimization takes advantage of the TOWR library. Specifically, this fork must be used:

  • cd external
  • git clone https://github.com/davrempe/towr.git
  • Follow the intallation instructions to build and install the library using cmake.

Building Physics-Based Optimization

Important Note: if you did not use the HSL routines when building IPopt as suggested, in towr_phys_optim/phys_optim.cpp you will need to change the line solver->SetOption("linear_solver", "MA57"); to solver->SetOption("linear_solver", "mumps"); before building our physics-based optimization. This uses the slower MUMPS solver and should be avoided if possible.

After building and installing TOWR, we must build the physics-based optimization part of the pipeline. To do this from the repo root:

cd towr_phys_optim
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make

Downloads

Synthetic Dataset

The synthetic dataset used to train our foot contact detection network contains motion sequences on various Mixamo characters. For each sequence, the dataset contains rendered videos from 2 different camera viewpoints, camera parameters, annotated foot contacts, detected 2D pose (with OpenPose), and the 3D motion as a bvh file. Note, this dataset is only needed if you want to retrain the contact detection network.

To download the dataset:

  • cd data
  • bash download_full.sh to download the full (52 GB) dataset or bash download_sample.sh for a sample version (715 MB) with limited motions from 2 characters.

Pretrained Weights

To download pretrained weights for the foot contact detection network, run:

  • cd pretrained_weights
  • bash download.sh

Running the Pipeline on Real Videos

Next we'll walk through running each part of the pipeline on a set of real-world videos. A small example dataset with 2 videos is provided in data/example_data. Data should always be structured as shown in example_data where each video is placed in its own directory named the same as the video file to be processed - inputs and outputs for parts of the pipeline will be saved in these directories. There is a helper script to create this structure from a directory of videos.

The first two steps in the pipeline are running MTC/OpenPose on the video to get 3D/2D pose inputs, followed by foot contact detection using the 2D poses.

Running MTC

The first step is to run MTC and OpenPose. This will create the necessary data (2D and 3D poses) to run both foot contact detection and physical optimization.

The scripts/run_totalcap.py is used to run MTC. It is invoked on a directory containg any number of videos, each in their own directory, and will run MTC on all contained videos. The script runs MTC, post-processes the results to be used in the rest of the pipeline, and saves videos visualizing the final output. The script copies all the needed outputs (in particular tracked_results.json and the OpenPose detection openpose_results directly to the given data directory). To run MTC for the example data, first cd scripts then:

python run_totalcap.py --data ../data/example_data --out ../output/mtc_viz_out --totalcap ../external/MonocularTotalCapture

Alternatively, if you only want to do foot contact detection (and don't care about the physical optimization), you can instead run OpenPose by itself without MTC. There is a helper script to do this in scripts:

python run_openpose.py --data ../data/example_data --out ../data/example_data --openpose ../external/openpose --hands --face --save-video

This runs OpenPose and saves the outputs directly to the same data directory for later use in contact detection.

Foot Contact Detection

The next step is using the learned neural network to detect foot contacts from the 2D pose sequence.

To run this, first download the pretrained network weights as detailed above. Then to run on the example data cd scripts and then:

python run_detect_contacts.py --data ../data/example_data --weights ../pretrained_weights/contact_detection_weights.pth

This will detect and save foot contacts for each video in the data directory to a file called foot_contacts.npy. This is simply an Fx4 array where F is the number of frames; for each frame there is a binary contact label for the left heel, left toe, right heel, and right toe, in that order.

You may also optionally add the --viz flag to additionally save a video with overlaid detections (currently requires a lot of memory for videos more than a few seconds long).

Trajectory Optimization

Finally, we are able to run the kinematic optimization, retargeting, and physics-based optimization steps.

There is a single script to run all these - simply make sure you are in the scripts directory, then run:

python run_phys_mocap.py --data ../data/example_data --character ybot

This command will do the optimization directly on the YBot Mixamo character (ty and skeletonzombie are also availble). To perform the optimization on the skeleton estimated from video (i.e., to not use the retargeting step), give the argument --character combined.

Each of the steps in this pipeline can be run individually if desired, see how to do this in run_phys_mocap.py.

Visualize Results with Blender

We can visualize results on a character using Blender. Before doing this, ensure Blender v2.79b is installed.

You will first need to download the Blender scene we use for rendering. From the repo root cd data then bash download_viz.sh will place viz_scene.blend in the data directory. Additionally, you need to download the character T-pose FBX file from the Mixamo website; in this example we are using the YBot character.

To visualize the result for a sequence, make sure you are in the src directory and use something like:

blender -b -P viz/viz_blender.py -- --results ../data/example_data/dance1 --fbx ../data/fbx/ybot.fbx --scene ../data/viz_scene.blend --character ybot --out ../output/rendered_res --fps 24 --draw-com --draw-forces

Note that there are many options to customize this rendering - please see the script for all these. Also the side view is set up heuristically, you may need to manually tune setup_camera depending on your video.

Training and Testing Contact Detection Network on Synthetic Data

To re-train the contact detection network on the synthetic dataset and run inference on the test set use the following:

>> cd src
# Train the contact detection network
>> python contact_learning/train.py --data ../data/synthetic_dataset --out ../output/contact_learning_results
# Run detection on the test set
>> python contact_learning/test.py --data ../data/synthetic_dataset --out ../output/contact_learning_results --weights-path ../output/contact_learning_results/op_only_weights_BEST.pth --full-video

Citation

If you found this code or paper useful, please consider citing:

@inproceedings{RempeContactDynamics2020,
    author={Rempe, Davis and Guibas, Leonidas J. and Hertzmann, Aaron and Russell, Bryan and Villegas, Ruben and Yang, Jimei},
    title={Contact and Human Dynamics from Monocular Video},
    booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
    year={2020}
}

Questions?

If you run into any problems or have questions, please create an issue or contact Davis ([email protected]).

Comments
  • Results of example_data/dance1 were not correct, did I make somthing wrong?

    Results of example_data/dance1 were not correct, did I make somthing wrong?

    Amazing work! Thank you for sharing the code. But I run the demo of example_data/dance1(73 frames), the .log in phys_optim_out_ybot folder gave out the result "dynamics 0 durations 0", which means Optimization did not converge. The dance_compare_all_skel.mp4 in phys_optim_out_ybot folder showed the strange results: https://user-images.githubusercontent.com/30141934/119248285-be8ea100-bbc2-11eb-959a-2b34972fbee9.mp4

    So, I want to ask was there anything wrong with what I did during the process? Thank you again.

    opened by zhouyan09 5
  • Monocular problem (site down)

    Monocular problem (site down)

    Hello. I was eager to test this solution, but right from the beggning I had problems with another repository. I know its not your fault, but I think it would be nice to let you know, because anybody else will be able to build those other tools to test this solution, sadly.

    Here is a issue I'm opened, just to other people, in case the have the same problem.

    https://github.com/CMU-Perceptual-Computing-Lab/MonocularTotalCapture/issues/59

    opened by carlosedubarreto 5
  • Files missing in “data” folder

    Files missing in “data” folder

    Hello, It seems that some python files in "data" folder are missing.

    In "src/contact_learning/test.py", the following files are imported, but I can't find them in the repo. import data.openpose_dataset as openpose_dataset from data.openpose_dataset import OpenPoseDataset from data.real_video_dataset import RealVideoDataset, TRAIN_DIM from data.contact_data_utils import get_frame_paths

    Could you please provide the missing files? Thanks

    opened by zhaopy10 4
  • Error building physics-based optimization part

    Error building physics-based optimization part

    Hello, Thanks for the great work. when I try to build the physics-based optimization I get the following error:

    contact-human-dynamics/towr_phys_optim/src/variables/nodes_variables_dynamic_phase_based.cpp:51:3: error: 
    
    std::vector<towr::NodesVariablesPhaseBased::PolyInfo> towr::NodesVariablesPhaseBased::polynomial_info_’
    
    is private within this context.
    

    Could you please help me to solve this error?

    opened by vadeli 3
  • apply foot contact on my own video

    apply foot contact on my own video

    I got some video in the wild, however, there might be some problem with the rendering image. I can't get the result. image this cost more than one hour, and I still can't get the result.

    opened by EEWenbinWu 3
  • is there a way to obtain the joints position?

    is there a way to obtain the joints position?

    Hi @davrempe, thanks for your great work and I have got some good results based on your code. I also would like to know how to get and save joints especially the root position, hope you can give some advice, thanks~

    opened by visonpon 2
  • About the cam parameters in foot contact data.

    About the cam parameters in foot contact data.

    Hi, Thanks for the wonderful work. I downloaded the foot contact data and tried to project the 3D keypoints obtained from bvh file onto the image plane, which failed to align with the pose. I used the "K" and "RT" in the view*_camera_params.npz for perspective projection but got the wrong results. What is the "P" in the npz file used for? Or could you please provide a script for projection? Thank you very much!

    opened by xjwxjw 1
  • How to get the joint mapping between mixamo and smpl?

    How to get the joint mapping between mixamo and smpl?

    https://github.com/davrempe/contact-human-dynamics/blob/ddde77df56de2afe2edfe8ef3a14f22ca7f1bbfc/src/utils/character_info_utils.py#L390 How get these mappings? Thanks!

    opened by tszhang97 1
  • Speed of kinematic optimizer

    Speed of kinematic optimizer

    Hi, thanks for the great work! I tried to run the physical mocap part (kinematic optimization + retargeting + physics based optimization), in paper you mentioned "for a 2 second (60 frame) video clip, the physical optimization usually takes from 30 minutes to 1 hour" and I found kinematic optimization is slow too, should it be very fast? I'm not sure if this is normal or not, looking forward to your response, thanks!!

    opened by jionie 1
  • some question about root postion

    some question about root postion

    Hi, since based on your scripts, when run totalcap, the run_fitting.cpp will save the predicted parameters in txt for tracked or untracked results, and I tried to read the root postion from these txt(the first row in these txt), but when to use these root position on my own T pose model, the result seems not normal. I cannot figure out why since the default rendered results are all normal, hope you can help, thanks~

    opened by visonpon 0
  • Files missing in data folder

    Files missing in data folder

    Hello, It seems that some files in "data" folder are missing.

    import data.openpose_dataset as openpose_datasetfrom data.openpose_dataset import OpenPoseDatasetfrom data.real_video_dataset import RealVideoDataset, TRAIN_DIMfrom data.contact_data_utils import get_frame_paths
    opened by zhaopy10 0
  • Trajectory Optimization faild, the no-dynamics、dynamics、durations bvh results are  totally the same.

    Trajectory Optimization faild, the no-dynamics、dynamics、durations bvh results are totally the same.

    This is the video of the Trajectory Optimization https://user-images.githubusercontent.com/50345299/198824411-49a1db7d-9fc4-4d35-934f-f6a0f9335f5e.mp4

    I have checked the bvh files, These are the same. And the success_log displays 0 0

    opened by jihg88 0
  • use viz flag on my own video (only a few seconds)

    use viz flag on my own video (only a few seconds)

    I turned on the viz flag and set it to true, but I still can't save the video, I get the error "FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpbibs1kw9/full_video_results/sample_pos.mp4'"

    opened by Timkeeper2018 0
  • How to run this technique with VIBE?

    How to run this technique with VIBE?

    Hi, i have 3d poses predicted from VIBE, how do I apply this technique to my model? I don't know how to using the contact information to reduce skate problem. Can u give me a hand?

    opened by luohao123 0
  • Problems on optimization with my own bvh file

    Problems on optimization with my own bvh file

    Hi @davrempe , I'm running the towr optimization code on my bvh file. I have removed T-pose at the first frame, changed the joints index and the body segment mapping dict. Spline fitting in stage1.1 can converge at 1822 steps, however it can not converge from stage1.2. I don't know why, could you please give me some advice.

    opened by NewCoderQ 0
  • Could you release the code which how to make Contact Synthetic Dataset

    Could you release the code which how to make Contact Synthetic Dataset

    Thanks for your great work! Could you release the code and method that make The Contact Synthetic Dataset. I want to make richer motion, because I found the pretrain-model you released not good in some motion.

    opened by liuyuemaicha 1
Owner
Davis Rempe
Davis Rempe
Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

PWLQ Updates 2020/07/16 - We are working on getting permission from our institution to release our source code. We will release it once we are granted

null 54 Dec 15, 2022
Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

Adversarial Training Against Location-Optimized Adversarial Patches arXiv | Paper | Code | Video | Slides Code for the paper: Sukrut Rao, David Stutz,

Sukrut Rao 32 Dec 13, 2022
Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh

Arjun Majumdar 44 Dec 14, 2022
PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network"

HAN PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network" This repository is for HAN introduced in the

五维空间 140 Nov 23, 2022
PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "

Foley Music: Learning to Generate Music from Videos This repo holds the code for the framework presented on ECCV 2020. Foley Music: Learning to Genera

Chuang Gan 30 Nov 3, 2022
Code for Towards Streaming Perception (ECCV 2020) :car:

sAP — Code for Towards Streaming Perception ECCV Best Paper Honorable Mention Award Feb 2021: Announcing the Streaming Perception Challenge (CVPR 2021

Martin Li 85 Dec 22, 2022
Source code for "Progressive Transformers for End-to-End Sign Language Production" (ECCV 2020)

Progressive Transformers for End-to-End Sign Language Production Source code for "Progressive Transformers for End-to-End Sign Language Production" (B

null 58 Dec 21, 2022
[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

Stable Head Pose Estimation and Landmark Regression via 3D Dense Face Reconstruction Reimplementation of (ECCV 2020) Towards Fast, Accurate and Stable

Remilia Scarlet 221 Dec 30, 2022
1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking

Instead, two models for appearance modeling are included, together with the open-source BAGS model and the full set of code for inference. With this code, you can achieve around mAP@23 with TAO test set (based on our estimation).

null 79 Oct 8, 2022
Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

Tackgeun 21 Nov 20, 2022
git《Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction》(ECCV 2020) GitHub:

Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction Code for the ECCV 2020 paper by Yiming Qian and Yasutaka Furukawa Getting

null 37 Dec 4, 2022
dataset for ECCV 2020 "Motion Capture from Internet Videos"

Motion Capture from Internet Videos Motion Capture from Internet Videos Junting Dong*, Qing Shuai*, Yuanqing Zhang, Xian Liu, Xiaowei Zhou, Hujun Bao

ZJU3DV 98 Dec 7, 2022
《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Unsupervised 3D Human Pose Representation [Paper] The implementation of our paper Unsupervised 3D Human Pose Representation with Viewpoint and Pose Di

null 42 Nov 24, 2022
SNE-RoadSeg in PyTorch, ECCV 2020

SNE-RoadSeg Introduction This is the official PyTorch implementation of SNE-RoadSeg: Incorporating Surface Normal Information into Semantic Segmentati

null 242 Dec 20, 2022
[ECCV 2020] Gradient-Induced Co-Saliency Detection

Gradient-Induced Co-Saliency Detection Zhao Zhang*, Wenda Jin*, Jun Xu, Ming-Ming Cheng ⭐ Project Home » The official repo of the ECCV 2020 paper Grad

Zhao Zhang 35 Nov 25, 2022
Sign Language Translation with Transformers (COLING'2020, ECCV'20 SLRTP Workshop)

transformer-slt This repository gathers data and code supporting the experiments in the paper Better Sign Language Translation with STMC-Transformer.

Kayo Yin 107 Dec 27, 2022
IAST: Instance Adaptive Self-training for Unsupervised Domain Adaptation (ECCV 2020)

This repo is the official implementation of our paper "Instance Adaptive Self-training for Unsupervised Domain Adaptation". The purpose of this repo is to better communicate with you and respond to your questions. This repo is almost the same with Another-Version, and you can also refer to that version.

CVSM Group -  email: czhu@bupt.edu.cn 84 Dec 12, 2022
Boundary-preserving Mask R-CNN (ECCV 2020)

BMaskR-CNN This code is developed on Detectron2 Boundary-preserving Mask R-CNN ECCV 2020 Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu Video

Hust Visual Learning Team 178 Nov 28, 2022
Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency(ECCV 2020) This is an official python implementati

null 304 Jan 3, 2023