Code for Paper: Self-supervised Learning of Motion Capture

Overview

Self-supervised Learning of Motion Capture

This is code for the paper: Hsiao-Yu Fish Tung, Hsiao-Wei Tung, Ersin Yumer, Katerina Fragkiadaki, Self-supervised Learning of Motion Capture, NIPS2017 (Spotlight)

Check the project page for more results.

Content

  • Environment setup and Dataset
  • Data preprocessing
  • Pretrained model and small tfrecords
  • Training
  • Citation
  • License

1. Environment setup and Dataset

  • python We use python2.7.13 from Anaconda and Tensorflow 1.1

  • SMPL model: We need rest body template from SMPL model.

You can download it from here.

  • SURREAL Dataset: If you plan to pretrain or test on surreal dataset.

Please download surreal from here

  • H36M Dataset: If you plan to test on real video with some groundtruth (to evaluate).

Please download H3.6M Dataset from here

2. Data preprocessing

  • Parse Surreal Dataset into binary files

In order to speed up the read write for tfrecords, we parse surreal dataset into binary files. Open file

data/preparsed/main_parse_surreal 

and change the data path and output path.

  • Build up tfrecords

change the data path to the path you built in the previous step in

pack_data/pack_data_bin.py

and run it. You can specify how many examples you want to have in each tfrecords by changing value for num_samples. If "is_test" is False, we use sequences generated from actor 1, 5, 6, 7, 8 as training samples. If "is_test" is True, we use only sequence "" from actor 9 as validation. You can change this split by modifying the "get_file_list" function in tfrecords_utils.py

3. Pretrained model and small tfrecords

You can downdload a pretrained model using supervision from here surreal_quo0.tfrecords is a small training data and surreal2_100_test_quo1.tfrecords

Note: To make this code pack, I calculate 2d flow directly from 3d groundtruth during testing. But you should replace this with your own predicted flow and keypoints.

4. Train model

open up pretrained.sh, there is one commend for pretraining using supervision, and one commend for finetuning with testing data. Commend out the line that you need

Citation

If you use this code, please cite:

@incollection{NIPS2017_7108, title = {Self-supervised Learning of Motion Capture}, author = {Tung, Hsiao-Yu and Tung, Hsiao-Wei and Yumer, Ersin and Fragkiadaki, Katerina}, booktitle = {Advances in Neural Information Processing Systems 30}, editor = {I. Guyon and U. V. Luxburg and S. Bengio and H. Wallach and R. Fergus and S. Vishwanathan and R. Garnett}, pages = {5236--5246}, year = {2017}, publisher = {Curran Associates, Inc.}, url = {http://papers.nips.cc/paper/7108-self-supervised-learning-of-motion-capture.pdf} }

You might also like...
 Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)
Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance This is the codebase for video-based human motion reconstruction in human-mot

This is the official code release for the paper Shape and Material Capture at Home
This is the official code release for the paper Shape and Material Capture at Home

This is the official code release for the paper Shape and Material Capture at Home. The code enables you to reconstruct a 3D mesh and Cook-Torrance BRDF from one or more images captured with a flashlight or camera with flash.

[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.
[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

TBE The source code for our paper "Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Le

Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds This is the official code implementation for the paper "Spatio-temporal Se

[CVPR 2021]
[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models Codes for this paper The Lottery Tickets Hypo

 Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models
Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models

Patch-Rotation(PatchRot) Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models Submitted to Neurips2021 To

Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

How Well Do Self-Supervised Models Transfer? This repository hosts the code for the experiments in the CVPR 2021 paper How Well Do Self-Supervised Mod

Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation
Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation

CorDA Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation Prerequisite Please create and activate the follo

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

ConSERT Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer Requirements torch==1.6.0

Comments
  • How can you transform 3d coordinate to smpl Pose?

    How can you transform 3d coordinate to smpl Pose?

    Hi, Thanks for your code, but I have a question about smpl pose. You get the data from H36, and it just has 3d coordinates and 2d coordinate. So how can you transform 3d coordinate to smpl pose, which is axis-angle to parent rig.

    opened by MikoyChinese 2
  • About H36M

    About H36M

    Thank you for providing the code. In your paper you tested your network on H36M dataset, however, the definitions of 2d poses in H36M and SURREAL are different, for SURREAL joints are defined based on SMPL model(24 joints), however, there are 32 joints in H36M dataset.

    So, since your network requires 2d heatmap as input, how did you obtain the 2d joints(24 joints) of H36M? Did you just use any off-the-shelf networks to detect 2d keypoints? If so, could you tell me which work you used?

    Thanks alot:)

    opened by sta105 2
  • About Optimization in test

    About Optimization in test

    HI, from the paper, it seems you first train a model in synthetic dataset, then based on pretrained model, finetune in test dataset? So essentially, in testing, it's still a learning-based model, not optimization? Or when you do the inference, the model is optimized just for that particular video?

    opened by lijiaman 0
Owner
Hsiao-Yu Fish Tung
Postdoc at MIT CoCosci Lab and Stanford NeuroAILab. PhD at CMU MLD
Hsiao-Yu Fish Tung
EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos.

EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos. In this project, we provide the basic code for fitt

ZJU3DV 2.2k Jan 5, 2023
dataset for ECCV 2020 "Motion Capture from Internet Videos"

Motion Capture from Internet Videos Motion Capture from Internet Videos Junting Dong*, Qing Shuai*, Yuanqing Zhang, Xian Liu, Xiaowei Zhou, Hujun Bao

ZJU3DV 98 Dec 7, 2022
PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time

PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time The implementation is based on SIGGRAPH Aisa'20. Dependencies Python 3.7 Ubuntu

soratobtai 124 Dec 8, 2022
A real-time motion capture system that estimates poses and global translations using only 6 inertial measurement units

TransPose Code for our SIGGRAPH 2021 paper "TransPose: Real-time 3D Human Translation and Pose Estimation with Six Inertial Sensors". This repository

Xinyu Yi 261 Dec 31, 2022
Motion and Shape Capture from Sparse Markers

MoSh++ This repository contains the official chumpy implementation of mocap body solver used for AMASS: AMASS: Archive of Motion Capture as Surface Sh

Nima Ghorbani 135 Dec 23, 2022
A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run.

Minimal Hand A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run. This project provides the

Yuxiao Zhou 824 Jan 7, 2023
Differential rendering based motion capture blender project.

TraceArmature Summary TraceArmature is currently a set of python scripts that allow for high fidelity motion capture through the use of AI pose estima

William Rodriguez 4 May 27, 2022
Self-Supervised Pillar Motion Learning for Autonomous Driving (CVPR 2021)

Self-Supervised Pillar Motion Learning for Autonomous Driving Chenxu Luo, Xiaodong Yang, Alan Yuille Self-Supervised Pillar Motion Learning for Autono

QCraft 101 Dec 5, 2022
Video Autoencoder: self-supervised disentanglement of 3D structure and motion

Video Autoencoder: self-supervised disentanglement of 3D structure and motion This repository contains the code (in PyTorch) for the model introduced

null 157 Dec 22, 2022
Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

UniSpeech The family of UniSpeech: UniSpeech (ICML 2021): Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR UniSpeech-

Microsoft 282 Jan 9, 2023