Tensorflow Implementation of ECCV'18 paper: Multimodal Human Motion Synthesis

Xinchen Yan

Last update: Oct 2, 2022

Related tags

Overview

MT-VAE for Multimodal Human Motion Synthesis

This is the code for ECCV 2018 paper MT-VAE: Learning Motion Transformations to Generate Multimodal Human Dynamics by Xinchen Yan, Akash Rastogi, Ruben Villegas, Kalyan Sunkavalli, Eli Shechtman, Sunil Hadap, Ersin Yumer, Honglak Lee.

Please follow the instructions to run the code.

Requirements

MT-VAE requires or works with

Mac OS X or Linux
NVIDIA GPU

Installing Dependency

Install TensorFlow
Note: this implementation has been tested with TensorFlow 1.3.

Data Preprocessing

For Human3.6M dataset, please download the pre-processed dataset.

bash prep_human36m_joints.sh

Disclaimer: Please check the license of Human3.6M dataset if you download this preprocessed version.

Training (MT-VAE)

If you want to train the MT-VAE human motion generator, please run the following script (usually it takes 1 day with a single Titan GPU).

bash demo_human36m_trainMTVAE.sh

Alternatively, you can download the pre-trained MT-VAE model, please run the following script.

bash prep_human36m_model.sh

Motion Synthesis Using Pre-trained MT-VAE Model

Please run the following command to generate multiple diverse human motion given initial motion.

bash demo_human36m_inferMTVAE.sh

Motion Analogy-making Using Pre-trained MT-VAE Model

Please run the following command to execute motion analogy-making.

bash demo_human36m_analogyMTVAE.sh

Hierchical Video Synthesis Using Pre-trained Image Generation Model

Please download full Human3.6M videos into the workspace/Human3.6M/ folder.
We use a pre-trained model from the ICML 2017 HierchVid Repository. Please run the following command for image synthesis given generated motion sequence.

CUDA_VISIBLE_DEVICE=0 python h36m_hierach_gensample.py

Disclaimer: Please double check the license in that repository and cite HierchVid paper when use.

Citation

If you find this useful, please cite our work as follows:

@inproceedings{yan2018mt,
  title={MT-VAE: Learning Motion Transformations to Generate Multimodal Human Dynamics},
  author={Yan, Xinchen and Rastogi, Akash and Villegas, Ruben and Sunkavalli, Kalyan and Shechtman, Eli and Hadap, Sunil and Yumer, Ersin and Lee, Honglak},
  booktitle={European Conference on Computer Vision},
  pages={276--293},
  year={2018},
  organization={Springer}
}

Acknowledgements

We would like to thank the amazing developers and the open-sourcing community. Our implementation has especially been benefited from the following excellent repositories:

Attribute2Image: https://github.com/xcyan/eccv16_attr2img
TensorFlow-PTN: https://github.com/tensorflow/models/tree/master/research/ptn
VideoGAN: https://github.com/cvondrick/videogan
MoCoGAN: https://github.com/sergeytulyakov/mocogan
HierchVid: https://github.com/rubenvillegas/icml2017hierchvid
Sketch-RNN: https://github.com/tensorflow/magenta/tree/master/magenta/models/sketch_rnn
VRNN: https://github.com/jych/nips2015_vrnn
SVG: https://github.com/edenton/svg

You might also like...

EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos.

EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos. In this project, we provide the basic code for fitt

2.2k Jan 5, 2023

Synthesizing Long-Term 3D Human Motion and Interaction in 3D in CVPR2021

Long-term-Motion-in-3D-Scenes This is an implementation of the CVPR'21 paper "Synthesizing Long-Term 3D Human Motion and Interaction in 3D". Please ch

76 Dec 13, 2022

[WACV 2020] Reducing Footskate in Human Motion Reconstruction with Ground Contact Constraints

Reducing Footskate in Human Motion Reconstruction with Ground Contact Constraints Official implementation for Reducing Footskate in Human Motion Recon

38 Nov 1, 2022

A TensorFlow implementation of Neural Program Synthesis from Diverse Demonstration Videos

ViZDoom http://vizdoom.cs.put.edu.pl ViZDoom allows developing AI bots that play Doom using only the visual information (the screen buffer). It is pri

1 Aug 19, 2020

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis, including human motion imitation, appearance transfer, and novel view synthesis. Currently the paper is under review of IEEE TPAMI. It is an extension of our previous ICCV project impersonator, and it has a more powerful ability in generalization and produces higher-resolution results (512 x 512, 1024 x 1024) than the previous ICCV version.

2.3k Jan 5, 2023

Code and datasets for the paper "Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction" (RA-L, 2021)

Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction This is the code for the paper Combining E

69 Dec 26, 2022

Preprocessed Datasets for our Multimodal NER paper

Unified Multimodal Transformer (UMT) for Multimodal Named Entity Recognition (MNER) Two MNER Datasets and Codes for our ACL'2020 paper: Improving Mult

76 Dec 21, 2022

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization [Paper] accepted at the EMNLP 2021: Vision Guided Genera

42 Jan 7, 2023

the official code for ICRA 2021 Paper: "Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation"

G2S This is the official code for ICRA 2021 Paper: Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation by Hemang

4 Jul 27, 2022

Comments

No corresponding file when to test

Hi, I met some problems when to execute your code. After downloading preprocessed dataset and pre-trained models, when I tried to generate multiple diverse human motion, I met IO error and there is no corresponding file in the preprocessed dataset.Do I need to download the whole Human3.6M dataset from the official website? Thanks!

opened by shiwei6523 2
data processing steps

Hi, I am trying to replicate the code for the paper by myself to practice, I am thinking of extending your model. I am trying to understand the data pre processing that you've done. I would like it if you could may be explain the structure of it to me please ?

Regards, Prashanth.

opened by peacekurella 0
About h36m_input.py

Hi, When I execute python h36m_gensample.py --model_version=MTVAE, happens an error from
h36m_input.py .
Error:FileNotFoundError: [Errno 2] No such file or directory: 'workspace/Human3.6M/annot_pts/770/vid_770_t000.csv' I am confused about the file annot_pts/770/vid_770_t000.csv. How can I generate these files? Could you help me? Thanks.😍

opened by chenhaomingbob 0
Code for evaluation metrics

Hi,

Would you please let me know where can I find the code for S-MSE, R-MSE and Test CLL? In fact, I'm wondering how the MSE errors were computed? which time-step is reported? How joint errors were aggregated?

Thanks.

opened by sadegh-aa 0

Tensorflow Implementation of ECCV'18 paper: Multimodal Human Motion Synthesis

Related tags

Overview

MT-VAE for Multimodal Human Motion Synthesis

Requirements

Installing Dependency

Data Preprocessing

Training (MT-VAE)

Motion Synthesis Using Pre-trained MT-VAE Model

Motion Analogy-making Using Pre-trained MT-VAE Model

Hierchical Video Synthesis Using Pre-trained Image Generation Model

Citation

Acknowledgements

You might also like...

EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos.

Synthesizing Long-Term 3D Human Motion and Interaction in 3D in CVPR2021

[WACV 2020] Reducing Footskate in Human Motion Reconstruction with Ground Contact Constraints

A TensorFlow implementation of Neural Program Synthesis from Diverse Demonstration Videos

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

Code and datasets for the paper "Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction" (RA-L, 2021)

Preprocessed Datasets for our Multimodal NER paper

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

the official code for ICRA 2021 Paper: "Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation"

Comments

No corresponding file when to test

data processing steps

About h36m_input.py

Code for evaluation metrics

Owner

Xinchen Yan

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

This repository contains the code for the paper "Hierarchical Motion Understanding via Motion Programs"

Official Pytorch implementation of the paper "MotionCLIP: Exposing Human Motion Generation to CLIP Space"

Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"

Implementation of "Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis"

Tensorflow implementation of the paper "HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences", CVPR 2021.

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.