Deep motion transfer

Last update: Nov 1, 2022

Related tags

Deep Learning animation-with-keypoint-mask

Overview

animation-with-keypoint-mask

Paper

The right most square is the final result. Softmax mask (circles):

\

Heatmap mask:

\

conda env create -f environment.yml
conda activate venv11
We use pytorch 1.7.1 with python 3.8.
Please obtain pretrained keypoint module. You can do so by
git checkout fomm-new-torch
Then, follow the instructions from the README of that branch, or obtain a pre-trained checkpoint from
https://github.com/AliaksandrSiarohin/first-order-model

training

to train a model on specific dataset run:

CUDA_VISIBLE_DEVICES=0,1,2,3 python run.py --config config/dataset_name.yaml --device_ids 0,1,2,3 --checkpoint_with_kp path/to/checkpoint/with/pretrained/kp

E.g. taichi-256-q.yaml for the keypoint heatmap mask model, and taichi-256-softmax-q.yaml for drawn circular keypoints instead.

the code will create a folder in the log directory (each run will create a time-stamped new directory). checkpoints will be saved to this folder. to check the loss values during training see log.txt. you can also check training data reconstructions in the train-vis sub-folder. by default the batch size is tuned to run on 4 titan-x gpu (apart from speed it does not make much difference). You can change the batch size in the train_params in corresponding .yaml file.

evaluation on video reconstruction

To evaluate the reconstruction of the driving video from its first frame, run:

CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml --mode reconstruction --checkpoint path/to/checkpoint --checkpoint_with_kp path/to/checkpoint/with/pretrained/kp

you will need to specify the path to the checkpoint, the reconstruction sub-folder will be created in the checkpoint folder. the generated video will be stored to this folder, also generated videos will be stored in png subfolder in loss-less '.png' format for evaluation. instructions for computing metrics from the paper can be found: https://github.com/aliaksandrsiarohin/pose-evaluation.

image animation

In order to animate a source image with motion from driving, run:

CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml --mode animate --checkpoint path/to/checkpoint --checkpoint_with_kp path/to/checkpoint/with/pretrained/kp

you will need to specify the path to the checkpoint, the animation sub-folder will be created in the same folder as the checkpoint. you can find the generated video there and its loss-less version in the png sub-folder. by default video from test set will be randomly paired, but you can specify the "source,driving" pairs in the corresponding .csv files. the path to this file should be specified in corresponding .yaml file in pairs_list setting.

datasets

taichi. follow the instructions in data/taichi-loading or instructions from https://github.com/aliaksandrsiarohin/video-preprocessing.

training on your own dataset

resize all the videos to the same size e.g 256x256, the videos can be in '.gif', '.mp4' or folder with images. we recommend the later, for each video make a separate folder with all the frames in '.png' format. this format is loss-less, and it has better i/o performance.
create a folder data/dataset_name with 2 sub-folders train and test, put training videos in the train and testing in the test.
create a config config/dataset_name.yaml, in dataset_params specify the root dir the root_dir: data/dataset_name. also adjust the number of epoch in train_params.

additional notes

citation:

@misc{toledano2021,
  author = {Or Toledano and Yanir Marmor and Dov Gertz},
  title = {Image Animation with Keypoint Mask},
  year = {2021},
  eprint={2112.10457},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

Old format (before paper):

@misc{toledano2021,
  author = {Or Toledano and Yanir Marmor and Dov Gertz},
  title = {Image Animation with Keypoint Mask},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/or-toledano/animation-with-keypoint-mask}},
  commit = {015b1f2d466658141c41ea67d7356790b5cded40}
}

Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th place solution

Lyft Motion Prediction for Autonomous Vehicles Code for the 4th place solution of Lyft Motion Prediction for Autonomous Vehicles on Kaggle. Discussion

44 Jun 27, 2022

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集，包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。人机交互主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

185 Dec 26, 2022

EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos.

Comments

Face Reenactment

Can I use this code to train on face videos data, basically I want to create a real-time application where I can generate smiling and blink videos in real-time based on driving videos? @Dov-Gertz @or-toledano

opened by RAJA-PARIKSHAT 1

Deep motion transfer

Related tags

Overview

animation-with-keypoint-mask

training

evaluation on video reconstruction

image animation

datasets

training on your own dataset

additional notes

You might also like...

Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th place solution

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos.

Object Depth via Motion and Detection Dataset

Character Controllers using Motion VAEs

dataset for ECCV 2020 "Motion Capture from Internet Videos"

[arXiv] What-If Motion Prediction for Autonomous Driving ❓🚗💨

Learning to Estimate Hidden Motions with Global Motion Aggregation

Synthesizing Long-Term 3D Human Motion and Interaction in 3D in CVPR2021

Comments

Face Reenactment

Owner

data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

Transfer-Learn is an open-source and well-documented library for Transfer Learning.

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

PyTorch implementation DRO: Deep Recurrent Optimizer for Structure-from-Motion

Deep Two-View Structure-from-Motion Revisited

Computer Vision Script to recognize first person motion, developed as final project for the course "Machine Learning and Deep Learning"

Transfer Learning library for Deep Neural Networks.

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

Style transfer, deep learning, feature transform