PyTorch implementation of the cross-modality generative model that synthesizes dance from music.

Overview

Python 2.7 Python 3.6

Dancing to Music

PyTorch implementation of the cross-modality generative model that synthesizes dance from music.

Paper

Hsin-Ying Lee, Xiaodong Yang, Ming-Yu Liu, Ting-Chun Wang, Yu-Ding Lu, Ming-Hsuan Yang, Jan Kautz
Dancing to Music Neural Information Processing Systems (NeurIPS) 2019
[Paper] [YouTube] [Project] [Blog] [Supp]

Example Videos

  • Beat-Matching
    1st row: generated dance sequences, 2nd row: music beats, 3rd row: kinematics beats

  • Multimodality
    Generate various dance sequences with the same music and the same initial pose.

  • Long-Term Generation
    Seamlessly generate a dance sequence with arbitrary length.

  • Photo-Realisitc Videos
    Map generated dance sequences to photo-realistic videos.

Train Decomposition

python train_decomp.py --name Decomp

Train Composition

python train_comp.py --name Decomp --decomp_snapshot DECOMP_SNAPSHOT

Demo

python demo.py --decomp_snapshot DECOMP_SNAPSHOT --comp_snapshot COMP_SNAPSHOT --aud_path AUD_PATH --out_file OUT_FILE --out_dir OUT_DIR --thr THR
  • Flags

    • aud_path: input .wav file
    • out_file: location of output .mp4 file
    • out_dir: directory of output frames
    • thr: threshold based on motion magnitude
    • modulate: whether to do beat warping
  • Example

python demo.py -decomp_snapshot snapshot/Stage1.ckpt --comp_snapshot snapshot/Stage2.ckpt --aud_path demo/demo.wav --out_file demo/out.mp4 --out_dir demo/out_frame

Citation

If you find this code useful for your research, please cite our paper:

@inproceedings{lee2019dancing2music,
  title={Dancing to Music},
  author={Lee, Hsin-Ying and Yang, Xiaodong and Liu, Ming-Yu and Wang, Ting-Chun and Lu, Yu-Ding and Yang, Ming-Hsuan and Kautz, Jan},
  booktitle={NeurIPS},
  year={2019}
}

License

Copyright (C) 2020 NVIDIA Corporation. All rights reserved. This work is made available under NVIDIA Source Code License (1-Way Commercial). To view a copy of this license, visit https://nvlabs.github.io/Dancing2Music/LICENSE.txt.

Comments
  • What is the description of the data structure?

    What is the description of the data structure?

    I have extracted data.zip but I don't understand the data structure.
    What is the difference between the ballet_unit, ballet_unitseq3 and ballet_unitseq4 lists? Are you able to share the original video sources rather than the numpy arrays?

    opened by stephmather 3
  • Details about FID evaluation

    Details about FID evaluation

    Hi, your work is very interesting. I'd like to know how you evaluate FID on generated dances. In table 1, you evaluated FID on real dances and generated dances. For real dances, do you evaluate FID on test dance data and training dance data? For generated dances, do you also evaluate FID on generated dances and training dance data?

    opened by guofeisun 1
  • FileNotFoundError: [WinError 3] The system cannot find the path specified: './logs\\Decomp'  &  demo.py: error: argument --thr: invalid int value: 'THR'

    FileNotFoundError: [WinError 3] The system cannot find the path specified: './logs\\Decomp' & demo.py: error: argument --thr: invalid int value: 'THR'

    When Run any of the following codes , I encounter errors . Help fix the error ...

    python train_decomp.py --name Decomp
    

    FileNotFoundError: [WinError 3] The system cannot find the path specified: './logs\Decomp'

    image

    &

    python train_comp.py --name Decomp --decomp_snapshot DECOMP_SNAPSHOT
    

    FileNotFoundError: [WinError 3] The system cannot find the path specified: './logs\Decomp'

    image

    &

    python demo.py --decomp_snapshot DECOMP_SNAPSHOT --comp_snapshot COMP_SNAPSHOT --aud_path AUD_PATH --out_file OUT_FILE --out_dir OUT_DIR --thr THR
    

    demo.py: error: argument --thr: invalid int value: 'THR'

    image

    opened by farzadmohseni-ir 0
  • aud_3cls.ckpt is needed!!

    aud_3cls.ckpt is needed!!

    I got IOError: [Errno 2] No such file or directory: './data/stats/aud_3cls.ckpt', while @HsinYingLee said that, The ckpt is under the downloaded "stats" folder. Is there somebody can share the aud_3cls.ckpt? Thanks.

    opened by Ai-is-light 0
  • FileNotFoundError in Dancing2Application

    FileNotFoundError in Dancing2Application

    Hi,

    I have Cloned the Dancing2Music application and Installed all the needed Dependency packages also but, I am Facing this below Error:-

    Can I know, how I can resolve this Issue?

    chaithra@chaithra-desktop:~/Dancing2Music$ python3 train_decomp.py --name Decomp 2022-11-22 18:19:21.235734: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2 WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them. Traceback (most recent call last): File “train_decomp.py”, line 41, in os.mkdir(args.log_dir) FileNotFoundError: [Errno 2] No such file or directory: ‘./logs/Decomp’

    Is that souces has some issue ?

    or

    If there is a proper documentation to install the application

    opened by chitrabv0305 0
  • where is the dataset?

    where is the dataset?

    Hello,

    The paper says: Finally, we provide a large-scale paired music and dance dataset, which is available along with the source code and models at our website. Website links to this github. But I can't find any dataset link in the repo. Ideally I'd hope to see the link in the README.md

    Thanks in advance for clarification

    opened by ikamensh 0
  • How to detect kinematic beats?

    How to detect kinematic beats?

    @tmbdev @bkhailany @mjgarland @dumerrill @sdalton1 Hi,thx to share such an outstanding work. I m interested in kinematic beats detection, can you share this work/code, Thank you a lot

    opened by ktxu1224 0
  • error with trian_decomp.py

    error with trian_decomp.py

    demo.py can work well by using the pre-trained mode, but when I tried train_decomp.py I met with:

    [Errno 2] No such file or directory: '../data\zumba/zfwOESZ8CkQ/19/unit/1.npy'

    opened by aryanna384 2
Owner
NVIDIA Research Projects
NVIDIA Research Projects
The personal repository of the work: *DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer*.

DanceNet3D The personal repository of the work: DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer. Dataset and Results Pleas

南嘉Nanga 36 Dec 21, 2022
《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

The Most Important Thing. Our code is developed based on: LXMERT: Learning Cross-Modality Encoder Representations from Transformers

null 53 Dec 16, 2022
CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification (ICCV2021)

CM-NAS Official Pytorch code of paper CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification in ICCV2021. Vis

JDAI-CV 40 Nov 25, 2022
MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieva

Introduction This is the source code of our TCSVT 2021 paper "MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieval". Ple

null 7 Aug 24, 2022
Tensorflow python implementation of "Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos"

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos This repository is the official tensorflow python implementation

Yasamin Jafarian 287 Jan 6, 2023
Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually

Portrait Segmentation using Tensorflow This script removes the background from an input image. You can read more about segmentation here Setup The scr

null 291 Dec 24, 2022
Public repository created to store my custom-made tools for Just Dance (UbiArt Engine)

Woody's Just Dance Tools Public repository created to store my custom-made tools for Just Dance (UbiArt Engine) Development and updates Almost all of

Wodson de Andrade 8 Dec 24, 2022
MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space

Update (20 Jan 2020): MODALS on text data is avialable MODALS MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space Table of Conte

null 38 Dec 15, 2022
GluonMM is a library of transformer models for computer vision and multi-modality research

GluonMM is a library of transformer models for computer vision and multi-modality research. It contains reference implementations of widely adopted baseline models and also research work from Amazon Research.

null 42 Dec 2, 2022
[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.

DeepVecFont This is the homepage for "DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning". Yizhi Wang and Zhouhui Lian. WI

Yizhi Wang 17 Dec 22, 2022
[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.

DeepVecFont This is the homepage for "DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning". Yizhi Wang and Zhouhui Lian. WI

Yizhi Wang 5 Oct 22, 2021
Code of paper Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification.

Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification We provide the codes for repr

null 12 Dec 12, 2022
UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.

Unified Multi-modal Transformers This repository maintains the official implementation of the paper UMT: Unified Multi-modal Transformers for Joint Vi

Applied Research Center (ARC), Tencent PCG 84 Jan 4, 2023
PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "

Foley Music: Learning to Generate Music from Videos This repo holds the code for the framework presented on ECCV 2020. Foley Music: Learning to Genera

Chuang Gan 30 Nov 3, 2022
Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Minimal PyTorch implementation of Generative Latent Optimization This is a reimplementation of the paper Piotr Bojanowski, Armand Joulin, David Lopez-

Thomas Neumann 117 Nov 27, 2022
PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN in PyTorch PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. * All samples in READM

Taehoon Kim 1k Jan 4, 2023
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

UC2 UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training Mingyang Zhou, Luowei Zhou, Shuohang Wang, Yu Cheng, Linjie Li, Zhou Yu,

Mingyang Zhou 28 Dec 30, 2022
Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN Official PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. Prerequisites Python 2.7

SK T-Brain 754 Dec 29, 2022
PyTorch implementation of MuseMorphose, a Transformer-based model for music style transfer.

MuseMorphose This repository contains the official implementation of the following paper: Shih-Lun Wu, Yi-Hsuan Yang MuseMorphose: Full-Song and Fine-

Yating Music, Taiwan AI Labs 142 Jan 8, 2023