Clockwork Variational Autoencoder

Related tags

Deep Learning cwvae
Overview

Clockwork Variational Autoencoders (CW-VAE)

Vaibhav Saxena, Jimmy Ba, Danijar Hafner

If you find this code useful, please reference in your paper:

@article{saxena2021clockworkvae,
  title={Clockwork Variational Autoencoders}, 
  author={Saxena, Vaibhav and Ba, Jimmy and Hafner, Danijar},
  journal={arXiv preprint arXiv:2102.09532},
  year={2021},
}

Method

Clockwork VAEs are deep generative model that learn long-term dependencies in video by leveraging hierarchies of representations that progress at different clock speeds. In contrast to prior video prediction methods that typically focus on predicting sharp but short sequences in the future, Clockwork VAEs can accurately predict high-level content, such as object positions and identities, for 1000 frames.

Clockwork VAEs build upon the Recurrent State Space Model (RSSM), so each state contains a deterministic component for long-term memory and a stochastic component for sampling diverse plausible futures. Clockwork VAEs are trained end-to-end to optimize the evidence lower bound (ELBO) that consists of a reconstruction term for each image and a KL regularizer for each stochastic variable in the model.

More information:

Instructions

This repository contains the code for training the Clockwork VAE model on the datasets minerl, mazes, and mmnist.

The datasets will automatically be downloaded into the --datadir directory.

python3 train.py --logdir /path/to/logdir --datadir /path/to/datasets --config configs/<dataset>.yml 

The evaluation script writes open-loop video predictions in both PNG and NPZ format and plots of PSNR and SSIM to the data directory.

python3 eval.py --logdir /path/to/logdir
You might also like...
MADE (Masked Autoencoder Density Estimation) implementation in PyTorch
MADE (Masked Autoencoder Density Estimation) implementation in PyTorch

pytorch-made This code is an implementation of "Masked AutoEncoder for Density Estimation" by Germain et al., 2015. The core idea is that you can turn

Molecular AutoEncoder in PyTorch

MolEncoder Molecular AutoEncoder in PyTorch Install $ git clone https://github.com/cxhernandez/molencoder.git && cd molencoder $ python setup.py insta

 Video Autoencoder: self-supervised disentanglement of 3D structure and motion
Video Autoencoder: self-supervised disentanglement of 3D structure and motion

Video Autoencoder: self-supervised disentanglement of 3D structure and motion This repository contains the code (in PyTorch) for the model introduced

Code image classification of MNIST dataset using different architectures: simple linear NN, autoencoder, and highway network

Deep Learning for image classification pip install -r http://webia.lip6.fr/~baskiotisn/requirements-amal.txt Train an autoencoder python3 train_auto

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

An SE(3)-invariant autoencoder for generating the periodic structure of materials
An SE(3)-invariant autoencoder for generating the periodic structure of materials

Crystal Diffusion Variational AutoEncoder This software implementes Crystal Diffusion Variational AutoEncoder (CDVAE), which generates the periodic st

sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code
sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code

sequitur sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code. It implements three differ

Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

ADE20k Semantic segmentation with MAE Getting started Install the mmsegmentation

Python package facilitating the use of Bayesian Deep Learning methods with Variational Inference for PyTorch
Python package facilitating the use of Bayesian Deep Learning methods with Variational Inference for PyTorch

PyVarInf PyVarInf provides facilities to easily train your PyTorch neural network models using variational inference. Bayesian Deep Learning with Vari

Comments
  • Question: Does the model condition on future time steps?

    Question: Does the model condition on future time steps?

    Hi, I read your paper and I found it really interesting. I've been going through the code to understand it, port it and try it out.

    I came across a section that I don't understand though:

    From reading the paper I assumed that the model should only condition on information from the past to predict the future. Is this assumption correct first of all?

    if so, In the convolutional encoder, you're predicting encodings for each RSSM level in the clockwork vae stack, then you're summing them over time here: https://github.com/vaibhavsaxena11/cwvae/blob/62dd5050d3c20c1c40879539906c54492a756b59/cnns.py#L140

    Does this not mean that some prediction at time step X then has access to information from time step X+N (where N is some positive integer < "temporal abstraction factor" ** level), or am I misunderstanding the data flow?

    opened by llucid-97 2
  • potential bug in the encoder

    potential bug in the encoder

    for level in range(1, self._levels): for i_dl in range(self._dense_layers-1): hidden = self.get('h{}_dense'.format(5+(level-1)*self._dense_layers+i_dl), tfkl.Dense, self._embed_size, activation=tf.nn.relu)(hidden) if self._dense_layers > 0: hidden = self.get('h{}_dense'.format(4+level*self._dense_layers), tfkl.Dense, feat_size, activation=None)(hidden) layer = hidden

    line 39 onwards in the cnn.py Encoder(), the depth of these layers increases with the level as the hidden variables is overwritten. At large n_levels and n_enc_dense_layers this will result in a very deep network mapping from the observation embedding to the latent space. Not sure it's intentional, doesn't seem to have a purpose, ie is there a reason the higher latent spaces need a deeper function to map from the embedding?

    opened by xmax1 1
  • File encoding error

    File encoding error

    While running train.py, I get the following error: Failed to encode example: {'video': 'filepath'} In

    ffmpeg and ffmpeg-python have already been installed in the virtual environment.

    opened by sb93 1
CoSMA: Convolutional Semi-Regular Mesh Autoencoder. From Paper "Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes"

Mesh Convolutional Autoencoder for Semi-Regular Meshes of Different Sizes Implementation of CoSMA: Convolutional Semi-Regular Mesh Autoencoder arXiv p

Fraunhofer SCAI 10 Oct 11, 2022
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim, Jungil Kong, and Juhee Son In our rece

Jaehyeon Kim 1.7k Jan 8, 2023
Implementation for "Manga Filling Style Conversion with Screentone Variational Autoencoder" (SIGGRAPH ASIA 2020 issue)

Manga Filling with ScreenVAE SIGGRAPH ASIA 2020 | Project Website | BibTex This repository is for ScreenVAE introduced in the following paper "Manga F

null 30 Dec 24, 2022
Recurrent Variational Autoencoder that generates sequential data implemented with pytorch

Pytorch Recurrent Variational Autoencoder Model: This is the implementation of Samuel Bowman's Generating Sentences from a Continuous Space with Kim's

Daniil Gavrilov 347 Nov 14, 2022
Variational autoencoder for anime face reconstruction

VAE animeface Variational autoencoder for anime face reconstruction Introduction This repository is an exploratory example to train a variational auto

Minzhe Zhang 2 Dec 11, 2021
PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch Autoencoders Implementing a Variational Autoencoder (VAE) Series in Pytorch. Inspired by this repository Model List check model paper conferen

Subin An 8 Nov 21, 2022
Code of 3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces

3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces Installation After cloning the repo open

null 37 Dec 3, 2022
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

RAVE: Realtime Audio Variational autoEncoder Official implementation of RAVE: A variational autoencoder for fast and high-quality neural audio synthes

ACIDS 587 Jan 1, 2023
Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder

ASEGAN: Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder 中文版简介 Readme with English Version 介绍 基于SEGAN模型的改进版本,使用自主设计的非

Nitin 53 Nov 17, 2022
Official Implementation of Swapping Autoencoder for Deep Image Manipulation (NeurIPS 2020)

Swapping Autoencoder for Deep Image Manipulation Taesung Park, Jun-Yan Zhu, Oliver Wang, Jingwan Lu, Eli Shechtman, Alexei A. Efros, Richard Zhang UC

null 449 Dec 27, 2022