Split Variational AutoEncoder

Andrea Asperti

Last update: Sep 2, 2022

Related tags

Deep Learning Split-VAE

Overview

Split-VAE

Split Variational AutoEncoder

Introduction

This repository contains and implemementation of a Split Variational AutoEncoder (SVAE). In a SVAE the output y is computed as a weighted sum

sigma * y1 + (1-sigma) * y2

where y1 and y2 are two distinct generated images, and sigma is a learned compositional map.

A Split VAE is trained as a normal VAE: no additional loss is added over the splitted images y1 and y2.

Splitting is meant to offer to the network a more flexible way to learn fruitful and independent features: as a result the variable collapse phenomenon is greatly reduced and the possibility of exploiting a larger number of latent variables improves the quality and diversity of generated samples.

Types of Splitting

The decomposition is nondeterministic, but follows two main schemes, that we may roughly categorize as either syntactical or semantical.

Syntactic decomposition

In this case, the compositional map tends to exploit the strong correlation between adjacent pixels, splitting the image in two complementary high frequency sub-images.

Below are some examples of syntactic splitting. In all the following pictures, the first row is the compositional map, then in order y1, y2 and y.

Semantic decomposition

In this case, the map typically focuses on the contours of objects, splitting the image in interesting variations of its content, with more marked and distinctive features.

Here are some examples of semantic splitting:

In case of sematic splitting, the Frèchet Inception Distance (FID) of y1 and y2 is frequently lower (hence better) than that of y, that clearly suffers from being the average of the formers.

In a sense, a SVAE forces the Variational Autoencoder to make choices, in contrast with its intrinsic tendency to average between alternatives with the aim to minimize the reconstruction loss towards a specific sample.

More examples of GENERATED images

Examples of Mnist-like gnerated digits (FID=7.47)

Here are some additional examples of semantic compositonal maps generated for CelebA, quite similar to drawings. The quality and precision of contours is both unexpected and remarkable.

And some generated faces (FID=35.1). Observe in particular the wide differentiation in pose, illumination, colors, age and expressions.

You might also like...

Aiming at the common training datsets split, spectrum preprocessing, wavelength select and calibration models algorithm involved in the spectral analysis process

Aiming at the common training datsets split, spectrum preprocessing, wavelength select and calibration models algorithm involved in the spectral analysis process, a complete algorithm library is established, which is named opensa (openspectrum analysis).

50 Jan 7, 2023

Split your patch similarly to `git add -p` but supporting multiple buckets

Code image classification of MNIST dataset using different architectures: simple linear NN, autoencoder, and highway network

Deep Learning for image classification pip install -r http://webia.lip6.fr/~baskiotisn/requirements-amal.txt Train an autoencoder python3 train_auto

0 Mar 30, 2022

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

2 Dec 17, 2021

Split Variational AutoEncoder

Related tags

Overview

Split-VAE

Introduction

Types of Splitting

Syntactic decomposition

Semantic decomposition

More examples of GENERATED images

You might also like...

Aiming at the common training datsets split, spectrum preprocessing, wavelength select and calibration models algorithm involved in the spectral analysis process

Split your patch similarly to `git add -p` but supporting multiple buckets

Speech Enhancement Generative Adversarial Network Based on Asymmetric AutoEncoder

Official Implementation of Swapping Autoencoder for Deep Image Manipulation (NeurIPS 2020)

MADE (Masked Autoencoder Density Estimation) implementation in PyTorch

Molecular AutoEncoder in PyTorch

Video Autoencoder: self-supervised disentanglement of 3D structure and motion

Code image classification of MNIST dataset using different architectures: simple linear NN, autoencoder, and highway network

Project looking into use of autoencoder for semi-supervised learning and comparing data requirements compared to supervised learning.

Owner

Andrea Asperti

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Implementation for "Manga Filling Style Conversion with Screentone Variational Autoencoder" (SIGGRAPH ASIA 2020 issue)

Recurrent Variational Autoencoder that generates sequential data implemented with pytorch

Variational autoencoder for anime face reconstruction

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

Code of 3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)

EPSANet：An Efficient Pyramid Split Attention Block on Convolutional Neural Network

An NVDA add-on to split screen reader and audio from other programs to different sound channels