Fader Networks: Manipulating Images by Sliding Attributes - NIPS 2017

Overview

FaderNetworks

PyTorch implementation of Fader Networks (NIPS 2017).

Fader Networks can generate different realistic versions of images by modifying attributes such as gender or age group. They can swap multiple attributes at a time, and continuously interpolate between each attribute value. In this repository we provide the code to reproduce the results presented in the paper, as well as trained models.

Single-attribute swap

Below are some examples of different attribute swaps:

Multi-attributes swap

The Fader Networks are also designed to disentangle multiple attributes at a time:

Model

The main branch of the model (Inference Model), is an autoencoder of images. Given an image x and an attribute y (e.g. male/female), the decoder is trained to reconstruct the image from the latent state E(x) and y. The other branch (Adversarial Component), is composed of a discriminator trained to predict the attribute from the latent state. The encoder of the Inference Model is trained not only to reconstruct the image, but also to fool the discriminator, by removing from E(x) the information related to the attribute. As a result, the decoder needs to consider y to properly reconstruct the image. During training, the model is trained using real attribute values, but at test time, y can be manipulated to generate variations of the original image.

Dependencies

Installation

Simply clone the repository:

git clone https://github.com/facebookresearch/FaderNetworks.git
cd FaderNetworks

Dataset

Download the aligned and cropped CelebA dataset from http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html. Extract all images and move them to the data/img_align_celeba/ folder. There should be 202599 images. The dataset also provides a file list_attr_celeba.txt containing the list of the 40 attributes associated with each image. Move it to data/. Then simply run:

cd data
./preprocess.py

It will resize images, and create 2 files: images_256_256.pth and attributes.pth. The first one contains a tensor of size (202599, 3, 256, 256) containing the concatenation of all resized images. Note that you can update the image size in preprocess.py to work with different resolutions. The second file is a pre-processed version of the attributes.

Pretrained models

You can download pretrained classifiers and Fader Networks by running:

cd models
./download.sh

Train your own models

Train a classifier

To train your own model you first need to train a classifier to let the model evaluate the swap quality during the training. Training a good classifier is relatively simple for most attributes, and a good model can be trained in a few minutes. We provide a trained classifier for all attributes in models/classifier256.pth. Note that the classifier does not need to be state-of-the-art, it is not used during the training process, but is just here to monitor the swap quality. If you want to train your own classifier, you can run classifier.py, using the following parameters:

python classifier.py

# Main parameters
--img_sz 256                  # image size
--img_fm 3                    # number of feature maps
--attr "*"                    # attributes list. "*" for all attributes

# Network architecture
--init_fm 32                  # number of feature maps in the first layer
--max_fm 512                  # maximum number of feature maps
--hid_dim 512                 # hidden layer size

# Training parameters
--v_flip False                # randomly flip images vertically (data augmentation)
--h_flip True                 # randomly flip images horizontally (data augmentation)
--batch_size 32               # batch size
--optimizer "adam,lr=0.0002"  # optimizer
--clip_grad_norm 5            # clip gradient L2 norm
--n_epochs 1000               # number of epochs
--epoch_size 50000            # number of images per epoch

# Reload
--reload ""                   # reload a trained classifier
--debug False                 # debug mode (if True, load a small subset of the dataset)

Train a Fader Network

You can train a Fader Network with train.py. The autoencoder can receive feedback from:

  • The image reconstruction loss
  • The latent discriminator loss
  • The PatchGAN discriminator loss
  • The classifier loss

In the paper, only the first two losses are used, but the two others could improve the results further. You can tune the impact of each of these losses with the lambda_ae, lambda_lat_dis, lambda_ptc_dis, and lambda_clf_dis coefficients. Below is a complete list of all parameters:

# Main parameters
--img_sz 256                      # image size
--img_fm 3                        # number of feature maps
--attr "Male"                     # attributes list. "*" for all attributes

# Networks architecture
--instance_norm False             # use instance normalization instead of batch normalization
--init_fm 32                      # number of feature maps in the first layer
--max_fm 512                      # maximum number of feature maps
--n_layers 6                      # number of layers in the encoder / decoder
--n_skip 0                        # number of skip connections
--deconv_method "convtranspose"   # deconvolution method
--hid_dim 512                     # hidden layer size
--dec_dropout 0                   # dropout in the decoder
--lat_dis_dropout 0.3             # dropout in the latent discriminator

# Training parameters
--n_lat_dis 1                     # number of latent discriminator training steps
--n_ptc_dis 0                     # number of PatchGAN discriminator training steps
--n_clf_dis 0                     # number of classifier training steps
--smooth_label 0.2                # smooth discriminator labels
--lambda_ae 1                     # autoencoder loss coefficient
--lambda_lat_dis 0.0001           # latent discriminator loss coefficient
--lambda_ptc_dis 0                # PatchGAN discriminator loss coefficient
--lambda_clf_dis 0                # classifier loss coefficient
--lambda_schedule 500000          # lambda scheduling (0 to disable)
--v_flip False                    # randomly flip images vertically (data augmentation)
--h_flip True                     # randomly flip images horizontally (data augmentation)
--batch_size 32                   # batch size
--ae_optimizer "adam,lr=0.0002"   # autoencoder optimizer
--dis_optimizer "adam,lr=0.0002"  # discriminator optimizer
--clip_grad_norm 5                # clip gradient L2 norm
--n_epochs 1000                   # number of epochs
--epoch_size 50000                # number of images per epoch

# Reload
--ae_reload ""                    # reload pretrained autoencoder
--lat_dis_reload ""               # reload pretrained latent discriminator
--ptc_dis_reload ""               # reload pretrained PatchGAN discriminator
--clf_dis_reload ""               # reload pretrained classifier
--eval_clf ""                     # evaluation classifier (trained with classifier.py)
--debug False                     # debug mode (if True, load a small subset of the dataset)

Generate interpolations

Given a trained model, you can use it to swap attributes of images in the dataset. Below are examples using the pretrained models:

# Narrow Eyes
python interpolate.py --model_path models/narrow_eyes.pth --n_images 10 --n_interpolations 10 --alpha_min 10.0 --alpha_max 10.0 --output_path narrow_eyes.png

# Eyeglasses
python interpolate.py --model_path models/eyeglasses.pth --n_images 10 --n_interpolations 10 --alpha_min 2.0 --alpha_max 2.0 --output_path eyeglasses.png

# Age
python interpolate.py --model_path models/young.pth --n_images 10 --n_interpolations 10 --alpha_min 10.0 --alpha_max 10.0 --output_path young.png

# Gender
python interpolate.py --model_path models/male.pth --n_images 10 --n_interpolations 10 --alpha_min 2.0 --alpha_max 2.0 --output_path male.png

# Pointy nose
python interpolate.py --model_path models/pointy_nose.pth --n_images 10 --n_interpolations 10 --alpha_min 10.0 --alpha_max 10.0 --output_path pointy_nose.png

These commands will generate images with 10 rows of 12 columns with the interpolated images. The first column corresponds to the original image, the second is the reconstructed image (without alteration of the attribute), and the remaining ones correspond to the interpolated images. alpha_min and alpha_max represent the range of the interpolation. Values superior to 1 represent generations over the True / False range of the boolean attribute in the model. Note that the variations of some attributes may only be noticeable for high values of alphas. For instance, for the "eyeglasses" or "gender" attributes, alpha_max=2 is usually enough, while for the "age" or "narrow eyes" attributes, it is better to go up to alpha_max=10.

References

If you find this code useful, please consider citing:

Fader Networks: Manipulating Images by Sliding Attributes - G. Lample, N. Zeghidour, N. Usunier, A. Bordes, L. Denoyer, M'A. Ranzato

@inproceedings{lample2017fader,
  title={Fader Networks: Manipulating Images by Sliding Attributes},
  author={Lample, Guillaume and Zeghidour, Neil and Usunier, Nicolas and Bordes, Antoine and DENOYER, Ludovic and others},
  booktitle={Advances in Neural Information Processing Systems},
  pages={5963--5972},
  year={2017}
}

Contact: [email protected], [email protected]

Comments
  • Add .gitignore

    Add .gitignore

    Add .gitignore by using https://www.gitignore.io/api/python This is useful to developer.

    Also, there are many companies use gitignore.io https://github.com/joeblau/gitignore.io#companies. I think this will good for this repo.

    Thanks.

    CLA Signed 
    opened by sufuf3 5
  • interpolate on own images

    interpolate on own images

    Feature request: enable an input_path or input_file on which we can run the model to alter the input image. It would be best if an easy to run example would be provided :)

    opened by kootenpv 2
  • Adversarial loss of encoder not identical to the paper

    Adversarial loss of encoder not identical to the paper

    In model.py from line 260 to 263, when calculate the adversarial loss of the Encoder, the author use random labels as target, not the flipped label (i.e. 1-y_k) as described in the paper

    opened by CharlesNord 1
  • pytorch version?

    pytorch version?

    Which version of pytorch is required to run the code? When I try 0.3.0 I get this attribute error trying to run the interpolate example with downloaded models:

    AttributeError: Can't get attribute '_rebuild_tensor_v2' on <module 'torch._utils'>

    When I try 0.4 I get a different attribute error:

    AttributeError: 'BatchNorm2d' object has no attribute 'track_running_stats'

    Presumably it's some version inbetween but I can't work it out.

    Thanks

    opened by S-Ellis 2
  • Adding Code of Conduct file

    Adding Code of Conduct file

    This is pull request was created automatically because we noticed your project was missing a Code of Conduct file.

    Code of Conduct files facilitate respectful and constructive communities by establishing expected behaviors for project contributors.

    This PR was crafted with love by Facebook's Open Source Team.

    CLA Signed 
    opened by facebook-github-bot 0
  • Adding Contributing file

    Adding Contributing file

    This is pull request was created automatically because we noticed your project was missing a Contributing file.

    CONTRIBUTING files explain how a developer can contribute to the project - which you should actively encourage.

    This PR was crafted with love by Facebook's Open Source Team.

    CLA Signed 
    opened by facebook-github-bot 0
  • fader network training failure

    fader network training failure

    I used exactly the same params as yours in README.md to train the fader network to manipulate "EyeGlasses" attribute. But the clf_accu_Eyeglasses_1 always decreased to a low value at around 750 epoch, like this: INFO - 10/18/18 12:58:35 - 1:41:47 - 048928 - Latent discriminator : 0.11062 / Reconstruction loss : 0.00315 INFO - 10/18/18 12:58:39 - 1:41:52 - 049728 - Latent discriminator : 0.05209 / Reconstruction loss : 0.00319 INFO - 10/18/18 12:58:41 - 1:41:53 - INFO - 10/18/18 12:59:20 - 1:42:33 - Latent discriminator accuracy: INFO - 10/18/18 12:59:20 - 1:42:33 - lat_dis_accu : 77.984% INFO - 10/18/18 12:59:20 - 1:42:33 - lat_dis_accu_Eyeglasses: 77.984% INFO - 10/18/18 12:59:20 - 1:42:33 - INFO - 10/18/18 13:00:25 - 1:43:38 - Classifier accuracy: INFO - 10/18/18 13:00:25 - 1:43:38 - clf_accu : 50.873% INFO - 10/18/18 13:00:25 - 1:43:38 - clf_accu_Eyeglasses : 50.873% INFO - 10/18/18 13:00:25 - 1:43:38 - clf_accu_Eyeglasses_0: 94.946% INFO - 10/18/18 13:00:25 - 1:43:38 - clf_accu_Eyeglasses_1: 6.800% INFO - 10/18/18 13:00:25 - 1:43:38 - INFO - 10/18/18 13:00:25 - 1:43:38 - Autoencoder loss: 0.00326

    And the best evaluation accuracy ever riched is : INFO - 10/17/18 21:27:23 - 3 days, 4:35:00 - Latent discriminator accuracy: INFO - 10/17/18 21:27:23 - 3 days, 4:35:00 - lat_dis_accu : 93.114% INFO - 10/17/18 21:27:23 - 3 days, 4:35:00 - lat_dis_accu_Eyeglasses: 93.114% INFO - 10/17/18 21:27:23 - 3 days, 4:35:00 - INFO - 10/17/18 21:28:28 - 3 days, 4:36:05 - Classifier accuracy: INFO - 10/17/18 21:28:28 - 3 days, 4:36:05 - clf_accu : 79.828% INFO - 10/17/18 21:28:28 - 3 days, 4:36:05 - clf_accu_Eyeglasses : 79.828% INFO - 10/17/18 21:28:28 - 3 days, 4:36:05 - clf_accu_Eyeglasses_0: 98.878% INFO - 10/17/18 21:28:28 - 3 days, 4:36:05 - clf_accu_Eyeglasses_1: 60.779% INFO - 10/17/18 21:28:28 - 3 days, 4:36:05 - INFO - 10/17/18 21:28:28 - 3 days, 4:36:05 - Autoencoder loss: 0.00338 INFO - 10/17/18 21:28:28 - 3 days, 4:36:05 - Best evaluation accuracy: 0.79828

    here is the params I used: --img_sz 256
    --img_fm 3
    --attr "Eyeglasses"

    --instance_norm False
    --init_fm 32
    --max_fm 512
    --n_layers 6
    --n_skip 0
    --deconv_method "convtranspose"
    --hid_dim 512
    --dec_dropout 0
    --lat_dis_dropout 0.3

    --n_lat_dis 1
    --n_ptc_dis 0
    --n_clf_dis 0
    --smooth_label 0.2
    --lambda_ae 1
    --lambda_lat_dis 0.0001
    --lambda_ptc_dis 0
    --lambda_clf_dis 0
    --lambda_schedule 500000
    --v_flip False
    --h_flip True
    --batch_size 32
    --ae_optimizer "adam,lr=0.0002"
    --dis_optimizer "adam,lr=0.0002"
    --clip_grad_norm 5
    --n_epochs 1000
    --epoch_size 50000

    --ae_reload ""
    --lat_dis_reload ""
    --ptc_dis_reload ""
    --clf_dis_reload ""
    --eval_clf "models/default/kjrite0bvw/best.pth"
    --debug False

    and the pytorch version is 0.4.1. Any idea why this happened? Thanks.

    opened by 413029811 0
Owner
Facebook Research
Facebook Research
Converts geometry node attributes to built-in attributes

Attribute Converter Simplifies converting attributes created by geometry nodes to built-in attributes like UVs or vertex colors, as a single click ope

Ivan Notaros 12 Dec 22, 2022
Supplementary code for TISMIR paper "Sliding-Window Pitch-Class Histograms as a Means of Modeling Musical Form"

Sliding-Window Pitch-Class Histograms as a Means of Modeling Musical Form This is supplementary code for the TISMIR paper Sliding-Window Pitch-Class H

null 1 Nov 27, 2021
Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

SSWS-loss_function_based_on_MS-TCN Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation Supervised Sliding Window

null 3 Aug 3, 2022
Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.

PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning The predictive learning of spatiotemporal sequences aims to generate future

THUML: Machine Learning Group @ THSS 243 Dec 26, 2022
PyTorch implementation of the Value Iteration Networks (VIN) (NIPS '16 best paper)

Value Iteration Networks in PyTorch Tamar, A., Wu, Y., Thomas, G., Levine, S., and Abbeel, P. Value Iteration Networks. Neural Information Processing

LEI TAI 75 Nov 24, 2022
Pytorch implementation of Value Iteration Networks (NIPS 2016 best paper)

VIN: Value Iteration Networks A quick thank you A few others have released amazing related work which helped inspire and improve my own implementation

Kent Sommer 297 Dec 26, 2022
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Achraf Rahouti 3 Nov 30, 2021
Synthesizing and manipulating 2048x1024 images with conditional GANs

pix2pixHD Project | Youtube | Paper Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic image-to-image translatio

NVIDIA Corporation 6k Dec 27, 2022
PyTorch version of the paper 'Enhanced Deep Residual Networks for Single Image Super-Resolution' (CVPRW 2017)

About PyTorch 1.2.0 Now the master branch supports PyTorch 1.2.0 by default. Due to the serious version problem (especially torch.utils.data.dataloade

Sanghyun Son 2.1k Jan 1, 2023
Oriented Response Networks, in CVPR 2017

Oriented Response Networks [Home] [Project] [Paper] [Supp] [Poster] Torch Implementation The torch branch contains: the official torch implementation

ZhouYanzhao 217 Dec 12, 2022
Improving Convolutional Networks via Attention Transfer (ICLR 2017)

Attention Transfer PyTorch code for "Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Tran

Sergey Zagoruyko 1.4k Dec 23, 2022
The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

Domain Generalization for Medical Imaging Classification with Linear Dependency Regularization The code release of paper 'Domain Generalization for Me

Yufei Wang 56 Dec 28, 2022
PyTorch implementation of spectral graph ConvNets, NIPS’16

Graph ConvNets in PyTorch October 15, 2017 Xavier Bresson http://www.ntu.edu.sg/home/xbresson https://github.com/xbresson https://twitter.com/xbresson

Xavier Bresson 287 Jan 4, 2023
PyTorch implementation of the NIPS-17 paper "Poincaré Embeddings for Learning Hierarchical Representations"

Poincaré Embeddings for Learning Hierarchical Representations PyTorch implementation of Poincaré Embeddings for Learning Hierarchical Representations

Facebook Research 1.6k Dec 25, 2022
SAN for Product Attributes Prediction

SAN Heterogeneous Star Graph Attention Network for Product Attributes Prediction This repository contains the official PyTorch implementation for ADVI

Xuejiao Zhao 9 Dec 12, 2022
Pcos-prediction - Predicts the likelihood of Polycystic Ovary Syndrome based on patient attributes and symptoms

PCOS Prediction ?? Predicts the likelihood of Polycystic Ovary Syndrome based on

Samantha Van Seters 1 Jan 10, 2022
A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier for ONNX.

sam4onnx A very simple tool to rewrite parameters such as attributes and constants for OPs in ONNX models. Simple Attribute and Constant Modifier for

Katsuya Hyodo 6 May 15, 2022
Tools for manipulating UVs in the Blender viewport.

UV Tool Suite for Blender A set of tools to make editing UVs easier in Blender. These tools can be accessed wither through the Kitfox - UV panel on th

null 35 Oct 29, 2022
MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens

MSG-Transformer Official implementation of the paper MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens, by Jiemin

Hust Visual Learning Team 68 Nov 16, 2022