Code for the paper "Adversarially Regularized Autoencoders (ICML 2018)" by Zhao, Kim, Zhang, Rush and LeCun

Related tags

Deep Learning ARAE
Overview

ARAE

Code for the paper "Adversarially Regularized Autoencoders (ICML 2018)" by Zhao, Kim, Zhang, Rush and LeCun https://arxiv.org/abs/1706.04223

Disclaimer

Major updates on 06/11/2018:

  • WGAN-GP replaced WGAN
  • added 1BWord dataset experiment
  • added Yelp transfer experiment
  • removed unnecessary tricks
  • added both RNNLM and ngram-LM evaluation for both forward and reverse PPL.

File structure

  • lang: ARAE for language generation, on both 1B word benchmark and SNLI
  • yelp: ARAE for language style transfer
  • mnist (in Torch): ARAE for discretized MNIST

Reference

@ARTICLE{2017arXiv170604223J,
   author = {{Junbo} and {Zhao} and {Kim}, Y. and {Zhang}, K. and {Rush}, A.~M. and 
	{LeCun}, Y.},
    title = "{Adversarially Regularized Autoencoders for Generating Discrete Structures}",
  journal = {ArXiv e-prints},
archivePrefix = "arXiv",
   eprint = {1706.04223},
 primaryClass = "cs.LG",
 keywords = {Computer Science - Learning, Computer Science - Computation and Language, Computer Science - Neural and Evolutionary Computing},
     year = 2017,
    month = jun,
   adsurl = {http://adsabs.harvard.edu/abs/2017arXiv170604223J},
  adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}
Comments
  • Runtime error

    Runtime error

    conda create -n pytorch python=3.5 anaconda
    source activate pytorch
    conda install pytorch torchvision cuda80 -c soumith
    python train.py --data_path PATH_TO_PROCESSED_DATA --cuda --kenlm_path PATH_TO_KENLM_DIRECTORY
    

    Training... Traceback (most recent call last): File "train.py", line 516, in train_ae(train_data[niter], total_loss_ae, start_time, niter) File "train.py", line 352, in train_ae output = autoencoder(source, lengths, noise=True) File "/home/$USER/anaconda2/envs/pytorch/lib/python3.5/site-packages/torch/nn/modules/module.py", line 224, in call result = self.forward(*input, **kwargs) File "/home/$USER/workspace/ARAE/pytorch/models.py", line 174, in forward hidden = self.encode(indices, lengths, noise) File "/home/$USER/workspace/ARAE/pytorch/models.py", line 202, in encode hidden = torch.div(hidden, norms.expand_as(hidden)) File "/home/$USER/anaconda2/envs/pytorch/lib/python3.5/site-packages/torch/autograd/variable.py", line 725, in expand_as return Expand.apply(self, (tensor.size(),)) File "/home/$USER/anaconda2/envs/pytorch/lib/python3.5/site-packages/torch/autograd/_functions/tensor.py", line 111, in forward result = i.expand(*new_size) RuntimeError: The expanded size of the tensor (300) must match the existing size (64) at non-singleton dimension 1. at /opt/conda/conda-bld/pytorch_1502008109146/work/torch/lib/THC/generic/THCTensor.c:323

    opened by shaform 4
  • Loading model with low accuracy

    Loading model with low accuracy

    Hi! I found that when I load the model back using autoencoder.load_state_dict(torch.load(ae_path)) etc., it leads to low accuracy (around 0.3) even though the model achieves around 0.8 accuracy in evaluate_autoencoder during training. I have to use torch.load(ae_path) and torch.save(autoencoder, f)to get away this problem. Probably it is a pytorch bug, as discussed here: https://discuss.pytorch.org/t/saving-and-loading-a-model-in-pytorch/2610/6

    opened by posenhuang 3
  • IndexError for last batch of last epoch

    IndexError for last batch of last epoch

    I got this error while training a full model yesterday:

    Traceback (most recent call last):
      File "train.py", line 526, in <module>
        train_gan_d(train_data[random.randint(0, niter)])
    IndexError: list index out of range
    

    This happened for the last batch of the last epoch.

    The relevant code block:

    # train gan ----------------------------------
            for k in range(niter_gan):
    
                # train discriminator/critic
                for i in range(args.niters_gan_d):
                    # feed a seen sample within this epoch; good for early training
                    errD, errD_real, errD_fake = \
                        train_gan_d(train_data[random.randint(0, niter)])
    

    It is possible that train_data can be populated with less elements than niter in the batchifyfunction?

    The argument list I had was pretty minimal: python train.py --data_path /path/to/billion-word-data --cuda --no_earlystopping

    opened by thvasilo 2
  • errG_real is not backpropagated in the training of generator

    errG_real is not backpropagated in the training of generator

    According to the Algorithm1 in the paper, L_G=errG_real-errG_fake. However, in your pytorch implementation, only errG_fake is backpropagated. Why? How does it affect the performance if you backpropagate both?

    opened by guxd 0
  • [snli] pretrained weight works while training from scratch results in very bad result.

    [snli] pretrained weight works while training from scratch results in very bad result.

    I found that the provided weights work quite well when I generate samples. However, if I train the model with the provided script(the same setting), the result is very pool. Are there any changes in the code?

    opened by yanghoonkim 0
  • Are there something wrong with the gan_d_loss?

    Are there something wrong with the gan_d_loss?

    I found that in the function train_gan_d(), the loss_fake and loss_real are in the opposite direction with the original paper. I was wondering is it correct or not?

    opened by shizhediao 0
  • How to use

    How to use "offset vector transformations" to change text like in the paper?

    I am wondering how to use this code to perform the "offset vector transformation" examples in the paper, such as substituting "clapping" with "walking" and altering the entire input sentence to fit the new word.

    opened by styfeng 0
Owner
Junbo (Jake) Zhao
NYU PhD student / Facebook researcher
Junbo (Jake) Zhao
Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

Bridging Multi-Task Learning and Meta-Learning Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Trainin

AI Secure 57 Dec 15, 2022
The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Temporal Query Networks for Fine-grained Video Understanding ?? This repository contains the implementation of CVPR2021 paper Temporal_Query_Networks

null 55 Dec 21, 2022
[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models Codes for this paper The Lottery Tickets Hypo

VITA 59 Dec 28, 2022
[CVPRW 21] "BNN - BN = ? Training Binary Neural Networks without Batch Normalization", Tianlong Chen, Zhenyu Zhang, Xu Ouyang, Zechun Liu, Zhiqiang Shen, Zhangyang Wang

BNN - BN = ? Training Binary Neural Networks without Batch Normalization Codes for this paper BNN - BN = ? Training Binary Neural Networks without Bat

VITA 40 Dec 30, 2022
Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

AutoSF The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding" and this paper has been accepted by ICDE2020. News:

AutoML Research 64 Dec 17, 2022
[Preprint] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

Chasing Sparsity in Vision Transformers: An End-to-End Exploration Codes for [Preprint] Chasing Sparsity in Vision Transformers: An End-to-End Explora

VITA 64 Dec 8, 2022
Danfeng Hong, Lianru Gao, Jing Yao, Bing Zhang, Antonio Plaza, Jocelyn Chanussot. Graph Convolutional Networks for Hyperspectral Image Classification, IEEE TGRS, 2021.

Graph Convolutional Networks for Hyperspectral Image Classification Danfeng Hong, Lianru Gao, Jing Yao, Bing Zhang, Antonio Plaza, Jocelyn Chanussot T

Danfeng Hong 154 Dec 13, 2022
[CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy Codes for this paper: [CVPR 2022] The Pr

VITA 16 Nov 26, 2022
The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

This repository contains the software implementation of most algorithms used or developed in my research. The LaTeX and Python code for generating the

João Fonseca 3 Jan 3, 2023
Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.

NeuralTextures This is repository with inference code for paper "StylePeople: A Generative Model of Fullbody Human Avatars" (CVPR21). This code is for

Visual Understanding Lab @ Samsung AI Center Moscow 18 Oct 6, 2022
This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

SLATE This is the official source code for SLATE. We provide the code for the model, the training code and a dataset loader for the 3D Shapes dataset.

Gautam Singh 66 Dec 26, 2022
Code for paper ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop.

Who Left the Dogs Out? Evaluation and demo code for our ECCV 2020 paper: Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization

Benjamin Biggs 29 Dec 28, 2022
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

null 73 Nov 6, 2022
Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

CoProtector Code for the prototype tool in our paper "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning".

Zhensu Sun 1 Oct 26, 2021
Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

J K Terry 32 Nov 9, 2021
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

Pranaydeep Singh 22 Dec 8, 2022
Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample

Maths from examples - Learning advanced mathematical computations from examples This is the source code and data sets relevant to the paper Learning a

Facebook Research 171 Nov 23, 2022
Code and data of the Fine-Grained R2R Dataset proposed in paper Sub-Instruction Aware Vision-and-Language Navigation

Fine-Grained R2R Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP2020 paper Sub-Instruction Aware Vision-and-Language Navigation. C

YicongHong 34 Nov 15, 2022
Code and datasets for the paper "Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction" (RA-L, 2021)

Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction This is the code for the paper Combining E

Robotics and Perception Group 69 Dec 26, 2022