Code for "Finetuning Pretrained Transformers into Variational Autoencoders"

Overview

transformers-into-vaes

Code for Finetuning Pretrained Transformers into Variational Autoencoders (our submission to NLP Insights Workshop 2021).

Gathering data used in the paper:

  1. Download all data (penn, snli, yahoo, yelp) from this repository.

  2. Change data path in base_models.py accordingly.

Running experiments:

  1. Install dependencies.
pip install -r requirements.txt
  1. Run phase 1 (encoder only training):
./run_encoder_training snli
  1. Run phase 2 (full training):
./run_training snli <path_to_checkpoint_from_phase_1>

Calculating metrics:

python evaluate_all.py -d snli -bs 256 -c <path_to_config_file> -ckpt <path_to_checkpoint_file> 
Comments
  • Mismatch dimension while generate

    Mismatch dimension while generate

    Hello!

    Now I am trying to inference the trained model, but I found mismatch dimension error. Here are the full logs:

    /opt/conda/lib/python3.7/site-packages/transformers/models/t5/tokenization_t5_fast.py:166: FutureWarning: This tokenizer was incorrectly instantiated with a model max length of 512 which will be corrected in Transformers v5.
    For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
    - Be aware that you SHOULD NOT rely on t5-small automatically truncating your input to 512 when padding/encoding.
    - If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.
    - To avoid this warning, please instantiate this tokenizer with `model_max_length` set to your preferred value.
      FutureWarning,
    /opt/conda/lib/python3.7/site-packages/pytorch_lightning/core/lightning.py:23: LightningDeprecationWarning: pytorch_lightning.core.lightning.LightningModule has been deprecated in v1.7 and will be removed in v1.9. Use the equivalent class from the pytorch_lightning.core.module.LightningModule class instead.
      "pytorch_lightning.core.lightning.LightningModule has been deprecated in v1.7"
    Downloading: 100%|███████████████████████████| 231M/231M [00:04<00:00, 55.9MB/s]
    
    --------------------------------------------------------------------------------
    ./transformers-into-vaes/finetune.py 280 <module>
    use_cache=True,
    
    /kaggle/working/transformers-into-vaes/generate.py 89 generate
    sampled_z=sampled_z,
    
    .../conda/lib/python3.7/site-packages/torch/nn/modules/module.py 1110 _call_implreturn forward_call(*input, **kwargs)
    
    /kaggle/working/transformers-into-vaes/vendor_t5.py 142 forward
    return_dict=return_dict,
    
    .../conda/lib/python3.7/site-packages/torch/nn/modules/module.py 1110 _call_implreturn forward_call(*input, **kwargs)
    
    ...ib/python3.7/site-packages/transformers/models/t5/modeling_t5.py 1044 forwardoutput_attentions=output_attentions,
    
    .../conda/lib/python3.7/site-packages/torch/nn/modules/module.py 1110 _call_implreturn forward_call(*input, **kwargs)
    
    ...lib/python3.7/site-packages/transformers/models/t5/modeling_t5.py 671 forwardoutput_attentions=output_attentions,
    
    .../conda/lib/python3.7/site-packages/torch/nn/modules/module.py 1110 _call_implreturn forward_call(*input, **kwargs)
    
    ...lib/python3.7/site-packages/transformers/models/t5/modeling_t5.py 577 forwardoutput_attentions=output_attentions,
    
    .../conda/lib/python3.7/site-packages/torch/nn/modules/module.py 1110 _call_implreturn forward_call(*input, **kwargs)
    
    ...lib/python3.7/site-packages/transformers/models/t5/modeling_t5.py 500 forwardhidden_states, self.k, key_value_states, past_key_value[0] if past_key_value is not None else None
    
    ...lib/python3.7/site-packages/transformers/models/t5/modeling_t5.py 489 projecthidden_states = torch.cat([past_key_value, hidden_states], dim=2)
    
    RuntimeError:
    Sizes of tensors must match except in dimension 2. Expected size 64 but got size 1 for tensor number 1 in the list.
    

    It seems the problem occurs on decoding phase, then goes deep into the internal class of T5. I am trying to figure out what's the problem but I can't find it. Could you please help to check with this issue?

    Thank you very much!)

    opened by fhrzn 2
  • KL Thresholding

    KL Thresholding

    Hello again! Thank you for the great code. As stated in the paper, there is a KL Thresholding technique. But, I feel lost looking for this function in the code. Could you please point out where I could find it?

    Thank you very much!

    opened by fhrzn 2
  • training loss always `nan`

    training loss always `nan`

    Hello, first of all thanks for your amazing work! I am trying to barely run your code in Google Colab but got the training loss always nan when starting pretraining encoder. Among 15 epochs, it just stop after 1st epoch as the EarlyStopCallback noticed that the loss keep infinite since the beginning. If it possible, could you please check to reproduce my errors?

    My environment:

    • Transformers == 4.23.1
    • Pytorch Lightning == 1.7.7

    Looking forward for your response, thank you)

    opened by fhrzn 2
  • Confusion about the mutual information metric

    Confusion about the mutual information metric

    Hello, thank you very much for making the code available. I'm confused about the mutual information math, more specifically about the line

    E_{q(z|x)}log(q(z|x)) = -0.5nzlog(2*\pi) - 0.5*(1+logvar).sum(-1) neg_entropy = (-0.5 * nz * math.log(2 * math.pi) - 0.5 * (1 + logv).sum(-1)).mean()

    When I derive it, it gives me neg_entropy = (-0.5 * nz * math.log(2 * math.pi) - 0.5 * (logv).sum(-1)).mean()

    So I think I must have made a mistake somewhere? Thank you

    opened by smolPixel 2
  • passing projected latent space

    passing projected latent space

    Thanks for this nice work and reproducible code. If I understood your approach well, I think I am missing the code line that you passed latent space z to decoder as encoder_hidden_states. I will be glad if you point that line. Thanks.

    opened by safakkbilici 2
Owner
Seongmin Park
NLP Researcher at actionpower.kr Maintainer @hamanlp
Seongmin Park
Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

Justin Terry 32 Nov 9, 2021
A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

Scriptfab - What is it? A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code

DevNugget 3 Jul 28, 2021
Code for the Python code smells video on the ArjanCodes channel.

7 Python code smells This repository contains the code for the Python code smells video on the ArjanCodes channel (watch the video here). The example

null 55 Dec 29, 2022
Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation This is the official PyTorch implementation

Salesforce 564 Jan 8, 2023
Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2. It is trained (finetuned) on a curated list of approximately 45K Python (~470MB) files gathered from the Github. Currently, it just works properly on Python but not bad at other languages (thanks to GPT-2's power).

Galois Autocompleter 91 Sep 23, 2022
Code-autocomplete, a code completion plugin for Python

Code AutoComplete code-autocomplete, a code completion plugin for Python.

xuming 13 Jan 7, 2023
Code of paper: A Recurrent Vision-and-Language BERT for Navigation

Recurrent VLN-BERT Code of the Recurrent-VLN-BERT paper: A Recurrent Vision-and-Language BERT for Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian

YicongHong 109 Dec 21, 2022
Easy, fast, effective, and automatic g-code compression!

Getting to the meat of g-code. Easy, fast, effective, and automatic g-code compression! MeatPack nearly doubles the effective data rate of a standard

Scott Mudge 97 Nov 21, 2022
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Max Woolf 4.8k Dec 30, 2022
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

T5: Text-To-Text Transfer Transformer The t5 library serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Lear

Google Research 4.6k Jan 1, 2023
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Max Woolf 4.3k Feb 18, 2021
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

T5: Text-To-Text Transfer Transformer The t5 library serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Lear

Google Research 3.2k Feb 17, 2021
Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks

Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks. It takes raw videos/images + text as inputs, and outputs task predictions. ClipBERT is designed based on 2D CNNs and transformers, and uses a sparse sampling strategy to enable efficient end-to-end video-and-language learning.

Jie Lei 雷杰 612 Jan 4, 2023
Collection of scripts to pinpoint obfuscated code

Obfuscation Detection (v1.0) Author: Tim Blazytko Automatically detect control-flow flattening and other state machines Description: Scripts and binar

Tim Blazytko 230 Nov 26, 2022
Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper

Data Augmentation using Pre-trained Transformer Models Code associated with the Data Augmentation using Pre-trained Transformer Models paper Code cont

null 44 Dec 31, 2022
Official code for "Parser-Free Virtual Try-on via Distilling Appearance Flows", CVPR 2021

Parser-Free Virtual Try-on via Distilling Appearance Flows, CVPR 2021 Official code for CVPR 2021 paper 'Parser-Free Virtual Try-on via Distilling App

null 395 Jan 3, 2023
This is the source code of RPG (Reward-Randomized Policy Gradient)

RPG (Reward-Randomized Policy Gradient) Zhenggang Tang*, Chao Yu*, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Shaolei Du, Yu Wang, Yi Wu (

null 40 Nov 25, 2022
Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat

Wasi Ahmad 138 Dec 30, 2022