Code for "Finetuning Pretrained Transformers into Variational Autoencoders"

Seongmin Park

Last update: Nov 26, 2022

Related tags

Overview

transformers-into-vaes

Code for Finetuning Pretrained Transformers into Variational Autoencoders (our submission to NLP Insights Workshop 2021).

Gathering data used in the paper:

Download all data (penn, snli, yahoo, yelp) from this repository.
Change data path in base_models.py accordingly.

Running experiments:

Install dependencies.

pip install -r requirements.txt

Run phase 1 (encoder only training):

./run_encoder_training snli

Run phase 2 (full training):

./run_training snli <path_to_checkpoint_from_phase_1>

Calculating metrics:

python evaluate_all.py -d snli -bs 256 -c <path_to_config_file> -ckpt <path_to_checkpoint_file>

Comments

Mismatch dimension while generate

Hello!

Now I am trying to inference the trained model, but I found mismatch dimension error. Here are the full logs:

/opt/conda/lib/python3.7/site-packages/transformers/models/t5/tokenization_t5_fast.py:166: FutureWarning: This tokenizer was incorrectly instantiated with a model max length of 512 which will be corrected in Transformers v5.
For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
- Be aware that you SHOULD NOT rely on t5-small automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.
- To avoid this warning, please instantiate this tokenizer with `model_max_length` set to your preferred value.
  FutureWarning,
/opt/conda/lib/python3.7/site-packages/pytorch_lightning/core/lightning.py:23: LightningDeprecationWarning: pytorch_lightning.core.lightning.LightningModule has been deprecated in v1.7 and will be removed in v1.9. Use the equivalent class from the pytorch_lightning.core.module.LightningModule class instead.
  "pytorch_lightning.core.lightning.LightningModule has been deprecated in v1.7"
Downloading: 100%|███████████████████████████| 231M/231M [00:04<00:00, 55.9MB/s]

--------------------------------------------------------------------------------
./transformers-into-vaes/finetune.py 280 <module>
use_cache=True,

/kaggle/working/transformers-into-vaes/generate.py 89 generate
sampled_z=sampled_z,

.../conda/lib/python3.7/site-packages/torch/nn/modules/module.py 1110 _call_implreturn forward_call(*input, **kwargs)

/kaggle/working/transformers-into-vaes/vendor_t5.py 142 forward
return_dict=return_dict,

.../conda/lib/python3.7/site-packages/torch/nn/modules/module.py 1110 _call_implreturn forward_call(*input, **kwargs)

...ib/python3.7/site-packages/transformers/models/t5/modeling_t5.py 1044 forwardoutput_attentions=output_attentions,

.../conda/lib/python3.7/site-packages/torch/nn/modules/module.py 1110 _call_implreturn forward_call(*input, **kwargs)

...lib/python3.7/site-packages/transformers/models/t5/modeling_t5.py 671 forwardoutput_attentions=output_attentions,

.../conda/lib/python3.7/site-packages/torch/nn/modules/module.py 1110 _call_implreturn forward_call(*input, **kwargs)

...lib/python3.7/site-packages/transformers/models/t5/modeling_t5.py 577 forwardoutput_attentions=output_attentions,

.../conda/lib/python3.7/site-packages/torch/nn/modules/module.py 1110 _call_implreturn forward_call(*input, **kwargs)

...lib/python3.7/site-packages/transformers/models/t5/modeling_t5.py 500 forwardhidden_states, self.k, key_value_states, past_key_value[0] if past_key_value is not None else None

...lib/python3.7/site-packages/transformers/models/t5/modeling_t5.py 489 projecthidden_states = torch.cat([past_key_value, hidden_states], dim=2)

RuntimeError:
Sizes of tensors must match except in dimension 2. Expected size 64 but got size 1 for tensor number 1 in the list.

It seems the problem occurs on decoding phase, then goes deep into the internal class of T5. I am trying to figure out what's the problem but I can't find it. Could you please help to check with this issue?

Thank you very much!)

opened by fhrzn 2

KL Thresholding

Hello again! Thank you for the great code. As stated in the paper, there is a KL Thresholding technique. But, I feel lost looking for this function in the code. Could you please point out where I could find it?

Thank you very much!

opened by fhrzn 2
training loss always `nan`
Hello, first of all thanks for your amazing work! I am trying to barely run your code in Google Colab but got the training loss always nan when starting pretraining encoder. Among 15 epochs, it just stop after 1st epoch as the EarlyStopCallback noticed that the loss keep infinite since the beginning. If it possible, could you please check to reproduce my errors?

My environment:

Transformers == 4.23.1

Pytorch Lightning == 1.7.7

Looking forward for your response, thank you)
opened by fhrzn 2
Confusion about the mutual information metric

Hello, thank you very much for making the code available. I'm confused about the mutual information math, more specifically about the line

E_{q(z|x)}log(q(z|x)) = -0.5nzlog(2*\pi) - 0.5*(1+logvar).sum(-1) neg_entropy = (-0.5 * nz * math.log(2 * math.pi) - 0.5 * (1 + logv).sum(-1)).mean()

When I derive it, it gives me neg_entropy = (-0.5 * nz * math.log(2 * math.pi) - 0.5 * (logv).sum(-1)).mean()

So I think I must have made a mistake somewhere? Thank you

opened by smolPixel 2
passing projected latent space

Thanks for this nice work and reproducible code. If I understood your approach well, I think I am missing the code line that you passed latent space z to decoder as encoder_hidden_states. I will be glad if you point that line. Thanks.

opened by safakkbilici 2

Owner

Seongmin Park

NLP Researcher at actionpower.kr Maintainer @hamanlp

GitHub

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

32 Nov 9, 2021

A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

Scriptfab - What is it? A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code

3 Jul 28, 2021

Code for the Python code smells video on the ArjanCodes channel.

7 Python code smells This repository contains the code for the Python code smells video on the ArjanCodes channel (watch the video here). The example

55 Dec 29, 2022

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation This is the official PyTorch implementation

564 Jan 8, 2023

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2. It is trained (finetuned) on a curated list of approximately 45K Python (~470MB) files gathered from the Github. Currently, it just works properly on Python but not bad at other languages (thanks to GPT-2's power).

91 Sep 23, 2022

Code-autocomplete, a code completion plugin for Python

Code AutoComplete code-autocomplete, a code completion plugin for Python.

13 Jan 7, 2023

Code of paper: A Recurrent Vision-and-Language BERT for Navigation

Recurrent VLN-BERT Code of the Recurrent-VLN-BERT paper: A Recurrent Vision-and-Language BERT for Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian

109 Dec 21, 2022

When doing audio and video sentiment recognition, I found that a lot of code is duplicated, often a function in different time debugging for a long time, based on this problem, I want to manage all the previous work, organized into an open source library can be iterative. For their own use and others.

FastAudioVisual Our project is developed here. The goal finish time is March 01, 2021 What is FastAudioVisual? FastAudioVisual is a tool that allows u

39 Oct 27, 2022

Easy, fast, effective, and automatic g-code compression!

Getting to the meat of g-code. Easy, fast, effective, and automatic g-code compression! MeatPack nearly doubles the effective data rate of a standard

97 Nov 21, 2022

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

4.8k Dec 30, 2022

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

T5: Text-To-Text Transfer Transformer The t5 library serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Lear

4.6k Jan 1, 2023

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

4.3k Feb 18, 2021

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

T5: Text-To-Text Transfer Transformer The t5 library serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Lear

3.2k Feb 17, 2021

Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks

Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks. It takes raw videos/images + text as inputs, and outputs task predictions. ClipBERT is designed based on 2D CNNs and transformers, and uses a sparse sampling strategy to enable efficient end-to-end video-and-language learning.

612 Jan 4, 2023

Code for "Finetuning Pretrained Transformers into Variational Autoencoders"

Related tags

Overview

transformers-into-vaes

Gathering data used in the paper:

Running experiments:

Calculating metrics:

Comments

Mismatch dimension while generate

KL Thresholding

training loss always `nan`

Confusion about the mutual information metric

passing projected latent space

Owner

Seongmin Park

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

Code for the Python code smells video on the ArjanCodes channel.

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Code-autocomplete, a code completion plugin for Python

Code of paper: A Recurrent Vision-and-Language BERT for Navigation

Easy, fast, effective, and automatic g-code compression!

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks

Collection of scripts to pinpoint obfuscated code

Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper

Official code for "Parser-Free Virtual Try-on via Distilling Appearance Flows", CVPR 2021

This is the source code of RPG (Reward-Randomized Policy Gradient)

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].