Code for ACL 2020 paper "Rigid Formats Controlled Text Generation"

Piji Li

Last update: Dec 17, 2022

Related tags

Overview

SongNet

SongNet: SongCi + Song (Lyrics) + Sonnet + etc.

@inproceedings{li-etal-2020-rigid,
    title = "Rigid Formats Controlled Text Generation",
    author = "Li, Piji and Zhang, Haisong and Liu, Xiaojiang and Shi, Shuming",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.68",
    doi = "10.18653/v1/2020.acl-main.68",
    pages = "742--751"
}

Run

python prepare_data.py
./train.sh

Evaluation

Modify test.py: m_path = the best dev model
./test.sh
python metrics.py

Polish

./polish.sh

Download

The pretrained Chinese Language Model: https://drive.google.com/file/d/1g2tGyUwPe86vPn2nub1vkQva5lwtZ6Rd/view
The finetuned SongCi model: https://drive.google.com/file/d/16A2AzuU7slf7xj2QdLcBAorUCCaCk650/view

Reference

Guyu: https://github.com/lipiji/Guyu
Pretraining：https://github.com/lipiji/big_tpl_zh_10_base

Comments

请问该怎样理解transformer模块中的incremental_state？

您好，近日想做一些关于AI写诗方面的研究，研读了您的代码。但是在读到您的transformer模块时，不太理解您的incremental_state的实现，因为之前在其他论文或transformer的代码中并没有看到过类似的实现（可能也有我看代码看的不很多的缘故），想询问一下您，您的incremental state与其中的'bidx'项该怎么样进行理解？这个transformer实现是不是一种原来transformer加速实现的方法，有没有相应的论文或这方面的说明？求指教！

谢谢！

opened by Imposingapple 5
训练与生成时 ys_tpl与xs_tpl 不匹配的问题

请问一下，训练时ys_tpl与xs_tpl（代表格式与押韵信息）只有c0,c2,c1，而生成（polish）时，对应的ys_tpl和xs_tpl却会包含已有的字的信息，比如C={c0,c0,love,c1,,bends,c0,remove,c1,,} 。 1）那么模型在生成时真的能够正确识别到ys_tpl和xs_tpl中字的信息吗，考虑到在训练时它从未见过这样的输入。如果缺失的字的比例只有20%甚至更低时，模型真的还能有相应的生成能力吗？ 2）在代码里面，对应的缺失的字“” 会被统一替换为c1，这里是不是没有考虑到字如果是韵脚的情况，即“”应该被替换为c2？

opened by jackbyebye 3
如何使用多卡训练？

尝试把 train.sh 里的 world_size 和 gpus 都设为 8，报了这个错误：

label_smoothing.py", line 15, in init self.one_hot = torch.full((1, size), self.smoothing_value).to(device) RuntimeError: CUDA error: invalid device ordinal

请问应该怎么办啊？

ps：另外发现了一个小问题，无论怎样设置 CUDA_VISIBLE_DEVICES，单卡时总是使用第二个 GPU，正在尝试解决

opened by ChaooMa 3
在google colab 执行./train.sh, 出现错误, 不确定是否是torch版本带来的问题

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [136, 16, 2304]], which is output 0 of AddBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

version: torch-1.7.0+cu101, python-3.6.9

李老师, 能否告知下你的torch版本啊

opened by smartmark-pro 3
运行./test.sh时发生报错：

你好我在训练网络完毕之后将test.py中的m_path改为了结果中最新的checkpoint的地址

但是在运行./test.sh时发生报错： Traceback (most recent call last): File "test.py", line 34, in lm_model, lm_vocab, lm_args = init_model(m_path, gpu, "./model/vocab.txt") File "test.py", line 28, in init_model lm_model.load_state_dict(ckpt['model']) File "/usr/local/anaconda3/envs/GPT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for BIGLM: size mismatch for tok_embed.weight: copying a param with shape torch.Size([6410, 768]) from checkpoint, the shape in current model is torch.Size([28781, 768]).
size mismatch for out_proj.weight: copying a param with shape torch.Size([6410, 768]) from checkpoint, the shape in current model is torch.Size([28781, 768]). size mismatch for out_proj.bias: copying a param with shape torch.Size([6410]) from checkpoint, the shape in current model is torch.Size([28781]).

请问这是什么原因导致的呢？非常感谢

opened by Asuka0002 2

Bad results..

Hi I trained on small dataset

/content/SongNet
9
2300 667 2599
7
2
9
vocab
done
vocab.size = 1215
batch_acm 99, loss 5.277, acc 0.102, nll 6.265, ppl 86.928, x_acm 1584, lr 0.000002
batch_acm 199, loss 3.566, acc 0.262, nll 4.273, ppl 20.652, x_acm 3179, lr 0.000005
batch_acm 299, loss 2.664, acc 0.323, nll 3.223, ppl 9.619, x_acm 4774, lr 0.000008
batch_acm 399, loss 2.123, acc 0.396, nll 2.580, ppl 6.136, x_acm 6374, lr 0.000010
batch_acm 499, loss 1.807, acc 0.452, nll 2.195, ppl 4.706, x_acm 7969, lr 0.000013
validating...
epoch-3-acm-499 nll= 1.846888825275015 ppl= 3.7909837519747205 count= 667.0
batch_acm 599, loss 1.580, acc 0.502, nll 1.921, ppl 3.879, x_acm 9564, lr 0.000015
batch_acm 699, loss 1.370, acc 0.561, nll 1.673, ppl 3.265, x_acm 11164, lr 0.000018
batch_acm 799, loss 1.151, acc 0.631, nll 1.418, ppl 2.733, x_acm 12759, lr 0.000020
batch_acm 899, loss 0.934, acc 0.701, nll 1.167, ppl 2.290, x_acm 14354, lr 0.000023
batch_acm 999, loss 0.695, acc 0.787, nll 0.890, ppl 1.882, x_acm 15954, lr 0.000025
validating...
epoch-6-acm-999 nll= 0.3973974670427314 ppl= 1.3300093074609851 count= 667.0
batch_acm 1099, loss 0.494, acc 0.856, nll 0.652, ppl 1.589, x_acm 17549, lr 0.000028
batch_acm 1199, loss 0.332, acc 0.913, nll 0.460, ppl 1.384, x_acm 19144, lr 0.000030
batch_acm 1299, loss 0.223, acc 0.945, nll 0.330, ppl 1.260, x_acm 20739, lr 0.000033
batch_acm 1399, loss 0.157, acc 0.966, nll 0.252, ppl 1.192, x_acm 22339, lr 0.000035
batch_acm 1499, loss 0.117, acc 0.975, nll 0.208, ppl 1.156, x_acm 23934, lr 0.000038
validating...
epoch-10-acm-1499 nll= 0.08431253172289664 ppl= 1.060383505013393 count= 667.0
training time: 453sec.

and test result is unreadable text after execute polish.sh with my ckpt epoch10_batch_1499 and my vocab.txt

ps: my editedpolish_tpl.txt

['Gufd<s1>327711<s2>_____,____ менять.______ _____ сейчас. _________ любимый. ______ _____ много.']
0.7558178901672363

result:

Gufd<s1>327711<s2>_____,____ менять.______ _____ сейчас. _________ любимый. ______ _____ много.
<bos>По-ти, мув менять.шозыха сйшаб с</s>

opened by pavelxx1 1

Could you please explain more about Integrity evaluation?

I cannot find the codes about evaluating Integrity. Could you please release them as well as the pretrained GPT2 model mentioned in your paper? Thanks.

opened by DuYooho 1
How to generate text containing fixed text information?

Hello, as shown in the Table 6 of the paper, you mention that "our model has the ability of refining and polishing given the format C which contains some fixed text information". So could you please tell me how to make it specifically? 😊 You could reply in Chinese if you would like to, thanks a lot!

opened by cdxeve 1

执行./test 出现错误 "IndexError: The shape of the mask [1] at index 0 does not match "

李老师你好, 您当前的代码, 我运行没有任何问题, 但是当我把数据迁移到自己搜集的数据时, 会出现错误.

具体报错如下

Traceback (most recent call last):
  File "test.py", line 359, in <module>
    res = top_k_inc(enc, src_padding_mask, ys_tpl, ys_seg, ys_pos, s)
  File "test.py", line 61, in top_k_inc
    incremental_state)
  File "/content/SongNet/biglm.py", line 91, in work_incremental
    incremental_state=incremental_state)
  File "/content/SongNet/transformer.py", line 73, in work_incremental
    attn_mask=self_attn_mask, incremental_state=incremental_state)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/content/SongNet/transformer.py", line 156, in forward
    prev_key = prev_key[bidx]
IndexError: The shape of the mask [1] at index 0 does not match the shape of the indexed tensor [2, 12, 1, 64] at index 0

错误情形 train和eval, polish都没有问题, 但是运行test中执行到某条数据时, 就会出现这种错误. 也就是有的数据可以正常预测和打印, 有的不能.

临时的解决办法 我在test中增加try except 跳过执行出错的例子.

希望能解决bug 我读了transformer prev_key前后的代码, 没能理解错误. 如果您在调试中也遇到类似问题, 能给一些解决的提示么?

opened by smartmark-pro 0

评价标准

论文里列出了很多标准，包括 rhyme, tpl 等相关 loss，但是代码中真正做 backpropagation 的好像只有 nll/ppl。我想请问下预训练的过程中，这些格式、韵律相关的 Loss 有做计算和backpropagation吗？

另外，论文中提到 Beam Search 方法生成，代码中只有在 test.py 里出现过，我想请问下是尝试过 TopK 生成效果优于 BeamSearch 吗？为什么在做 polish 的时候不选择 BeamSearch ?

opened by MianWang123 0

Owner

Piji Li

NUAA NLP Group

GitHub

Code for ACL 2021 main conference paper "Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances".

Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances This repository contains the code and pre-trained mode

90 Dec 27, 2022

Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021

Mask-Align: Self-Supervised Neural Word Alignment This is the implementation of our work Mask-Align: Self-Supervised Neural Word Alignment. @inproceed

46 Dec 15, 2022

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

ConSERT Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer Requirements torch==1.6.0

478 Dec 25, 2022

Code for our ACL 2021 (Findings) Paper - Fingerprinting Fine-tuned Language Models in the wild .

?? Fingerprinting Fine-tuned Language Models in the wild This is the code and dataset for our ACL 2021 (Findings) Paper - Fingerprinting Fine-tuned La

5 Sep 13, 2022

Code for our paper "Transfer Learning for Sequence Generation: from Single-source to Multi-source" in ACL 2021.

TRICE: a task-agnostic transferring framework for multi-source sequence generation This is the source code of our work Transfer Learning for Sequence

9 Jun 27, 2022

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

XL-Sum This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Lang

189 Jan 2, 2023

(ACL 2022) The source code for the paper "Towards Abstractive Grounded Summarization of Podcast Transcripts"

Towards Abstractive Grounded Summarization of Podcast Transcripts We provide the source code for the paper "Towards Abstractive Grounded Summarization

10 Jul 1, 2022

Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors"

SWRM Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors" Clone Clone th

14 Jan 3, 2023

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

STEMM: Self-learning with Speech-Text Manifold Mixup for Speech Translation This is a PyTorch implementation for the ACL 2022 main conference paper ST

29 Oct 16, 2022

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Chimera: Learning Shared Semantic Space for Speech-to-Text Translation This is a Pytorch implementation for the "Chimera" paper Learning Shared Semant

43 Dec 28, 2022

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Chimera: Learning Shared Semantic Space for Speech-to-Text Translation This is a Pytorch implementation for the "Chimera" paper Learning Shared Semant

43 Dec 28, 2022

Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation This is the implementaion of our paper: Bridging the

20 Dec 12, 2022

Code for papers "Generation-Augmented Retrieval for Open-Domain Question Answering" and "Reader-Guided Passage Reranking for Open-Domain Question Answering", ACL 2021

This repo provides the code of the following papers: (GAR) "Generation-Augmented Retrieval for Open-domain Question Answering", ACL 2021 (RIDER) "Read

49 Dec 26, 2022

Code for ACL 2020 paper "Rigid Formats Controlled Text Generation"

Related tags

Overview

SongNet

Run

Evaluation

Polish

Download

Reference

Comments

Owner

Piji Li

Code for ACL 2021 main conference paper "Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances".

Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Code for our ACL 2021 (Findings) Paper - Fingerprinting Fine-tuned Language Models in the wild .

Code for our paper "Transfer Learning for Sequence Generation: from Single-source to Multi-source" in ACL 2021.

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

(ACL 2022) The source code for the paper "Towards Abstractive Grounded Summarization of Podcast Transcripts"

Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors"

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Code for papers "Generation-Augmented Retrieval for Open-Domain Question Answering" and "Reader-Guided Passage Reranking for Open-Domain Question Answering", ACL 2021

Code for "Parallel Instance Query Network for Named Entity Recognition", accepted at ACL 2022.

A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Findings of ACL 2021

LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

ACL'2021: Learning Dense Representations of Phrases at Scale

Kerberoast with ACL abuse capabilities