Code for ACL 2020 paper "Rigid Formats Controlled Text Generation"

Overview

SongNet

SongNet: SongCi + Song (Lyrics) + Sonnet + etc.

@inproceedings{li-etal-2020-rigid,
    title = "Rigid Formats Controlled Text Generation",
    author = "Li, Piji and Zhang, Haisong and Liu, Xiaojiang and Shi, Shuming",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.68",
    doi = "10.18653/v1/2020.acl-main.68",
    pages = "742--751"
}

Run

  • python prepare_data.py
  • ./train.sh

Evaluation

  • Modify test.py: m_path = the best dev model
  • ./test.sh
  • python metrics.py

Polish

  • ./polish.sh

Download

Reference

Comments
  • 请问该怎样理解transformer模块中的incremental_state?

    请问该怎样理解transformer模块中的incremental_state?

    您好,近日想做一些关于AI写诗方面的研究,研读了您的代码。但是在读到您的transformer模块时,不太理解您的incremental_state的实现,因为之前在其他论文或transformer的代码中并没有看到过类似的实现(可能也有我看代码看的不很多的缘故),想询问一下您,您的incremental state与其中的'bidx'项该怎么样进行理解? 这个transformer实现是不是一种原来transformer加速实现的方法,有没有相应的论文或这方面的说明?求指教!

    谢谢!

    opened by Imposingapple 5
  • 训练与生成时 ys_tpl与xs_tpl 不匹配的问题

    训练与生成时 ys_tpl与xs_tpl 不匹配的问题

    请问一下,训练时ys_tpl与xs_tpl(代表格式与押韵信息)只有c0,c2,c1,而生成(polish)时,对应的ys_tpl和xs_tpl却会包含已有的字的信息,比如C={c0,c0,love,c1,,bends,c0,remove,c1,,} 。 1)那么模型在生成时真的能够正确识别到ys_tpl和xs_tpl中字的信息吗,考虑到在训练时它从未见过这样的输入。如果缺失的字的比例只有20%甚至更低时,模型真的还能有相应的生成能力吗? 2)在代码里面,对应的缺失的字“” 会被统一替换为c1,这里是不是没有考虑到字如果是韵脚的情况,即“”应该被替换为c2?

    opened by jackbyebye 3
  • 如何使用多卡训练?

    如何使用多卡训练?

    尝试把 train.sh 里的 world_size 和 gpus 都设为 8,报了这个错误:

    label_smoothing.py", line 15, in init self.one_hot = torch.full((1, size), self.smoothing_value).to(device) RuntimeError: CUDA error: invalid device ordinal

    请问应该怎么办啊?

    ps:另外发现了一个小问题,无论怎样设置 CUDA_VISIBLE_DEVICES,单卡时总是使用第二个 GPU,正在尝试解决

    opened by ChaooMa 3
  • 在google colab 执行./train.sh, 出现错误, 不确定是否是torch版本带来的问题

    在google colab 执行./train.sh, 出现错误, 不确定是否是torch版本带来的问题

    image

    RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [136, 16, 2304]], which is output 0 of AddBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

    version: torch-1.7.0+cu101, python-3.6.9

    李老师, 能否告知下你的torch版本啊

    opened by smartmark-pro 3
  • 运行./test.sh时发生报错:

    运行./test.sh时发生报错:

    你好 我在训练网络完毕之后 将test.py中的m_path改为了结果中最新的checkpoint的地址

    但是在运行./test.sh时发生报错: Traceback (most recent call last): File "test.py", line 34, in lm_model, lm_vocab, lm_args = init_model(m_path, gpu, "./model/vocab.txt") File "test.py", line 28, in init_model lm_model.load_state_dict(ckpt['model']) File "/usr/local/anaconda3/envs/GPT/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for BIGLM: size mismatch for tok_embed.weight: copying a param with shape torch.Size([6410, 768]) from checkpoint, the shape in current model is torch.Size([28781, 768]).
    size mismatch for out_proj.weight: copying a param with shape torch.Size([6410, 768]) from checkpoint, the shape in current model is torch.Size([28781, 768]). size mismatch for out_proj.bias: copying a param with shape torch.Size([6410]) from checkpoint, the shape in current model is torch.Size([28781]).

    请问这是什么原因导致的呢?非常感谢

    opened by Asuka0002 2
  • Bad results..

    Bad results..

    Hi I trained on small dataset

    /content/SongNet
    9
    2300 667 2599
    7
    2
    9
    vocab
    done
    vocab.size = 1215
    batch_acm 99, loss 5.277, acc 0.102, nll 6.265, ppl 86.928, x_acm 1584, lr 0.000002
    batch_acm 199, loss 3.566, acc 0.262, nll 4.273, ppl 20.652, x_acm 3179, lr 0.000005
    batch_acm 299, loss 2.664, acc 0.323, nll 3.223, ppl 9.619, x_acm 4774, lr 0.000008
    batch_acm 399, loss 2.123, acc 0.396, nll 2.580, ppl 6.136, x_acm 6374, lr 0.000010
    batch_acm 499, loss 1.807, acc 0.452, nll 2.195, ppl 4.706, x_acm 7969, lr 0.000013
    validating...
    epoch-3-acm-499 nll= 1.846888825275015 ppl= 3.7909837519747205 count= 667.0
    batch_acm 599, loss 1.580, acc 0.502, nll 1.921, ppl 3.879, x_acm 9564, lr 0.000015
    batch_acm 699, loss 1.370, acc 0.561, nll 1.673, ppl 3.265, x_acm 11164, lr 0.000018
    batch_acm 799, loss 1.151, acc 0.631, nll 1.418, ppl 2.733, x_acm 12759, lr 0.000020
    batch_acm 899, loss 0.934, acc 0.701, nll 1.167, ppl 2.290, x_acm 14354, lr 0.000023
    batch_acm 999, loss 0.695, acc 0.787, nll 0.890, ppl 1.882, x_acm 15954, lr 0.000025
    validating...
    epoch-6-acm-999 nll= 0.3973974670427314 ppl= 1.3300093074609851 count= 667.0
    batch_acm 1099, loss 0.494, acc 0.856, nll 0.652, ppl 1.589, x_acm 17549, lr 0.000028
    batch_acm 1199, loss 0.332, acc 0.913, nll 0.460, ppl 1.384, x_acm 19144, lr 0.000030
    batch_acm 1299, loss 0.223, acc 0.945, nll 0.330, ppl 1.260, x_acm 20739, lr 0.000033
    batch_acm 1399, loss 0.157, acc 0.966, nll 0.252, ppl 1.192, x_acm 22339, lr 0.000035
    batch_acm 1499, loss 0.117, acc 0.975, nll 0.208, ppl 1.156, x_acm 23934, lr 0.000038
    validating...
    epoch-10-acm-1499 nll= 0.08431253172289664 ppl= 1.060383505013393 count= 667.0
    training time: 453sec.
    

    and test result is unreadable text after execute polish.sh with my ckpt epoch10_batch_1499 and my vocab.txt

    ps: my editedpolish_tpl.txt

    ['Gufd<s1>327711<s2>_____,____ менять.______ _____ сейчас. _________ любимый. ______ _____ много.']
    0.7558178901672363
    
    

    result:

    Gufd<s1>327711<s2>_____,____ менять.______ _____ сейчас. _________ любимый. ______ _____ много.
    <bos>По-ти, мув менять.шозыха сйшаб с</s>
    
    opened by pavelxx1 1
  • Could you please explain more about Integrity evaluation?

    Could you please explain more about Integrity evaluation?

    I cannot find the codes about evaluating Integrity. Could you please release them as well as the pretrained GPT2 model mentioned in your paper? Thanks.

    opened by DuYooho 1
  • How to generate text containing fixed text information?

    How to generate text containing fixed text information?

    Hello, as shown in the Table 6 of the paper, you mention that "our model has the ability of refining and polishing given the format C which contains some fixed text information". So could you please tell me how to make it specifically? 😊 You could reply in Chinese if you would like to, thanks a lot!

    opened by cdxeve 1
  • 执行./test 出现错误

    执行./test 出现错误 "IndexError: The shape of the mask [1] at index 0 does not match "

    李老师你好, 您当前的代码, 我运行没有任何问题, 但是当我把数据迁移到自己搜集的数据时, 会出现错误.

    具体报错如下

    Traceback (most recent call last):
      File "test.py", line 359, in <module>
        res = top_k_inc(enc, src_padding_mask, ys_tpl, ys_seg, ys_pos, s)
      File "test.py", line 61, in top_k_inc
        incremental_state)
      File "/content/SongNet/biglm.py", line 91, in work_incremental
        incremental_state=incremental_state)
      File "/content/SongNet/transformer.py", line 73, in work_incremental
        attn_mask=self_attn_mask, incremental_state=incremental_state)
      File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/content/SongNet/transformer.py", line 156, in forward
        prev_key = prev_key[bidx]
    IndexError: The shape of the mask [1] at index 0 does not match the shape of the indexed tensor [2, 12, 1, 64] at index 0
    

    错误情形 train和eval, polish都没有问题, 但是运行test中执行到某条数据时, 就会出现这种错误. 也就是有的数据可以正常预测和打印, 有的不能.

    临时的解决办法 我在test中增加try except 跳过执行出错的例子.

    希望能解决bug 我读了transformer prev_key前后的代码, 没能理解错误. 如果您在调试中也遇到类似问题, 能给一些解决的提示么?

    opened by smartmark-pro 0
  • 评价标准

    评价标准

    论文里列出了很多标准,包括 rhyme, tpl 等相关 loss,但是代码中真正做 backpropagation 的好像只有 nll/ppl。 我想请问下预训练的过程中,这些 格式、韵律 相关的 Loss 有做计算和backpropagation吗?

    另外,论文中提到 Beam Search 方法生成,代码中只有在 test.py 里出现过,我想请问下是尝试过 TopK 生成效果优于 BeamSearch 吗?为什么在做 polish 的时候不选择 BeamSearch ?

    opened by MianWang123 0
Owner
Piji Li
NUAA NLP Group
Piji Li
Code for ACL 2021 main conference paper "Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances".

Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances This repository contains the code and pre-trained mode

ICTNLP 90 Dec 27, 2022
Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021

Mask-Align: Self-Supervised Neural Word Alignment This is the implementation of our work Mask-Align: Self-Supervised Neural Word Alignment. @inproceed

THUNLP-MT 46 Dec 15, 2022
Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

ConSERT Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer Requirements torch==1.6.0

Yan Yuanmeng 478 Dec 25, 2022
Code for our ACL 2021 (Findings) Paper - Fingerprinting Fine-tuned Language Models in the wild .

?? Fingerprinting Fine-tuned Language Models in the wild This is the code and dataset for our ACL 2021 (Findings) Paper - Fingerprinting Fine-tuned La

LCS2-IIITDelhi 5 Sep 13, 2022
Code for our paper "Transfer Learning for Sequence Generation: from Single-source to Multi-source" in ACL 2021.

TRICE: a task-agnostic transferring framework for multi-source sequence generation This is the source code of our work Transfer Learning for Sequence

THUNLP-MT 9 Jun 27, 2022
null 189 Jan 2, 2023
(ACL 2022) The source code for the paper "Towards Abstractive Grounded Summarization of Podcast Transcripts"

Towards Abstractive Grounded Summarization of Podcast Transcripts We provide the source code for the paper "Towards Abstractive Grounded Summarization

null 10 Jul 1, 2022
Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors"

SWRM Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors" Clone Clone th

null 14 Jan 3, 2023
Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

STEMM: Self-learning with Speech-Text Manifold Mixup for Speech Translation This is a PyTorch implementation for the ACL 2022 main conference paper ST

ICTNLP 29 Oct 16, 2022
A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Chimera: Learning Shared Semantic Space for Speech-to-Text Translation This is a Pytorch implementation for the "Chimera" paper Learning Shared Semant

Chi Han 43 Dec 28, 2022
A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Chimera: Learning Shared Semantic Space for Speech-to-Text Translation This is a Pytorch implementation for the "Chimera" paper Learning Shared Semant

Chi Han 43 Dec 28, 2022
Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation This is the implementaion of our paper: Bridging the

hezw.tkcw 20 Dec 12, 2022
Code for papers "Generation-Augmented Retrieval for Open-Domain Question Answering" and "Reader-Guided Passage Reranking for Open-Domain Question Answering", ACL 2021

This repo provides the code of the following papers: (GAR) "Generation-Augmented Retrieval for Open-domain Question Answering", ACL 2021 (RIDER) "Read

morning 49 Dec 26, 2022
Code for "Parallel Instance Query Network for Named Entity Recognition", accepted at ACL 2022.

README Code for Two-stage Identifier: "Parallel Instance Query Network for Named Entity Recognition", accepted at ACL 2022. For details of the model a

Yongliang Shen 45 Nov 29, 2022
A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Rebiber: A tool for normalizing bibtex with official info. We often cite papers using their arXiv versions without noting that they are already PUBLIS

(Bill) Yuchen Lin 2k Jan 1, 2023
Findings of ACL 2021

Assessing Dialogue Systems with Distribution Distances [arXiv][code] We propose to measure the performance of a dialogue system by computing the distr

Yahui Liu 16 Feb 24, 2022
LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

LV-BERT Introduction In this repo, we introduce LV-BERT by exploiting layer variety for BERT. For detailed description and experimental results, pleas

Weihao Yu 14 Aug 24, 2022
ACL'2021: Learning Dense Representations of Phrases at Scale

DensePhrases DensePhrases is an extractive phrase search tool based on your natural language inputs. From 5 million Wikipedia articles, it can search

Princeton Natural Language Processing 540 Dec 30, 2022
Kerberoast with ACL abuse capabilities

targetedKerberoast targetedKerberoast is a Python script that can, like many others (e.g. GetUserSPNs.py), print "kerberoast" hashes for user accounts

Shutdown 213 Dec 22, 2022