Code for Massive-scale Decoding for Text Generation using Lattices

Jiacheng Xu

Last update: Dec 18, 2022

Related tags

Deep Learning lattice-generation

Overview

Massive-scale Decoding for Text Generation using Lattices

TL;DR: a new search algorithm to construct lattices encoding many generation options; two key technical contributions: (1) best-first search, (2) path recombination.

Visualization

We provide a few examples in the vis folder and on my homepage. You need to download the html files to view and interact with the model outputs.

The complete set of outputs are available on Box.

Getting started

model contains all of the methods, including baselines like beam search, nucleus sampling, and our methods.
evaluation contains scripts for evaluation.
command are the prompts and shells we use to run the experiment.

Beam Search:

PYTHONPATH=./ python src/recom_search/command/run_pipeline.py -nexample 100  -ngram_suffix 4  -beam_size 16 -min_len 10 -max_len 35   -model bs

Best-first Search:

PYTHONPATH=./ python src/recom_search/command/run_pipeline.py -nexample 100  -ngram_suffix 4  -beam_size 16 -min_len 10 -max_len 35   -model astar_baseline

Best-first Search with Recomb:

PYTHONPATH=./ python src/recom_search/command/run_pipeline.py -nexample 100  -ngram_suffix 4 -beam_size 16 -min_len 10 -max_len 35 -model astar -merge imp  -avg_score 0.75  -adhoc

Best-first Search with Zip:

PYTHONPATH=./ python src/recom_search/command/run_pipeline.py -nexample 100  -ngram_suffix 4 -beam_size 16 -min_len 10 -max_len 35 -model astar -merge zip  -avg_score 0.75  -adhoc

More detailed instructions coming soon!

Citation

@misc{xu-durrett-2021-massive,
    title={Massive-scale Decoding for Text Generation using Lattices},
    author={Jiacheng Xu and Greg Durrett},
    year={2021},
    eprint={2112.07660},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Contact

[email protected]

You might also like...

A large-scale face dataset for face parsing, recognition, generation and editing.

CelebAMask-HQ [Paper] [Demo] CelebAMask-HQ is a large-scale face image dataset that has 30,000 high-resolution face images selected from the CelebA da

1.7k Dec 26, 2022

Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search This is an implementation for our paper Contextual Non-Loca

50 Dec 3, 2022

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

README.md shall be finished soon. WSSGG 0 Overview 1 Installation 1.1 Faster-RCNN 1.2 Language Parser 1.3 GloVe Embeddings 2 Settings 2.1 VG-GT-Graph

35 Nov 20, 2022

Code for the paper A Theoretical Analysis of the Repetition Problem in Text Generation

A Theoretical Analysis of the Repetition Problem in Text Generation This repository share the code for the paper "A Theoretical Analysis of the Repeti

37 Nov 21, 2022

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa

18 Dec 29, 2022

Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

Parallel and High-Fidelity Text-to-Lip Generation This repository is the official PyTorch implementation of our AAAI-2022 paper, in which we propose P

77 Dec 21, 2022

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Deep Daze mist over green hills shattered plates on the grass cosmic love and attention a time traveler in the crowd life during the plague meditative

4.4k Jan 3, 2023

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.

Ryan Murdock has done it again, combining OpenAI's CLIP and the generator from a BigGAN! This repository wraps up his work so it is easily accessible to anyone who owns a GPU.

2.3k Jan 9, 2023

T2F: text to face generation using Deep Learning

⭐ [NEW] ⭐ T2F - 2.0 Teaser (coming soon ...) Please note that all the faces in the above samples are generated ones. The T2F 2.0 will be using MSG-GAN

533 Dec 22, 2022

Comments

Bump numpy from 1.21.4 to 1.22.0
Bumps numpy from 1.21.4 to 1.22.0.

Release notes

Sourced from numpy's releases.

v1.22.0

NumPy 1.22.0 Release Notes

NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.

A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.

NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.

New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.

A new configurable allocator for use by downstream projects.

These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

Expired deprecations

Deprecated numeric style dtype strings have been removed

Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

(gh-19539)

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

(gh-19615)

... (truncated)

Commits

4adc87d Merge pull request #20685 from charris/prepare-for-1.22.0-release

fd66547 REL: Prepare for the NumPy 1.22.0 release.

125304b wip

c283859 Merge pull request #20682 from charris/backport-20416

5399c03 Merge pull request #20681 from charris/backport-20954

f9c45f8 Merge pull request #20680 from charris/backport-20663

794b36f Update armccompiler.py

d93b14e Update test_public_api.py

7662c07 Update init.py

311ab52 Update armccompiler.py

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0

Code for Massive-scale Decoding for Text Generation using Lattices

Related tags

Overview

Massive-scale Decoding for Text Generation using Lattices

Visualization

Getting started

Citation

Contact

You might also like...

A large-scale face dataset for face parsing, recognition, generation and editing.

Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

Code for the paper A Theoretical Analysis of the Repetition Problem in Text Generation

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.

T2F: text to face generation using Deep Learning

Comments

Bump numpy from 1.21.4 to 1.22.0

v1.22.0

NumPy 1.22.0 Release Notes

Expired deprecations

Deprecated numeric style dtype strings have been removed

Expired deprecations for `loads`, `ndfromtxt`, and `mafromtxt` in npyio

Owner

Jiacheng Xu

Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

Image-generation-baseline - MUGE Text To Image Generation Baseline

codes for "Scheduled Sampling Based on Decoding Steps for Neural Machine Translation" (long paper of EMNLP-2022)

PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

BARTScore: Evaluating Generated Text as Text Generation

A 1.3B text-to-image generation model trained on 14 million image-text pairs

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

Code for Massive-scale Decoding for Text Generation using Lattices

Related tags

Overview

Massive-scale Decoding for Text Generation using Lattices

Visualization

Getting started

Citation

Contact

You might also like...

A large-scale face dataset for face parsing, recognition, generation and editing.

Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"

A weakly-supervised scene graph generation codebase. The implementation of our CVPR2021 paper ``Linguistic Structures as Weak Supervision for Visual Scene Graph Generation''

Code for the paper A Theoretical Analysis of the Repetition Problem in Text Generation

This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" (SPNLP@ACL2022)

Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.

T2F: text to face generation using Deep Learning

Comments

Bump numpy from 1.21.4 to 1.22.0

v1.22.0

NumPy 1.22.0 Release Notes

Expired deprecations

Deprecated numeric style dtype strings have been removed

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

Owner

Jiacheng Xu

Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

Deep Unsupervised 3D SfM Face Reconstruction Based on Massive Landmark Bundle Adjustment.

Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

Image-generation-baseline - MUGE Text To Image Generation Baseline

codes for "Scheduled Sampling Based on Decoding Steps for Neural Machine Translation" (long paper of EMNLP-2022)

PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

BARTScore: Evaluating Generated Text as Text Generation

A 1.3B text-to-image generation model trained on 14 million image-text pairs

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

Expired deprecations for `loads`, `ndfromtxt`, and `mafromtxt` in npyio