Plan-then-Generate: Controlled Data-to-Text Generation via Planning
Authors: Yixuan Su, David Vandyke, Sihui Wang, Yimai Fang, and Nigel Collier
Code for EMNLP 2021 paper Plan-then-Generate: Controlled Data-to-Text Generation via Planning
1. Environment Setup:
(1) Hardware Requirement:
The code in this repo is thoroughly tested on our machine with a single Nvida V100 GPU (16GB)
(2) Installation:
chmod +x ./config_setup.sh
./config_setup.sh
2. ToTTo Data Preprocessing:
Option (1): Preprocess the ToTTo data from scratch by yourself:
cd ./data
chmod +x ./prepare_data.sh
./prepare_data.sh
This process could take up to 1 hour
here
Option (2): Download the our processed dataunzip data.zip and replace with the empty ./data folder
For more details about ToTTo dataset, please refer to the original Google Research repo
3. Content Planner:
Please refer to README.md in ./content_planner folder
4. Sequence Generator:
Please refer to README.md in ./generator folder
5. Citation
If you find our paper and resources useful, please kindly cite our paper:
@inproceedings{su2021plangen,
title={Plan-then-Generate: Controlled Data-to-Text Generation via Planning},
author={Yixuan Su and David Vandyke and Sihui Wang and Yimai Fang and Nigel Collier},
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
month = nov,
year = "2021",
publisher = "Association for Computational Linguistics",
}