Character Transformations for Non-Autoregressive GEC Tagging
Milan Straka, Jakub Náplava, Jana Straková
Charles University
Faculty of Mathematics and Physics
Institute of Formal and Applied Linguistics
This repository contains supplementary source code of the Character Transformations for Non-Autoregressive GEC Tagging paper. Consider it a research prototype, not an off-the-shelf product.
Structure
The repository contains two main components:
-
rules
directory contains the scripts for generating transformations from aligned GEC data, encoding gold data using transformations and applying the transformations on input data; -
training
directory contains the scripts for training a BERT-like model on gold data encoded with transformations.
Poster
Citation
@inproceedings{straka-etal-2021-character,
title = "Character Transformations for Non-Autoregressive {GEC} Tagging",
author = "Straka, Milan and N{\'a}plava, Jakub and Strakov{\'a}, Jana",
booktitle = "Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)",
month = nov,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.wnut-1.46",
pages = "417--422",
}