This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing described in paper Discontinuous Grammar as a Foreign Language.

Related tags

Deep Learning Disco-Seq2seq-Parser

Overview

Discontinuous Grammar as a Foreign Language

This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing described in paper Discontinuous Grammar as a Foreign Language. In particular, it uses the in-order+SWAP linearization to deal with discontinuities and yields 95.47 F1 on the English Discontinuous Penn Treebank (DPTB). This implementation is based on the system by Fernandez Astudillo et al. (2020) and reuses part of its code.

Requirements

This implementation was tested on Python 3.6.9, PyTorch 1.1.0 and CUDA 9.0.176. Please run the following command to proceed with the installation:

    cd Disco-Seq2seq-Parser
    pip install -r requirements.txt

For the evaluation, script DISCODOP must be also installed following steps described in https://github.com/andreasvc/disco-dop.

Data

To get shift-reduce linearizations from discontinuous constituent treebanks (for instance, the DPTB), please include train, dev and test splits in discbracket format in the disco_data folder and name them as train.discbracket, dev.discbracket and test.discbracket. Then use the following script:

    ./linearization/generate.sh DPTB

Experiments

To train a model for the DPTB treebank, just execute the following script:

   ./scripts/stack-transformer/con_experiment.sh configs/ptb_roberta.large.sh

To test the trained model on the test split, please run the following command:

    ./scripts/stack-transformer/con_test-test.sh configs/test_roberta_large.sh DATA/dep-parsing/models/DPTB_RoBERTa-large_stnp6x6-seed44/checkpoint_top3-average.pt DATA/dep-parsing/models/DPTB_RoBERTa-large_stnp6x6-seed44/epoch-tests-test/dec-checkpoint-top3-average

Citation

@misc{fernándezgonzález2021discontinuous,
      title={Discontinuous Grammar as a Foreign Language},
      author={Daniel Fernández-González and Carlos Gómez-Rodríguez},
      year={2021},
      eprint={2110.10431},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
    }

Acknowledgments

We acknowledge the European Research Council (ERC), which has funded this research under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150), MINECO (ANSWER-ASAP, TIN2017-85160-C2-1-R), MICINN (SCANNER, PID2020-113230RB-C21) Xunta de Galicia (ED431C 2020/11), and Centro de Investigación de Galicia "CITIC", funded by Xunta de Galicia and the European Union (ERDF - Galicia 2014-2020 Program), by grant ED431G 2019/01.

You might also like...

PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

VoiceLoop PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a n

873 Dec 15, 2022

Python implementation of 3D facial mesh exaggeration using the techniques described in the paper: Computational Caricaturization of Surfaces.

8 Nov 1, 2022

Implementation of the method described in the Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations Implementation of the method described in the Speech Resynthesis from Di

4 Mar 11, 2022

This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing described in paper Discontinuous Grammar as a Foreign Language.

Related tags

Overview

Discontinuous Grammar as a Foreign Language

Requirements

Data

Experiments

Citation

Acknowledgments

You might also like...

PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

Python implementation of 3D facial mesh exaggeration using the techniques described in the paper: Computational Caricaturization of Surfaces.

Grammar Induction using a Template Tree Approach

An intelligent, flexible grammar of machine learning.

a grammar based feedback fuzzer

The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment

Empower Sequence Labeling with Task-Aware Language Model

Generative Query Network (GQN) in PyTorch as described in "Neural Scene Representation and Rendering"

Implementation of the method described in the Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

Owner

Daniel Fernández-González

Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

This is a package for LiDARTag, described in paper: LiDARTag: A Real-Time Fiducial Tag System for Point Clouds

An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.