One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

seq-to-mind

Last update: Dec 11, 2022

Related tags

Deep Learning DMRST_Parser

Overview

Introduction

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".
Users can apply it to parse the input text from scratch, and get the EDU segmentations and the parsed tree structure.
The model supports both sentence-level and document-level RST discourse parsing.
This repo and the pre-trained model is only for research use.

Package Requirements

pytorch==1.7.1
transformers==4.8.2

Supported Languages

We trained and evaluated the model with the multilingual collection of RST discourse treebanks, and it natively supports 6 languages: English, Portuguese, Spanish, German, Dutch, Basque. Interested users can also try other languages.

Data Format

[Input] InputSentence: The input document/sentence, and the raw text will be tokenizaed and encoded by the xlm-roberta-base language backbone. '|| ' denotes the EDU boundary positions.
- Although the report, || which has released || before the stock market opened, || didn't trigger the 190.58 point drop in the Dow Jones Industrial Average, || analysts said || it did play a role in the market's decline. ||
[Output] EDU_Breaks: The indices of the EDU boundary tokens, including the last word of the sentence.
- [2, 5, 10, 22, 24, 33]
[Output] tree_parsing_output: The model outputs of the discourse parsing tree follow this format.
- (1:Satellite=Contrast:4,5:Nucleus=span:6) (1:Nucleus=Same-Unit:3,4:Nucleus=Same-Unite:4) (5:Satellite=Attribution:5,6:Nucleus=span:6) (1:Satellite=span:1,2:Nucleus=Elaboration:3) (2:Nucleus=span:2,3:Satellite=Temporal:3)

How to use it for parsing

Put the text paragraph to the file ./data/text_for_inference.txt.
Run the script MUL_main_Infer.py to obtain the RST parsing result. See the script for detailed model output.
We recommend users to run the parser on a GPU-equipped environment.

Citation

@article{liu2021dmrst,
  title={DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing},
  author={Liu, Zhengyuan and Shi, Ke and Chen, Nancy F},
  journal={arXiv preprint arXiv:2110.04518},
  year={2021}
}

@inproceedings{liu2020multilingual,
  title={Multilingual Neural RST Discourse Parsing},
  author={Liu, Zhengyuan and Shi, Ke and Chen, Nancy},
  booktitle={Proceedings of the 28th International Conference on Computational Linguistics},
  pages={6730--6738},
  year={2020}
}

You might also like...

Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

One-Shot Voice Conversion with Weight Adaptive Instance Normalization By Shengjie Huang, Yanyan Xu*, Dengfeng Ke*, Mingjie Chen, Thomas Hain. This rep

31 Dec 7, 2022

Pytorch implementation of One-Shot Affordance Detection

One-shot Affordance Detection PyTorch implementation of our one-shot affordance detection models. This repository contains PyTorch evaluation code, tr

46 Dec 12, 2022

This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

CORA This is the official implementation of the following paper: Akari Asai, Xinyan Yu, Jungo Kasai and Hannaneh Hajishirzi. One Question Answering Mo

59 Dec 28, 2022

A fast python implementation of Ray Tracing in One Weekend using python and Taichi

ray-tracing-one-weekend-taichi A fast python implementation of Ray Tracing in One Weekend using python and Taichi. Taichi is a simple "Domain specific

157 Dec 26, 2022

Pytorch implementation of "Forward Thinking: Building and Training Neural Networks One Layer at a Time"

forward-thinking-pytorch Pytorch implementation of Forward Thinking: Building and Training Neural Networks One Layer at a Time Requirements Python 2.7

65 Oct 6, 2022

Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

face-vid2vid Usage Dataset Preparation cd datasets wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl chmod a+rx youtube-dl python load_

68 Dec 30, 2022

A PyTorch implementation of "From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network" (ICCV2021)

From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network The official code of VisionLAN (ICCV2021). VisionLAN successfully a

81 Dec 12, 2022

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

This repository contains the software implementation of most algorithms used or developed in my research. The LaTeX and Python code for generating the

3 Jan 3, 2023

TransGAN: Two Transformers Can Make One Strong GAN

[Preprint] "TransGAN: Two Transformers Can Make One Strong GAN", Yifan Jiang, Shiyu Chang, Zhangyang Wang

1.5k Jan 7, 2023

Comments

Dimension out of range at pred = torch.argmax(outputs, dim=1)

Hi,

Congrats on the great work. I've been using your trained model to classify arguments in over 800 news in both spanish and portuguese, I further modified the original MUL_main_Infer.py code to match my needs (the only file I modified). I made it capable of loading a group of texts in a directory one by one, splitting them into sentences, and then predicting the labels for each sentence to finally save the output as a single csv file.

I am predicting for 800 txt files now. It works very well until I add more than the first 2 files to the directory. However, the error seems to rise up whenever I choose to add all the files in it. According to the debugging file, the error is in the pred = torch.argmax(outputs, dim=1) line within the module.py file.

This is the error in question:

  File "C:\Users\diego\Escritorio\DMRST\MUL_main_Infer.py", line 114, in <module>
    input_sentences, all_segmentation_pred, all_tree_parsing_pred = inference(model, bert_tokenizer, Test_InputSentences, batch_size)
  File "C:\Users\diego\Escritorio\DMRST\MUL_main_Infer.py", line 63, in inference
    _, _, SPAN_batch, _, predict_EDU_breaks = model.TestingLoss(input_sen_batch, input_EDU_breaks=None, LabelIndex=None,
  File "C:\Users\diego\Escritorio\DMRST\model_depth.py", line 184, in TestingLoss
    EncoderOutputs, Last_Hiddenstates, _, predict_edu_breaks = self.encoder(input_sentence, input_EDU_breaks, is_test=use_pred_segmentation)
  File "C:\Users\diego\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\diego\Escritorio\DMRST\module.py", line 81, in forward
    predict_edu_breaks = self.segmenter.test_segment_loss(embeddings.squeeze())
  File "C:\Users\diego\Escritorio\DMRST\module.py", line 355, in test_segment_loss
    pred = torch.argmax(outputs, dim=1).detach().cpu().numpy().tolist()
IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

Can you please help me solving this issue? I would be extremely thankful for that.

opened by diegopaucarv 3

the relation numbers

hello, i just want to know the relation nums , because i find the relations nums in your paper is 18, but i just find 16 in your references, the 2 is span and same-unit?

opened by jing86736238 1

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Related tags

Overview

Introduction

Package Requirements

Supported Languages

Data Format

How to use it for parsing

Citation

You might also like...

Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

Pytorch implementation of One-Shot Affordance Detection

This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

A fast python implementation of Ray Tracing in One Weekend using python and Taichi

Pytorch implementation of "Forward Thinking: Building and Training Neural Networks One Layer at a Time"

Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

A PyTorch implementation of "From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network" (ICCV2021)

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

TransGAN: Two Transformers Can Make One Strong GAN

Comments

Dimension out of range at pred = torch.argmax(outputs, dim=1)

the relation numbers

Owner

seq-to-mind

People log into different sites every day to get information and browse through these sites one by one

Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks

Tensorflow 2 implementation of the paper: Learning and Evaluating Representations for Deep One-class Classification published at ICLR 2021

Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

Code for the paper "Reinforcement Learning as One Big Sequence Modeling Problem"

Code for our NeurIPS 2021 paper Mining the Benefits of Two-stage and One-stage HOI Detection

Code for the paper "Offline Reinforcement Learning as One Big Sequence Modeling Problem"

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)