This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"

Last update: Dec 27, 2022

Related tags

Deep Learning wl-coref

Overview

Word-Level Coreference Resolution

This is a repository with the code to reproduce the experiments described in the paper of the same name, which was accepted to EMNLP 2021. The paper is available here.

Preparation
Training
Evaluation

Preparation

The following instruction has been tested with Python 3.7 on an Ubuntu 20.04 machine.

You will need:

OntoNotes 5.0 corpus (download here, registration needed)
Python 2.7 to run conll-2012 scripts
Java runtime to run Stanford Parser
Python 3.7+ to run the model
Perl to run conll-2012 evaluation scripts
CUDA-enabled machine (48 GB to train, 4 GB to evaluate)

Extract OntoNotes 5.0 arhive. In case it's in the repo's root directory:
```
 tar -xzvf ontonotes-release-5.0_LDC2013T19.tgz
```
Switch to Python 2.7 environment (where python would run 2.7 version). This is necessary for conll scripts to run correctly. To do it with with conda:
```
 conda create -y --name py27 python=2.7 && conda activate py27
```

Run the conll data preparation scripts (~30min):

 sh get_conll_data.sh ontonotes-release-5.0 data

Download conll scorers and Stanford Parser:
```
 sh get_third_party.sh
```

Prepare your environment. To do it with conda:

 conda create -y --name wl-coref python=3.7 openjdk perl
 conda activate wl-coref
 python -m pip install -r requirements.txt

Build the corpus in jsonlines format (~20 min):

 python convert_to_jsonlines.py data/conll-2012/ --out-dir data
 python convert_to_heads.py

You're all set!

Training

If you have completed all the steps in the previous section, then just run:

python run.py train roberta

Use -h flag for more parameters and CUDA_VISIBLE_DEVICES environment variable to limit the cuda devices visible to the script. Refer to config.toml to modify existing model configurations or create your own.

Evaluation

Make sure that you have successfully completed all steps of the Preparation section.

Download and save the pretrained model to the data directory.

 https://www.dropbox.com/s/vf7zadyksgj40zu/roberta_%28e20_2021.05.02_01.16%29_release.pt?dl=0

Generate the conll-formatted output:

 python run.py eval roberta --data-split test

Run the conll-2012 scripts to obtain the metrics:
```
 python calculate_conll.py roberta test 20
```

Comments

about the training process

Here is the following error i met: Epoch 1: bc/cnn/00/cnn_0001 c_loss: 2.11580 s_loss: 0.57502: 14% 394/2802 [01:04<04:54, 8.18docs/s] It seems the training process stopped. can u tell me why? thanks.

opened by leileilin 27
some confusions about convert_to_head.py

Hello, I have a new question about convert_ to_ heads.py file, in which some span and clusters will be deleted. Is this the case as follows? In those cases "A" and "A & B" are different spans with the same head word, "A". In our implementation such cases were simply discarded from the training set, because they were few and we were able to perform well, even though we couldn't predict any of such cases during inference. like u said in #2 thanks.

opened by leileilin 14
about chinese dataset

Hello, thank you for your great work of open source. I want to process Chinese datasets according to your process, but in convert_ to_ jsonlines.py. Py this step reports an error, do you know why? Thanks.

opened by leileilin 14
what is the equivalent of "edu.stanford.nlp.trees.EnglishGrammaticalStructure" for arabic coreference resolution task

Hi,

I can't find the ArabicGrammaticalStructure class from the nlp.stanford. It works for english data but not for Arabic .

Converting constituents to dependencies... development: 0% 0/44 [00:00<?, ?docs/s]Exception in thread "main" java.lang.IllegalArgumentException: No head rule defined for PV+PVSUFF using class edu.stanford.nlp.trees.SemanticHeadFinder in PV+PVSUFF-39 at edu.stanford.nlp.trees.AbstractCollinsHeadFinder.determineNonTrivialHead(AbstractCollinsHeadFinder.java:222) at edu.stanford.nlp.trees.SemanticHeadFinder.determineNonTrivialHead(SemanticHeadFinder.java:348) at edu.stanford.nlp.trees.AbstractCollinsHeadFinder.determineHead(AbstractCollinsHeadFinder.java:179) at edu.stanford.nlp.trees.TreeGraphNode.percolateHeads(TreeGraphNode.java:476) at edu.stanford.nlp.trees.TreeGraphNode.percolateHeads(TreeGraphNode.java:474) at edu.stanford.nlp.trees.TreeGraphNode.percolateHeads(TreeGraphNode.java:474) at edu.stanford.nlp.trees.TreeGraphNode.percolateHeads(TreeGraphNode.java:474) at edu.stanford.nlp.trees.TreeGraphNode.percolateHeads(TreeGraphNode.java:474) at edu.stanford.nlp.trees.GrammaticalStructure.(GrammaticalStructure.java:94) at edu.stanford.nlp.trees.EnglishGrammaticalStructure.(EnglishGrammaticalStructure.java:86) at edu.stanford.nlp.trees.EnglishGrammaticalStructure.(EnglishGrammaticalStructure.java:66) at edu.stanford.nlp.parser.lexparser.EnglishTreebankParserParams.getGrammaticalStructure(EnglishTreebankParserParams.java:2271) at edu.stanford.nlp.trees.GrammaticalStructure$TreeBankGrammaticalStructureWrapper$GsIterator.primeGs(GrammaticalStructure.java:1361) at edu.stanford.nlp.trees.GrammaticalStructure$TreeBankGrammaticalStructureWrapper$GsIterator.(GrammaticalStructure.java:1348) at edu.stanford.nlp.trees.GrammaticalStructure$TreeBankGrammaticalStructureWrapper.iterator(GrammaticalStructure.java:1325) at edu.stanford.nlp.trees.GrammaticalStructure.main(GrammaticalStructure.java:1604) development: 0% 0/44 [00:00<?, ?docs/s] Traceback (most recent call last): File "convert_to_jsonlines.py", line 392, in convert_con_to_dep(args.tmp_dir, conll_filenames) File "convert_to_jsonlines.py", line 195, in convert_con_to_dep subprocess.run(cmd, check=True, stdout=out) File "/home/souid/anaconda3/envs/wl-coref/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['java', '-cp', 'downloads/stanford-parser.jar', 'edu.stanford.nlp.trees.EnglishGrammaticalStructure', '-basic', '-keepPunct', '-conllx', '-treeFile', 'temp/data/conll-2012/v4/data/development/data/arabic/annotations/nw/ann/00/ann_0010.v4_gold_conll']' returned non-zero exit status 1.

opened by aymen-souid 11
10) in the response"">

Conll perl script refusing to score because of "too many repeated mentions (>10) in the response"

I ran the preparation scripts successfully.

Downloaded the roberta checkpoint from dropbox link, and placed it in data folder.

Ran the command: python calculate_conll.py roberta test 20

I noticed some errors due to subprocess because I was using python3.6 instead of python3.7.

Error was: unexpected keyword argument 'capture_output'

Fixed the issue with this

But then I got an error: 'NoneType' object has no attribute 'group' origin of error --> line 15

I ran the perl script directly in bash: perl reference-coreference-scorers/scorer.pl all data/conll_logs/roberta_test_e20.gold.conll data/conll_logs/roberta_test_e20.pred.conll none

MUC came out to be 86 (f1) but while calculating b3, I got this error: Found too many repeated mentions (> 10) in the response, so refusing to score. Please fix the output

I think it is because of this error only that the line 15 above was throwing that error (because output was empty).

How to proceed forward now? How to evaluate the results?

opened by ritwikmishra 9

Inference from the box?

Hi! Thank you for posting the model. Could you please provide how to make inference from the box? If I understood correctly, model from dropbox has already been fitted, so we should be able to run it, but by the design model require original data and building of optimisers to run

class CorefModel:
    Attributes:
        config (coref.config.Config): the model's configuration,
            see config.toml for the details
        epochs_trained (int): number of epochs the model has been trained for
        trainable (Dict[str, torch.nn.Module]): trainable submodules with their
            names used as keys
        training (bool): used to toggle train/eval modes

    Submodules (in the order of their usage in the pipeline):
        tokenizer (transformers.AutoTokenizer)
        bert (transformers.AutoModel)
        we (WordEncoder)
        rough_scorer (RoughScorer)
        pw (PairwiseEncoder)
        a_scorer (AnaphoricityScorer)
        sp (SpanPredictor)
    """
    def __init__(self,
                 config_path: str,
                 section: str,
                 epochs_trained: int = 0):
        """
        A newly created model is set to evaluation mode.

        Args:
            config_path (str): the path to the toml file with the configuration
            section (str): the selected section of the config file
            epochs_trained (int): the number of epochs finished
                (useful for warm start)
        """
        self.config = CorefModel._load_config(config_path, section)
        self.epochs_trained = epochs_trained
        self._docs: Dict[str, List[Doc]] = {}
        self._build_model()
        self._build_optimizers()
        self._set_training(False)
        self._coref_criterion = CorefLoss(self.config.bce_loss_weight)
        self._span_criterion = torch.nn.CrossEntropyLoss(reduction="sum")

So maybe there is an option to run it without all this stuff?

opened by Dzz1th 7

Questions_dataset-representation
Based on my observation in this code base, training use the following features, e.g: cased_words, sent_id, speaker, pos, deprel, head, clusters.

then converted into: cased_words, sent_id, speaker, pos, deprel, head, head2span, word_clusters, span_clusters.

while in inference data example, the feature used only cased_words, sent_id, and optionally speaker information.

My questions is.

how we get the pos, deprel, head, and clusters data from in inference mode? It is derived from cased_words or not?

in training mode, is the speaker, pos, deprel, head, clusters data is used as well?

Thank you
opened by fajarmuslim 7
shall I use convert_to_heads when using CoNLL-U?

Hi, thanks so much for your work! I have a question regarding convert_to_heads.py script. I'm trying to make RoBERTa learn coreference resolution, but my data is in .conllu format. I have quite hard time trying to preprocess data/modify some of your code to make it work. Can you share some insights/thoughts on that? I would be very much obliged.

Cheers

opened by brgsk 7
about the training data format

Hello, I'd like to ask about the .jsonlines file executived through convert_ to_ jsonlines. py, Can some attributes in the jsonlines file be successfully trained after being discarded? Such as speaker, pos.

opened by leileilin 5
Inference on conversation.
Hello, great work.

I had two questions:

what sent_id in the sample input file supposed to refer to??

If I want to make an inference for dialogue like tc genre, what should be the conversation format ??
opened by maherr13 5
Questions about training

Currently, when running this source code. I have an error cuda running out of memory. Since single GPU have only 32GB memory.

but, in another side, I have access to server which have 8 GPU (each of them having 32GB memory). Can I run this training experiment with the paralel mode?

If it can, how to achieve that?

thanks in advance..

opened by fajarmuslim 5
Reduce training memory requirement

CUDA-enabled machine (48 GB to train, 4 GB to evaluate)

@vdobrovolskii friendly ping Are 48GB really needed to train? Can't we train longer (how long) with less ? couldn't your project leverage FP16, FP8 and other optimizations ? You can get them out of the box if you use roberta from the Transformers library https://github.com/huggingface/transformers Also there is accelerate https://huggingface.co/docs/accelerate/index

I have a 3070 with 8GB of GDDR6 :/

opened by LifeIsStrange 2

This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"

Related tags

Overview

Word-Level Coreference Resolution

Table of contents

Preparation

Training

Evaluation

Comments

Owner

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

This GitHub repository contains code used for plots in NeurIPS 2021 paper 'Stochastic Multi-Armed Bandits with Control Variates.'

Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Code for EMNLP 2021 main conference paper "Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification"

PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".

Code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”

Code for our paper Aspect Sentiment Quad Prediction as Paraphrase Generation in EMNLP 2021.

RGBD-Net - This repository contains a pytorch lightning implementation for the 3DV 2021 RGBD-Net paper.

Implementation for the EMNLP 2021 paper "Interactive Machine Comprehension with Dynamic Knowledge Graphs".

Related resources for our EMNLP 2021 paper

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)

EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation.