A toolkit for document-level event extraction, containing some SOTA model implementations

Tong Zhu（朱桐）

Last update: Dec 22, 2022

Related tags

Overview

❤️ A Toolkit for Document-level Event Extraction with & without Triggers

Hi, there 👋 . Thanks for your stay in this repo.

This project aims at building a universal toolkit for extracting events automatically from documents 📄 (long texts). The details can be found in our paper: Efficient Document-level Event Extraction via Pseudo-Trigger-aware Pruned Complete Graph

🔥 We have an online demo [here] (available in 9:00-17:00 UTC+8).

Currently, this repo contains PTPCG, Doc2EDAG and GIT models, and these models are all designed for document-level event extraction without triggers. Here are some basic descriptions to help you understand the characteristics of each model:

PTPCG is a fast and lightweight model which takes only 3.6% GPU training resources than GIT, and it is 8.5x faster when inference. Besides, PTPCG is better than GIT and Doc2EDAG on o2o (one instance per doc) docs. Overall metrics scores are higher than Doc2EDAG and competitive to GIT. We tested this model on the LIC'21 information extraction shared task and won top5 prize 🏆 (team: 广告位招租). Availability are confirmed. Supplements are included here (including detailed examples, the BK algorithm, hyperparameters and additional experiment results).
GIT is the SOTA model (Doc2EDAG modification), which shows the great power on o2m (multi instances with the same event type per doc) and m2m (multi instances with multi event types per doc) docs. GIT is slower than Doc2EDAG and need more resources to train and inference.
Doc2EDAG is an auto-regressive model with good compatibilities on DocEE without triggers and is a widely used baseline for DocEE. This repo is developed based on Doc2EDAG.

⚙️ Installation

Make sure you have the following dependencies installed.

Python 3.7.7
- torch==1.5.1 # should be OK with higher torch version
- pytorch-mcrf==0.0.3 # for MaskedCRF
- gpu-watchmen==0.3.8 # if you wanna wait for a vacant GPU via gpu-watchmen
- loguru==0.5.3
- matplotlib==3.3.0
- numpy==1.19.0
- transformers==4.9.1
- dgl-cu92==0.6.1 # find a version that is compatable with your CUDA version
- tqdm==4.53.0
- networkx==2.4
- tensorboard==2.4.1

# don't forget to install the dee package
$ git clone https://github.com/Spico197/DocEE.git
$ pip install -e .
# or install directly from git
$ pip install git+https://github.com/Spico197/DocEE.git

🚀 Quick Start

💾 Data Preprocessing

# ChFinAnn
## You can download Data.zip from the original repo: https://github.com/dolphin-zs/Doc2EDAG
$ unzip Data.zip
$ cd Data
# generate data with doc type (o2o, o2m, m2m) for better evaluation
$ python stat.py

# DuEE-fin
## If you want to win the test, you should check the codes and make further modifications,
## since each role may refer to multiple entities in DuEE-fin.
## Our PTPCG can help with this situation, all you need is to check the data preprocessing
## and check `predict_span_role()` method in `event_table.py`.
## We **do not** perform such magic tricks in the paper to make fair comparisons with Doc2EDAG and GIT.
$ # downloading datasets from https://aistudio.baidu.com/aistudio/competition/detail/65
$ cd Data/DuEEData  # paste train.json and dev.json into Data/DuEEData folder and run:
$ python build_data.py

📋 To Reproduce Results in Paper

Doc2EDAG and GIT are already integrated in this repo, and more models are planned to be added.

If you want to reproduce the PTPCG results, or run other trials, please follow the instructions below.

Before running any bash script, please ensure bert_model has been correctly set.

Doc2EDAG

Tip: At least 4 * NVIDIA V100 GPU (at least 16GB) cards are required to run Doc2EDAG models.

# run on ChFinAnn dataset
$ nohup bash scripts/run_doc2edag.sh 1>Logs/Doc2EDAG_reproduction.log 2>&1 &
$ tail -f Logs/Doc2EDAG_reproduction.log

# run on DuEE-fin dataset without trigger
$ nohup bash scripts/run_doc2edag_dueefin.sh.sh 1>Logs/Doc2EDAG_DuEE_fin.log 2>&1 &
$ tail -f Logs/Doc2EDAG_DuEE_fin.log

# run on DuEE-fin dataset with trigger
$ nohup bash scripts/run_doc2edag_dueefin_withtgg.sh 1>Logs/Doc2EDAG_DuEE_fin_with_trigger.log 2>&1 &
$ tail -f Logs/Doc2EDAG_DuEE_fin_with_trigger.log

Tip: At least 4 * NVIDIA V100 GPU (32GB) cards are required to run GIT models.

# run on ChFinAnn dataset
$ nohup bash scripts/run_git.sh 1>Logs/GIT_reproduction.log 2>&1 &
$ tail -f Logs/GIT_reproduction.log

# run on DuEE-fin dataset without trigger
$ nohup bash scripts/run_git_dueefin.sh 1>Logs/GIT_DuEE_fin.log 2>&1 &
$ tail -f Logs/GIT_DuEE_fin.log

# run on DuEE-fin dataset with trigger
$ nohup bash scripts/run_git_dueefin_withtgg.sh 1>Logs/GIT_DuEE_fin_with_trigger.log 2>&1 &
$ tail -f Logs/GIT_DuEE_fin_with_trigger.log

PTPCG

Tip: At least 1 * 1080Ti (at least 9GB) card is required to run PTPCG.

Default: |R| = 1, which means only the first (pseudo) trigger is selected.

# run on ChFinAnn dataset (to reproduce |R|=1 results in Table 1 of the PTPCG paper)
$ nohup bash scripts/run_ptpcg.sh 1>Logs/PTPCG_R1_reproduction.log 2>&1 &
$ tail -f Logs/PTPCG_R1_reproduction.log

# run on DuEE-fin dataset without annotated trigger (to reproduce |R|=1, Tgg=× results in Table 3 of the PTPCG paper)
$ nohup bash scripts/run_ptpcg_dueefin.sh 1>Logs/PTPCG_P1-DuEE_fin.log 2>&1 &
$ tail -f Logs/PTPCG_P1-DuEE_fin.log

# run on DuEE-fin dataset with annotated trigger and without pseudo trigger (to reproduce |R|=0, Tgg=√ results in Table 3 of the PTPCG paper)
$ nohup bash scripts/run_ptpcg_dueefin_withtgg.sh 1>Logs/PTPCG_T1-DuEE_fin.log 2>&1 &
$ tail -f Logs/PTPCG_T1-DuEE_fin.log

# run on DuEE-fin dataset with annotated trigger and one pseudo trigger (to reproduce |R|=1, Tgg=√ results in Table 3 of the PTPCG paper)
$ nohup bash scripts/run_ptpcg_dueefin_withtgg_withptgg.sh 1>Logs/PTPCG_P1T1-DuEE_fin.log 2>&1 &
$ tail -f Logs/PTPCG_P1T1-DuEE_fin.log

#PseudoTgg	Setting	Log	Task Dump
1	189Cloud	189Cloud	189Cloud

Explainations on PTPCG hyperparameters in the executable script:

# whether to use max clique decoding strategy, brute-force if set to False
max_clique_decode = True
# number of triggers when training, to make all arguments as pseudo triggers, set to higher numbers like `10`
num_triggers = 1
# number of triggers when evaluating, set to `-1` to make all arguments as pseudo triggers
eval_num_triggers = 1
# put additional pseudo triggers into the graph, make full use of the pseudo triggers
with_left_trigger = True
# make the trigger graph to be directed
directed_trigger_graph = True
# run mode is used in `dee/tasks/dee_task.py/DEETaskSetting`
run_mode = 'full'
# at least one combination (see paper for more information)
at_least_one_comb = True
# whether to include regex matched entities
include_complementary_ents = True
# event schemas, check `dee/event_types` for all support schemas
event_type_template = 'zheng2019_trigger_graph'

⚽ Find Pseudo Triggers

Please check Data/trigger.py for more details. In general, you should first convert your data into acceptable format (like typed_train.json after building ChFinAnn).

Then, you can run the command below to generate event schemas with pseudo triggers and importance scores:

$ cd Data
$ python trigger.py <max number of pseudo triggers>

📚 Instructions

dee has evoluted to a toolkit package, make sure to install the package first: pip install -e .
Please change the path to BERT to load the tokenizer.
To run on ChFinAnn dataset, you should generate typed_(train|dev|test).json files first via cd Data && python stat.py after Data.zip file unzipped into the Data folder.
It's not DDP model by default. If you want to train across different devices, add a --parallel_decorate flag after python run_dee_task.py.
Comments starting with tzhu are added by Tong Zhu to help understanding the codes, not written in the original Doc2EDAG repo.
For trials on DuEE-fin dataset, if you want to submit generated files to online platform, check the dueefin_post_process.py to make further post process to meet the format requirments.
I had tried lots of wasted models, so there were redundancies. For better understanding the codes and get rid of any potential distractions, I delete them from this repo. There may be some other redundancies and you may find there are unused methods or models, feel free to touch me and make the repo cleaner and nicer together~ Btw, there may be some issues if some files are removed directly. Feel free to reach me by openning an issue or email. I check the GitHub site messages everyday regularly and emails are received instantly during weekdays.

🙋 FAQ

Q: What's the evluation strategy to calculate the final micro-F1 scores?
- A: Micro-F1 scores are calculated by counting the final number of event role predictions' TP , FP and FNs
Q: What is teacher_prob doing ?
- A: It's used in the scheduled sampling strategy, indicating the probability to use the gold_span. If teacher_prob == 0.7, then there is 70% probability to use gold_span during training. teacher_prob will decrease during training.
What's GreedyDec?
- A: Greedy decoding is a prediction generation strategy. We can fill in the event table by finding the first corresponding entity for each field (argument role). That's why it's called a Greedy method.
Q: How to make predictions and get readable results with a trained model?
- A: Such inference interface is provided in dee/tasks/dee_task.py/DEETask.predict_one() (Convenient online serving interface).
Q: What is o2o, o2m and m2m?
- A: They are abbreviations for one-type one-instance per doc, one-type with multiple instances per doc and multiple types per doc.

📜 Citation

This work has not been published yet, please cite the arXiv preview version first 😉

@misc{zhu-et-al-2021-ptpcg,
  title={Efficient Document-level Event Extraction via Pseudo-Trigger-aware Pruned Complete Graph}, 
  author={Tong Zhu and Xiaoye Qu and Wenliang Chen and Zhefeng Wang and Baoxing Huai and Nicholas Jing Yuan and Min Zhang},
  year={2021},
  eprint={2112.06013},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}

🔑 Licence

MIT Licence

🤘 Furthermore

This repo is still under development. If you find any bugs, don't hesitate to drop us an issue.

Thanks~

Comments

这个库里面哪些代码是ptpcg这个算法用到的
Agreement

[x] Fill the space in brackets with x to check the agreement items.

[ ] Before submitting this issue, I've fully checked the instructions in README.md.

[ ] Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.

[ ] This issue is about the toolkit itself, not Python, pip or other programming basics.

[ ] I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

我想单独看这个算法的相关的部分内容不看其他的是否有历史的分支项目代码

Environment

| Environment | Values | | :------------------------ | :------------ | | System | Windows/Linux | | GPU Device | | | CUDA Version | | | Python Version | | | PyTorch Version | | | dee (the Toolkit) Version | |

Full Log

Log:
discussion question
opened by xxllp 29
新数据集的训练
Agreement

[x] Fill the space in brackets with x to check the agreement items.

[ ] Before submitting this issue, I've fully checked the instructions in README.md.

[ ] Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.

[ ] This issue is about the toolkit itself, not Python, pip or other programming basics.

[ ] I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

在自己新数据的训练数据处理这块如何入手有无具体的步骤指引

Environment

| Environment | Values | | :------------------------ | :------------ | | System | Windows/Linux | | GPU Device | | | CUDA Version | | | Python Version | | | PyTorch Version | | | dee (the Toolkit) Version | |
question
opened by xxllp 18
模型训练问题
Agreement

[x] Fill the space in brackets with x to check the agreement items.

[x] Before submitting this issue, I've fully checked the instructions in README.md.

[x] Before submitting this issue, I'd searched in the issue area and didn't find a solved issue that covers my problem.

[x] This issue is about the toolkit itself, not Python, pip or other programming basics.

[x] I understand if I do not check all the agreemnt items above, my issue MAY BE CLOSED OR REMOVED WITHOUT FURTHER EXPLANATIONS.

Problem

老师你好，我现在想重新训练ptpcg模型，运行run_ptpcg.sh发现我的电脑配置太低，所以准备申请云平台进行加速。我阅读了dee_task.py，现在我是否通过shell运行run_dee_task.py，就可以获得我想要的模型在Exps文件中？（不知道为啥，dee_task.train(save_cpt_flag=in_argv.save_cpt_flag)中的save_cpt_flag=False，意思是不保存模型吗？）
question
opened by sauceplus 11
PTPCG预测时灵活性的问题

hi，看了PTPCG这个模型，对预测过程有个问题。假设邻接矩阵已经预测出来，那么对应的Combinations也就确定了，下面就要根据每一个预测出的事件类型和Combinations里的每一个Combination进行论元角色预测。这样是不是有一个潜在的假设：每一个预测出的事件类型都有同样数量的Combination，即每个event_type都有同样数量的event_object。不知道我的理解是否有误？
discussion

opened by Rover912 11
触发词的问题
我又来了，还是有问题想请问下：

按照论文里面的pipeline 只有单触法词的模型训练（非伪触法词），触法词识别是先ner 然后作为图构建的节点在构建子图分解的时候这个触法词节点是作为最大子团来的吗？ 2.代码里面如何判断那些mention是伪触法词(或者触法词) 需要在span_context_list 里面获取对应的下标

discussion
opened by xxllp 10
Reproduction of Doc2EDAG

** Idea sharing ** While sharing what you want to do, make sure to protect your ideas.

** Problems ** If you have any questions about event extraction, make sure you have read the latest papers or searched on the Internet.

** Others ** Other things you may want to share or discuss. Hello, Spico! I'm very glad to talk with you about event extraction. Does the order of event type (o2o, o2m, m2m) in training data important for model performance? I find that the reproduction of Doc2EDAG in your paper is (P=86.2, R=70.8, F=79.0, overall scores), but my reproduction is only (P=79.7, R=73.2, F=76.3, overall scores). I just git clone code from the Github repo in Doc2EDAG paper and run the code without modified data preprocessing.
help wanted discussion

opened by CarlanLark 9

PTPCG 分布式训练的效率

** Idea sharing ** While sharing what you want to do, make sure to protect your ideas.

** Problems ** 参考了其他运行的命令执行如下命令

TASK_NAME='PTPCG_R1_reproduction'
CUDA='0,1,2,3'
NUM_GPU=4
MODEL_NAME='TriggerAwarePrunedCompleteGraph'


CUDA_VISIBLE_DEVICES=${CUDA} ./scripts/train_multi.sh ${NUM_GPU} --task_name ${TASK_NAME}\
    --use_bert=False \
    --bert_model='/data/xxl/roberta-base-chinese/' \
    --model_type=${MODEL_NAME} \
    --cpt_file_name=${MODEL_NAME} \
    --resume_latest_cpt=False \
	--save_cpt_flag=False \
    --save_best_cpt=True \
    --remove_last_cpt=True \
    --resume_latest_cpt=False \
    --optimizer='adam' \
    --learning_rate=0.0005 \
    --dropout=0.1 \
    --gradient_accumulation_steps=8 \
    --train_batch_size=64 \
    --eval_batch_size=16 \
    --max_clique_decode=True \
    --num_triggers=1 \
    --eval_num_triggers=1 \
    --with_left_trigger=True \
    --directed_trigger_graph=True \
    --use_scheduled_sampling=True \
    --schedule_epoch_start=10 \
    --schedule_epoch_length=10 \
    --num_train_epochs=100 \
    --run_mode='full' \
    --skip_train=False \
	--filtered_data_types='o2o,o2m,m2m' \
    --re_eval_flag=False \
    --add_greedy_dec=False \
    --num_lstm_layers=2 \
    --hidden_size=768 \
    --biaffine_hidden_size=512 \
    --biaffine_hard_threshold=0.5 \
    --at_least_one_comb=True \
    --include_complementary_ents=True \
    --event_type_template='zheng2019_trigger_graph' \
    --use_span_lstm=True \
    --span_lstm_num_layer=2 \
    --role_by_encoding=True \
    --use_token_role=True \
    --ment_feature_type='concat' \
    --ment_type_hidden_size=32 \
    --parallel_decorate

运行的几个卡我看都是有使用起来的

但是最终的运行速度还是没有提高（20min），比较单卡的时间还要长一些。这块我也不是很懂是不是缺少啥

discussion

opened by xxllp 8

关于“Before running any bash script, please ensure has been correctly set.bert_model”
你好老师，我按照您说的纠正了一些问题，很高兴项目现在已经可以运行了。但是还有一些小问题实在无法解决需要向您请教，下面我将陈述我的问题。

您readme中“Before running any bash script, please ensure has been correctly set.bert_model”所指的bert模型是Google官方开源的中文模型吗（https://github.com/google-research/bert），？

由于我的运行结果中分词存在问题（见图1），所有的role都只有一个字或者标点，所以我怀疑是bert没有导入的结果，因为我并没有修改您tump中的task_setting.json "bert_model": "bert-base-chinese"，所以我的怀疑合理吗？图1

question
opened by sauceplus 8
Take Model as API to Extract event in Document

您好，我是做其他的NLP任务的，但是对抽取文档里的Event很感兴趣，发现了您的工作

通读了README之后看到了很详细的复现方法，但是想问一下是否有公开已经训练的模型，以及inference的API。可以比较方便的直接作为一个数据的预处理方法，在自己的数据上，获得文档中的事件，而不需要重新训练和阅读代码呢？

非常感谢您的建议
documentation discussion

opened by Ricardokevins 7
为什么`ner_token_labels` 里面没有包含扩充的OtherType的实体？

** Problems ** 请问为什么在NER模型训练部分输入进模型的ner_token_labels 里面没有论文中提到扩充的Money, Time等实体？

我发现在这里会对entity label 进行in的判断，判断基于的dict来自于 DEEExample。但是这个list里面没有B-OtherType 和 I-OtherType.
bug documentation discussion

opened by chenxshuo 6
DDP问题 - IndexError: Caught IndexError in replica 0 on device 0

老师您好，在使用单机多卡的时候，会出现以下报错：

Traceback (most recent call last): File "/data/home/qianbenchen/DocEE-main/dee/tasks/dee_task.py", line 587, in get_loss_on_batch teacher_prob=teacher_prob, File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply output.reraise() File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/_utils.py", line 428, in reraise raise self.exc_type(msg) IndexError: Caught IndexError in replica 0 on device 0. Original Traceback (most recent call last): File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker output = module(*input, **kwargs) File "/data/home/qianbenchen/envs/torch/venv/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/data/home/qianbenchen/DocEE-main/dee/models/trigger_aware.py", line 172, in forward ent_fix_mode=self.config.ent_fix_mode, File "/data/home/qianbenchen/DocEE-main/dee/modules/doc_info.py", line 305, in get_doc_arg_rel_info_list ) = get_span_mention_info(span_dranges_list, doc_token_type_mat) File "/data/home/qianbenchen/DocEE-main/dee/modules/doc_info.py", line 16, in get_span_mention_info mention_type_list.append(doc_token_type_list[sent_idx][char_s]) IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "run_dee_task.py", line 274, in dee_task.train(save_cpt_flag=in_argv.save_cpt_flag) File "/data/home/qianbenchen/DocEE-main/dee/tasks/dee_task.py", line 656, in train base_epoch_idx=resume_base_epoch, File "/data/home/qianbenchen/DocEE-main/dee/tasks/base_task.py", line 693, in base_train total_loss = get_loss_func(self, batch, **kwargs_dict1) File "/data/home/qianbenchen/DocEE-main/dee/tasks/dee_task.py", line 598, in get_loss_on_batch raise Exception("Cannot get the loss")

请问是否有得到解决呢？谢谢！
question

opened by chenqianben 6
Readme first before opening a new issue when error occurs. 遇到报错提issue之前先看这里

For toolkit usage errors, you must strictly follow the Toolkit usage issue template to open a new issue. 对于使用时报错等工具使用类的问题，必须严格使用 Toolkit usage issue 模板进行提问。

Otherwise, your issue may be closed directly without further explanations. 否则您的 issue 可能会被无解释地直接关闭。

The template can be found when you open a new issue. 该模板可在新建 issue 时找到。

opened by Spico197 0

Releases(v0.3.1)

v0.3.1(May 26, 2022)
2022/5/26 - v0.3.1: add more docs, change instance evaluation with event type included as mentioned in #7.

2022/5/26 - v0.3.0: add DEPPNModel (beta), change luge_* templates into dueefin_*, add OtherType as default common_fields in dueefin_(w|wo)_tgg templates, add isort tool to help formatting

Source code(tar.gz)
Source code(zip)
v0.2.2(Dec 16, 2021)
remove LSTMMTL2EDAGModel, EventTableForIndependentTypeCombination, DEEMultiStepTriggeringFeatureConverter and DEEMultiStepTriggeringFeature which are redundant

Update test cases via zheng2019_trigger_graph schema

Codes are formatted by black

Source code(tar.gz)
Source code(zip)
dee-0.2.2-py3-none-any.whl(157.50 KB)
dee-0.2.2.tar.gz(137.86 KB)

Owner

Tong Zhu（朱桐）

GitHub

Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

ATLOP Code for AAAI 2021 paper Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling. If you make use of this co

146 Nov 29, 2022

Implementation for our AAAI2021 paper (Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).

SSAN Introduction This is the pytorch implementation of the SSAN model (see our AAAI2021 paper: Entity Structure Within and Throughout: Modeling Menti

69 Nov 15, 2022

Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification

STAM - Pytorch Implementation of STAM (Space Time Attention Model), yet another pure and simple SOTA attention model that bests all previous models in

109 Dec 28, 2022

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task

Siamese Deep Neural Networks for Semantic Text Similarity PyTorch A repository c

32 Dec 15, 2022

Generic Event Boundary Detection: A Benchmark for Event Segmentation

Generic Event Boundary Detection: A Benchmark for Event Segmentation We release our data annotation & baseline codes for detecting generic event bound

47 Nov 22, 2022

Scikit-event-correlation - Event Correlation and Forecasting over High Dimensional Streaming Sensor Data algorithms

scikit-event-correlation Event Correlation and Changing Detection Algorithm Theo

5 Oct 30, 2022

Event-forecasting - Event Forecasting Algorithms With Python

event-forecasting Event Forecasting Algorithms Theory Correlating events in comp

4 Feb 15, 2022

Event sourced bank - A wide-and-shallow example using the Python event sourcing library

Event Sourced Bank A "wide but shallow" example of using the Python event sourci

3 Mar 9, 2022

Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

Cross-media Structured Common Space for Multimedia Event Extraction Table of Contents Overview Requirements Data Quickstart Citation Overview The code

49 Nov 21, 2022

An implementation for `Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction`

Text2Event An implementation for Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction Please contact Yaojie Lu (@

153 Jan 7, 2023

SOTA model in CIFAR10

A PyTorch Implementation of CIFAR Tricks 调研了CIFAR10数据集上各种trick，数据增强，正则化方法，并进行了实现。目前项目告一段落，如果有更好的想法，或者希望一起维护这个项目可以提issue或者在我的主页找到我的联系方式。 0. Requirement

58 Dec 21, 2022

Key information extraction from invoice document with Graph Convolution Network

Key Information Extraction from Scanned Invoices Key information extraction from invoice document with Graph Convolution Network Related blog post fro

39 Dec 16, 2022

Code and dataset for ACL2018 paper "Exploiting Document Knowledge for Aspect-level Sentiment Classification"

Aspect-level Sentiment Classification Code and dataset for ACL2018 [paper] ‘‘Exploiting Document Knowledge for Aspect-level Sentiment Classification’’

146 Nov 29, 2022

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Introduction One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing". Users

18 Dec 11, 2022

Independent and minimal implementations of some reinforcement learning algorithms using PyTorch (including PPO, A3C, A2C, ...).

PyTorch RL Minimal Implementations There are implementations of some reinforcement learning algorithms, whose characteristics are as follow: Less pack

4 Dec 31, 2022

Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Torch-template-for-deep-learning Pytorch implementations of some **classical backbone CNNs, data enhancement, torch loss, attention, visualization and

270 Dec 31, 2022

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

1.2k Jan 4, 2023

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

REval Table of Contents Introduction Overview Requirements Installation Probing Usage Citation License ?? Introduction REval is a simple framework for

13 Jan 6, 2023

Code for technical report "An Improved Baseline for Sentence-level Relation Extraction".

RE_improved_baseline Code for technical report "An Improved Baseline for Sentence-level Relation Extraction". Requirements torch >= 1.8.1 transformers

74 Nov 29, 2022

A toolkit for document-level event extraction, containing some SOTA model implementations

Related tags

Overview

❤️ A Toolkit for Document-level Event Extraction with & without Triggers

⚙️ Installation

🚀 Quick Start

💾 Data Preprocessing

📋 To Reproduce Results in Paper

⚽ Find Pseudo Triggers

📚 Instructions

🙋 FAQ

📜 Citation

🔑 Licence

🤘 Furthermore

Comments

Agreement

Problem

Environment

Full Log

Agreement

Problem

Environment

Agreement

Problem

Releases(v0.3.1)

v0.3.1(May 26, 2022)

v0.2.2(Dec 16, 2021)

Owner

Tong Zhu（朱桐）

Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

Implementation for our AAAI2021 paper (Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).

Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task

Generic Event Boundary Detection: A Benchmark for Event Segmentation

Scikit-event-correlation - Event Correlation and Forecasting over High Dimensional Streaming Sensor Data algorithms

Event-forecasting - Event Forecasting Algorithms With Python

Event sourced bank - A wide-and-shallow example using the Python event sourcing library

Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

An implementation for `Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction`

SOTA model in CIFAR10

Key information extraction from invoice document with Graph Convolution Network

Code and dataset for ACL2018 paper "Exploiting Document Knowledge for Aspect-level Sentiment Classification"

One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Independent and minimal implementations of some reinforcement learning algorithms using PyTorch (including PPO, A3C, A2C, ...).

Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

Code for technical report "An Improved Baseline for Sentence-level Relation Extraction".