EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation.

Last update: Oct 28, 2022

Related tags

Deep Learning clads

Overview

This repository contains data and code for our EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation. Please contact me at [email protected] for any question.

Please cite this paper if you use our code or data.

@InProceedings{clads-emnlp,
  author =      "Laura Perez-Beltrachini and Mirella Lapata",
  title =       "Models and Datasets for Cross-Lingual Summarisation",
  booktitle =   "Proceedings of The 2021 Conference on Empirical Methods in Natural Language Processing ",
  year =        "2021",
  address =     "Punta Cana, Dominican Republic",
}

The XWikis Corpus

You can create the corpus using the instructions below. The original XWikis corpus is available at XWikis.

Instructions to re-create our corpus and extract other languages are available here.

Cross-lingual Summarisation Code

Our code is based on Fairseq and mBART/mBART50. You'll find our clone of Fairseq and the code extension to implement our models here and instructions to pre-process the data, and train and evaluate our models here.

Models' Outputs

For AILAB: Cross Lingual Retrieval on Yelp Search Engine

Cross-lingual Information Retrieval Model for Document Search Train Phase CUDA_VISIBLE_DEVICES="0,1,2,3" \ python -m torch.distributed.launch --nproc_

104 Nov 12, 2022

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

16 Jul 16, 2022

“Data Augmentation for Cross-Domain Named Entity Recognition” (EMNLP 2021)

Data Augmentation for Cross-Domain Named Entity Recognition Authors: Shuguang Chen, Gustavo Aguilar, Leonardo Neves and Thamar Solorio This repository

18 Sep 10, 2022

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization [Paper] accepted at the EMNLP 2021: Vision Guided Genera

42 Jan 7, 2023

Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)

57 Jan 1, 2023

Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample

Maths from examples - Learning advanced mathematical computations from examples This is the source code and data sets relevant to the paper Learning a

171 Nov 23, 2022

《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

The Most Important Thing. Our code is developed based on: LXMERT: Learning Cross-Modality Encoder Representations from Transformers

53 Dec 16, 2022

EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections Ruiqi Zhong, Kristy Lee*, Zheng Zhang*, Dan Klein EMN

42 Nov 3, 2022

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

Storium GPT-2 Models This is the official repository for the GPT-2 models described in the EMNLP 2020 paper [STORIUM: A Dataset and Evaluation Platfor

27 Dec 20, 2022

EMNLP 2021 paper Models and Datasets for Cross-Lingual Summarisation.

Related tags

Overview

The XWikis Corpus

Cross-lingual Summarisation Code

Models' Outputs

You might also like...

For AILAB: Cross Lingual Retrieval on Yelp Search Engine

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

“Data Augmentation for Cross-Domain Named Entity Recognition” (EMNLP 2021)

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)

Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample

《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

Owner

Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021)

Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

Meta Representation Transformation for Low-resource Cross-lingual Learning

This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

PyTorch original implementation of Cross-lingual Language Model Pretraining.