Code for Emergent Translation in Multi-Agent Communication

Facebook Research

Last update: Jul 15, 2022

Related tags

Text Data & NLP translagent

Overview

Emergent Translation in Multi-Agent Communication

PyTorch implementation of the models described in the paper Emergent Translation in Multi-Agent Communication.

We present code for training and decoding both word- and sentence-level models and baselines, as well as preprocessed datasets.

Dependencies

Python

Python 2.7
PyTorch 0.2
Numpy

GPU

CUDA (we recommend using the latest version. The version 8.0 was used in all our experiments.)

Related code

For preprocessing, we used scripts from Moses and Subword-NMT.

Downloading Datasets

The original corpora can be downloaded from (Bergsma500, Multi30k, MS COCO). For the preprocessed corpora see below.

	Dataset
Bergsma500	Data
Multi30k	Data
MS COCO	Data

Before you run the code

Download the datasets and place them in /data/word (Bergsma500) and /data/sentence (Multi30k and MS COCO)
Set correct path in scr_path() from /scr/word/util.py and scr_path(), multi30k_reorg_path() and coco_path() from /src/sentence/util.py

Word-level Models

Running nearest neighbour baselines

$ python word/bergsma_bli.py

Running our models

$ python word/train_word_joint.py --l1 <L1> --l2 <L2>

where <L1> and <L2> are any of {en, de, es, fr, it, nl}

Sentence-level Models

Baseline 1 : Nearest neighbour

$ python sentence/baseline_nn.py --dataset <DATASET> --task <TASK> --src <SRC> --trg <TRG>

Baseline 2 : NMT with neighbouring sentence pairs

$ python sentence/nmt.py --dataset <DATASET> --task <TASK> --src <SRC> --trg <TRG> --nn_baseline

Baseline 3 : Nakayama and Nishida, 2017

$ python sentence/train_naka_encdec.py --dataset <DATASET> --task <TASK> --src <SRC> --trg <TRG> --train_enc_how <ENC_HOW> --train_dec_how <DEC_HOW>

where <ENC_HOW> is either two or three, and <DEC_HOW> is either img, des, or both.

Our models :

$ python sentence/train_seq_joint.py --dataset <DATASET> --task <TASK>

Aligned NMT :

$ python sentence/nmt.py --dataset <DATASET> --task <TASK> --src <SRC> --trg <TRG>

where <DATASET> is multi30k or coco, and <TASK> is either 1 or 2 (only applicable for Multi30k).

Dataset & Related Code Attribution

Moses is licensed under LGPL, and Subword-NMT is licensed under MIT License.
MS COCO and Multi30k are licensed under Creative Commons.

Citation

If you find the resources in this repository useful, please consider citing:

@inproceedings{Lee:18,
  author    = {Jason Lee and Kyunghyun Cho and Jason Weston and Douwe Kiela},
  title     = {Emergent Translation in Multi-Agent Communication},
  year      = {2018},
  booktitle = {Proceedings of the International Conference on Learning Representations},
}

You might also like...

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX. This package is still in alpha stag

211 Dec 28, 2022

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

TextBlob: Simplified Text Processing Homepage: https://textblob.readthedocs.io/ TextBlob is a Python (2 and 3) library for processing textual data. It

7.5k Feb 17, 2021

Open Source Neural Machine Translation in PyTorch

OpenNMT-py: Open-Source Neural Machine Translation OpenNMT-py is the PyTorch version of the OpenNMT project, an open-source (MIT) neural machine trans

4.8k Feb 18, 2021

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Sockeye This package contains the Sockeye project, an open-source sequence-to-sequence framework for Neural Machine Translation based on Apache MXNet

986 Feb 17, 2021

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

VizSeq is a Python toolkit for visual analysis on text generation tasks like machine translation, summarization, image captioning, speech translation

310 Feb 1, 2021

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX. This package is still in alpha stag

137 Feb 1, 2021

Code for Emergent Translation in Multi-Agent Communication

Related tags

Overview

Emergent Translation in Multi-Agent Communication

Dependencies

Python

GPU

Related code

Downloading Datasets

Before you run the code

Word-level Models

Running nearest neighbour baselines

Running our models

Sentence-level Models

Baseline 1 : Nearest neighbour

Baseline 2 : NMT with neighbouring sentence pairs

Baseline 3 : Nakayama and Nishida, 2017

Our models :

Aligned NMT :

Dataset & Related Code Attribution

Citation

You might also like...

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

Open Source Neural Machine Translation in PyTorch

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.

A deep learning-based translation library built on Huggingface transformers

A Paper List for Speech Translation

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Owner

Facebook Research

Neural-Machine-Translation - Implementation of revolutionary machine translation models

Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

multi-label，classifier，text classification，多标签文本分类，文本分类，BERT，ALBERT，multi-label-classification，seq2seq，attention，beam search

Easy to use, state-of-the-art Neural Machine Translation for 100+ languages

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

Open Source Neural Machine Translation in PyTorch

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)