Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021

Related tags

Deep Learning mRASP2
Overview

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021

The code for training mCOLT/mRASP2, a multilingual NMT training framework, implemented based on fairseq.

mRASP2: paper

mRASP: paper, code


News

We have released two versions, this version is the original one. In this implementation:

  • You should first merge all data, by pre-pending language token before each sentence to indicate the language.
  • AA/RAS muse be done off-line (before binarize), check this toolkit.

New implementation: https://github.com/PANXiao1994/mRASP2/tree/new_impl

  • Acknowledgement: This work is supported by Bytedance. We thank Chengqi for uploading all files and checkpoints.

Introduction

mRASP2/mCOLT, representing multilingual Contrastive Learning for Transformer, is a multilingual neural machine translation model that supports complete many-to-many multilingual machine translation. It employs both parallel corpora and multilingual corpora in a unified training framework. For detailed information please refer to the paper.

img.png

Pre-requisite

pip install -r requirements.txt

Training Data and Checkpoints

We release our preprocessed training data and checkpoints in the following.

Dataset

We merge 32 English-centric language pairs, resulting in 64 directed translation pairs in total. The original 32 language pairs corpus contains about 197M pairs of sentences. We get about 262M pairs of sentences after applying RAS, since we keep both the original sentences and the substituted sentences. We release both the original dataset and dataset after applying RAS.

Dataset #Pair
32-lang-pairs-TRAIN 197603294
32-lang-pairs-RAS-TRAIN 262662792
mono-split-a -
mono-split-b -
mono-split-c -
mono-split-d -
mono-split-e -
mono-split-de-fr-en -
mono-split-nl-pl-pt -
32-lang-pairs-DEV-en-centric -
32-lang-pairs-DEV-many-to-many -
Vocab -
BPE Code -

Checkpoints & Results

  • Please note that the provided checkpoint is sightly different from that in the paper. In the following sections, we report the results of the provided checkpoints.

English-centric Directions

We report tokenized BLEU in the following table. (check eval.sh for details)

6e6d-no-mono 12e12d-no-mono 12e12d
en2cs/wmt16 21.0 22.3 23.8
cs2en/wmt16 29.6 32.4 33.2
en2fr/wmt14 42.0 43.3 43.4
fr2en/wmt14 37.8 39.3 39.5
en2de/wmt14 27.4 29.2 29.5
de2en/wmt14 32.2 34.9 35.2
en2zh/wmt17 33.0 34.9 34.1
zh2en/wmt17 22.4 24.0 24.4
en2ro/wmt16 26.6 28.1 28.7
ro2en/wmt16 36.8 39.0 39.1
en2tr/wmt16 18.6 20.3 21.2
tr2en/wmt16 22.2 25.5 26.1
en2ru/wmt19 17.4 18.5 19.2
ru2en/wmt19 22.0 23.2 23.6
en2fi/wmt17 20.2 22.1 22.9
fi2en/wmt17 26.1 29.5 29.7
en2es/wmt13 32.8 34.1 34.6
es2en/wmt13 32.8 34.6 34.7
en2it/wmt09 28.9 30.0 30.8
it2en/wmt09 31.4 32.7 32.8

Unsupervised Directions

We report tokenized BLEU in the following table. (check eval.sh for details)

12e12d
en2pl/wmt20 6.2
pl2en/wmt20 13.5
en2nl/iwslt14 8.8
nl2en/iwslt14 27.1
en2pt/opus100 18.9
pt2en/opus100 29.2

Zero-shot Directions

  • row: source language
  • column: target language We report sacreBLEU in the following table.
12e12d ar zh nl fr de ru
ar - 32.5 3.2 22.8 11.2 16.7
zh 6.5 - 1.9 32.9 7.6 23.7
nl 1.7 8.2 - 7.5 10.2 2.9
fr 6.2 42.3 7.5 - 18.9 24.4
de 4.9 21.6 9.2 24.7 - 14.4
ru 7.1 40.6 4.5 29.9 13.5 -

Training

export NUM_GPU=4 && bash train_w_mono.sh ${model_config}
  • We give example of ${model_config} in ${PROJECT_REPO}/examples/configs/parallel_mono_12e12d_contrastive.yml

Inference

  • You must pre-pend the corresponding language token to the source side before binarize the test data.
${final_res_file} python3 ${repo_dir}/scripts/utils.py ${res_file} ${ref_file} || exit 1; ">
fairseq-generate ${test_path} \
    --user-dir ${repo_dir}/mcolt \
    -s ${src} \
    -t ${tgt} \
    --skip-invalid-size-inputs-valid-test \
    --path ${ckpts} \
    --max-tokens ${batch_size} \
    --task translation_w_langtok \
    ${options} \
    --lang-prefix-tok "LANG_TOK_"`echo "${tgt} " | tr '[a-z]' '[A-Z]'` \
    --max-source-positions ${max_source_positions} \
    --max-target-positions ${max_target_positions} \
    --nbest 1 | grep -E '[S|H|P|T]-[0-9]+' > ${final_res_file}
python3 ${repo_dir}/scripts/utils.py ${res_file} ${ref_file} || exit 1;

Synonym dictionaries

We use the bilingual synonym dictionaries provised by MUSE.

We generate multilingual synonym dictionaries using this script, and apply RAS using this script.

Description File Size
dep=1 synonym_dict_raw_dep1 138.0 M
dep=2 synonym_dict_raw_dep2 1.6 G
dep=3 synonym_dict_raw_dep3 2.2 G

Contact

Please contact me via e-mail [email protected] or via wechat/zhihu

Citation

Please cite as:

@inproceedings{mrasp2,
  title = {Contrastive Learning for Many-to-many Multilingual Neural Machine Translation},
  author= {Xiao Pan and
           Mingxuan Wang and
           Liwei Wu and
           Lei Li},
  booktitle = {Proceedings of ACL 2021},
  year = {2021},
}
Comments
  • The script in new_impl cannot be downloaded

    The script in new_impl cannot be downloaded

    For example, when run follow command will return 403

    wget -c http://sf3-ttcdn-tos.pstatp.com/obj/nlp-opensource/acl2021/mrasp2/parallel_pub100_bin/download.sh
    
    opened by YinAoXiong 8
  • Could you give more information about how to run the code?

    Could you give more information about how to run the code?

    Thank you for your great job! Could you give more information about how to run the code? Such as the data format, script of training and inference. Thank you very much!

    opened by HqWu-HITCS 6
  • Clarifications in training config

    Clarifications in training config

    Hello

    Thank you for your excellent work, and well-documented repo. I am trying to use your code to train a new model from scratch, and require some clarification on certain parts that are unclear to me, especially regarding the config.

    (Please note I am referring to this config on the new_impl branch as an example of how I could create my own)

    1. data (under meta): What does this refer to? Is this the directory that contains binarized versions of one multilingual parallel dataset made by concatenating datasets from several language pairs (eg. en-es, en-fr, en-it) or does it contain language pair-specific binary files in its subdirectories?

    2. I can see that in load_config.sh variables starting with meta_ are not written to the options variable, and both monolingual and parallel data are provided separately in train_multilingual_w_mono.sh. This seems to suggest that paths are expected in the form of data_1, data_2 etc.. If so, could you please confirm what these paths refer to? I.e. how does data_1 differ from data_2?

    3. What is mono_dae? This is referred to repeatedly in the codebase, at various places. Would I need to set mono_key in the config file to mono_dae?

    4. Lastly, I have parallel and monolingual datasets that I have already preprocessed (with RAS substitution and language token prefixes). Would I need to set variables like langtoks, encoder_langtok and decoder_langtok?

    Hope I can receive some assistance on this issue soon. Thanks!

    opened by Remorax 4
  • Link broken of datasets

    Link broken of datasets

    Hi, I find some links to the datasets seems to be broken. And reported the following error "upstream server error". Is there new links provided? Thanks!

    opened by clarenceguan 4
  • what the

    what the "meta.ras_dict" in config is?

    the last line of "meta" in the example config said that there should be a json file called "data/lang150/dicts/id_dict_1.json". do we have this one in the repo? or an example what it should contain?

    opened by lidh15 3
  • fairseq-train config question

    fairseq-train config question

    In the example config, what does min_lr stand for? When I run the command provided in readme, it gives me the following error. 7510c29be303e5ef6993c102e56dd95 I checked the fairseq documentation and found this: image Should I use this argument instead? Thank you for answering!

    opened by wying8349 3
  • Where can I get trained models?

    Where can I get trained models?

    Hi, i'm very interested in your work and wanna do additional experiments with the model.

    Where can I get the trained one?

    Thank you for your great job!

    opened by jaehyoyoo 3
  • dataset

    dataset

    请问下论文中指的单向语料是什么意思呢?我个人理解双向语料就是同语义<English,Chinese>的sentence这种形式,但是单向语料没有对应的标签怎么计算交叉熵loss和contrastive loss呢?其次,看代码实现中数据的内容是<String,Coding>的pair对形式,这个和我理解的不太一样,想了好久不知道将pair对的数据处理成这样的。

    opened by limbo520 2
  • Inference Error

    Inference Error

    Hey, this is a really great work. But I ran into a problems when using the model for inferences.

    You have released three models: 6e6d-no-mono, 12e12d-no-mono and 12e12d. I try to use 12e12d-no-mono and 12e12d to translate Hindi to English. However, this problem is encountered: sometimes 12e12d cannot decode the token correctly, but 12e12d-no-mono can decode it correctly. The following is my test sample and the token predicted by the model:

    model: 12e12d

    S-6 LANG_TOK_HI इस समय आ@@ ठ अं@@ को के साथ इ@@ ट@@ ली पू@@ ल C में ती@@ स@@ रे नं@@ बर पर हैं और इ@@ ट@@ ली को 29 सि@@ तं@@ बर को स@@ ्@@ कॉ@@ ट@@ ल@@ ै@@ ंड के खि@@ ला@@ फ@@ ़ दूस@@ रे मै@@ च में कड@@ ़@@ ी ट@@ क@@ ्@@ कर मि@@ ली । H-6 -0.6864292621612549 LANG_TOK_EN Ital@@ y is now on the th@@ ir@@ d spo@@ t in Po@@ ol C with eig@@ ht points and Ital@@ y fo@@ und a tie on September 29 against Sc@@ ot@@ land in a sec@@ ond mat@@ ch . S-7 LANG_TOK_HI न@@ ्@@ यू@@ ज@@ ़@@ ी@@ ल@@ ै@@ ंड ग@@ ्@@ रु@@ प में प@@ ्@@ रथम श@@ ्@@ रे@@ णी पर , स@@ ्@@ कॉ@@ ट@@ ल@@ ै@@ ंड से 10 प@@ ॉ@@ इं@@ ट से आ@@ गे रहा । H-7 -0.6589236855506897 ् न ् यू@@ जी@@ ल@@ ै@@ ंड सम@@ ू@@ ह में पहले श ् रे@@ णी पर , स ् कॉ@@ ट@@ ल@@ ै@@ ंड से 10 प@@ ॉ@@ इं@@ ट से आ@@ गे रहा ।

    model: 12e12d-no-mono

    S-6 LANG_TOK_HI इस समय आ@@ ठ अं@@ को के साथ इ@@ ट@@ ली पू@@ ल C में ती@@ स@@ रे नं@@ बर पर हैं और इ@@ ट@@ ली को 29 सि@@ तं@@ बर को स@@ ्@@ कॉ@@ ट@@ ल@@ ै@@ ंड के खि@@ ला@@ फ@@ ़ दूस@@ रे मै@@ च में कड@@ ़@@ ी ट@@ क@@ ्@@ कर मि@@ ली । H-6 -0.5951337218284607 LANG_TOK_EN Ital@@ y is cur@@ rent@@ ly th@@ ir@@ d in Po@@ ol C with eig@@ ht points and scor@@ ed a tie against Sc@@ ot@@ land in the sec@@ ond mat@@ ch on September 29 . S-7 LANG_TOK_HI न@@ ्@@ यू@@ ज@@ ़@@ ी@@ ल@@ ै@@ ंड ग@@ ्@@ रु@@ प में प@@ ्@@ रथम श@@ ्@@ रे@@ णी पर , स@@ ्@@ कॉ@@ ट@@ ल@@ ै@@ ंड से 10 प@@ ॉ@@ इं@@ ट से आ@@ गे रहा । H-7 -0.6146384477615356 LANG_TOK_EN In the New Ze@@ al@@ and gro@@ up , it was 10 points a@@ head of Sc@@ ot@@ land in the first clas@@ s .

    The following is my script: model: 12e12d fairseq-generate ./test_data/bin \ --user-dir ./mcolt \ -s hi \ -t en \ --path ./model/12e12d_last.pt \ --max-tokens 1024 \ --task translation_w_langtok \ --lang-prefix-tok "LANG_TOK_"echo "en " | tr '[a-z]' '[A-Z]'\ --max-source-positions 1024 \ --max-target-positions 1024 \ --nbest 1 | grep -E '[S|H|P|T]-[0-9]+' > ./test_data/trans_res/en_12e12d_last.txt model: 12e12d-no-mono fairseq-generate ./test_data/bin \ --user-dir ./mcolt \ -s hi \ -t en \ --path ./model/12e12d_no_mono.pt \ --max-tokens 1024 \ --task translation_w_langtok \ --lang-prefix-tok "LANG_TOK_"echo "en " | tr '[a-z]' '[A-Z]'\ --max-source-positions 1024 \ --max-target-positions 1024 \ --nbest 1 | grep -E '[S|H|P|T]-[0-9]+' > ./test_data/trans_res/en_12e12d_no_mono.txt

    It can be found that the tokens predicted by the two models for H-7 are completely inconsistent. The first position should be LANG_TOK_EN, but the model is decoded to . Of course, the tokens after LANG_TOK_ are neither fully source language tokens nor target language tokens. In my testset, there are other sentences that have also been decoded into this situation. And their first token is .

    Why does this happen? Did I not input the parameters expected by 12e12d correctly?

    opened by zhuyl96 2
  • loss average

    loss average

    Hi,in this line https://github.com/PANXiao1994/mRASP2/blob/36c17003dcd642affbe8290c8f26231fec77794a/mcolt/criterions/label_smoothed_cross_entropy_with_contrastive.py#L105 why is it sum rather than mean? Does fairseq library will automatically do average in batch? Sorry , I am not familiar with this framework. And I also notice that reduce function is sum in compute_loss https://github.com/pytorch/fairseq/blob/14c5bd027f04aae9dbb32f1bd7b34591b61af97f/fairseq/criterions/label_smoothed_cross_entropy.py#L46 and ntokens/nsenteces mean average token number within a batch, right? https://github.com/PANXiao1994/mRASP2/blob/36c17003dcd642affbe8290c8f26231fec77794a/mcolt/criterions/label_smoothed_cross_entropy_with_contrastive.py#L66 Could you please tell the loss in the early training stage , because according to my empirical experiment, without multiplying ntokens/nsentences to contrastive_loss, it is already in the same order of magnitude, thanks so much!

    opened by Hannibal046 2
  • Release of synonym dictionary

    Release of synonym dictionary

    Hi, this is really a great paper. In the paper, you said you would release the synonym dictionary. May I ask when will you release it? In addition, is it a multilingual synonym dictionary? Do you have monolingual synonym dictionary, e.g. only for english.

    opened by BaohaoLiao 2
  • Project dependencies may have API risk issues

    Project dependencies may have API risk issues

    Hi, In mRASP2, inappropriate dependency versioning constraints can cause risks.

    Below are the dependencies and version constraints that the project is using

    subword-nmt
    sacrebleu
    sacremoses
    kytea
    six
    

    The version constraint == will introduce the risk of dependency conflicts because the scope of dependencies is too strict. The version constraint No Upper Bound and * will introduce the risk of the missing API Error because the latest version of the dependencies may remove some APIs.

    After further analysis, in this project, The version constraint of dependency sacrebleu can be changed to >=1.1.0,<=1.1.1. The version constraint of dependency sacrebleu can be changed to >=1.1.3,<=1.4.5.

    The above modification suggestions can reduce the dependency conflicts as much as possible, and introduce the latest version as much as possible without calling Error in the projects.

    The invocation of the current project includes all the following methods.

    The calling methods from the sacrebleu
    sacrebleu.corpus_bleu
    sacrebleu.compute_bleu
    
    The calling methods from the all methods
    get_hypo_and_ref
    numpy.array
    counts.append
    fairseq.utils.strip_pad
    max
    all_dataset_upsample_ratio.strip
    fairseq.data.PrependTokenDataset
    self.swap_sample
    FileNotFoundError
    self.tgt_dict.string
    model.encoder.forward.transpose
    torch.no_grad
    inspect.getfullargspec
    log.get
    float
    tqdm.tqdm
    hyps.append
    json.loads
    torch.cat
    f.read.split
    self.temperature.anchor_dot_contrast.torch.div.nn.LogSoftmax.diag
    cls.load_dictionary.index
    eval.readlines
    self.dataset.size
    self.temperature.anchor_dot_contrast.torch.div.nn.LogSoftmax.diag.sum
    isinstance
    torch.LongTensor
    _sentence_embedding
    self.inference_step
    fairseq.criterions.label_smoothed_cross_entropy.LabelSmoothedCrossEntropyCriterion.add_args
    open.close
    mask.float
    cls
    recover_bpe
    open
    hasattr
    id_num.score_dict.append
    self.dataset.prefetch
    fairseq.data.TruncateDataset
    eval.read
    numpy.array.sum
    src_list.append
    fairseq.models.transformer.transformer_wmt_en_de_big_t2t
    self.tgt_dict.pad
    eval
    toks.int
    src_datasets.append
    format
    bpe_symbol.line.replace.rstrip
    cls.load_dictionary.eos
    fairseq.data.data_utils.infer_language_pair
    torch.cat.contiguous
    logging.getLogger.info
    argparse.Namespace
    fairseq.models.register_model_architecture
    j.line.split
    similarity_function
    str
    Exception
    ValueError
    self.set_epoch
    open.write
    mask.float.sum.unsqueeze
    fairseq.data.AppendTokenDataset
    fairseq.data.encoders.build_tokenizer
    super.set_epoch
    super.__init__
    cls.load_dictionary.unk
    size_ratio.dataset.len.np.ceil.astype
    super
    self.padding_idx.src_tokens.int.sum
    numpy.argsort
    super.reduce_metrics
    self.padding_idx.src_tokens.int
    itertools.count
    os.path.join
    self.padding_idx.target.int
    super.build_model
    generator.generate
    super.valid_step
    round
    int
    len
    fairseq.data.indexed_dataset.dataset_exists
    refs.append
    os.path.dirname
    torch.nn.LogSoftmax
    toks.int.cpu
    logging.getLogger
    re.compile
    mask.unsqueeze
    self.tokenizer.decode
    numpy.ceil
    remove_bpe_fn
    fairseq.tasks.register_task
    fairseq.tasks.translation.TranslationTask.add_args
    re.search.span
    torch.nn.CosineSimilarity
    self.dataset.num_tokens
    totals.append
    fairseq.utils.deprecation_warning
    self.compute_loss
    cls.load_dictionary
    self.target_dictionary.index
    prefix_tokens.to.to
    split_exists
    fairseq.utils.eval_bool
    remove_bpe
    torch.transpose
    self.len.np.random.permutation.astype
    getattr
    fairseq.tasks.translation.load_langpair_dataset
    torch.div
    re.search
    target.contiguous
    sum_logs
    fairseq.metrics.log_scalar
    self.padding_idx.target.int.sum
    contrast_feature.expand
    numpy.random.permutation
    tgt_list.append
    self.dataset.__getitem__
    numpy.random.RandomState
    cls.load_dictionary.bos
    src_tokens.size
    numpy.random.RandomState.choice
    load_langpair_dataset
    bpe_symbol.line.replace.rstrip.replace
    fairseq.data.data_utils.load_indexed_dataset
    cls.load_dictionary.pad
    sacrebleu.compute_bleu
    fairseq.options.eval_bool
    mask.float.sum
    map
    self.get_contrastive_loss
    fairseq.data.StripTokenDataset
    self.build_generator
    fairseq.utils.split_paths
    fairseq.data.ConcatDataset
    fairseq.metrics.log_derived
    decode
    data.SubsampleLanguagePairDataset
    model
    join
    parser.add_argument
    id_num.hypothesis_dict.append
    tgt_datasets.append
    math.log
    fairseq.data.plasma_utils.PlasmaArray
    prefix_tokens.to.expand
    self.similarity_function
    all_dataset_upsample_ratio.strip.split
    fairseq.data.LanguagePairDataset
    id_num.pos_score_dict.append
    mask.unsqueeze.encoder_output.sum
    numpy.arange
    fairseq.utils.item
    o.write
    sacrebleu.corpus_bleu
    reprocess
    sample.size
    re.search.group
    fairseq.models.transformer.transformer_wmt_en_de
    fairseq.criterions.register_criterion
    self._inference_with_bleu
    mono_datas.append
    range
    anchor_feature.expand
    prefix_tokens.torch.LongTensor.unsqueeze
    sum
    model.encoder.forward
    

    @developer Could please help me check this issue? May I pull a request to fix it? Thank you very much.

    opened by PyDeps 0
  • Error downloading dataset

    Error downloading dataset

    image Hello, our team recognizes your work very much, but there are some problems during the replication: download After changing the domain name in the. sh file, a 404 error occurs in the download. Can you answer it at your convenience? esteem it a favor

    opened by surviveMiao 2
  • Question on WMT16 en-ro

    Question on WMT16 en-ro

    On WMT16 en->ro benchmark, the reported results on this website (28.7) is quite different from that reported in your paper (38.0). Is it possible for you to release your bpe tokenized wmt16 en-ro testset? I am trying to reproduce your results on this benchmark but cannnot achieve comparable performance.

    Thanks a lot!

    opened by gpengzhi 2
  • About swap sample

    About swap sample

    Hi Xiao,

    I have a question about the swap_sample function in label_smoothed_cross_entropy_with_contrastive.py https://github.com/PANXiao1994/mRASP2/blob/d4d627b8442af062a5b6607a459fe53c6b516695/mcolt/criterions/label_smoothed_cross_entropy_with_contrastive.py#L39 Here, after swaping the sample, src_tokens are the same as the original target tokens.

    However, the padding positions for source and target are different. src uses left padding while tgt uses right padding (see details below) https://github.com/facebookresearch/fairseq/blob/a0ceabc287e26f64517fadb13a54c83b71e8e469/fairseq/tasks/translation.py#L200

    Thus, why not do the left padding for new source tokens (old target tokens), which are originally with the right padding?

    opened by Mao-KU 0
Owner
null
Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

ERICA Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive L

THUNLP 75 Nov 2, 2022
Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Deep Text Search - AI Based Text Search & Recommendation System Deep Text Search is an AI-powered multilingual text search and recommendation engine w

null 19 Sep 29, 2022
Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit

Yu Bai 43 Nov 7, 2022
Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter Code and checkpoints for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling

null 274 Dec 6, 2022
Code for our paper "Sematic Representation for Dialogue Modeling" in ACL2021

AMR-Dialogue An implementation for paper "Semantic Representation for Dialogue Modeling". You may find our paper here. Requirements python 3.6 pytorch

xfbai 45 Dec 26, 2022
Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

xTune Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning. Environment DockerFile: dancingsoul/pytorch:xTune Install the f

Bo Zheng 42 Dec 9, 2022
Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

PLOME:Pre-training with Misspelled Knowledge for Chinese Spelling Correction (ACL2021) This repository provides the code and data of the work in ACL20

null 197 Nov 26, 2022
Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021.

UniRE Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021. Requirements python: 3.7.6 pytorch: 1.8.1 transformers:

Wang Yijun 109 Nov 29, 2022
A Multi-modal Model Chinese Spell Checker Released on ACL2021.

ReaLiSe ReaLiSe is a multi-modal Chinese spell checking model. This the office code for the paper Read, Listen, and See: Leveraging Multimodal Informa

DaDa 106 Dec 29, 2022
This is the code for ACL2021 paper A Unified Generative Framework for Aspect-Based Sentiment Analysis

This is the code for ACL2021 paper A Unified Generative Framework for Aspect-Based Sentiment Analysis Install the package in the requirements.txt, the

null 108 Dec 23, 2022
Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit

Yu Bai 43 Nov 7, 2022
Code for Dual Contrastive Learning for Unsupervised Image-to-Image Translation, NTIRE, CVPRW 2021.

arXiv Dual Contrastive Learning Adversarial Generative Networks (DCLGAN) We provide our PyTorch implementation of DCLGAN, which is a simple yet powerf

null 119 Dec 4, 2022
Official implementation for "QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation" (CVPR 2022)

QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation (CVPR2022) https://arxiv.org/abs/2203.08483 Unpaired image-to-image (I2I

Xueqi Hu 50 Dec 16, 2022
Saeed Lotfi 28 Dec 12, 2022
Code for paper "Vocabulary Learning via Optimal Transport for Neural Machine Translation"

**Codebase and data are uploaded in progress. ** VOLT(-py) is a vocabulary learning codebase that allows researchers and developers to automaticaly ge

null 416 Jan 9, 2023
"Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback"

This is code repo for our EMNLP 2017 paper "Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback", which implements the A2C algorithm on top of a neural encoder-decoder model and benchmarks the combination under simulated noisy rewards.

Khanh Nguyen 131 Oct 21, 2022
Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Noise Contrastive Estimation for pyTorch Overview This repository contains a re-implementation of the Noise Contrastive Estimation algorithm, implemen

Denis Emelin 42 Nov 24, 2022
XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

XtremeDistilTransformers for Distilling Massive Multilingual Neural Networks ACL 2020 Microsoft Research [Paper] [Video] Releasing [XtremeDistilTransf

Microsoft 125 Jan 4, 2023