Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Overview

ConSERT

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Requirements

torch==1.6.0
cudatoolkit==10.0.103
cudnn==7.6.5
sentence-transformers==0.3.9
transformers==3.4.0
tensorboardX==2.1
pandas==1.1.5
sentencepiece==0.1.85
matplotlib==3.4.1
apex==0.1.0

Get Started

  1. Download pre-trained language model (e.g. bert-base-uncased) from HuggingFace's Library
  2. Download STS datasets to ./data folder using SentEval toolkit
  3. Run the following script to run the unsupervised experiment:
    python3 main.py --no_pair --seed 1 --use_apex_amp --apex_amp_opt_level O1 --batch_size 96 --max_seq_length 64 --evaluation_steps 200 --add_cl --cl_loss_only --cl_rate 0.15 --temperature 0.1 --learning_rate 0.0000005 --train_data stssick --num_epochs 10 --da_final_1 feature_cutoff --da_final_2 shuffle --cutoff_rate_final_1 0.2 --model_name_or_path [PRETRAINED_BERT_FOLDER] --model_save_path ./output/unsup-base-feature_cutoff-shuffle --force_del --no_dropout --patience 10
    where [PRETRAINED_BERT_FOLDER] should be replaced to the folder that contains downloaded pre-trained language model

Citation

@article{yan2021consert,
  title={ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer},
  author={Yan, Yuanmeng and Li, Rumei and Wang, Sirui and Zhang, Fuzheng and Wu, Wei and Xu, Weiran},
  journal={arXiv preprint arXiv:2105.11741},
  year={2021}
}
Comments
  • 关于句子表示

    关于句子表示

    Q1:请问在无监督学习时,利用两种数据增强方法产生了句子的两个表示,但是之后对模型评估的时候,论文中说通过平均最后两层的token来获得句子的表示,请问这个句子是指哪一个?是原始的句子送入transformer得到的表示还是增强后的句子送入得到的? Q2:在监督学习的任务中,加入了下游任务的损失,此时如果使用joint方式训练,figure2中的数据增强还是使用两种增强吗?还是一个是原始句子 一个是增强后的句子,同样的,在模型评估时用的是哪个句子的表示?谢谢!

    opened by zhangxiaowei5346 1
  • cpu ram memory leak

    cpu ram memory leak

    I've been re-implementing ConSERT these days

    Just out of curiosity, I removed early stopping to check if it makes difference in scores.

    I found that this code might have CPU memory leak.

    When I execute this code, the total amount of CPU memory usage keeps increasing and it ends up shutting down.

    Have you experienced this kind of situation on this code as well?

    opened by qmin2 0
  • 在使用unsup-consert-base.sh复现时,结果和论文中的结果差距比较大,差了约10个点

    在使用unsup-consert-base.sh复现时,结果和论文中的结果差距比较大,差了约10个点

    similarity mean: 0.6198720335960388 similarity std: 0.23218630254268646 similarity max: 0.9888665676116943 similarity min: -0.11419621855020523 labels mean: 0.5215833187103271 labels std: 0.30510348081588745 labels max: 1.0 labels min: 0.0

    不知道是不是忽略了什么细节,希望您能给出一点建议

    opened by Xiaoyingzi09 0
  • AttributeError: module 'torch.distributed' has no attribute '_all_gather_base'

    AttributeError: module 'torch.distributed' has no attribute '_all_gather_base'

    torch 1.6.0 and torch 1.8.1 not work. assert this error like title.

    Traceback (most recent call last): File "main.py", line 14, in from sentence_transformers import models, losses File "/root/ConSERT/sentence_transformers/init.py", line 3, in from .datasets import SentencesDataset, SentenceLabelDataset, ParallelSentencesDataset File "/root/ConSERT/sentence_transformers/datasets/init.py", line 1, in from .sampler import * File "/root/ConSERT/sentence_transformers/datasets/sampler/init.py", line 1, in from .LabelSampler import * File "/root/ConSERT/sentence_transformers/datasets/sampler/LabelSampler.py", line 6, in from ...datasets import SentenceLabelDataset File "/root/ConSERT/sentence_transformers/datasets/SentenceLabelDataset.py", line 8, in from .. import SentenceTransformer File "/root/ConSERT/sentence_transformers/SentenceTransformer.py", line 11, in import transformers File "/root/ConSERT/transformers/init.py", line 22, in from .integrations import ( # isort:skip File "/root/ConSERT/transformers/integrations.py", line 58, in from .file_utils import is_torch_tpu_available File "/root/ConSERT/transformers/file_utils.py", line 140, in from apex import amp # noqa: F401 File "/root/miniconda3/lib/python3.8/site-packages/apex/init.py", line 27, in from . import transformer File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/init.py", line 4, in from apex.transformer import pipeline_parallel File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/pipeline_parallel/init.py", line 1, in from apex.transformer.pipeline_parallel.schedules import get_forward_backward_func File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/pipeline_parallel/schedules/init.py", line 3, in from apex.transformer.pipeline_parallel.schedules.fwd_bwd_no_pipelining import ( File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/pipeline_parallel/schedules/fwd_bwd_no_pipelining.py", line 10, in from apex.transformer.pipeline_parallel.schedules.common import Batch File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/pipeline_parallel/schedules/common.py", line 9, in from apex.transformer.pipeline_parallel.p2p_communication import FutureTensor File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/pipeline_parallel/p2p_communication.py", line 25, in from apex.transformer.utils import split_tensor_into_1d_equal_chunks File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/utils.py", line 11, in torch.distributed.all_gather_into_tensor = torch.distributed._all_gather_base AttributeError: module 'torch.distributed' has no attribute '_all_gather_base'

    this error in apex https://github.com/NVIDIA/apex/issues/1526

    apex not match torch version, can you tell me your torch version?

    opened by SkullFang 1
  • 关于dropout的问题

    关于dropout的问题

    您好,想请问一下, 1.看到代码中只有 unsup-consert-base.sh 使用了 no_dropout 参数,而其它没有将BERT自带的dropout设置为0,这是为什么呢? 2.在禁用了BERT的dropout的情况下,是原句子和数据增强后句子都也不使用dropout,还是说只是数据增强后的句子不使用?

    opened by LBJ6666 2
  • OsError when running main.py

    OsError when running main.py

    I've been running into this issue when I run bash scripts/unsup-consert-base.sh

    Traceback (most recent call last):
      File "main.py", line 327, in <module>
        main(args)
      File "main.py", line 185, in main
        word_embedding_model = models.Transformer(args.model_name_or_path, attention_probs_dropout_prob=0.0, hidden_dropout_prob=0.0)
      File "/home/qmin/ConSERT/sentence_transformers/models/Transformer.py", line 36, in __init__
        self.auto_model = AutoModel.from_pretrained(model_name_or_path, config=config, cache_dir=cache_dir)
      File "/home/qmin/ConSERT/transformers/modeling_auto.py", line 629, in from_pretrained
        pretrained_model_name_or_path, *model_args, config=config, **kwargs
      File "/home/qmin/ConSERT/transformers/modeling_utils.py", line 954, in from_pretrained
        "Unable to load weights from pytorch checkpoint file. "
    
    OSError: Unable to load weights from pytorch checkpoint file. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. 
    

    Is there any workaround?

    opened by qmin2 2
  • 'BertModel' object has no attribute 'set_flag'

    'BertModel' object has no attribute 'set_flag'

    具体报错是: File "/data2/work2/chenzhihao/NLP/nlp/sentence_transformers/SentenceTransformer.py", line 594, in fit loss_value = loss_model(features, labels) File "/root/anaconda3/envs/NLP_py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/data2/work2/chenzhihao/NLP/nlp/sentence_transformers/losses/AdvCLSoftmaxLoss.py", line 775, in forward rep_a_view1 = self._data_aug(sentence_feature_a, self.data_augmentation_strategy_final_1, File "/data2/work2/chenzhihao/NLP/nlp/sentence_transformers/losses/AdvCLSoftmaxLoss.py", line 495, in _data_aug self.model[0].auto_model.set_flag("data_aug_cutoff", True) File "/root/anaconda3/envs/NLP_py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1185, in getattr raise AttributeError("'{}' object has no attribute '{}'".format( AttributeError: 'BertModel' object has no attribute 'set_flag'

    我加载的是hfl/chinese-roberta-wwm-ext模型。

    opened by zhihao-chen 4
  • How to use the model with sentence-transformer for inference?

    How to use the model with sentence-transformer for inference?

    Cannot load the model. code from sentence_transformers import SentenceTransformer

    model = SentenceTransformer("../../models/consbert/unsup-consert-base-atec_ccks") # the model path Error message Traceback (most recent call last): File "/home/qhd/PythonProjects/GraduationProject/code/preprocess_unlabeled_second/sentence-bert.py", line 16, in model = SentenceTransformer("../../models/cosbert/unsup-consert-base-atec_ccks") File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/sentence_transformers/SentenceTransformer.py", line 87, in init modules = self._load_sbert_model(model_path) File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/sentence_transformers/SentenceTransformer.py", line 824, in _load_sbert_model module = module_class.load(os.path.join(model_path, module_config['path'])) File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/sentence_transformers/models/Transformer.py", line 123, in load return Transformer(model_name_or_path=input_path, **config) File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/sentence_transformers/models/Transformer.py", line 30, in init self.tokenizer = AutoTokenizer.from_pretrained(tokenizer_name_or_path if tokenizer_name_or_path is not None else model_name_or_path, cache_dir=cache_dir, **tokenizer_args) File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 445, in from_pretrained return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1719, in from_pretrained return cls._from_pretrained( File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1791, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/transformers/models/bert/tokenization_bert_fast.py", line 177, in init super().init( File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 96, in init fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file) Exception: No such file or directory (os error 2)

    opened by qhd1996 2
Owner
Yan Yuanmeng
A student in Beijing University of Posts and Telecommunications.
Yan Yuanmeng
[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

TBE The source code for our paper "Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Le

Jinpeng Wang 150 Dec 28, 2022
Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.

Dense Contrastive Learning for Self-Supervised Visual Pre-Training This project hosts the code for implementing the DenseCL algorithm for se

Xinlong Wang 491 Jan 3, 2023
code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

SHOT++ Code for our TPAMI submission "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer" that is ext

null 75 Dec 16, 2022
Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

How Well Do Self-Supervised Models Transfer? This repository hosts the code for the experiments in the CVPR 2021 paper How Well Do Self-Supervised Mod

Linus Ericsson 157 Dec 16, 2022
A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

MERIT A PyTorch implementation of our IJCAI-21 paper Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning. Depen

Graph Analysis & Deep Learning Laboratory, GRAND 32 Jan 2, 2023
Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds This is the official code implementation for the paper "Spatio-temporal Se

Hesper 63 Jan 5, 2023
Code for our ACL 2021 paper "One2Set: Generating Diverse Keyphrases as a Set"

One2Set This repository contains the code for our ACL 2021 paper “One2Set: Generating Diverse Keyphrases as a Set”. Our implementation is built on the

Jiacheng Ye 63 Jan 5, 2023
Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding PyTorch implementation for the Scalable Attentive Sentence-Pair Modeling vi

Microsoft 25 Dec 2, 2022
Saeed Lotfi 28 Dec 12, 2022
Official code for the paper "Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks".

Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks This repository contains the official code for the

Linus Ericsson 11 Dec 16, 2022
Here is the implementation of our paper S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations.

S2VC Here is the implementation of our paper S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations. In thi

null 81 Dec 15, 2022
A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".

Mugs: A Multi-Granular Self-Supervised Learning Framework This is a PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-

Sea AI Lab 62 Nov 8, 2022
[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

REval Table of Contents Introduction Overview Requirements Installation Probing Usage Citation License ?? Introduction REval is a simple framework for

null 13 Jan 6, 2023
Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.

WSDEC This is the official repo for our NeurIPS paper Weakly Supervised Dense Event Captioning in Videos. Description Repo directories ./: global conf

Melon(Xuguang Duan) 96 Nov 1, 2022
The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

Sun Yi 201 Nov 21, 2022
The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Published by SpaceML • About SpaceML • Quick Colab Example Self-Supervised Learner The Self-Supervised Learner can be used to train a classifier with

SpaceML 92 Nov 30, 2022
Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation

CorDA Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation Prerequisite Please create and activate the follo

Qin Wang 60 Nov 30, 2022
Code for our CVPR 2021 Paper "Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes".

Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes (CVPR 2021) Project page | Paper | Colab | Colab for Drawing App Rethinking Style

CompVis Heidelberg 153 Jan 4, 2023
A self-supervised 3D representation learning framework named viewpoint bottleneck.

Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck Paper Created by Liyi Luo, Beiwen Tian, Hao Zhao and Guyue Zhou from Institute for AI In

null 63 Aug 11, 2022