Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Yan Yuanmeng

Last update: Dec 25, 2022

Related tags

Deep Learning ConSERT

Overview

ConSERT

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Requirements

torch==1.6.0
cudatoolkit==10.0.103
cudnn==7.6.5
sentence-transformers==0.3.9
transformers==3.4.0
tensorboardX==2.1
pandas==1.1.5
sentencepiece==0.1.85
matplotlib==3.4.1
apex==0.1.0

Get Started

Download pre-trained language model (e.g. bert-base-uncased) from HuggingFace's Library
Download STS datasets to ./data folder using SentEval toolkit

Run the following script to run the unsupervised experiment:

python3 main.py --no_pair --seed 1 --use_apex_amp --apex_amp_opt_level O1 --batch_size 96 --max_seq_length 64 --evaluation_steps 200 --add_cl --cl_loss_only --cl_rate 0.15 --temperature 0.1 --learning_rate 0.0000005 --train_data stssick --num_epochs 10 --da_final_1 feature_cutoff --da_final_2 shuffle --cutoff_rate_final_1 0.2 --model_name_or_path [PRETRAINED_BERT_FOLDER] --model_save_path ./output/unsup-base-feature_cutoff-shuffle --force_del --no_dropout --patience 10

where [PRETRAINED_BERT_FOLDER] should be replaced to the folder that contains downloaded pre-trained language model

Citation

@article{yan2021consert,
  title={ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer},
  author={Yan, Yuanmeng and Li, Rumei and Wang, Sirui and Zhang, Fuzheng and Wu, Wei and Xu, Weiran},
  journal={arXiv preprint arXiv:2105.11741},
  year={2021}
}

Comments

关于句子表示

Q1：请问在无监督学习时，利用两种数据增强方法产生了句子的两个表示，但是之后对模型评估的时候，论文中说通过平均最后两层的token来获得句子的表示，请问这个句子是指哪一个？是原始的句子送入transformer得到的表示还是增强后的句子送入得到的？ Q2：在监督学习的任务中，加入了下游任务的损失，此时如果使用joint方式训练，figure2中的数据增强还是使用两种增强吗？还是一个是原始句子一个是增强后的句子，同样的，在模型评估时用的是哪个句子的表示？谢谢！

opened by zhangxiaowei5346 1
cpu ram memory leak

I've been re-implementing ConSERT these days

Just out of curiosity, I removed early stopping to check if it makes difference in scores.

I found that this code might have CPU memory leak.

When I execute this code, the total amount of CPU memory usage keeps increasing and it ends up shutting down.

Have you experienced this kind of situation on this code as well?

opened by qmin2 0
在使用unsup-consert-base.sh复现时，结果和论文中的结果差距比较大，差了约10个点

similarity mean: 0.6198720335960388 similarity std: 0.23218630254268646 similarity max: 0.9888665676116943 similarity min: -0.11419621855020523 labels mean: 0.5215833187103271 labels std: 0.30510348081588745 labels max: 1.0 labels min: 0.0

不知道是不是忽略了什么细节，希望您能给出一点建议

opened by Xiaoyingzi09 0
AttributeError: module 'torch.distributed' has no attribute '_all_gather_base'

torch 1.6.0 and torch 1.8.1 not work. assert this error like title.

Traceback (most recent call last): File "main.py", line 14, in from sentence_transformers import models, losses File "/root/ConSERT/sentence_transformers/init.py", line 3, in from .datasets import SentencesDataset, SentenceLabelDataset, ParallelSentencesDataset File "/root/ConSERT/sentence_transformers/datasets/init.py", line 1, in from .sampler import * File "/root/ConSERT/sentence_transformers/datasets/sampler/init.py", line 1, in from .LabelSampler import * File "/root/ConSERT/sentence_transformers/datasets/sampler/LabelSampler.py", line 6, in from ...datasets import SentenceLabelDataset File "/root/ConSERT/sentence_transformers/datasets/SentenceLabelDataset.py", line 8, in from .. import SentenceTransformer File "/root/ConSERT/sentence_transformers/SentenceTransformer.py", line 11, in import transformers File "/root/ConSERT/transformers/init.py", line 22, in from .integrations import ( # isort:skip File "/root/ConSERT/transformers/integrations.py", line 58, in from .file_utils import is_torch_tpu_available File "/root/ConSERT/transformers/file_utils.py", line 140, in from apex import amp # noqa: F401 File "/root/miniconda3/lib/python3.8/site-packages/apex/init.py", line 27, in from . import transformer File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/init.py", line 4, in from apex.transformer import pipeline_parallel File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/pipeline_parallel/init.py", line 1, in from apex.transformer.pipeline_parallel.schedules import get_forward_backward_func File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/pipeline_parallel/schedules/init.py", line 3, in from apex.transformer.pipeline_parallel.schedules.fwd_bwd_no_pipelining import ( File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/pipeline_parallel/schedules/fwd_bwd_no_pipelining.py", line 10, in from apex.transformer.pipeline_parallel.schedules.common import Batch File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/pipeline_parallel/schedules/common.py", line 9, in from apex.transformer.pipeline_parallel.p2p_communication import FutureTensor File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/pipeline_parallel/p2p_communication.py", line 25, in from apex.transformer.utils import split_tensor_into_1d_equal_chunks File "/root/miniconda3/lib/python3.8/site-packages/apex/transformer/utils.py", line 11, in torch.distributed.all_gather_into_tensor = torch.distributed._all_gather_base AttributeError: module 'torch.distributed' has no attribute '_all_gather_base'

this error in apex https://github.com/NVIDIA/apex/issues/1526

apex not match torch version, can you tell me your torch version?

opened by SkullFang 1
关于dropout的问题

您好，想请问一下， 1.看到代码中只有 unsup-consert-base.sh 使用了 no_dropout 参数，而其它没有将BERT自带的dropout设置为0，这是为什么呢？ 2.在禁用了BERT的dropout的情况下，是原句子和数据增强后句子都也不使用dropout，还是说只是数据增强后的句子不使用？

opened by LBJ6666 2

OsError when running main.py

I've been running into this issue when I run bash scripts/unsup-consert-base.sh

Traceback (most recent call last):
  File "main.py", line 327, in <module>
    main(args)
  File "main.py", line 185, in main
    word_embedding_model = models.Transformer(args.model_name_or_path, attention_probs_dropout_prob=0.0, hidden_dropout_prob=0.0)
  File "/home/qmin/ConSERT/sentence_transformers/models/Transformer.py", line 36, in __init__
    self.auto_model = AutoModel.from_pretrained(model_name_or_path, config=config, cache_dir=cache_dir)
  File "/home/qmin/ConSERT/transformers/modeling_auto.py", line 629, in from_pretrained
    pretrained_model_name_or_path, *model_args, config=config, **kwargs
  File "/home/qmin/ConSERT/transformers/modeling_utils.py", line 954, in from_pretrained
    "Unable to load weights from pytorch checkpoint file. "

OSError: Unable to load weights from pytorch checkpoint file. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

Is there any workaround?

opened by qmin2 2

'BertModel' object has no attribute 'set_flag'

具体报错是： File "/data2/work2/chenzhihao/NLP/nlp/sentence_transformers/SentenceTransformer.py", line 594, in fit loss_value = loss_model(features, labels) File "/root/anaconda3/envs/NLP_py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/data2/work2/chenzhihao/NLP/nlp/sentence_transformers/losses/AdvCLSoftmaxLoss.py", line 775, in forward rep_a_view1 = self._data_aug(sentence_feature_a, self.data_augmentation_strategy_final_1, File "/data2/work2/chenzhihao/NLP/nlp/sentence_transformers/losses/AdvCLSoftmaxLoss.py", line 495, in _data_aug self.model[0].auto_model.set_flag("data_aug_cutoff", True) File "/root/anaconda3/envs/NLP_py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1185, in getattr raise AttributeError("'{}' object has no attribute '{}'".format( AttributeError: 'BertModel' object has no attribute 'set_flag'

我加载的是hfl/chinese-roberta-wwm-ext模型。

opened by zhihao-chen 4
How to use the model with sentence-transformer for inference?

Cannot load the model. code from sentence_transformers import SentenceTransformer

model = SentenceTransformer("../../models/consbert/unsup-consert-base-atec_ccks") # the model path Error message Traceback (most recent call last): File "/home/qhd/PythonProjects/GraduationProject/code/preprocess_unlabeled_second/sentence-bert.py", line 16, in model = SentenceTransformer("../../models/cosbert/unsup-consert-base-atec_ccks") File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/sentence_transformers/SentenceTransformer.py", line 87, in init modules = self._load_sbert_model(model_path) File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/sentence_transformers/SentenceTransformer.py", line 824, in _load_sbert_model module = module_class.load(os.path.join(model_path, module_config['path'])) File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/sentence_transformers/models/Transformer.py", line 123, in load return Transformer(model_name_or_path=input_path, **config) File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/sentence_transformers/models/Transformer.py", line 30, in init self.tokenizer = AutoTokenizer.from_pretrained(tokenizer_name_or_path if tokenizer_name_or_path is not None else model_name_or_path, cache_dir=cache_dir, **tokenizer_args) File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 445, in from_pretrained return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1719, in from_pretrained return cls._from_pretrained( File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1791, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/transformers/models/bert/tokenization_bert_fast.py", line 177, in init super().init( File "/home/qhd/anaconda3/envs/qhdpython39/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 96, in init fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file) Exception: No such file or directory (os error 2)

opened by qhd1996 2

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Related tags

Overview

ConSERT

Requirements

Get Started

Citation

Comments

关于句子表示

cpu ram memory leak

在使用unsup-consert-base.sh复现时，结果和论文中的结果差距比较大，差了约10个点

AttributeError: module 'torch.distributed' has no attribute '_all_gather_base'

关于dropout的问题

OsError when running main.py

'BertModel' object has no attribute 'set_flag'

How to use the model with sentence-transformer for inference?

Owner

Yan Yuanmeng

[CVPR2021] The source code for our paper 《Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning》.

Dense Contrastive Learning (DenseCL) for self-supervised representation learning, CVPR 2021.

code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Code for our ACL 2021 paper "One2Set: Generating Diverse Keyphrases as a Set"

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation

SUPERVISED-CONTRASTIVE-LEARNING-FOR-PRE-TRAINED-LANGUAGE-MODEL-FINE-TUNING - The Facebook paper about fine tuning RoBERTa with contrastive loss

Official code for the paper "Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks".

Here is the implementation of our paper S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations.

A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

The Self-Supervised Learner can be used to train a classifier with fewer labeled examples needed using self-supervised learning.

Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation

Code for our CVPR 2021 Paper "Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes".

A self-supervised 3D representation learning framework named viewpoint bottleneck.