State of the art Semantic Sentence Embeddings

Fredrik Carlsson

Last update: Dec 30, 2022

Related tags

Deep Learning Contrastive-Tension

Overview

Contrastive Tension

State of the art Semantic Sentence Embeddings

Published Paper · Huggingface Models · Report Bug

Overview

This is the official code accompanied with the paper Semantic Re-Tuning via Contrastive Tension.
The paper was accepted at ICLR-2021 and official reviews and responses can be found at OpenReview.

Contrastive Tension(CT) is a fully self-supervised algorithm for re-tuning already pre-trained transformer Language Models, and achieves State-Of-The-Art(SOTA) sentence embeddings for Semantic Textual Similarity(STS). All that is required is hence a pre-trained model and a modestly large text corpus. The results presented in the paper sampled text data from Wikipedia.

This repository contains:

Tensorflow 2 implementation of the CT algorithm
State of the art pre-trained STS models
Tensorflow 2 inference code
PyTorch inference code

Requirements

While it is possible that other versions works equally fine, we have worked with the following:

Python = 3.6.9
Transformers = 4.1.1

Usage

All the models and tokenizers are available via the Huggingface interface, and can be loaded for both Tensorflow and PyTorch:

import transformers

tokenizer = transformers.AutoTokenizer.from_pretrained('Contrastive-Tension/RoBerta-Large-CT-STSb')

TF_model = transformers.TFAutoModel.from_pretrained('Contrastive-Tension/RoBerta-Large-CT-STSb')
PT_model = transformers.AutoModel.from_pretrained('Contrastive-Tension/RoBerta-Large-CT-STSb')

Inference

To perform inference with the pre-trained models (or other Huggigface models) please see the script ExampleBatchInference.py.
The most important thing to remember when running inference is to apply the attention_masks on the batch output vector before mean pooling, as is done in the example script.

CT Training

To run CT on your own models and text data see ExampleTraining.py for a comprehensive example. This file currently creates a dummy corpus of random text. Simply replace this to whatever corpus you like.

Pre-trained Models

Note that these models are not trained with the exact hyperparameters as those disclosed in the original CT paper. Rather, the parameters are from a short follow-up paper currently under review, which once again pushes the SOTA.

All evaluation is done using the SentEval framework, and shows the: (Pearson / Spearman) correlations

Unsupervised / Zero-Shot

As both the training of BERT, and CT itself is fully self-supervised, the models only tuned with CT require no labeled data whatsoever.
The NLI models however, are first fine-tuned towards a natural language inference task, which requires labeled data.

Model	Avg Unsupervised STS	STS-b	#Parameters
Fully Unsupervised
BERT-Distil-CT	75.12 / 75.04	78.63 / 77.91	66 M
BERT-Base-CT	73.55 / 73.36	75.49 / 73.31	108 M
BERT-Large-CT	77.12 / 76.93	80.75 / 79.82	334 M
Using NLI Data
BERT-Distil-NLI-CT	76.65 / 76.63	79.74 / 81.01	66 M
BERT-Base-NLI-CT	76.05 / 76.28	79.98 / 81.47	108 M
BERT-Large-NLI-CT	77.42 / 77.41	80.92 / 81.66	334 M

Supervised

These models are fine-tuned directly with STS data, using a modified version of the supervised training object proposed by S-BERT.
To our knowledge our RoBerta-Large-STSb is the current SOTA model for STS via sentence embeddings.

Model	STS-b	#Parameters
BERT-Distil-CT-STSb	84.85 / 85.46	66 M
BERT-Base-CT-STSb	85.31 / 85.76	108 M
BERT-Large-CT-STSb	85.86 / 86.47	334 M
RoBerta-Large-CT-STSb	87.56 / 88.42	334 M

Other Languages

Model	Language	#Parameters
BERT-Base-Swe-CT-STSb	Swedish	108 M

License

Distributed under the MIT License. See LICENSE for more information.

Contact

If you have questions regarding the paper, please consider creating a comment via the official OpenReview submission.
If you have questions regarding the code or otherwise related to this Github page, please open an issue.

For other purposes, feel free to contact me directly at: [email protected]

Acknowledgements

Comments

AttributeError: module 'ContrastiveTension.Inference' has no attribute 'generateSentenceEmbeddings'

Hi, thank you for open-sourcing your work! I found this error when executing ExampleTraining.py There is no function generateSentenceEmbeddings under Inference.py. Could you please update this? Thank you! File "/content/gdrive/MyDrive/Contrastive-Tension-master/STSData/Evaluation.py", line 36, in evaluateSTS embs = Inference.generateSentenceEmbeddings(model, tokenizer, batchTexts) AttributeError: module 'ContrastiveTension.Inference' has no attribute 'generateSentenceEmbeddings'

opened by mmmargarettttt 0
ContrastiveTension

Hi Everyone.

It's probably just me, but I keep getting the message: "ModuleNotFoundError: No module named 'ContrastiveTension'." I was trying to run the "ExampleBatchInference.py" script. Can someone please explain why?

Thank you so much in advance. I wish you all a great day! Angelina.

opened by angelinachu 0
TypeError: () missing 1 required positional argument: 'b'
Hi, I read your work and found that it very interested and I'm trying to reproduce this work by following the example that you posted. However, when the example is running to evalFunc the error happened

87 if (evalFunc != None): ---> 88 eval_1 = evalFunc(model.model1, tokenizer) 89 print("Evaluation Scores Model-1:", eval_1) 90 eval_2 = evalFunc(model.model2, tokenizer) TypeError: <lambda>() missing 1 required positional argument: 'b'

I use the setting and model that you provided in this.

Thank you for the answer.
opened by mrpeerat 1
Model is only saved when an evalFunc is passed

First of all, thank you for open-sourcing the training scripts. This looks very promising for semantic search and I'm currently evaluating some use-cases.

Something I noticed while playing around with it:

If no evalFunc is passed, we never make it to the branch of the code that calls .save_pretrained().

opened by etiennedi 0

A state of the art of new lightweight YOLO model implemented by TensorFlow 2.

CSL-YOLO: A New Lightweight Object Detection System for Edge Computing This project provides a SOTA level lightweight YOLO called "Cross-Stage Lightwe

54 Dec 21, 2022

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.

An Effective Loss Function for Generating 3D Models from Single 2D Image without Rendering Papers with code | Paper Nikola Zubić Pietro Lio University

213 Dec 27, 2022

😇A pyTorch implementation of the DeepMoji model: state-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc

------ Update September 2018 ------ It's been a year since TorchMoji and DeepMoji were released. We're trying to understand how it's being used such t

865 Dec 24, 2022

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

63 Oct 17, 2022

Fuzzification helps developers protect the released, binary-only software from attackers who are capable of applying state-of-the-art fuzzing techniques

About Fuzzification Fuzzification helps developers protect the released, binary-only software from attackers who are capable of applying state-of-the-

55 Oct 25, 2022

State of the art Semantic Sentence Embeddings

Related tags

Overview

Contrastive Tension

State of the art Semantic Sentence Embeddings

Overview

Requirements

Usage

Inference

CT Training

Pre-trained Models

Unsupervised / Zero-Shot

Supervised

Other Languages

License

Contact

Acknowledgements

You might also like...

A state of the art of new lightweight YOLO model implemented by TensorFlow 2.

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.

😇A pyTorch implementation of the DeepMoji model: state-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc

FastReID is a research platform that implements state-of-the-art re-identification algorithms.

Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

Fuzzification helps developers protect the released, binary-only software from attackers who are capable of applying state-of-the-art fuzzing techniques

Comments

AttributeError: module 'ContrastiveTension.Inference' has no attribute 'generateSentenceEmbeddings'

ContrastiveTension

TypeError: () missing 1 required positional argument: 'b'

Model is only saved when an evalFunc is passed

Owner

Fredrik Carlsson

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

State of the Art Neural Networks for Deep Learning

Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

Quickly comparing your image classification models with the state-of-the-art models (such as DenseNet, ResNet, ...)

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

State-of-the-art data augmentation search algorithms in PyTorch

A selection of State Of The Art research papers (and code) on human locomotion (pose + trajectory) prediction (forecasting)