Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task

Shahrukh Khan

Last update: Dec 15, 2022

Related tags

Deep Learning siamese-nn-semantic-text-similarity

Overview

Siamese Deep Neural Networks for Semantic Text Similarity PyTorch

A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task, including architectures such as:

Siamese LSTM
Siamese BiLSTM with Attention
Siamese Transformer
Siamese BERT.

Usage

install dependencies

pip install -r requirements.txt

download spacy en model for tokenization

python -m spacy download en

Siamese LSTM

Siamese LSTM Example

 ## init siamese lstm
    siamese_lstm = SiameseLSTM(
        batch_size=batch_size,
        output_size=output_size,
        hidden_size=hidden_size,
        vocab_size=vocab_size,
        embedding_size=embedding_size,
        embedding_weights=embedding_weights,
        lstm_layers=lstm_layers,
        device=device,
    )

    ## define optimizer
    optimizer = torch.optim.Adam(params=siamese_lstm.parameters())
   
   ## train model
    train_model(
        model=siamese_lstm,
        optimizer=optimizer,
        dataloader=sick_dataloaders,
        data=sick_data,
        max_epochs=max_epochs,
        config_dict={"device": device, "model_name": "siamese_lstm"},
    )

Siamese BiLSTM with Attention

Siamese BiLSTM with Attention Example

     ## init siamese lstm
     siamese_lstm_attention = SiameseBiLSTMAttention(
        batch_size=batch_size,
        output_size=output_size,
        hidden_size=hidden_size,
        vocab_size=vocab_size,
        embedding_size=embedding_size,
        embedding_weights=embedding_weights,
        lstm_layers=lstm_layers,
        self_attention_config=self_attention_config,
        fc_hidden_size=fc_hidden_size,
        device=device,
        bidirectional=bidirectional,
    )
    
    ## define optimizer
    optimizer = torch.optim.Adam(params=siamese_lstm_attention.parameters())
   
   ## train model
    train_model(
        model=siamese_lstm_attention,
        optimizer=optimizer,
        dataloader=sick_dataloaders,
        data=sick_data,
        max_epochs=max_epochs,
        config_dict={
            "device": device,
            "model_name": "siamese_lstm_attention",
            "self_attention_config": self_attention_config,
        },
    )

Siamese Transformer

Siamese Transformer Example

    ## init siamese bilstm with attention
    siamese_transformer = SiameseTransformer(
        batch_size=batch_size,
        vocab_size=vocab_size,
        embedding_size=embedding_size,
        nhead=attention_heads,
        hidden_size=hidden_size,
        transformer_layers=transformer_layers,
        embedding_weights=embedding_weights,
        device=device,
        dropout=dropout,
        max_sequence_len=max_sequence_len,
    )

    ## define optimizer
    optimizer = torch.optim.Adam(params=siamese_transformer.parameters())
   
   ## train model
    train_model(
        model=siamese_transformer,
        optimizer=optimizer,
        dataloader=sick_dataloaders,
        data=sick_data,
        max_epochs=max_epochs,
        config_dict={"device": device, "model_name": "siamese_transformer"},
    )

Siamese BERT

Siamese BERT Example

    from siamese_sts.siamese_net.siamese_bert import BertForSequenceClassification
    ## init siamese bert
    siamese_bert = BertForSequenceClassification.from_pretrained(model_name)

    ## train model
    trainer = transformers.Trainer(
        model=siamese_bert,
        args=transformers.TrainingArguments(
            output_dir="./output",
            overwrite_output_dir=True,
            learning_rate=1e-5,
            do_train=True,
            num_train_epochs=num_epochs,
            # Adjust batch size if this doesn't fit on the Colab GPU
            per_device_train_batch_size=batch_size,
            save_steps=3000,
        ),
        train_dataset=sick_dataloader,
    )
    trainer.train()

You might also like...

PyTorch implementation of Asymmetric Siamese (https://arxiv.org/abs/2204.00613)

Asym-Siam: On the Importance of Asymmetry for Siamese Representation Learning This is a PyTorch implementation of the Asym-Siam paper, CVPR 2022: @inp

89 Dec 18, 2022

TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

912 Jan 8, 2023

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

224 Jan 4, 2023

A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)

wsss-analysis The code of: A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains, arXiv pre-print 2019 paper.

48 Dec 18, 2022

Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

SiamSA: Robust Siamese Object Tracking for Unmanned Aerial Manipulator Demo video 📹 Our video on Youtube and bilibili demonstrates the evaluation of

Intelligent Vision for Robotics in Complex Environment

12 Dec 18, 2022

Comments

Bump nltk from 3.4.5 to 3.6.6
Bumps nltk from 3.4.5 to 3.6.6.

Changelog

Sourced from nltk's changelog.

Version 3.7 2022-02-09

Improve and update the NLTK team page on nltk.org (#2855, #2941)

Drop support for Python 3.6, support Python 3.10 (#2920)

Version 3.6.7 2021-12-28

Resolve IndexError in sent_tokenize and word_tokenize (#2922)

Version 3.6.6 2021-12-21

Refactor gensim.doctest to work for gensim 4.0.0 and up (#2914)

Add Precision, Recall, F-measure, Confusion Matrix to Taggers (#2862)

Added warnings if .zip files exist without any corresponding .csv files. (#2908)

Fix FileNotFoundError when the download_dir is a non-existing nested folder (#2910)

Rename omw to omw-1.4 (#2907)

Resolve ReDoS opportunity by fixing incorrectly specified regex (#2906)

Support OMW 1.4 (#2899)

Deprecate Tree get and set node methods (#2900)

Fix broken inaugural test case (#2903)

Use Multilingual Wordnet Data from OMW with newer Wordnet versions (#2889)

Keep NLTKs "tokenize" module working with pathlib (#2896)

Make prettyprinter to be more readable (#2893)

Update links to the nltk book (#2895)

Add CITATION.cff to nltk (#2880)

Resolve serious ReDoS in PunktSentenceTokenizer (#2869)

Delete old CI config files (#2881)

Improve Tokenize documentation + add TokenizerI as superclass for TweetTokenizer (#2878)

Fix expected value for BLEU score doctest after changes from #2572

Add multi Bleu functionality and tests (#2793)

Deprecate 'return_str' parameter in NLTKWordTokenizer and TreebankWordTokenizer (#2883)

Allow empty string in CFG's + more (#2888)

Partition tree.py module into tree package + pickle fix (#2863)

Fix several TreebankWordTokenizer and NLTKWordTokenizer bugs (#2877)

Rewind Wordnet data file after each lookup (#2868)

Correct init call for SyntaxCorpusReader subclasses (#2872)

Documentation fixes (#2873)

Fix levenstein distance for duplicated letters (#2849)

Support alternative Wordnet versions (#2860)

Remove hundreds of formatting warnings for nltk.org (#2859)

Modernize nltk.org/howto pages (#2856)

Fix Bleu Score smoothing function from taking log(0) (#2839)

Update third party tools to newer versions and removing MaltParser fixed version (#2832)

Fix TypeError: _pretty() takes 1 positional argument but 2 were given in sem/drt.py (#2854)

Replace http with https in most URLs (#2852)

Thanks to the following contributors to 3.6.6 Adam Hawley, BatMrE, Danny Sepler, Eric Kafe, Gavish Poddar, Panagiotis Simakis, RnDevelover, Robby Horvath, Tom Aarsen, Yuta Nakamura, Mohaned Mashaly

... (truncated)

Commits

4862b09 updates for 3.6.6

6b60213 Refactor gensim.doctest to work for gensim 4.0.0 and up (#2914)

59aa3fb Fix decode error for bllip parser (#2897)

a28d256 Add Precision, Recall, F-measure, Confusion Matrix to Taggers (#2862)

72d9885 Added warnings if .zip files exist without any corresponding .csv files. (#2908)

dea7b44 Fix FileNotFoundError when the download_dir is a non-existing nested fold...

abbe86b Undo #2909 due to unexpected test failure

c075dab Allow commits with /nocache to not use the cache (#2909)

d6d513d Renamed omw to omw-1.4 (#2907)

2a50a3e Resolve ReDoS opportunity by fixing incorrectly specified regex (#2906)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Siamese BERT seems to be a cross encoder rather than a bi encoder.

siamese vs cross encoder cross encoder

I believe you are not implementing a siamese network here but rather a cross encoder in case of Siamese BERT example.

opened by rohit1998 1

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task

Related tags

Overview

Siamese Deep Neural Networks for Semantic Text Similarity PyTorch

Usage

Siamese LSTM

Siamese BiLSTM with Attention

Siamese Transformer

Siamese BERT

You might also like...

PyTorch implementation of Asymmetric Siamese (https://arxiv.org/abs/2204.00613)

TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)

Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

Siamese TabNet

From this paper "SESNet: A Semantically Enhanced Siamese Network for Remote Sensing Change Detection"

Exploring Simple Siamese Representation Learning

[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Comments

Bump nltk from 3.4.5 to 3.6.6

Siamese BERT seems to be a cross encoder rather than a bi encoder.

Owner

Shahrukh Khan

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

Classify bird species based on their songs using SIamese Networks and 1D dilated convolutions.

A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

A toolkit for document-level event extraction, containing some SOTA model implementations

The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

Sharpened cosine similarity torch - A Sharpened Cosine Similarity layer for PyTorch

SiamMOT is a region-based Siamese Multi-Object Tracking network that detects and associates object instances simultaneously.

A PyTorch re-implementation of the paper 'Exploring Simple Siamese Representation Learning'. Reproduced the 67.8% Top1 Acc on ImageNet.

PyTorch implementation of SimSiam: Exploring Simple Siamese Representation Learning

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task

Related tags

Overview

Siamese Deep Neural Networks for Semantic Text Similarity PyTorch

Usage

Siamese LSTM

Siamese BiLSTM with Attention

Siamese Transformer

Siamese BERT

You might also like...

PyTorch implementation of Asymmetric Siamese (https://arxiv.org/abs/2204.00613)

TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)

Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

Siamese TabNet

From this paper "SESNet: A Semantically Enhanced Siamese Network for Remote Sensing Change Detection"

Exploring Simple Siamese Representation Learning

[CVPR 2022 Oral] Crafting Better Contrastive Views for Siamese Representation Learning

Comments

Bump nltk from 3.4.5 to 3.6.6

Siamese BERT seems to be a cross encoder rather than a bi encoder.

Owner

Shahrukh Khan

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

Classify bird species based on their songs using SIamese Networks and 1D dilated convolutions.

A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

A toolkit for document-level event extraction, containing some SOTA model implementations

The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen*, Kaixiong Zhou*, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang

Sharpened cosine similarity torch - A Sharpened Cosine Similarity layer for PyTorch

SiamMOT is a region-based Siamese Multi-Object Tracking network that detects and associates object instances simultaneously.

A PyTorch re-implementation of the paper 'Exploring Simple Siamese Representation Learning'. Reproduced the 67.8% Top1 Acc on ImageNet.

PyTorch implementation of SimSiam: Exploring Simple Siamese Representation Learning

[Preprint] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen, Kaixiong Zhou, Keyu Duan, Wenqing Zheng, Peihao Wang, Xia Hu, Zhangyang Wang