Awesome Semantic-Search
Logo made by @createdbytango.
Following repository aims to serve a meta-repository for Semantic Search and Semantic Similarity related tasks.
Semantic Search isn't limited to text! It can be done with images, speech, etc. So there are numerous different use-cases and applications of semantic search.
Contributions / Milestones
Have a look at the project board for the task list
Table Of Contents
Papers
2014
2015
- Skip-Thought Vectors
π
2016
- Bag of Tricks for Efficient Text Classification
π - Enriching Word Vectors with Subword Information
π - Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs
- On Approximately Searching for Similar Word Embeddings
π
2017
2018
- Universal Sentence Encoder
π - Learning Semantic Textual Similarity from Conversations
π - Google AI Blog: Advances in Semantic Textual Similarity
π - Optimization of Indexing Based on k-Nearest Neighbor Graph for Proximity Search in High-dimensional Data
2019
- LASER: Language Agnostic Sentence Representations
π - Document Expansion by Query Prediction
π - Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
π - Multi-Stage Document Ranking with BERT
π
2020
- Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned
π - PASSAGE RE-RANKING WITH BERT
π - CO-Search: COVID-19 Information Retrieval with Semantic Search, Question Answering, and Abstractive Summarization
π - LaBSE:Language-agnostic BERT Sentence Embedding
π - Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset
π - DeText: A deep NLP framework for intelligent text understanding
π - Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation
π - Pretrained Transformers for Text Ranking: BERT and Beyond
π
2021
- Augmented SBERT
π - BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
π - Compatibility-aware Heterogeneous Visual Search
π·
?2021/2022?
Libraries and Tools
- fastText
- Universal Sentence Encoder
- SBERT
- LaBSE
- LASER
- Haystack
- Jina.AI
- SentEval Toolkit
- BEIR :Benchmarking IR
- Which Frame?
- PySerini
- milvus
- weaviate
- natural-language-youtube-search
- same.energy
- scaNN
- annoy
- faiss
- DPR
- rank_BM25
- nearPy
- vearch
- PyNNDescent
- pgANN