535 Repositories
Python information-retrieval Libraries
Continual Learning of Long Topic Sequences in Neural Information Retrieval
ContinualPassageRanking Repository for the paper "Continual Learning of Long Topic Sequences in Neural Information Retrieval". In this repository you
A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population
DeepKE is a knowledge extraction toolkit supporting low-resource and document-level scenarios for entity, relation and attribute extraction. We provide comprehensive documents, Google Colab tutorials, and online demo for beginners.
Code for ECIR'20 paper Diagnosing BERT with Retrieval Heuristics
Bert Axioms This is the repository with the code for the Paper Diagnosing BERT with Retrieval Heuristics Required Data In order to run this code, you
Sorting-Algorithms - All information about sorting algorithm you need and you can visualize the code tracer
Sorting-Algorithms - All information about sorting algorithm you need and you can visualize the code tracer
Supporting information (calculation outputs, structures)
Supporting information (calculation outputs, structures)
Python library for datamining glitch information from Gen 1 Pokémon GameBoy ROMs
g1utils This is a Python library for datamining information about various glitches (glitch Pokémon, glitch maps, etc.) from Gen 1 Pokémon ROMs. TODO A
Display relevant information for the amazing Banano coin.
Display relevant information for the amazing Banano coin. It'll also show your current Folding@Home stats (because you're likely folding for Bananos!)
Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)
TDEER 🦌 🦒 Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021) Overview TDEE
Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.
Lbl2Vec Lbl2Vec is an algorithm for unsupervised document classification and unsupervised document retrieval. It automatically generates jointly embed
A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval
CLIP4CMR A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval The original data and pre-calculate
Microservice to extract structured information on EVM smart contracts.
Contract Serializer Microservice to extract structured information on EVM smart contract. Why? Modern NFT contracts may have different names for getPr
ticguide: quick + painless TESS observing information
ticguide: quick + painless TESS observing information Complementary to the TESS observing tool tvguide (see also WTV), which tells you if your target
Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline
MUGE Multimodal Retrieval Baseline This repo is implemented based on the open_cl
Pytorch implementations of the paper Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy Gradients
LSF-SAC Pytorch implementations of the paper Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy G
This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' published at ECIR'22.
Paragraph Aggregation Retrieval Model (PARM) for Dense Document-to-Document Retrieval This repository contains the code for the paper PARM: A Paragrap
HuSpaCy: industrial-strength Hungarian natural language processing
HuSpaCy: Industrial-strength Hungarian NLP HuSpaCy is a spaCy model and a library providing industrial-strength Hungarian language processing faciliti
A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval
CLIP4CMR A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval The original data and pre-calculate
FMA: A Dataset For Music Analysis
FMA: A Dataset For Music Analysis Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson. International Society for Music Information
An all-purpose Discord bot written in Python featuring a diverse collection of practical utilities.
GlazeGlopBot Table of Contents About Setup Usage Commands Command Errors Cog Management Local Sound Files Cogs Mod QR RNG VC Weather Proposed Features
Tenssens framework focused on gathering information from free tools or resources. The intention is to help people find free OSINT resources.
Tenssens framework focused on gathering information from free tools or resources. The intention is to help people find free OSINT resources.
GDIT: Geometry Dash Info Tool
GDIT: Geometry Dash Info Tool This is the first large script that allows you to quickly get information from the Geometry Dash server
Automatically download and crop key information from the arxiv daily paper. (cpu version)
Automatically download and crop key information from the arxiv daily paper. (cpu version)
TakeInfoatNistforICS - Take Information in NIST NVD for ICS
Take Information in NIST NVD for ICS This project developed with Python. When yo
Cherche (search in French) allows you to create a neural search pipeline using retrievers and pre-trained language models as rankers.
Cherche (search in French) allows you to create a neural search pipeline using retrievers and pre-trained language models as rankers. Cherche is meant to be used with small to medium sized corpora. Cherche's main strength is its ability to build diverse and end-to-end pipelines.
A ninja python package that unifies the Google Earth Engine ecosystem.
A Python package that unifies the Google Earth Engine ecosystem. EarthEngine.jl | rgee | rgee+ | eemont GitHub: https://github.com/r-earthengine/ee_ex
labsecurity is a tool that brings together python scripts made for ethical hacking, in a single tool, through a console interface
labsecurity labsecurity is a tool that brings together python scripts made for ethical hacking, in a single tool, through a console interface. Warning
Establishing Strong Baselines for TripClick Health Retrieval; ECIR 2022
TripClick Baselines with Improved Training Data Welcome 🙌 to the hub-repo of our paper: Establishing Strong Baselines for TripClick Health Retrieval
Code and dataset for AAAI 2021 paper FixMyPose: Pose Correctional Describing and Retrieval Hyounghun Kim, Abhay Zala, Graham Burri, Mohit Bansal.
FixMyPose / फिक्समाइपोज़ Code and dataset for AAAI 2021 paper "FixMyPose: Pose Correctional Describing and Retrieval" Hyounghun Kim*, Abhay Zala*, Grah
Better-rtti-parser - IDA script to parse RTTI information in executable
RTTI parser Parses RTTI information from executable. Example HexRays decompiler view Before: After: Functions window Before: After: Structs window Ins
IP Rover - An Excellent OSINT tool to get information of any ip address
IP Rover - An Excellent OSINT tool to get information of any ip address. All details are explained in below screenshot
Awesome-AI-books - Some awesome AI related books and pdfs for learning and downloading
Awesome AI books Some awesome AI related books and pdfs for downloading and learning. Preface This repo only used for learning, do not use in business
The Master's in Data Science Program run by the Faculty of Mathematics and Information Science
The Master's in Data Science Program run by the Faculty of Mathematics and Information Science is among the first European programs in Data Science and is fully focused on data engineering and data analytics.
Sie_banxico - A python class for the Economic Information System (SIE) API of Banco de México
sie_banxico A python class for the Economic Information System (SIE) API of Banco de México. Args: token (str): A query token from Banco de México id_
VQMIVC - Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion (Interspeech
Vector space based Information Retrieval System for Text Processing - Information retrieval
Information Retrieval: Text Processing Group 13 Sequence of operations Install Requirements Add given wikipedia files to the corpus directory. Downloa
Locationinfo - A script helps the user to show network information such as ip address
Description This script helps the user to show network information such as ip ad
Fake news detector filters - Smart filter project allow to classify the quality of information and web pages
fake-news-detector-1.0 Lists, lists and more lists... Spam filter list, quality keyword list, stoplist list, top-domains urls list, news agencies webs
More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval
More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdh
CCCL: Contrastive Cascade Graph Learning.
CCGL: Contrastive Cascade Graph Learning This repo provides a reference implementation of Contrastive Cascade Graph Learning (CCGL) framework as descr
This script is intended to crawl license information of repositories through the GitHub API.
GithubLicenseCrawler This script is intended to crawl license information of repositories through the GitHub API. Taking a csv file with requirements.
Personal implementation of paper "Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval"
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval This repo provides personal implementation of paper Approximate Ne
Graph Representation Learning via Graphical Mutual Information Maximization
GMI (Graphical Mutual Information) Graph Representation Learning via Graphical Mutual Information Maximization (Peng Z, Huang W, Luo M, et al., WWW 20
Unsupervised Attributed Multiplex Network Embedding (AAAI 2020)
Unsupervised Attributed Multiplex Network Embedding (DMGI) Overview Nodes in a multiplex network are connected by multiple types of relations. However
Graph InfoClust: Leveraging cluster-level node information for unsupervised graph representation learning
Graph-InfoClust-GIC [PAKDD 2021] PAKDD'21 version Graph InfoClust: Maximizing Coarse-Grain Mutual Information in Graphs Preprint version Graph InfoClu
Quickly download, clean up, and install public datasets into a database management system
Finding data is one thing. Getting it ready for analysis is another. Acquiring, cleaning, standardizing and importing publicly available data is time
Python information display framework aimed at e-ink devices
My display, using a Raspberry Pi Zero W and Waveshare 6" e-paper hat infodisplay Modular information display framework aimed at e-ink devices. Built u
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
ALPRO Align and Prompt: Video-and-Language Pre-training with Entity Prompts [Paper] Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C.H
Official code for "InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization" (ICLR 2020, spotlight)
InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization Authors: Fan-yun Sun, Jordan Hoffm
Code for "SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism"
SUGAR Code for "SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism" Overview train.py: the cor
NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions
NeoDTI NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions (Bioinformatics).
PyTorch implementation for the ICLR 2020 paper "Understanding the Limitations of Variational Mutual Information Estimators"
Smoothed Mutual Information ``Lower Bound'' Estimator PyTorch implementation for the ICLR 2020 paper Understanding the Limitations of Variational Mutu
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
News December 27: v1.1.0 New loss functions: CentroidTripletLoss and VICRegLoss Mean reciprocal rank + per-class accuracies See the release notes Than
Based on the selenium automatic test framework of python, the program crawls the score information of the educational administration system of a unive
whpu_spider 该程序基于python的selenium自动化测试框架,对某高校的教务系统的成绩信息实时爬取,在检测到成绩更新之后,会通过电子邮件的方式,将更新的成绩以文本的方式发送给用户,可以使得用户在不必手动登录教务系统网站时,实时获取成绩更新的信息。 该程序仅供学习交流,不可用于恶意攻
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)
Baleen Baleen is a state-of-the-art model for multi-hop reasoning, enabling scalable multi-hop search over massive collections for knowledge-intensive
This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation)
This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation) Usage example python dynamic_inverted_softmax.py --sims_train
Free & open source API service for obtaining information about +9600 universities worldwide.
Free & open source API service for obtaining information about +9600 universities worldwide.
Sequence lineage information extracted from RKI sequence data repo
Pango lineage information for German SARS-CoV-2 sequences This repository contains a join of the metadata and pango lineage tables of all German SARS-
This Spider/Bot is developed using Python and based on Scrapy Framework to Fetch some items information from Amazon
- Hello, This Project Contains Amazon Web-bot. - I've developed this bot for fething some items information on Amazon. - Scrapy Framework in Python is
OCR powered screen-capture tool to capture information instead of images
NormCap OCR powered screen-capture tool to capture information instead of images. Links: Repo | PyPi | Releases | Changelog | FAQs Content: Quickstart
PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)
Thanks to the low storage cost and high query speed, cross-view hashing (CVH) has been successfully used for similarity search in multimedia retrieval. However, most existing CVH methods use all views to learn a common Hamming space, thus making it difficult to handle the data with increasing views or a large number of views.
Fast and robust date extraction from web pages, with Python or on the command-line
Find original and updated publication dates of any web page. From the command-line or within Python, all the steps needed from web page download to HTML parsing, scraping, and text analysis are included.
Legal text retrieval for python
legal-text-retrieval Overview This system contains 2 steps: generate training data containing negative sample found by mixture score of cosine(tfidf)
Whoisss is a website information gatharing Tool.
Whoisss Whoisss is a website information gatharing Tool. You can cse it to collect information about website. Usage apt-get update apt-get upgrade pkg
Get information about what a Python frame is currently doing, particularly the AST node being executed
executing This mini-package lets you get information about what a frame is currently doing, particularly the AST node being executed. Usage Getting th
LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021
LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021 We propose a cross encoder model (LTR_CrossEncoder) for information retrieval, re-retrie
Telegram bot to provide Telegram user/group/channel information
Whois-TeLeTiPs Telegram bot to provide Telegram user/group/channel information Deployment Methods Heroku Config Vars API_ID : Telegram API_ID, get it
A simple Discord bot wrote with Python. Kizmeow let you track your NFT project and display some useful information
Kizmeow-OpenSea-and-Etherscan-Discord-Bot 中文版 | English Ver A Discord bot wrote with Python. Kizmeow let you track your NFT project and display some u
SOCMINT tool to get personal infos from an Instagram account via analysis of its followers and/or following
S T E R R A 🔭 A SOCMINT tool to get infos from an Instagram acc via its Followers / Following Allows you to analyse someone's followers, following, a
Preview title and other information about links sent to chats.
Link Preview A small plugin for Nicotine+ to display preview information like title and description about links sent in chats. Plugin created with Nic
A discord bot with information and template tracking for pxls.space.
pyCharity A discord bot with information and template tracking for pxls.space. Inspired by Mikarific's Charity bot. Try out the beta version on your s
People log into different sites every day to get information and browse through these sites one by one
HyperLink People log into different sites every day to get information and browse through these sites one by one. And they are exposed to advertisemen
Omniscient Mozart, being able to transcribe everything in the music, including vocal, drum, chord, beat, instruments, and more.
OMNIZART Omnizart is a Python library that aims for democratizing automatic music transcription. Given polyphonic music, it is able to transcribe pitc
Script for getting information in discord
User-info.py Script for getting information in https://discord.com/ Instalação: apt-get update -y apt-get upgrade -y apt-get install git pkg install
DaProfiler allows you to get emails, social medias, adresses, works and more on your target using web scraping and google dorking techniques
DaProfiler allows you to get emails, social medias, adresses, works and more on your target using web scraping and google dorking techniques, based in France Only. The particularity of this program is its ability to find your target's e-mail adresses.
An application pulls configuration information from JSON files generated
AP Provisioning Automation An application pulls configuration information from JSON files generated by Ekahau and then uses Netmiko to configure the l
Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021
Fine-grained Post-training for Multi-turn Response Selection Implements the model described in the following paper Fine-grained Post-training for Impr
LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021
LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021 We propose a cross encoder model (LTR_CrossEncoder) for information retrieval, re-retrie
labsecurity is a framework and its use is for ethical hacking and computer security
labsecurity labsecurity is a framework and its use is for ethical hacking and computer security. Warning This tool is only for educational purpose. If
🔍 📊 Look up information about anime, manga and much more directly in Discord!
AniSearch The source code of the AniSearch Discord Bot. Contribute You have an idea or found a bug? Open a new issue with detailed explanation. You wa
Operational information regarding the vulnerability in the Log4j logging library.
Log4j Vulnerability (CVE-2021-44228) This repo contains operational information regarding the vulnerability in the Log4j logging library (CVE-2021-442
A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization
University1652-Baseline [Paper] [Slide] [Explore Drone-view Data] [Explore Satellite-view Data] [Explore Street-view Data] [Video Sample] [中文介绍] This
Autoregressive Entity Retrieval
The GENRE (Generative ENtity REtrieval) system as presented in Autoregressive Entity Retrieval implemented in pytorch. @inproceedings{decao2020autoreg
A python package that extends Google Earth Engine.
A python package that extends Google Earth Engine GitHub: https://github.com/davemlz/eemont Documentation: https://eemont.readthedocs.io/ PyPI: https:
Powerful unsupervised domain adaptation method for dense retrieval.
Powerful unsupervised domain adaptation method for dense retrieval
A toolkit for document-level event extraction, containing some SOTA model implementations
❤️ A Toolkit for Document-level Event Extraction with & without Triggers Hi, there 👋 . Thanks for your stay in this repo. This project aims at buildi
Joint learning of images and text via maximization of mutual information
mutual_info_img_txt Joint learning of images and text via maximization of mutual information. This repository incorporates the algorithms presented in
DeepDiffusion: Unsupervised Learning of Retrieval-adapted Representations via Diffusion-based Ranking on Latent Feature Manifold
DeepDiffusion Introduction This repository provides the code of the DeepDiffusion algorithm for unsupervised learning of retrieval-adapted representat
A toolkit for document-level event extraction, containing some SOTA model implementations
Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker Source code for ACL-IJCNLP 2021 Long paper: Document-le
RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
RDFLib RDFLib is a pure Python package for working with RDF. RDFLib contains most things you need to work with RDF, including: parsers and serializers
Official implementation of the article "Unsupervised JPEG Domain Adaptation For Practical Digital Forensics"
Unsupervised JPEG Domain Adaptation for Practical Digital Image Forensics @WIFS2021 (Montpellier, France) Rony Abecidan, Vincent Itier, Jeremie Boulan
A Python tool to display geolocation information in the traceroute.
IP2Trace Python IP2Trace Python is a Python tool allowing user to get IP address information such as country, region, city, latitude, longitude, zip c
A dead simple crawler to get books information from Douban.
Introduction A dead simple crawler to get books information from Douban. Pre-requesites Python 3 Install dependencies from requirements.txt (Optional)
Official implementation of the AAAI 2022 paper "Learning Token-based Representation for Image Retrieval"
Token: Token-based Representation for Image Retrieval PyTorch training code for Token-based Representation for Image Retrieval. We propose a joint loc
Ethereum transactions and wallet information for people you follow on Twitter.
ethFollowing Ethereum transactions and wallet information for people you follow on Twitter. Set up Setup python environment (requires python 3.8): vir
Code, final versions, and information on the Sparkfun Graphical Datasheets
Graphical Datasheets Code, final versions, and information on the SparkFun Graphical Datasheets. Generated Cells After Running Script Example Complete
GTK and Python based, system performance and usage monitoring tool
System Monitoring Center GTK3 and Python 3 based, system performance and usage monitoring tool. Features: Detailed system performance and usage usage
A dead simple crawler to get books information from Douban.
Introduction A dead simple crawler to get books information from Douban. Pre-requesites Python 3 Install dependencies from requirements.txt (Optional)
A Python package for Misty II development
Misty2py Misty2py is a Python 3 package for Misty II development using Misty's REST API. Read the full documentation here! Installation Poetry To inst
This repository accompanies the ACM TOIS paper "What can I cook with these ingredients?" - Understanding cooking-related information needs in conversational search
In this repository you find data that has been gathered when conducting in-situ experiments in a conversational cooking setting. These data include tr