901 Repositories
Python search-engine-scraper Libraries
Automated Machine Learning with scikit-learn
auto-sklearn auto-sklearn is an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator. Find the documentation here
mlpack: a scalable C++ machine learning library --
a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack
一款高性能敏感词(非法词/脏字)检测过滤组件,附带繁体简体互换,支持全角半角互换,汉字转拼音,模糊搜索等功能。
一款高性能非法词(敏感词)检测组件,附带繁体简体互换,支持全角半角互换,获取拼音首字母,获取拼音字母,拼音模糊搜索等功能。
A web scraper that exports your entire WhatsApp chat history.
WhatSoup 🍲 A web scraper that exports your entire WhatsApp chat history. Table of Contents Overview Demo Prerequisites Instructions Frequen
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
LightSpeech UnOfficial PyTorch implementation of LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search.
dirmaker is a simple, opinionated static site generator for quickly publishing directory websites.
dirmaker is a simple, opinionated static site generator for publishing directory websites (eg: Indic.page, env.wiki It takes entries from a YAML file and generates a categorised, paginated directory website.
Abilian Social Business Engine - an enterprise social networking / collaboration platform.
About Abilian SBE (Social Business Engine) is a platform for social business applications, and more specifically collaborative / enterprise 2.0 busine
Animation engine for explanatory math videos
Manim is an engine for precise programatic animations, designed for creating explanatory math videos. Note, there are two versions of manim. This repo
Data intensive science for everyone.
The latest information about Galaxy can be found on the Galaxy Community Hub. Community support is available at Galaxy Help. Galaxy Quickstart Galaxy
MoinMoin Wiki Development (2.0+), unstable, for production please use 1.9.x.
MoinMoin - a wiki engine in Python MoinMoin is an easy to use, full-featured and extensible wiki software package written in Python. It can fulfill a
A free & open modern, fast email client with user-friendly encryption and privacy features
Welcome to Mailpile! Introduction Mailpile (https://www.mailpile.is/) is a modern, fast web-mail client with user-friendly encryption and privacy feat
Abilian Social Business Engine - an enterprise social networking / collaboration platform.
About Abilian SBE (Social Business Engine) is a platform for social business applications, and more specifically collaborative / enterprise 2.0 busine
Scan, index, and archive all of your paper documents
[ en | de | el ] Important news about the future of this project It's been more than 5 years since I started this project on a whim as an effort to tr
:bookmark: Browser-independent bookmark manager
buku buku in action! Introduction buku is a powerful bookmark manager written in Python3 and SQLite3. When I started writing it, I couldn't find a fle
:mag: Ambar: Document Search Engine
🔍 Ambar: Document Search Engine Ambar is an open-source document search engine with automated crawling, OCR, tagging and instant full-text search. Am
API & Webapp to answer questions about COVID-19. Using NLP (Question Answering) and trusted data sources.
This open source project serves two purposes. Collection and evaluation of a Question Answering dataset to improve existing QA/search methods - COVID-
Sandwich Batch Normalization
Sandwich Batch Normalization Code for Sandwich Batch Normalization. Introduction We present Sandwich Batch Normalization (SaBN), an extremely easy imp
Model search is a framework that implements AutoML algorithms for model architecture search at scale
Model search (MS) is a framework that implements AutoML algorithms for model architecture search at scale. It aims to help researchers speed up their exploration process for finding the right model architecture for their classification problems (i.e., DNNs with different types of layers).
Full text search for flask.
flask-msearch Installation To install flask-msearch: pip install flask-msearch # when MSEARCH_BACKEND = "whoosh" pip install whoosh blinker # when MSE
rewise is an unofficial wrapper for google search's auto-complete feature
rewise is an unofficial wrapper for google search's auto-complete feature
Knowledge Management for Humans using Machine Learning & Tags
HyperTag HyperTag helps humans intuitively express how they think about their files using tags and machine learning.
Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.
CSE-Autoloss Designing proper loss functions for vision tasks has been a long-standing research direction to advance the capability of existing models
Minecraft clone using Python Ursina game engine!
Minecraft clone using Python Ursina game engine!
Knowledge Management for Humans using Machine Learning & Tags
HyperTag helps humans intuitively express how they think about their files using tags and machine learning. Represent how you think using tags. Find what you look for using semantic search for your text documents (yes, even PDF's) and images.
Cookiecutter API for creating Custom Skills for Azure Search using Python and Docker
cookiecutter-spacy-fastapi Python cookiecutter API for quick deployments of spaCy models with FastAPI Azure Search The API interface is compatible wit
NeuralQA: A Usable Library for Question Answering on Large Datasets with BERT
NeuralQA: A Usable Library for (Extractive) Question Answering on Large Datasets with BERT Still in alpha, lots of changes anticipated. View demo on n
Extract Keywords from sentence or Replace keywords in sentences.
FlashText This module can be used to replace keywords in sentences or extract keywords from sentences. It is based on the FlashText algorithm. Install
:mag: End-to-End Framework for building natural language search interfaces to data by utilizing Transformers and the State-of-the-Art of NLP. Supporting DPR, Elasticsearch, HuggingFace’s Modelhub and much more!
Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want
🎐 a python library for doing approximate and phonetic matching of strings.
jellyfish Jellyfish is a python library for doing approximate and phonetic matching of strings. Written by James Turk [email protected] and Michael
An easier way to build neural search on the cloud
An easier way to build neural search on the cloud Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g
ChatterBot is a machine learning, conversational dialog engine for creating chat bots
ChatterBot ChatterBot is a machine-learning based conversational dialog engine build in Python which makes it possible to generate responses based on
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Project DeepSpeech DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Spee
CPU inference engine that delivers unprecedented performance for sparse models
The DeepSparse Engine is a CPU runtime that delivers unprecedented performance by taking advantage of natural sparsity within neural networks to reduce compute required as well as accelerate memory bound workloads. It is focused on model deployment and scaling machine learning pipelines, fitting seamlessly into your existing deployments as an inference backend.
Darkdump - Search The Deep Web Straight From Your Terminal
Darkdump - Search The Deep Web Straight From Your Terminal About Darkdump Darkdump is a simple script written in Python3.9 in which it allows users to
NeuralQA: A Usable Library for Question Answering on Large Datasets with BERT
NeuralQA: A Usable Library for (Extractive) Question Answering on Large Datasets with BERT Still in alpha, lots of changes anticipated. View demo on n
Extract Keywords from sentence or Replace keywords in sentences.
FlashText This module can be used to replace keywords in sentences or extract keywords from sentences. It is based on the FlashText algorithm. Install
:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...
Haystack is an end-to-end framework for Question Answering & Neural search that enables you to ... ... ask questions in natural language and find gran
🎐 a python library for doing approximate and phonetic matching of strings.
jellyfish Jellyfish is a python library for doing approximate and phonetic matching of strings. Written by James Turk [email protected] and Michael
An easier way to build neural search on the cloud
An easier way to build neural search on the cloud Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g
ChatterBot is a machine learning, conversational dialog engine for creating chat bots
ChatterBot ChatterBot is a machine-learning based conversational dialog engine build in Python which makes it possible to generate responses based on
mlpack: a scalable C++ machine learning library --
a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin
Apache Spark - A unified analytics engine for large-scale data processing
Apache Spark Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an op
Looks at Python code to search for things which look "dodgy" such as passwords or diffs
dodgy Dodgy is a very basic tool to run against your codebase to search for "dodgy" looking values. It is a series of simple regular expressions desig
Pysolr — Python Solr client
pysolr pysolr is a lightweight Python client for Apache Solr. It provides an interface that queries the server and returns results based on the query.
High level Python client for Elasticsearch
Elasticsearch DSL Elasticsearch DSL is a high-level library whose aim is to help with writing and running queries against Elasticsearch. It is built o
Official Python low-level client for Elasticsearch
Python Elasticsearch Client Official low-level client for Elasticsearch. Its goal is to provide common ground for all Elasticsearch-related code in Py
A Python library for the Docker Engine API
Docker SDK for Python A Python library for the Docker Engine API. It lets you do anything the docker command does, but from within Python apps – run c
Full-text multi-table search application for Django. Easy to install and use, with good performance.
django-watson django-watson is a fast multi-model full-text search plugin for Django. It is easy to install and use, and provides high quality search
Reusable workflow library for Django
django-viewflow Viewflow is a lightweight reusable workflow library that helps to organize people collaboration business logic in django applications.
Modular search for Django
Haystack Author: Daniel Lindsley Date: 2013/07/28 Haystack provides modular search for Django. It features a unified, familiar API that allows you to
Full text search for flask.
flask-msearch Installation To install flask-msearch: pip install flask-msearch # when MSEARCH_BACKEND = "whoosh" pip install whoosh blinker # when MSE
Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)
trafilatura: Web scraping tool for text discovery and retrieval Description Trafilatura is a Python package and command-line tool which seamlessly dow
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
AutoScraper: A Smart, Automatic, Fast and Lightweight Web Scraper for Python This project is made for automatic web scraping to make scraping easy. It
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Newspaper3k: Article scraping & curation Inspired by requests for its simplicity and powered by lxml for its speed: "Newspaper is an amazing python li
:incoming_envelope: IMAP/SMTP sync system with modern APIs
Nylas Sync Engine The Nylas Sync Engine provides a RESTful API on top of a powerful email sync platform, making it easy to build apps on top of email.
Python Script to download hundreds of images from 'Google Images'. It is a ready-to-run code!
Google Images Download Python Script for 'searching' and 'downloading' hundreds of Google images to the local hard disk! Documentation Documentation H
IMDbPY is a Python package useful to retrieve and manage the data of the IMDb movie database about movies, people, characters and companies
IMDbPY is a Python package for retrieving and managing the data of the IMDb movie database about movies, people and companies. Revamp notice Starting
scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot
instagram_scraper This is a minimalistic Instagram scraper written in Python. It can fetch media, accounts, videos, comments etc. `Comment` and `Like`
Scrapes an instagram user's photos and videos
Instagram Scraper instagram-scraper is a command-line application written in Python that scrapes and downloads an instagram user's photos and videos.
ASGI support for the Tartiflette GraphQL engine
tartiflette-asgi is a wrapper that provides ASGI support for the Tartiflette Python GraphQL engine. It is ideal for serving a GraphQL API over HTTP, o
tartiflette-aiohttp is a wrapper of aiohttp which includes the Tartiflette GraphQL Engine, do not hesitate to take a look of the Tartiflette project.
tartiflette-aiohttp is a wrapper of aiohttp which includes the Tartiflette GraphQL Engine. You can take a look at the Tartiflette API documentation. U
GraphQL Engine built with Python 3.6+ / asyncio
Tartiflette is a GraphQL Server implementation built with Python 3.6+. Summary Motivation Status Usage Installation Installation dependencies Tartifle
GraphQL is a query language and execution engine tied to any backend service.
GraphQL The GraphQL specification is edited in the markdown files found in /spec the latest release of which is published at https://graphql.github.io
Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search
CLIP-GLaSS Repository for the paper Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search An in-browser demo is
A universal package of scraper scripts for humans
Scrapera is a completely Chromedriver free package that provides access to a variety of scraper scripts for most commonly used machine learning and data science domains.
Python binding to Modest engine (fast HTML5 parser with CSS selectors).
A fast HTML5 parser with CSS selectors using Modest engine. Installation From PyPI using pip: pip install selectolax Development version from github:
🛠️ Learn a technology X by doing a project - Search engine of project-based learning
Learn X by doing Y 🛠️ Learn a technology X by doing a project Y Website You can contribute by adding projects to the CSV file.
This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.
OpenVINO Inference API This is a repository for an object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operati
Search for documents in a domain through Google. The objective is to extract metadata
MetaFinder - Metadata search through Google _____ __ ___________ .__ .___ / \
[ICLR 2021] "Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective" by Wuyang Chen, Xinyu Gong, Zhangyang Wang
Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective [PDF] Wuyang Chen, Xinyu Gong, Zhangyang Wang In ICLR 2
Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"
Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search This is an implementation for our paper Contextual Non-Loca
Jinja is a fast, expressive, extensible templating engine.
Jinja is a fast, expressive, extensible templating engine. Special placeholders in the template allow writing code similar to Python syntax.
Python wrapper for Wikipedia
Wikipedia API Wikipedia-API is easy to use Python wrapper for Wikipedias' API. It supports extracting texts, sections, links, categories, translations
Tesseract Open Source OCR Engine (main repository)
Tesseract OCR About This package contains an OCR engine - libtesseract and a command line program - tesseract. Tesseract 4 adds a new neural net (LSTM
🔍 Google Search unofficial API for Python with no external dependencies
Python Google Search API Unofficial Google Search API for Python. It uses web scraping in the background and is compatible with both Python 2 and 3. W
Embed the Duktape JS interpreter in Python
Introduction Pyduktape is a python wrapper around Duktape, an embeddable Javascript interpreter. On top of the interpreter wrapper, pyduktape offers e
A Python library for the Docker Engine API
Docker SDK for Python A Python library for the Docker Engine API. It lets you do anything the docker command does, but from within Python apps – run c
The unofficial Amazon search CLI & Python API
amzSear The unofficial Amazon Product CLI & API. Easily search the amazon product directory from the command line without the need for an Amazon API k
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Newspaper3k: Article scraping & curation Inspired by requests for its simplicity and powered by lxml for its speed: "Newspaper is an amazing python li
Pysolr — Python Solr client
pysolr pysolr is a lightweight Python client for Apache Solr. It provides an interface that queries the server and returns results based on the query.
High level Python client for Elasticsearch
Elasticsearch DSL Elasticsearch DSL is a high-level library whose aim is to help with writing and running queries against Elasticsearch. It is built o
Modular search for Django
Haystack Author: Daniel Lindsley Date: 2013/07/28 Haystack provides modular search for Django. It features a unified, familiar API that allows you to
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer
🎐 a python library for doing approximate and phonetic matching of strings.
jellyfish Jellyfish is a python library for doing approximate and phonetic matching of strings. Written by James Turk [email protected] and Michael
Determined: Deep Learning Training Platform
Determined: Deep Learning Training Platform Determined is an open-source deep learning training platform that makes building models fast and easy. Det
A Comparative Framework for Multimodal Recommender Systems
Cornac Cornac is a comparative framework for multimodal recommender systems. It focuses on making it convenient to work with models leveraging auxilia
A Sklearn-like Framework for Hyperparameter Tuning and AutoML in Deep Learning projects. Finally have the right abstractions and design patterns to properly do AutoML. Let your pipeline steps have hyperparameter spaces. Enable checkpoints to cut duplicate calculations. Go from research to production environment easily.
Neuraxle Pipelines Code Machine Learning Pipelines - The Right Way. Neuraxle is a Machine Learning (ML) library for building machine learning pipeline
Best Practices on Recommendation Systems
Recommenders What's New (February 4, 2021) We have a new relase Recommenders 2021.2! It comes with lots of bug fixes, optimizations and 3 new algorith
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer
Crab is a flexible, fast recommender engine for Python that integrates classic information filtering recommendation algorithms in the world of scientific Python packages (numpy, scipy, matplotlib).
Crab - A Recommendation Engine library for Python Crab is a flexible, fast recommender engine for Python that integrates classic information filtering r
Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.
Couler What is Couler? Couler aims to provide a unified interface for constructing and managing workflows on different workflow engines, such as Argo
A powerful workflow engine implemented in pure Python
Spiff Workflow Summary Spiff Workflow is a workflow engine implemented in pure Python. It is based on the excellent work of the Workflow Patterns init
The easiest way to automate your data
Hello, world! 👋 We've rebuilt data engineering for the data science era. Prefect is a new workflow management system, designed for modern infrastruct
Small, fast HTTP client library for Python. Features persistent connections, cache, and Google App Engine support. Originally written by Joe Gregorio, now supported by community.
Introduction httplib2 is a comprehensive HTTP client library, httplib2.py supports many features left out of other HTTP libraries. HTTP and HTTPS HTTP
Safely add untrusted strings to HTML/XML markup.
MarkupSafe MarkupSafe implements a text object that escapes characters so it is safe to use in HTML and XML. Characters that have special meanings are
ASGI support for the Tartiflette GraphQL engine
tartiflette-asgi is a wrapper that provides ASGI support for the Tartiflette Python GraphQL engine. It is ideal for serving a GraphQL API over HTTP, o
tartiflette-aiohttp is a wrapper of aiohttp which includes the Tartiflette GraphQL Engine, do not hesitate to take a look of the Tartiflette project.
tartiflette-aiohttp is a wrapper of aiohttp which includes the Tartiflette GraphQL Engine. You can take a look at the Tartiflette API documentation. U
A code-completion engine for Vim
YouCompleteMe: a code-completion engine for Vim NOTE: Minimum Requirements Have Changed Our policy is to support the Vim version that's in the latest
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear