10315 Repositories
Python algorithms-and-data-structures Libraries
DELTA is a deep learning based natural language and speech processing platform.
DELTA - A DEep learning Language Technology plAtform What is DELTA? DELTA is a deep learning based end-to-end natural language and speech processing p
Text preprocessing, representation and visualization from zero to hero.
Text preprocessing, representation and visualization from zero to hero. From zero to hero • Installation • Getting Started • Examples • API • FAQ • Co
Python package for performing Entity and Text Matching using Deep Learning.
DeepMatcher DeepMatcher is a Python package for performing entity and text matching using deep learning. It provides built-in neural networks and util
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Texar is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
A Deep Learning NLP/NLU library by Intel® AI Lab Overview | Models | Installation | Examples | Documentation | Tutorials | Contributing NLP Architect
Beautiful visualizations of how language differs among document types.
Scattertext 0.1.0.0 A tool for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot. Points corresponding t
Basic Utilities for PyTorch Natural Language Processing (NLP)
Basic Utilities for PyTorch Natural Language Processing (NLP) PyTorch-NLP, or torchnlp for short, is a library of basic utilities for PyTorch NLP. tor
A full spaCy pipeline and models for scientific/biomedical documents.
This repository contains custom pipes and models related to using spaCy for scientific documents. In particular, there is a custom tokenizer that adds
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr
Extract Keywords from sentence or Replace keywords in sentences.
FlashText This module can be used to replace keywords in sentences or extract keywords from sentences. It is based on the FlashText algorithm. Install
⚡ Automatically decrypt encryptions without knowing the key or cipher, decode encodings, and crack hashes ⚡
Translations 🇩🇪 DE 🇫🇷 FR 🇭🇺 HU 🇮🇩 ID 🇮🇹 IT 🇳🇱 NL 🇧🇷 PT-BR 🇷🇺 RU 🇨🇳 ZH ➡️ Documentation | Discord | Installation Guide ⬅️ Fully autom
Python module (C extension and plain python) implementing Aho-Corasick algorithm
pyahocorasick pyahocorasick is a fast and memory efficient library for exact or approximate multi-pattern string search meaning that you can find mult
Python implementation of TextRank for phrase extraction and summarization of text documents
PyTextRank PyTextRank is a Python implementation of TextRank as a spaCy pipeline extension, used to: extract the top-ranked phrases from text document
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
fastNLP fastNLP是一款轻量级的自然语言处理(NLP)工具包,目标是快速实现NLP任务以及构建复杂模型。 fastNLP具有如下的特性: 统一的Tabular式数据容器,简化数据预处理过程; 内置多种数据集的Loader和Pipe,省去预处理代码; 各种方便的NLP工具,例如Embedd
Module for automatic summarization of text documents and HTML pages.
Automatic text summarizer Simple library and command line utility for extracting summary from HTML pages or plain texts. The package also contains sim
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
ParlAI (pronounced “par-lay”) is a python framework for sharing, training and testing dialogue models, from open-domain chitchat, to task-oriented dia
:mag: End-to-End Framework for building natural language search interfaces to data by utilizing Transformers and the State-of-the-Art of NLP. Supporting DPR, Elasticsearch, HuggingFace’s Modelhub and much more!
Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want
🎐 a python library for doing approximate and phonetic matching of strings.
jellyfish Jellyfish is a python library for doing approximate and phonetic matching of strings. Written by James Turk [email protected] and Michael
Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
TextDistance TextDistance -- python library for comparing distance between two or more sequences by many algorithms. Features: 30+ algorithms Pure pyt
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.
VADER-Sentiment-Analysis VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifica
Fixes mojibake and other glitches in Unicode text, after the fact.
ftfy: fixes text for you print(fix_encoding("(ง'⌣')ง")) (ง'⌣')ง Full documentation: https://ftfy.readthedocs.org Testimonials “My life is li
An open source library for deep learning end-to-end dialog systems and chatbots.
DeepPavlov is an open-source conversational AI library built on TensorFlow, Keras and PyTorch. DeepPavlov is designed for development of production re
NLP, before and after spaCy
textacy: NLP, before and after spaCy textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the hig
:id: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
Dedupe Python Library dedupe is a python library that uses machine learning to perform fuzzy matching, deduplication and entity resolution quickly on
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Provides an implementation of today's most used tokenizers, with a focus on performance and versatility. Main features: Train new vocabularies and tok
Snowball compiler and stemming algorithms
Snowball is a small string processing language for creating stemming algorithms for use in Information Retrieval, plus a collection of stemming algori
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
TextBlob: Simplified Text Processing Homepage: https://textblob.readthedocs.io/ TextBlob is a Python (2 and 3) library for processing textual data. It
An open-source NLP research library, built on PyTorch.
An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks. Quic
Data loaders and abstractions for text and NLP
torchtext This repository consists of: torchtext.data: Generic data loaders, abstractions, and iterators for text (including vocabulary and word vecto
Library for fast text representation and classification.
fastText fastText is a library for efficient learning of word representations and sentence classification. Table of contents Resources Models Suppleme
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
Rasa Open Source Rasa is an open source machine learning framework to automate text-and voice-based conversations. With Rasa, you can build contextual
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
State-of-the-art Natural Language Processing for PyTorch and TensorFlow 2.0 🤗 Transformers provides thousands of pretrained models to perform tasks o
💫 Industrial-strength Natural Language Processing (NLP) in Python
spaCy: Industrial-strength NLP spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest researc
Dimensionality reduction in very large datasets using Siamese Networks
ivis Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets. Ivis
Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js
pivottablejs: the Python module Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js Installation pip install pivot
A grammar of graphics for Python
plotnine Latest Release License DOI Build Status Coverage Documentation plotnine is an implementation of a grammar of graphics in Python, it is based
The open-source tool for building high-quality datasets and computer vision models
The open-source tool for building high-quality datasets and computer vision models. Website • Docs • Try it Now • Tutorials • Examples • Blog • Commun
🧇 Make Waffle Charts in Python.
PyWaffle PyWaffle is an open source, MIT-licensed Python package for plotting waffle charts. It provides a Figure constructor class Waffle, which coul
Automatically Visualize any dataset, any size with a single line of code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.
AutoViz Automatically Visualize any dataset, any size with a single line of code. AutoViz performs automatic visualization of any dataset with one lin
A python package for animating plots build on matplotlib.
animatplot A python package for making interactive as well as animated plots with matplotlib. Requires Python = 3.5 Matplotlib = 2.2 (because slider
An open-source plotting library for statistical data.
Lets-Plot Lets-Plot is an open-source plotting library for statistical data. It is implemented using the Kotlin programming language. The design of Le
Visualize and compare datasets, target values and associations, with one line of code.
In-depth EDA (target analysis, comparison, feature analysis, correlation) in two lines of code! Sweetviz is an open-source Python library that generat
HiPlot makes understanding high dimensional data easy
HiPlot - High dimensional Interactive Plotting HiPlot is a lightweight interactive visualization tool to help AI researchers discover correlations and
Joyplots in Python with matplotlib & pandas :chart_with_upwards_trend:
JoyPy JoyPy is a one-function Python package based on matplotlib + pandas with a single purpose: drawing joyplots (a.k.a. ridgeline plots). The code f
IPython/Jupyter notebook module for Vega and Vega-Lite
IPython Vega IPython/Jupyter notebook module for Vega 5, and Vega-Lite 4. Notebooks with embedded visualizations can be viewed on GitHub and nbviewer.
Bokeh Plotting Backend for Pandas and GeoPandas
Pandas-Bokeh provides a Bokeh plotting backend for Pandas, GeoPandas and Pyspark DataFrames, similar to the already existing Visualization feature of
Parallel t-SNE implementation with Python and Torch wrappers.
Multicore t-SNE This is a multicore modification of Barnes-Hut t-SNE by L. Van der Maaten with python and Torch CFFI-based wrappers. This code also wo
Visualizations for machine learning datasets
Introduction The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive
Python library that makes it easy for data scientists to create charts.
Chartify Chartify is a Python library that makes it easy for data scientists to create charts. Why use Chartify? Consistent input data format: Spend l
A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
hvPlot A high-level plotting API for the PyData ecosystem built on HoloViews. Build Status Coverage Latest dev release Latest release Docs What is it?
A Python toolbox for gaining geometric insights into high-dimensional data
"To deal with hyper-planes in a 14 dimensional space, visualize a 3D space and say 'fourteen' very loudly. Everyone does it." - Geoff Hinton Overview
3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK)
PyVista Deployment Build Status Metrics Citation License Community 3D plotting and mesh analysis through a streamlined interface for the Visualization
Streaming pivot visualization via WebAssembly
Perspective is an interactive visualization component for large, real-time datasets. Originally developed for J.P. Morgan's trading business, Perspect
Library for exploring and validating machine learning data
TensorFlow Data Validation TensorFlow Data Validation (TFDV) is a library for exploring and validating machine learning data. It is designed to be hig
Missing data visualization module for Python.
missingno Messy datasets? Missing values? missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities tha
Quickly and accurately render even the largest data.
Turn even the largest data into images, accurately Build Status Coverage Latest dev release Latest release Docs Support What is it? Datashader is a da
With Holoviews, your data visualizes itself.
HoloViews Stop plotting your data - annotate your data and let it visualize itself. HoloViews is an open-source Python library designed to make data a
Fast data visualization and GUI tools for scientific / engineering applications
PyQtGraph A pure-Python graphics library for PyQt5/PyQt6/PySide2/PySide6 Copyright 2020 Luke Campagnola, University of North Carolina at Chapel Hill h
Uniform Manifold Approximation and Projection
UMAP Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, bu
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
Interactive Data Visualization in the browser, from Python
Bokeh is an interactive visualization library for modern web browsers. It provides elegant, concise construction of versatile graphics, and affords hi
Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required.
Dash Dash is the most downloaded, trusted Python framework for building ML & data science web apps. Built on top of Plotly.js, React and Flask, Dash t
Statistical data visualization using matplotlib
seaborn: statistical data visualization Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing
:sound: Play and Record Sound with Python :snake:
Play and Record Sound with Python This Python module provides bindings for the PortAudio library and a few convenience functions to play and record Nu
commonfate 📦commonfate 📦 - Common Fate Model and Transform.
Common Fate Transform and Model for Python This package is a python implementation of the Common Fate Transform and Model to be used for audio source
Speech recognition module for Python, supporting several engines and APIs, online and offline.
SpeechRecognition Library for performing speech recognition, with support for several engines and APIs, online and offline. Speech recognition engine/
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
Audiomentations A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio a
:speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/
SpeechPy Official Project Documentation Table of Contents Documentation Which Python versions are supported Citation How to Install? Local Installatio
This library provides common speech features for ASR including MFCCs and filterbank energies.
python_speech_features This library provides common speech features for ASR including MFCCs and filterbank energies. If you are not sure what MFCCs ar
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.
Summary Pyroomacoustics is a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the pack
A fast MDCT implementation using SciPy and FFTs
MDCT A fast MDCT implementation using SciPy and FFTs Installation As usual pip install mdct Dependencies NumPy SciPy STFT Usage import mdct spectrum
Data manipulation and transformation for audio signal processing, powered by PyTorch
torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the
Python library for handling audio datasets.
AUDIOMATE Audiomate is a library for easy access to audio datasets. It provides the datastructures for accessing/loading different datasets in a gener
Read music meta data and length of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and Wave files with python 2 or 3
tinytag tinytag is a library for reading music meta data of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and Wave files with python Install pip install tinytag
Formatting of dates and times in Flask templates using moment.js.
Flask-Moment This extension enhances Jinja2 templates with formatting of dates and times using moment.js. Quick Start Step 1: Initialize the extension
Mixer -- Is a fixtures replacement. Supported Django, Flask, SqlAlchemy and custom python objects.
The Mixer is a helper to generate instances of Django or SQLAlchemy models. It's useful for testing and fixture replacement. Fast and convenient test-
Doing the OAuth dance with style using Flask, requests, and oauthlib.
Flask-Dance Doing the OAuth dance with style using Flask, requests, and oauthlib. Currently, only OAuth consumers are supported, but this project coul
A flask extension for managing permissions and scopes
Flask-Pundit A simple flask extension to organize resource authorization and scoping. This extension is heavily inspired by the ruby Pundit library. I
The ultimate Python library in building OAuth, OpenID Connect clients and servers. JWS,JWE,JWK,JWA,JWT included.
Authlib The ultimate Python library in building OAuth and OpenID Connect servers. JWS, JWK, JWA, JWT are included. Authlib is compatible with Python2.
Strong, Simple, and Precise security for Flask APIs (using jwt)
flask-praetorian Strong, Simple, and Precise security for Flask APIs API security should be strong, simple, and precise like a Roman Legionary. This p
Simple extension that provides Basic, Digest and Token HTTP authentication for Flask routes
Flask-HTTPAuth Simple extension that provides Basic and Digest HTTP authentication for Flask routes. Installation The easiest way to install this is t
Customizable User Authorization & User Management: Register, Confirm, Login, Change username/password, Forgot password and more.
Flask-User v1.0 Attention: Flask-User v1.0 is a Production/Stable version. The previous version is Flask-User v0.6. User Authentication and Management
Quick and simple security for Flask applications
Note This project is non maintained anymore. Consider the Flask-Security-Too project as an alternative. Flask-Security It quickly adds security featur
Flask-Potion is a RESTful API framework for Flask and SQLAlchemy, Peewee or MongoEngine
Flask-Potion Description Flask-Potion is a powerful Flask extension for building RESTful JSON APIs. Potion features include validation, model resource
Fully featured framework for fast, easy and documented API development with Flask
Flask RestPlus IMPORTANT NOTICE: This project has been forked to Flask-RESTX and will be maintained by by the python-restx organization. Flask-RESTPlu
Flower is a web based tool for monitoring and administrating Celery clusters.
Real-time monitor and web admin for Celery distributed task queue
Drop-in MessagePack support for ASGI applications and frameworks
msgpack-asgi msgpack-asgi allows you to add automatic MessagePack content negotiation to ASGI applications (Starlette, FastAPI, Quart, etc.), with a s
Run your jupyter notebooks as a REST API endpoint. This isn't a jupyter server but rather just a way to run your notebooks as a REST API Endpoint.
Jupter Notebook REST API Run your jupyter notebooks as a REST API endpoint. This isn't a jupyter server but rather just a way to run your notebooks as
Opentracing support for Starlette and FastApi
Starlette-OpenTracing OpenTracing support for Starlette and FastApi. Inspired by: Flask-OpenTracing OpenTracing implementations exist for major distri
Prometheus exporter for Starlette and FastAPI
starlette_exporter Prometheus exporter for Starlette and FastAPI. The middleware collects basic metrics: Counter: starlette_requests_total Histogram:
Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.
starlette context Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automat
A rate limiter for Starlette and FastAPI
SlowApi A rate limiting library for Starlette and FastAPI adapted from flask-limiter. Note: this is alpha quality code still, the API may change, and
FastAPI application and service structure for a more maintainable codebase
Abstracting FastAPI Services See this article for more information: https://camillovisini.com/article/abstracting-fastapi-services/ Poetry poetry inst
A tool to convert AWS EC2 instances back and forth between On-Demand and Spot billing models.
ec2-spot-converter This tool converts existing AWS EC2 instances back and forth between On-Demand and 'persistent' Spot billing models while preservin
A tax calculator for stocks and dividends activities.
Revolut Stocks calculator for Bulgarian National Revenue Agency Information Processing and calculating the required information about stock possession
Code and yara rules to detect and analyze Cobalt Strike
Cobalt Strike Resources This repository contains: analyze.py: a script to analyze a Cobalt Strike beacon (python analyze.py BEACON) extract.py; extrac
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language This repository contains UA-GEC data and an accompanying Python lib
Code for the Shortformer model, from the paper by Ofir Press, Noah A. Smith and Mike Lewis.
Shortformer This repository contains the code and the final checkpoint of the Shortformer model. This file explains how to run our experiments on the