3246 Repositories
Python data-app-performance Libraries
A standard framework for modelling Deep Learning Models for tabular data
PyTorch Tabular aims to make Deep Learning with Tabular data easy and accessible to real-world cases and research alike.
CPU inference engine that delivers unprecedented performance for sparse models
The DeepSparse Engine is a CPU runtime that delivers unprecedented performance by taking advantage of natural sparsity within neural networks to reduce compute required as well as accelerate memory bound workloads. It is focused on model deployment and scaling machine learning pipelines, fitting seamlessly into your existing deployments as an inference backend.
šš Global COVID-19 vaccination data at the regional level.
COVID-19 vaccination data at subnational level. To ensure its officiality, the source data is carefully verified.
This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.
This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.
Persistent remote applications for X11; screen sharing for X11, MacOS and MSWindows.
Table of Contents About Installation Usage Help About Xpra is known as "screen for X" : its seamless mode allows you to run X11 programs, usually on a
Topic Modelling for Humans
gensim ā Topic Modelling in Python Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Targ
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Texar is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides
Beautiful visualizations of how language differs among document types.
Scattertext 0.1.0.0 A tool for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot. Points corresponding t
Basic Utilities for PyTorch Natural Language Processing (NLP)
Basic Utilities for PyTorch Natural Language Processing (NLP) PyTorch-NLP, or torchnlp for short, is a library of basic utilities for PyTorch NLP. tor
Extract Keywords from sentence or Replace keywords in sentences.
FlashText This module can be used to replace keywords in sentences or extract keywords from sentences. It is based on the FlashText algorithm. Install
An open-source NLP research library, built on PyTorch.
An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks. Quic
Data loaders and abstractions for text and NLP
torchtext This repository consists of: torchtext.data: Generic data loaders, abstractions, and iterators for text (including vocabulary and word vecto
š« Industrial-strength Natural Language Processing (NLP) in Python
spaCy: Industrial-strength NLP spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest researc
Dimensionality reduction in very large datasets using Siamese Networks
ivis Implementation of the ivis algorithm as described in the paper Structure-preserving visualisation of high dimensional single-cell datasets. Ivis
Dragānādrop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js
pivottablejs: the Python module Dragānādrop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js Installation pip install pivot
A grammar of graphics for Python
plotnine Latest Release License DOI Build Status Coverage Documentation plotnine is an implementation of a grammar of graphics in Python, it is based
š§ Make Waffle Charts in Python.
PyWaffle PyWaffle is an open source, MIT-licensed Python package for plotting waffle charts. It provides a Figure constructor class Waffle, which coul
A python package for animating plots build on matplotlib.
animatplot A python package for making interactive as well as animated plots with matplotlib. Requires Python = 3.5 Matplotlib = 2.2 (because slider
An open-source plotting library for statistical data.
Lets-Plot Lets-Plot is an open-source plotting library for statistical data. It is implemented using the Kotlin programming language. The design of Le
Visualize and compare datasets, target values and associations, with one line of code.
In-depth EDA (target analysis, comparison, feature analysis, correlation) in two lines of code! Sweetviz is an open-source Python library that generat
HiPlot makes understanding high dimensional data easy
HiPlot - High dimensional Interactive Plotting HiPlot is a lightweight interactive visualization tool to help AI researchers discover correlations and
Joyplots in Python with matplotlib & pandas :chart_with_upwards_trend:
JoyPy JoyPy is a one-function Python package based on matplotlib + pandas with a single purpose: drawing joyplots (a.k.a. ridgeline plots). The code f
Visualizations for machine learning datasets
Introduction The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive
Python library that makes it easy for data scientists to create charts.
Chartify Chartify is a Python library that makes it easy for data scientists to create charts. Why use Chartify? Consistent input data format: Spend l
A Python toolbox for gaining geometric insights into high-dimensional data
"To deal with hyper-planes in a 14 dimensional space, visualize a 3D space and say 'fourteen' very loudly. Everyone does it." - Geoff Hinton Overview
Streaming pivot visualization via WebAssembly
Perspective is an interactive visualization component for large, real-time datasets. Originally developed for J.P. Morgan's trading business, Perspect
Library for exploring and validating machine learning data
TensorFlow Data Validation TensorFlow Data Validation (TFDV) is a library for exploring and validating machine learning data. It is designed to be hig
Missing data visualization module for Python.
missingno Messy datasets? Missing values? missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities tha
Quickly and accurately render even the largest data.
Turn even the largest data into images, accurately Build Status Coverage Latest dev release Latest release Docs Support What is it? Datashader is a da
With Holoviews, your data visualizes itself.
HoloViews Stop plotting your data - annotate your data and let it visualize itself. HoloViews is an open-source Python library designed to make data a
Fast data visualization and GUI tools for scientific / engineering applications
PyQtGraph A pure-Python graphics library for PyQt5/PyQt6/PySide2/PySide6 Copyright 2020 Luke Campagnola, University of North Carolina at Chapel Hill h
Uniform Manifold Approximation and Projection
UMAP Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, bu
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
Interactive Data Visualization in the browser, from Python
Bokeh is an interactive visualization library for modern web browsers. It provides elegant, concise construction of versatile graphics, and affords hi
Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required.
Dash Dash is the most downloaded, trusted Python framework for building ML & data science web apps. Built on top of Plotly.js, React and Flask, Dash t
Statistical data visualization using matplotlib
seaborn: statistical data visualization Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing
IntelĀ® Nervanaā¢ reference deep learning framework committed to best performance on all hardware
DISCONTINUATION OF PROJECT. This project will no longer be maintained by Intel. Intel will not provide or guarantee development of or support for this
A toolkit for making real world machine learning and data analysis applications in C++
dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl
fklearn: Functional Machine Learning
fklearn: Functional Machine Learning fklearn uses functional programming principles to make it easier to solve real problems with Machine Learning. Th
ShÅgun
The SHOGUN machine learning toolbox Unified and efficient Machine Learning since 1999. Latest release: Cite Shogun: Develop branch build status: Donat
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin
ktrain is a Python library that makes deep learning and AI more accessible and easier to apply
Overview | Tutorials | Examples | Installation | FAQ | How to Cite Welcome to ktrain News and Announcements 2020-11-08: ktrain v0.25.x is released and
Deep learning library featuring a higher-level API for TensorFlow.
TFLearn: Deep learning library featuring a higher-level API for TensorFlow. TFlearn is a modular and transparent deep learning library built on top of
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website ā¢ Key Features ā¢ How To Use ā¢ Docs ā¢
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a
Deep Learning for humans
Keras: Deep Learning for Python Under Construction In the near future, this repository will be used once again for developing the Keras codebase. For
Statsmodels: statistical modeling and econometrics in Python
About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an
Apache Spark - A unified analytics engine for large-scale data processing
Apache Spark Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an op
scikit-learn: machine learning in Python
scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. The project was started
Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.
starlette context Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automat
Instrument your FastAPI app
Prometheus FastAPI Instrumentator A configurable and modular Prometheus Instrumentator for your FastAPI. Install prometheus-fastapi-instrumentator fro
Scalene: a high-performance, high-precision CPU and memory profiler for Python
scalene: a high-performance CPU and memory profiler for Python by Emery Berger äøęēę¬ (Chinese version) About Scalene % pip install -U scalene Scalen
š“Ā Call stack profiler for Python. Shows you why your code is slow!
pyinstrument Pyinstrument is a Python profiler. A profiler is a tool to help you 'optimize' your code - make it faster. It sounds obvious, but to get
Yet Another Python Profiler, but this time thread&coroutine&greenlet aware.
Yappi Yet Another Python Profiler, but this time thread&coroutine&greenlet aware. Highlights Fast: Yappi is fast. It is completely written in C and lo
Sampling profiler for Python programs
py-spy: Sampling profiler for Python programs py-spy is a sampling profiler for Python programs. It lets you visualize what your Python program is spe
Create standalone executables from Python scripts, with the same performance and is cross-platform.
About cx_Freeze cx_Freeze creates standalone executables from Python scripts, with the same performance, is cross-platform and should work on any plat
Nuitka is a Python compiler written in Python. It's fully compatible with Python 2.6, 2.7, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, and 3.9. You feed it your Python app, it does a lot of clever things, and spits out an executable or extension module.
Nuitka User Manual Contents Overview Usage Requirements Command Line Installation License Tutorial Setup and build on Windows Setup Install Python Ins
pytest plugin for manipulating test data directories and files
pytest-datadir pytest plugin for manipulating test data directories and files. Usage pytest-datadir will look up for a directory with the name of your
py.test fixture for benchmarking code
Overview docs tests package A pytest fixture for benchmarking code. It will group the tests into rounds that are calibrated to the chosen timer. See c
Data-Driven Tests for Python Unittest
DDT (Data-Driven Tests) allows you to multiply one test case by running it with different test data, and make it appear as multiple test cases. Instal
Display tabular data in a visually appealing ASCII table format
PrettyTable Installation Install via pip: python -m pip install -U prettytable Install latest development version: python -m pip install -U git+https
Command-line tool that instantly fetches Stack Overflow results when an exception is thrown
rebound Rebound is a command-line tool that instantly fetches Stack Overflow results when an exception is thrown. Just use the rebound command to exec
Pretty-print tabular data in Python, a library and a command-line utility. Repository migrated from bitbucket.org/astanin/python-tabulate.
python-tabulate Pretty-print tabular data in Python, a library and a command-line utility. The main use cases of the library are: printing small table
Lazydata: Scalable data dependencies for Python projects
lazydata: scalable data dependencies lazydata is a minimalist library for including data dependencies into Python projects. Problem: Keeping all data
MongoDB data stream pipeline tools by YouGov (adopted from MongoDB)
mongo-connector The mongo-connector project originated as a MongoDB mongo-labs project and is now community-maintained under the custody of YouGov, Pl
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana
PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.
PyPika - Python Query Builder Abstract What is PyPika? PyPika is a Python API for building SQL queries. The motivation behind PyPika is to provide a s
Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.
dataset: databases for lazy people In short, dataset makes reading and writing data in databases as simple as reading and writing JSON files. Read the
Pandas Google BigQuery
pandas-gbq pandas-gbq is a package providing an interface to the Google BigQuery API from pandas Installation Install latest release version via conda
Open source UI framework written in Python, running on Windows, Linux, macOS, Android and iOS
Kivy Innovative user interfaces made easy. Kivy is an open source, cross-platform Python framework for the development of applications that make use o
Configuration Management for Python ā
dynaconf - Configuration Management for Python. Features Inspired by the 12-factor application guide Settings management (default values, validation,
Python ASN.1 library with a focus on performance and a pythonic API
asn1crypto A fast, pure Python library for parsing and serializing ASN.1 structures. Features Why Another Python ASN.1 Library? Related Crypto Librari
Better directory iterator and faster os.walk(), now in the Python 3.5 stdlib
scandir, a better directory iterator and faster os.walk() scandir() is a directory iteration function like os.listdir(), except that instead of return
A small Python module for determining appropriate platform-specific dirs, e.g. a "user data dir".
the problem What directory should your app use for storing user data? If running on macOS, you should use: ~/Library/Application Support/AppName If
Lightweight data validation and adaptation Python library.
Valideer Lightweight data validation and adaptation library for Python. At a Glance: Supports both validation (check if a value is valid) and adaptati
Python Data Structures for Humansā¢.
Schematics Python Data Structures for Humansā¢. About Project documentation: https://schematics.readthedocs.io/en/latest/ Schematics is a Python librar
Typical: Fast, simple, & correct data-validation using Python 3 typing.
typical: Python's Typing Toolkit Introduction Typical is a library devoted to runtime analysis, inference, validation, and enforcement of Python types
A simple, fast, extensible python library for data validation.
Validr A simple, fast, extensible python library for data validation. Simple and readable schema 10X faster than jsonschema, 40X faster than schematic
Python Data Validation for Humansā¢.
validators Python data validation for Humans. Python has all kinds of data validation tools, but every one of them seems to require defining a schema
CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.
CONTRIBUTIONS ONLY What does this mean? I do not have time to fix issues myself. The only way fixes or new features will be added is by people submitt
Lightweight, extensible data validation library for Python
Cerberus Cerberus is a lightweight and extensible data validation library for Python. v = Validator({'name': {'type': 'string'}}) v.validate({
Data parsing and validation using Python type hints
pydantic Data validation and settings management using Python type hinting. Fast and extensible, pydantic plays nicely with your linters/IDE/brain. De
š¦ Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)
srsly: Modern high-performance serialization utilities for Python This package bundles some of the best Python serialization libraries into one standa
Protocol Buffers - Google's data interchange format
Protocol Buffers - Google's data interchange format Copyright 2008 Google Inc. https://developers.google.com/protocol-buffers/ Overview Protocol Buffe
A calendaring app for Django. It is now stable, Please feel free to use it now. Active development has been taken over by bartekgorny.
Django-schedule A calendaring/scheduling application, featuring: one-time and recurring events calendar exceptions (occurrences changed or cancelled)
A Django app to accept payments from various payment processors via Pluggable backends.
Django-Merchant Django-Merchant is a django application that enables you to use multiple payment processors from a single API. Gateways Following gate
Yet another Django audit log app, hopefully the simplest one.
django-easy-audit Yet another Django audit log app, hopefully the easiest one. This app allows you to keep track of every action taken by your users.
Full-text multi-table search application for Django. Easy to install and use, with good performance.
django-watson django-watson is a fast multi-model full-text search plugin for Django. It is easy to install and use, and provides high quality search
A Django app for managing robots.txt files following the robots exclusion protocol
Django Robots This is a basic Django application to manage robots.txt files following the robots exclusion protocol, complementing the Django Sitemap
No effort, no worry, maximum performance.
Django Cachalot Caches your Django ORM queries and automatically invalidates them. Documentation: http://django-cachalot.readthedocs.io Table of Conte
Easily share data across your company via SQL queries. From Grove Collab.
SQL Explorer SQL Explorer aims to make the flow of data between people fast, simple, and confusion-free. It is a Django-based application that you can
A simple app that provides django integration for RQ (Redis Queue)
Django-RQ Django integration with RQ, a Redis based Python queuing library. Django-RQ is a simple app that allows you to configure your queues in djan
A reusable Django model field for storing ad-hoc JSON data
jsonfield jsonfield is a reusable model field that allows you to store validated JSON, automatically handling serialization to and from the database.
django-tables2 - An app for creating HTML tables
django-tables2 - An app for creating HTML tables django-tables2 simplifies the task of turning sets of data into HTML tables. It has native support fo
Django application and library for importing and exporting data with admin integration.
django-import-export django-import-export is a Django application and library for importing and exporting data with included admin integration. Featur
Django app for handling the server headers required for Cross-Origin Resource Sharing (CORS)
django-cors-headers A Django App that adds Cross-Origin Resource Sharing (CORS) headers to responses. This allows in-browser requests to your Django a
Easily integrate socket.io with your FastAPI app š
fastapi-socketio Easly integrate socket.io with your FastAPI app. Installation Install this plugin using pip: $ pip install fastapi-socketio Usage To
This code generator creates FastAPI app from an openapi file.
fastapi-code-generator This code generator creates FastAPI app from an openapi file. This project is an experimental phase. fastapi-code-generator use
Seamlessly serve your static assets of your Flask app from Amazon S3
flask-s3 Seamlessly serve the static assets of your Flask app from Amazon S3. Maintainers Flask-S3 is maintained by @e-dard, @eriktaubeneck and @SunDw