2909 Python Website-to-Json-Data Libraries

An open-source plotting library for statistical data.

Lets-Plot Lets-Plot is an open-source plotting library for statistical data. It is implemented using the Kotlin programming language. The design of Le

820 Jan 6, 2023

Visualize and compare datasets, target values and associations, with one line of code.

In-depth EDA (target analysis, comparison, feature analysis, correlation) in two lines of code! Sweetviz is an open-source Python library that generat

2.3k Jan 5, 2023

HiPlot makes understanding high dimensional data easy

HiPlot - High dimensional Interactive Plotting HiPlot is a lightweight interactive visualization tool to help AI researchers discover correlations and

2.4k Jan 4, 2023

Joyplots in Python with matplotlib & pandas :chart_with_upwards_trend:

JoyPy JoyPy is a one-function Python package based on matplotlib + pandas with a single purpose: drawing joyplots (a.k.a. ridgeline plots). The code f

462 Jan 2, 2023

Visualizations for machine learning datasets

Introduction The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive

7.1k Jan 7, 2023

Python library that makes it easy for data scientists to create charts.

Chartify Chartify is a Python library that makes it easy for data scientists to create charts. Why use Chartify? Consistent input data format: Spend l

3.2k Jan 4, 2023

A Python toolbox for gaining geometric insights into high-dimensional data

"To deal with hyper-planes in a 14 dimensional space, visualize a 3D space and say 'fourteen' very loudly. Everyone does it." - Geoff Hinton Overview

1.8k Dec 29, 2022

Streaming pivot visualization via WebAssembly

Perspective is an interactive visualization component for large, real-time datasets. Originally developed for J.P. Morgan's trading business, Perspect

The Fintech Open Source Foundation (www.finos.org)

5.1k Dec 27, 2022

Library for exploring and validating machine learning data

TensorFlow Data Validation TensorFlow Data Validation (TFDV) is a library for exploring and validating machine learning data. It is designed to be hig

688 Jan 3, 2023

Missing data visualization module for Python.

missingno Messy datasets? Missing values? missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities tha

3.4k Dec 29, 2022

Quickly and accurately render even the largest data.

Turn even the largest data into images, accurately Build Status Coverage Latest dev release Latest release Docs Support What is it? Datashader is a da

2.9k Dec 28, 2022

With Holoviews, your data visualizes itself.

HoloViews Stop plotting your data - annotate your data and let it visualize itself. HoloViews is an open-source Python library designed to make data a

2.3k Jan 2, 2023

Fast data visualization and GUI tools for scientific / engineering applications

2.3k Feb 13, 2021

Uniform Manifold Approximation and Projection

UMAP Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, bu

6k Jan 8, 2023

Create HTML profiling reports from pandas DataFrame objects

Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great

10k Jan 1, 2023

Interactive Data Visualization in the browser, from Python

Bokeh is an interactive visualization library for modern web browsers. It provides elegant, concise construction of versatile graphics, and affords hi

14.7k Feb 13, 2021

Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required.

Dash Dash is the most downloaded, trusted Python framework for building ML & data science web apps. Built on top of Plotly.js, React and Flask, Dash t

13.9k Feb 13, 2021

Statistical data visualization using matplotlib

seaborn: statistical data visualization Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing

8.1k Feb 13, 2021

A toolkit for making real world machine learning and data analysis applications in C++

dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl

11.6k Jan 1, 2023

fklearn: Functional Machine Learning

fklearn: Functional Machine Learning fklearn uses functional programming principles to make it easier to solve real problems with Machine Learning. Th

1.4k Dec 7, 2022

Shōgun

The SHOGUN machine learning toolbox Unified and efficient Machine Learning since 1999. Latest release: Cite Shogun: Develop branch build status: Donat

2.8k Feb 13, 2021

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

2.8k Feb 12, 2021

ktrain is a Python library that makes deep learning and AI more accessible and easier to apply

1.1k Jan 2, 2023

Deep learning library featuring a higher-level API for TensorFlow.

TFLearn: Deep learning library featuring a higher-level API for TensorFlow. TFlearn is a modular and transparent deep learning library built on top of

9.5k Feb 12, 2021

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

5.7k Feb 12, 2021

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website • Key Features • How To Use • Docs •

11.9k Feb 13, 2021

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

14.5k Jan 8, 2023

Deep Learning for humans

Keras: Deep Learning for Python Under Construction In the near future, this repository will be used once again for developing the Keras codebase. For

50.7k Feb 12, 2021

Statsmodels: statistical modeling and econometrics in Python

About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an

8.1k Jan 2, 2023

Apache Spark - A unified analytics engine for large-scale data processing

Apache Spark Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an op

34.7k Jan 4, 2023

scikit-learn: machine learning in Python

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. The project was started

52.5k Jan 8, 2023

Fully featured framework for fast, easy and documented API development with Flask

Flask RestPlus IMPORTANT NOTICE: This project has been forked to Flask-RESTX and will be maintained by by the python-restx organization. Flask-RESTPlu

2.7k Jan 4, 2023

NO LONGER MAINTAINED - A Flask extension for creating simple ReSTful JSON APIs from SQLAlchemy models.

NO LONGER MAINTAINED This repository is no longer maintained due to lack of time. You might check out the fork https://github.com/mrevutskyi/flask-res

1k Jan 4, 2023

Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.

starlette context Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automat

300 Dec 26, 2022

pytest plugin for manipulating test data directories and files

pytest-datadir pytest plugin for manipulating test data directories and files. Usage pytest-datadir will look up for a directory with the name of your

191 Dec 21, 2022

Data-Driven Tests for Python Unittest

DDT (Data-Driven Tests) allows you to multiply one test case by running it with different test data, and make it appear as multiple test cases. Instal

424 Nov 28, 2022

Display tabular data in a visually appealing ASCII table format

PrettyTable Installation Install via pip: python -m pip install -U prettytable Install latest development version: python -m pip install -U git+https

924 Jan 5, 2023

Json Formatter for the standard python logger

Overview This library is provided to allow standard python logging to output log data as json objects. With JSON we can make our logs more readable by

1.4k Jan 4, 2023

Pretty-print tabular data in Python, a library and a command-line utility. Repository migrated from bitbucket.org/astanin/python-tabulate.

python-tabulate Pretty-print tabular data in Python, a library and a command-line utility. The main use cases of the library are: printing small table

1.5k Jan 6, 2023

Lazydata: Scalable data dependencies for Python projects

lazydata: scalable data dependencies lazydata is a minimalist library for including data dependencies into Python projects. Problem: Keeping all data

629 Nov 21, 2022

MongoDB data stream pipeline tools by YouGov (adopted from MongoDB)

mongo-connector The mongo-connector project originated as a MongoDB mongo-labs project and is now community-maintained under the custody of YouGov, Pl

1.9k Jan 4, 2023

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana

3.3k Dec 31, 2022

PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.

PyPika - Python Query Builder Abstract What is PyPika? PyPika is a Python API for building SQL queries. The motivation behind PyPika is to provide a s

1.9k Jan 4, 2023

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.

dataset: databases for lazy people In short, dataset makes reading and writing data in databases as simple as reading and writing JSON files. Read the

4.2k Jan 2, 2023

Pandas Google BigQuery

pandas-gbq pandas-gbq is a package providing an interface to the Google BigQuery API from pandas Installation Install latest release version via conda

345 Dec 28, 2022

A small Python module for determining appropriate platform-specific dirs, e.g. a "user data dir".

the problem What directory should your app use for storing user data? If running on macOS, you should use: ~/Library/Application Support/AppName If

948 Dec 31, 2022

Lightweight data validation and adaptation Python library.

Valideer Lightweight data validation and adaptation library for Python. At a Glance: Supports both validation (check if a value is valid) and adaptati

258 Nov 22, 2022

Python Data Structures for Humans™.

Schematics Python Data Structures for Humans™. About Project documentation: https://schematics.readthedocs.io/en/latest/ Schematics is a Python librar

2.5k Dec 28, 2022

Typical: Fast, simple, & correct data-validation using Python 3 typing.

typical: Python's Typing Toolkit Introduction Typical is a library devoted to runtime analysis, inference, validation, and enforcement of Python types

171 Jan 2, 2023

A simple, fast, extensible python library for data validation.

Validr A simple, fast, extensible python library for data validation. Simple and readable schema 10X faster than jsonschema, 40X faster than schematic

209 Sep 19, 2022

Python Data Validation for Humans™.

validators Python data validation for Humans. Python has all kinds of data validation tools, but every one of them seems to require defining a schema

670 Jan 9, 2023

CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

CONTRIBUTIONS ONLY What does this mean? I do not have time to fix issues myself. The only way fixes or new features will be added is by people submitt

1.8k Dec 31, 2022

Lightweight, extensible data validation library for Python

Cerberus Cerberus is a lightweight and extensible data validation library for Python. v = Validator({'name': {'type': 'string'}}) v.validate({

2.9k Dec 27, 2022

An(other) implementation of JSON Schema for Python

jsonschema jsonschema is an implementation of JSON Schema for Python. from jsonschema import validate # A sample schema, like what we'd get f

4k Jan 4, 2023

Data parsing and validation using Python type hints

pydantic Data validation and settings management using Python type hinting. Fast and extensible, pydantic plays nicely with your linters/IDE/brain. De

12.1k Jan 6, 2023

Python bindings for the simdjson project.

pysimdjson Python bindings for the simdjson project, a SIMD-accelerated JSON parser. If SIMD instructions are unavailable a fallback parser is used, m

562 Jan 8, 2023

Python wrapper around rapidjson

python-rapidjson Python wrapper around RapidJSON Authors: Ken Robbins [email protected] Lele Gaifax [email protected] License: MIT License Sta

469 Jan 4, 2023

🦉 Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)

srsly: Modern high-performance serialization utilities for Python This package bundles some of the best Python serialization libraries into one standa

329 Dec 28, 2022

Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy

orjson orjson is a fast, correct JSON library for Python. It benchmarks as the fastest Python library for JSON and is more correct than the standard j

4.1k Dec 30, 2022

Python library for serializing any arbitrary object graph into JSON. It can take almost any Python object and turn the object into JSON. Additionally, it can reconstitute the object back into Python.

jsonpickle jsonpickle is a library for the two-way conversion of complex Python objects and JSON. jsonpickle builds upon the existing JSON encoders, s

1.1k Jan 2, 2023

simplejson is a simple, fast, extensible JSON encoder/decoder for Python

simplejson simplejson is a simple, fast, complete, correct and extensible JSON http://json.org encoder and decoder for Python 3.3+ with legacy suppo

1.5k Dec 31, 2022

Ultra fast JSON decoder and encoder written in C with Python bindings

UltraJSON UltraJSON is an ultra fast JSON encoder and decoder written in pure C with bindings for Python 3.6+. Install with pip: $ python -m pip insta

3.9k Jan 2, 2023

FlatBuffers: Memory Efficient Serialization Library

FlatBuffers FlatBuffers is a cross platform serialization library architected for maximum memory efficiency. It allows you to directly access serializ

19.6k Jan 1, 2023

Protocol Buffers - Google's data interchange format

57.6k Jan 3, 2023

Easily share data across your company via SQL queries. From Grove Collab.

SQL Explorer SQL Explorer aims to make the flow of data between people fast, simple, and confusion-free. It is a Django-based application that you can

2.1k Dec 30, 2022

A reusable Django model field for storing ad-hoc JSON data

jsonfield jsonfield is a reusable model field that allows you to store validated JSON, automatically handling serialization to and from the database.

1.1k Jan 3, 2023

Django application and library for importing and exporting data with admin integration.

django-import-export django-import-export is a Django application and library for importing and exporting data with included admin integration. Featur

2.6k Dec 26, 2022

FastAPI framework plugins

Plugins for FastAPI framework, high performance, easy to learn, fast to code, ready for production fastapi-plugins FastAPI framework plugins Cache Mem

239 Dec 28, 2022

A basic JSON-RPC implementation for your Flask-powered sites

Flask JSON-RPC A basic JSON-RPC implementation for your Flask-powered sites. Some reasons you might want to use: Simple, powerful, flexible and python

272 Jan 4, 2023

SqlAlchemy Flask-Restful Swagger Json:API OpenAPI

SAFRS: Python OpenAPI & JSON:API Framework Overview Installation JSON:API Interface Resource Objects Relationships Methods Custom Methods Class Method

365 Jan 6, 2023

A flask extension using pyexcel to read, manipulate and write data in different excel formats: csv, ods, xls, xlsx and xlsm.

Flask-Excel - Let you focus on data, instead of file formats Support the project If your company has embedded pyexcel and its components into a revenu

247 Dec 27, 2022

Django Smuggler is a pluggable application for Django Web Framework that helps you to import/export fixtures via the automatically-generated administration interface.

Django Smuggler Django Smuggler is a pluggable application for Django Web Framework to easily dump/load fixtures via the automatically-generated admin

373 Dec 26, 2022

Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.

starlette context Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automat

300 Dec 26, 2022

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.

GoAccess What is it? GoAccess is an open source real-time web log analyzer and interactive viewer that runs in a terminal on *nix systems or through y

15.6k Jan 2, 2023

Library to scrape and clean web pages to create massive datasets.

lazynlp A straightforward library that allows you to crawl, clean up, and deduplicate webpages to create massive monolingual datasets. Using this libr

2.1k Jan 6, 2023

Extract embedded metadata from HTML markup

extruct extruct is a library for extracting embedded metadata from HTML markup. Currently, extruct supports: W3C's HTML Microdata embedded JSON-LD Mic

725 Jan 3, 2023

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Parsel Parsel is a BSD-licensed Python library to extract and remove data from HTML and XML using XPath and CSS selectors, optionally combined with re

859 Dec 29, 2022

IMDbPY is a Python package useful to retrieve and manage the data of the IMDb movie database about movies, people, characters and companies

IMDbPY is a Python package for retrieving and managing the data of the IMDb movie database about movies, people and companies. Revamp notice Starting

1.1k Jan 2, 2023

TuShare is a utility for crawling historical data of China stocks

TuShare Tushare Pro版已发布，请访问新的官网了解和查询数据接口！ https://tushare.pro TuShare是实现对股票/期货等金融数据从数据采集、清洗加工到数据存储过程的工具，满足金融量化分析师和学习数据分析的人在数据获取方面的需求，它的特点是数据覆盖范围广，接口

11.9k Dec 30, 2022

🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

Best-of Machine Learning with Python 🏆 A ranked list of awesome machine learning Python libraries. Updated weekly. This curated list contains 840 awe

12.2k Jan 4, 2023

Mockoon is the easiest and quickest way to run mock APIs locally. No remote deployment, no account required, open source.

Mockoon Mockoon is the easiest and quickest way to run mock APIs locally. No remote deployment, no account required, open source. It has been built wi

4.4k Dec 30, 2022

Script used to download data for stocks.

This script is useful for downloading stock market data for a wide range of companies specified by their respective tickers. The script reads in the d

71 Oct 4, 2022

Boltstream Live Video Streaming Website + Backend

Boltstream Self-hosted Live Video Streaming Website + Backend Reference

1.7k Dec 28, 2022

🏆 A ranked list of awesome Python open-source libraries and tools. Updated weekly.

Best-of Python 🏆 A ranked list of awesome Python open-source libraries & tools. Updated weekly. This curated list contains 230 awesome open-source pr

2.7k Jan 3, 2023

DeiT: Data-efficient Image Transformers

DeiT: Data-efficient Image Transformers This repository contains PyTorch evaluation code, training code and pretrained models for DeiT (Data-Efficient

3.2k Jan 6, 2023

Deploy a ML inference service on a budget in less than 10 lines of code.

BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end.

1.3k Dec 25, 2022

This is a database of 180.000+ symbols containing Equities, ETFs, Funds, Indices, Futures, Options, Currencies, Cryptocurrencies and Money Markets.

Finance Database As a private investor, the sheer amount of information that can be found on the internet is rather daunting.

1.4k Dec 31, 2022

JSON Web Token Authentication support for Django REST Framework

REST framework JWT Auth Notice This project is currently unmaintained. Check #484 for more details and suggested alternatives. JSON Web Token Authenti

3.2k Dec 31, 2022

A JSON Web Token authentication plugin for the Django REST Framework.

Simple JWT Abstract Simple JWT is a JSON Web Token authentication plugin for the Django REST Framework. For full documentation, visit django-rest-fram

3.3k Jan 1, 2023

An interactive command-line HTTP and API testing client built on top of HTTPie featuring autocomplete, syntax highlighting, and more. https://twitter.com/httpie

HTTP Prompt HTTP Prompt is an interactive command-line HTTP client featuring autocomplete and syntax highlighting, built on HTTPie and prompt_toolkit.

8.6k Dec 31, 2022

As easy as /aitch-tee-tee-pie/ 🥧 Modern, user-friendly command-line HTTP client for the API era. JSON support, colors, sessions, downloads, plugins & more. https://twitter.com/httpie

HTTPie: human-friendly CLI HTTP client for the API era HTTPie (pronounced aitch-tee-tee-pie) is a command-line HTTP client. Its goal is to make CLI in

25.4k Jan 1, 2023

Python AsyncIO data API to manage billions of resources

Introduction Please read the detailed docs This is the working project of the next generation Guillotina server based on asyncio. Dependencies Python

183 Nov 15, 2022

⚾🤖⚾ Automatic baseball pitching overlay in realtime

⚾ Automatically overlaying pitch motion and trajectory with machine learning! This project takes your baseball pitching clips and automatically genera

240 Dec 5, 2022

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

WILDS is a benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications, from tumor identification to wildlife monitoring to poverty mapping.

437 Dec 30, 2022

Graphing communities on Twitch.tv in a visually intuitive way

VisualizingTwitchCommunities This project maps communities of streamers on Twitch.tv based on shared viewership. The data is collected from the Twitch

312 Jan 7, 2023

Converts Betaflight blackbox gyro to MP4 GoPro Meta data so it can be used with ReelSteady GO

Here are a bunch of scripts that I created some time ago as a proof of concept that Betaflight blackbox gyro data can be converted to GoPro Metadata F

108 Oct 5, 2022

Export your data from Xiami

Xiami Exporter 导出虾米音乐的个人数据，功能：导出歌曲为 json 收藏歌曲收藏专辑播放列表导出收藏艺人为 json 导出收藏专辑为 json 导出播放列表为 json (个人和收藏) 将导出的数据整理至 sqlite 数据库收藏歌曲收藏艺人收藏专辑播放列表下载已导出

59 Nov 13, 2021

An API serving data on all creatures, monsters, materials, equipment, and treasure in The Legend of Zelda: Breath of the Wild

Hyrule Compendium API An API serving data on all creatures, monsters, materials, equipment, and treasure in The Legend of Zelda: Breath of the Wild. B

116 Dec 1, 2022

Soda SQL Data testing, monitoring and profiling for SQL accessible data.

Soda SQL Data testing, monitoring and profiling for SQL accessible data. What does Soda SQL do? Soda SQL allows you to Stop your pipeline when bad dat

51 Jan 1, 2023

Python based scripts for obtaining system information from Linux.

sysinfo Python based scripts for obtaining system information from Linux. Python2 and Python3 compatible Output in JSON format Simple scripts and exte

70 Dec 20, 2022

Python Website-to-Json-Data Resources

Python Website-to-Json-Data Libraries

An open-source plotting library for statistical data.

Visualize and compare datasets, target values and associations, with one line of code.

HiPlot makes understanding high dimensional data easy

Joyplots in Python with matplotlib & pandas :chart_with_upwards_trend:

Visualizations for machine learning datasets

Python library that makes it easy for data scientists to create charts.

A Python toolbox for gaining geometric insights into high-dimensional data

Streaming pivot visualization via WebAssembly

Library for exploring and validating machine learning data

Missing data visualization module for Python.

Quickly and accurately render even the largest data.

With Holoviews, your data visualizes itself.

Fast data visualization and GUI tools for scientific / engineering applications

Uniform Manifold Approximation and Projection

Create HTML profiling reports from pandas DataFrame objects

Interactive Data Visualization in the browser, from Python

Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required.

Statistical data visualization using matplotlib

A toolkit for making real world machine learning and data analysis applications in C++

fklearn: Functional Machine Learning

Shōgun

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

ktrain is a Python library that makes deep learning and AI more accessible and easier to apply

Deep learning library featuring a higher-level API for TensorFlow.

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Deep Learning for humans

Statsmodels: statistical modeling and econometrics in Python

Apache Spark - A unified analytics engine for large-scale data processing

scikit-learn: machine learning in Python

Fully featured framework for fast, easy and documented API development with Flask

NO LONGER MAINTAINED - A Flask extension for creating simple ReSTful JSON APIs from SQLAlchemy models.

Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.

pytest plugin for manipulating test data directories and files

Data-Driven Tests for Python Unittest

Display tabular data in a visually appealing ASCII table format

Json Formatter for the standard python logger

Pretty-print tabular data in Python, a library and a command-line utility. Repository migrated from bitbucket.org/astanin/python-tabulate.

Lazydata: Scalable data dependencies for Python projects

MongoDB data stream pipeline tools by YouGov (adopted from MongoDB)

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.

Pandas Google BigQuery

A small Python module for determining appropriate platform-specific dirs, e.g. a "user data dir".

Lightweight data validation and adaptation Python library.

Python Data Structures for Humans™.

Typical: Fast, simple, & correct data-validation using Python 3 typing.

A simple, fast, extensible python library for data validation.

Python Data Validation for Humans™.

CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

Lightweight, extensible data validation library for Python

An(other) implementation of JSON Schema for Python

Data parsing and validation using Python type hints

Python bindings for the simdjson project.

Python wrapper around rapidjson

🦉 Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)

Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy

Python library for serializing any arbitrary object graph into JSON. It can take almost any Python object and turn the object into JSON. Additionally, it can reconstitute the object back into Python.

simplejson is a simple, fast, extensible JSON encoder/decoder for Python

Ultra fast JSON decoder and encoder written in C with Python bindings

FlatBuffers: Memory Efficient Serialization Library

Protocol Buffers - Google's data interchange format

Easily share data across your company via SQL queries. From Grove Collab.

A reusable Django model field for storing ad-hoc JSON data

Django application and library for importing and exporting data with admin integration.

FastAPI framework plugins

A basic JSON-RPC implementation for your Flask-powered sites

SqlAlchemy Flask-Restful Swagger Json:API OpenAPI

A flask extension using pyexcel to read, manipulate and write data in different excel formats: csv, ods, xls, xlsx and xlsm.

Django Smuggler is a pluggable application for Django Web Framework that helps you to import/export fixtures via the automatically-generated administration interface.

Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.

Library to scrape and clean web pages to create massive datasets.

Extract embedded metadata from HTML markup

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

IMDbPY is a Python package useful to retrieve and manage the data of the IMDb movie database about movies, people, characters and companies