3178 Python Genomic-data-analysis Libraries

Apache Spark - A unified analytics engine for large-scale data processing

Apache Spark Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an op

34.7k Jan 4, 2023

scikit-learn: machine learning in Python

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. The project was started

52.5k Jan 8, 2023

Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.

starlette context Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automat

300 Dec 26, 2022

Scalene: a high-performance, high-precision CPU and memory profiler for Python

scalene: a high-performance CPU and memory profiler for Python by Emery Berger 中文版本 (Chinese version) About Scalene % pip install -U scalene Scalen

138 Dec 30, 2022

Sampling profiler for Python programs

py-spy: Sampling profiler for Python programs py-spy is a sampling profiler for Python programs. It lets you visualize what your Python program is spe

9.5k Jan 8, 2023

pytest plugin for manipulating test data directories and files

pytest-datadir pytest plugin for manipulating test data directories and files. Usage pytest-datadir will look up for a directory with the name of your

191 Dec 21, 2022

Data-Driven Tests for Python Unittest

DDT (Data-Driven Tests) allows you to multiply one test case by running it with different test data, and make it appear as multiple test cases. Instal

424 Nov 28, 2022

Display tabular data in a visually appealing ASCII table format

PrettyTable Installation Install via pip: python -m pip install -U prettytable Install latest development version: python -m pip install -U git+https

924 Jan 5, 2023

Pretty-print tabular data in Python, a library and a command-line utility. Repository migrated from bitbucket.org/astanin/python-tabulate.

python-tabulate Pretty-print tabular data in Python, a library and a command-line utility. The main use cases of the library are: printing small table

1.5k Jan 6, 2023

McCabe complexity checker for Python

McCabe complexity checker Ned's script to check McCabe complexity. This module provides a plugin for flake8, the Python code checker. Installation You

527 Dec 21, 2022

Various code metrics for Python code

Radon Radon is a Python tool that computes various metrics from the source code. Radon can compute: McCabe's complexity, i.e. cyclomatic complexity ra

1.4k Jan 7, 2023

Dlint is a tool for encouraging best coding practices and helping ensure Python code is secure.

Dlint Dlint is a tool for encouraging best coding practices and helping ensure Python code is secure. The most important thing I have done as a progra

127 Dec 27, 2022

A Static Analysis Tool for Detecting Security Vulnerabilities in Python Web Applications

This project is no longer maintained March 2020 Update: Please go see the amazing Pysa tutorial that should get you up to speed finding security vulne

2.1k Dec 25, 2022

Bandit is a tool designed to find common security issues in Python code.

A security linter from PyCQA Free software: Apache license Documentation: https://bandit.readthedocs.io/en/latest/ Source: https://github.com/PyCQA/ba

4.8k Dec 31, 2022

Awesome autocompletion, static analysis and refactoring library for python

Jedi - an awesome autocompletion, static analysis and refactoring library for Python Jedi is a static analysis tool for Python that is typically used

5.3k Dec 29, 2022

A static-analysis bot for Github

Imhotep, the peaceful builder. What is it? Imhotep is a tool which will comment on commits coming into your repository and check for syntactic errors

221 Nov 10, 2022

Custom Python linting through AST expressions

bellybutton bellybutton is a customizable, easy-to-configure linting engine for Python. What is this good for? Tools like pylint and flake8 provide, o

249 Dec 31, 2022

Automated security testing using bandit and flake8.

flake8-bandit Automated security testing built right into your workflow! You already use flake8 to lint all your code for errors, ensure docstrings ar

96 Jan 1, 2023

coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.

"Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live." ― John F. Woods coala provides a

3.4k Dec 29, 2022

Pylint plugin for improving code analysis for when using Django

pylint-django About pylint-django is a Pylint plugin for improving code analysis when analysing code using Django. It is also used by the Prospector t

544 Jan 6, 2023

A static type analyzer for Python code

pytype - 🦆 ✔ Pytype checks and infers types for your Python code - without requiring type annotations. Pytype can: Lint plain Python code, flagging c

4k Dec 31, 2022

Performant type-checking for python.

Pyre is a performant type checker for Python compliant with PEP 484. Pyre can analyze codebases with millions of lines of code incrementally – providi

6.2k Jan 4, 2023

It's not just a linter that annoys you!

README for Pylint - https://pylint.pycqa.org/ Professional support for pylint is available as part of the Tidelift Subscription. Tidelift gives softwa

4.4k Jan 4, 2023

The official GitHub mirror of https://gitlab.com/pycqa/flake8

Flake8 Flake8 is a wrapper around these tools: PyFlakes pycodestyle Ned Batchelder's McCabe script Flake8 runs all the tools by launching the single f

2.6k Jan 3, 2023

Lazydata: Scalable data dependencies for Python projects

lazydata: scalable data dependencies lazydata is a minimalist library for including data dependencies into Python projects. Problem: Keeping all data

629 Nov 21, 2022

MongoDB data stream pipeline tools by YouGov (adopted from MongoDB)

mongo-connector The mongo-connector project originated as a MongoDB mongo-labs project and is now community-maintained under the custody of YouGov, Pl

1.9k Jan 4, 2023

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana

3.3k Dec 31, 2022

PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.

PyPika - Python Query Builder Abstract What is PyPika? PyPika is a Python API for building SQL queries. The motivation behind PyPika is to provide a s

1.9k Jan 4, 2023

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.

dataset: databases for lazy people In short, dataset makes reading and writing data in databases as simple as reading and writing JSON files. Read the

4.2k Jan 2, 2023

Pandas Google BigQuery

pandas-gbq pandas-gbq is a package providing an interface to the Google BigQuery API from pandas Installation Install latest release version via conda

345 Dec 28, 2022

A small Python module for determining appropriate platform-specific dirs, e.g. a "user data dir".

the problem What directory should your app use for storing user data? If running on macOS, you should use: ~/Library/Application Support/AppName If

948 Dec 31, 2022

Lightweight data validation and adaptation Python library.

Valideer Lightweight data validation and adaptation library for Python. At a Glance: Supports both validation (check if a value is valid) and adaptati

258 Nov 22, 2022

Python Data Structures for Humans™.

Schematics Python Data Structures for Humans™. About Project documentation: https://schematics.readthedocs.io/en/latest/ Schematics is a Python librar

2.5k Dec 28, 2022

Typical: Fast, simple, & correct data-validation using Python 3 typing.

typical: Python's Typing Toolkit Introduction Typical is a library devoted to runtime analysis, inference, validation, and enforcement of Python types

171 Jan 2, 2023

A simple, fast, extensible python library for data validation.

Validr A simple, fast, extensible python library for data validation. Simple and readable schema 10X faster than jsonschema, 40X faster than schematic

209 Sep 19, 2022

Python Data Validation for Humans™.

validators Python data validation for Humans. Python has all kinds of data validation tools, but every one of them seems to require defining a schema

670 Jan 9, 2023

CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

CONTRIBUTIONS ONLY What does this mean? I do not have time to fix issues myself. The only way fixes or new features will be added is by people submitt

1.8k Dec 31, 2022

Lightweight, extensible data validation library for Python

Cerberus Cerberus is a lightweight and extensible data validation library for Python. v = Validator({'name': {'type': 'string'}}) v.validate({

2.9k Dec 27, 2022

Data parsing and validation using Python type hints

pydantic Data validation and settings management using Python type hinting. Fast and extensible, pydantic plays nicely with your linters/IDE/brain. De

12.1k Jan 6, 2023

Protocol Buffers - Google's data interchange format

57.6k Jan 3, 2023

Easily share data across your company via SQL queries. From Grove Collab.

SQL Explorer SQL Explorer aims to make the flow of data between people fast, simple, and confusion-free. It is a Django-based application that you can

2.1k Dec 30, 2022

A reusable Django model field for storing ad-hoc JSON data

jsonfield jsonfield is a reusable model field that allows you to store validated JSON, automatically handling serialization to and from the database.

1.1k Jan 3, 2023

Django application and library for importing and exporting data with admin integration.

django-import-export django-import-export is a Django application and library for importing and exporting data with included admin integration. Featur

2.6k Dec 26, 2022

a flask profiler which watches endpoint calls and tries to make some analysis.

Flask-profiler version: 1.8 Flask-profiler measures endpoints defined in your flask application; and provides you fine-grained report through a web in

718 Dec 20, 2022

A flask extension using pyexcel to read, manipulate and write data in different excel formats: csv, ods, xls, xlsx and xlsm.

Flask-Excel - Let you focus on data, instead of file formats Support the project If your company has embedded pyexcel and its components into a revenu

247 Dec 27, 2022

Django Smuggler is a pluggable application for Django Web Framework that helps you to import/export fixtures via the automatically-generated administration interface.

Django Smuggler Django Smuggler is a pluggable application for Django Web Framework to easily dump/load fixtures via the automatically-generated admin

373 Dec 26, 2022

Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.

starlette context Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automat

300 Dec 26, 2022

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.

GoAccess What is it? GoAccess is an open source real-time web log analyzer and interactive viewer that runs in a terminal on *nix systems or through y

15.6k Jan 2, 2023

Library to scrape and clean web pages to create massive datasets.

lazynlp A straightforward library that allows you to crawl, clean up, and deduplicate webpages to create massive monolingual datasets. Using this libr

2.1k Jan 6, 2023

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Parsel Parsel is a BSD-licensed Python library to extract and remove data from HTML and XML using XPath and CSS selectors, optionally combined with re

859 Dec 29, 2022

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par

Computational Linguistics Research Group

8.4k Jan 8, 2023

IMDbPY is a Python package useful to retrieve and manage the data of the IMDb movie database about movies, people, characters and companies

IMDbPY is a Python package for retrieving and managing the data of the IMDb movie database about movies, people and companies. Revamp notice Starting

1.1k Jan 2, 2023

TuShare is a utility for crawling historical data of China stocks

TuShare Tushare Pro版已发布，请访问新的官网了解和查询数据接口！ https://tushare.pro TuShare是实现对股票/期货等金融数据从数据采集、清洗加工到数据存储过程的工具，满足金融量化分析师和学习数据分析的人在数据获取方面的需求，它的特点是数据覆盖范围广，接口

11.9k Dec 30, 2022

🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

Best-of Machine Learning with Python 🏆 A ranked list of awesome machine learning Python libraries. Updated weekly. This curated list contains 840 awe

12.2k Jan 4, 2023

Mockoon is the easiest and quickest way to run mock APIs locally. No remote deployment, no account required, open source.

Mockoon Mockoon is the easiest and quickest way to run mock APIs locally. No remote deployment, no account required, open source. It has been built wi

4.4k Dec 30, 2022

Script used to download data for stocks.

This script is useful for downloading stock market data for a wide range of companies specified by their respective tickers. The script reads in the d

71 Oct 4, 2022

🏆 A ranked list of awesome Python open-source libraries and tools. Updated weekly.

Best-of Python 🏆 A ranked list of awesome Python open-source libraries & tools. Updated weekly. This curated list contains 230 awesome open-source pr

2.7k Jan 3, 2023

DeiT: Data-efficient Image Transformers

DeiT: Data-efficient Image Transformers This repository contains PyTorch evaluation code, training code and pretrained models for DeiT (Data-Efficient

3.2k Jan 6, 2023

Deploy a ML inference service on a budget in less than 10 lines of code.

BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end.

1.3k Dec 25, 2022

This is a database of 180.000+ symbols containing Equities, ETFs, Funds, Indices, Futures, Options, Currencies, Cryptocurrencies and Money Markets.

Finance Database As a private investor, the sheer amount of information that can be found on the internet is rather daunting.

1.4k Dec 31, 2022

Python AsyncIO data API to manage billions of resources

Introduction Please read the detailed docs This is the working project of the next generation Guillotina server based on asyncio. Dependencies Python

183 Nov 15, 2022

⚾🤖⚾ Automatic baseball pitching overlay in realtime

⚾ Automatically overlaying pitch motion and trajectory with machine learning! This project takes your baseball pitching clips and automatically genera

240 Dec 5, 2022

APT-Hunter is Threat Hunting tool for windows event logs

APT-Hunter is Threat Hunting tool for windows event logs which made by purple team mindset to provide detect APT movements hidden in the sea of windows event logs to decrease the time to uncover suspicious activity

824 Jan 8, 2023

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

WILDS is a benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications, from tumor identification to wildlife monitoring to poverty mapping.

437 Dec 30, 2022

Graphing communities on Twitch.tv in a visually intuitive way

VisualizingTwitchCommunities This project maps communities of streamers on Twitch.tv based on shared viewership. The data is collected from the Twitch

312 Jan 7, 2023

Converts Betaflight blackbox gyro to MP4 GoPro Meta data so it can be used with ReelSteady GO

Here are a bunch of scripts that I created some time ago as a proof of concept that Betaflight blackbox gyro data can be converted to GoPro Metadata F

108 Oct 5, 2022

Export your data from Xiami

Xiami Exporter 导出虾米音乐的个人数据，功能：导出歌曲为 json 收藏歌曲收藏专辑播放列表导出收藏艺人为 json 导出收藏专辑为 json 导出播放列表为 json (个人和收藏) 将导出的数据整理至 sqlite 数据库收藏歌曲收藏艺人收藏专辑播放列表下载已导出

59 Nov 13, 2021

This program goes thru reddit, finds the most mentioned tickers and uses Vader SentimentIntensityAnalyzer to calculate the ticker compound value.

195 Dec 13, 2022

An API serving data on all creatures, monsters, materials, equipment, and treasure in The Legend of Zelda: Breath of the Wild

Hyrule Compendium API An API serving data on all creatures, monsters, materials, equipment, and treasure in The Legend of Zelda: Breath of the Wild. B

116 Dec 1, 2022

Various capabilities for static malware analysis.

Malchive The malchive serves as a compendium for a variety of capabilities mainly pertaining to malware analysis, such as scripts supporting day to da

64 Nov 22, 2022

Soda SQL Data testing, monitoring and profiling for SQL accessible data.

Soda SQL Data testing, monitoring and profiling for SQL accessible data. What does Soda SQL do? Soda SQL allows you to Stop your pipeline when bad dat

51 Jan 1, 2023

A framework for joint super-resolution and image synthesis, without requiring real training data

SynthSR This repository contains code to train a Convolutional Neural Network (CNN) for Super-resolution (SR), or joint SR and data synthesis. The met

83 Jan 1, 2023

data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

C2F-FWN data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer" (https://arxiv.org/abs/

46 Dec 14, 2022

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

Data Structure and Algorithms with Python This repository is related to the Arabic tutorial here, within the tutorial we discuss the common data struc

33 Dec 2, 2022

A Pythonic client for the official https://data.gov.gr API.

pydatagovgr An unofficial Pythonic client for the official data.gov.gr API. Aims to be an easy, intuitive and out-of-the-box way to: find data publish

40 Nov 10, 2022

PyCG: Practical Python Call Graphs

PyCG - Practical Python Call Graphs PyCG generates call graphs for Python code using static analysis. It efficiently supports Higher order functions T

185 Dec 29, 2022

A minimalistic library designed to provide native access to YNAB data from Python

pYNAB A minimalistic library designed to provide native access to YNAB data from Python. Install The simplest way is to install the latest version fro

92 Apr 6, 2022

Python wrapper for the Sportradar APIs ⚽️🏈

Sportradar APIs This is a Python wrapper for the sports APIs provided by Sportradar. You'll need to sign up for an API key to use the service. Sportra

39 Jan 1, 2023

Python client for the Socrata Open Data API

sodapy sodapy is a python client for the Socrata Open Data API. Installation You can install with pip install sodapy. If you want to install from sour

368 Dec 9, 2022

An API Client package to access the APIs for NBA.com

nba_api An API Client package to access the APIs for NBA.com Development Version: v1.1.9 nba_api is an API Client for www.nba.com. This package is mea

1.4k Jan 1, 2023

A Python API to retrieve and read MLB GameDay data

mlbgame mlbgame is a Python API to retrieve and read MLB GameDay data. mlbgame works with real time data, getting information as games are being playe

493 Dec 13, 2022

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

14.5k Jan 8, 2023

:snake: A simple library to fetch data from the iTunes Store API made for Python = 3.5

itunespy itunespy is a simple library to fetch data from the iTunes Store API made for Python 3.5 and beyond. Important: Since version 1.6 itunespy no

56 Dec 22, 2022

Backtest 1000s of minute-by-minute trading algorithms for training AI with automated pricing data from: IEX, Tradier and FinViz. Datasets and trading performance automatically published to S3 for building AI training datasets for teaching DNNs how to trade. Runs on Kubernetes and docker-compose. 150 million trading history rows generated from +5000 algorithms. Heads up: Yahoo's Finance API was disabled on 2019-01-03 https://developer.yahoo.com/yql/

Stock Analysis Engine Build and tune investment algorithms for use with artificial intelligence (deep neural networks) with a distributed stack for ru

828 Dec 28, 2022

Python SDK for IEX Cloud

iexfinance Python SDK for IEX Cloud. Architecture mirrors that of the IEX Cloud API (and its documentation). An easy-to-use toolkit to obtain data for

640 Jan 7, 2023

Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes in a variety of languages.

Mimesis - Fake Data Generator Description Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes

3.8k Jan 1, 2023

Faker is a Python package that generates fake data for you.

Faker is a Python package that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in yo

15.2k Jan 1, 2023

create custom test databases that are populated with fake data

About Generate fake but valid data filled databases for test purposes using most popular patterns(AFAIK). Current support is sqlite, mysql, postgresql

2.2k Jan 6, 2023

Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes in a variety of languages.

Mimesis - Fake Data Generator Description Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes

3.8k Dec 29, 2022

Faker is a Python package that generates fake data for you.

Faker is a Python package that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in yo

15.2k Jan 1, 2023

create custom test databases that are populated with fake data

About Generate fake but valid data filled databases for test purposes using most popular patterns(AFAIK). Current support is sqlite, mysql, postgresql

2.2k Jan 4, 2023

Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files

pyexcel - Let you focus on data, instead of file formats Support the project If your company has embedded pyexcel and its components into a revenue ge

1.1k Dec 29, 2022

Statsmodels: statistical modeling and econometrics in Python

About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an

8k Dec 29, 2022

Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) an

7.2k Dec 30, 2022

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis. You write a high level configuration file specifying your in

917 Jan 3, 2023

Rich Python data types for Redis

Created by Stephen McDonald Introduction HOT Redis is a wrapper library for the redis-py client. Rather than calling the Redis commands directly from

281 Nov 10, 2022

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.

dataset: databases for lazy people In short, dataset makes reading and writing data in databases as simple as reading and writing JSON files. Read the

4.2k Dec 26, 2022

Safely pass trusted data to untrusted environments and back.

ItsDangerous ... so better sign this Various helpers to pass data to untrusted environments and to get it back safe and sound. Data is cryptographical

2.6k Jan 1, 2023

🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.

Boltons boltons should be builtins. Boltons is a set of over 230 BSD-licensed, pure-Python utilities in the same spirit as — and yet conspicuously mis

6k Jan 6, 2023

Debugging, monitoring and visualization for Python Machine Learning and Data Science

Welcome to TensorWatch TensorWatch is a debugging and visualization tool designed for data science, deep learning and reinforcement learning from Micr

3.3k Dec 27, 2022

Python Genomic-data-analysis Resources

Python genomic-data-analysis Libraries

Apache Spark - A unified analytics engine for large-scale data processing

scikit-learn: machine learning in Python

Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.

Scalene: a high-performance, high-precision CPU and memory profiler for Python

Sampling profiler for Python programs

pytest plugin for manipulating test data directories and files

Data-Driven Tests for Python Unittest

Display tabular data in a visually appealing ASCII table format

Pretty-print tabular data in Python, a library and a command-line utility. Repository migrated from bitbucket.org/astanin/python-tabulate.

McCabe complexity checker for Python

Various code metrics for Python code

Dlint is a tool for encouraging best coding practices and helping ensure Python code is secure.

A Static Analysis Tool for Detecting Security Vulnerabilities in Python Web Applications

Bandit is a tool designed to find common security issues in Python code.

Awesome autocompletion, static analysis and refactoring library for python

A static-analysis bot for Github

Custom Python linting through AST expressions

Automated security testing using bandit and flake8.

coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.

Pylint plugin for improving code analysis for when using Django

A static type analyzer for Python code

Performant type-checking for python.

It's not just a linter that annoys you!

The official GitHub mirror of https://gitlab.com/pycqa/flake8

Lazydata: Scalable data dependencies for Python projects

MongoDB data stream pipeline tools by YouGov (adopted from MongoDB)

Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.

Pandas Google BigQuery

A small Python module for determining appropriate platform-specific dirs, e.g. a "user data dir".

Lightweight data validation and adaptation Python library.

Python Data Structures for Humans™.

Typical: Fast, simple, & correct data-validation using Python 3 typing.

A simple, fast, extensible python library for data validation.

Python Data Validation for Humans™.

CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

Lightweight, extensible data validation library for Python

Data parsing and validation using Python type hints

Protocol Buffers - Google's data interchange format

Easily share data across your company via SQL queries. From Grove Collab.

A reusable Django model field for storing ad-hoc JSON data

Django application and library for importing and exporting data with admin integration.

a flask profiler which watches endpoint calls and tries to make some analysis.

A flask extension using pyexcel to read, manipulate and write data in different excel formats: csv, ods, xls, xlsx and xlsm.

Django Smuggler is a pluggable application for Django Web Framework that helps you to import/export fixtures via the automatically-generated administration interface.

Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.

Library to scrape and clean web pages to create massive datasets.

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

IMDbPY is a Python package useful to retrieve and manage the data of the IMDb movie database about movies, people, characters and companies

TuShare is a utility for crawling historical data of China stocks

🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

Mockoon is the easiest and quickest way to run mock APIs locally. No remote deployment, no account required, open source.

Script used to download data for stocks.

🏆 A ranked list of awesome Python open-source libraries and tools. Updated weekly.

DeiT: Data-efficient Image Transformers

Deploy a ML inference service on a budget in less than 10 lines of code.

This is a database of 180.000+ symbols containing Equities, ETFs, Funds, Indices, Futures, Options, Currencies, Cryptocurrencies and Money Markets.

Python AsyncIO data API to manage billions of resources

⚾🤖⚾ Automatic baseball pitching overlay in realtime

APT-Hunter is Threat Hunting tool for windows event logs

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

Graphing communities on Twitch.tv in a visually intuitive way

Converts Betaflight blackbox gyro to MP4 GoPro Meta data so it can be used with ReelSteady GO

Export your data from Xiami

This program goes thru reddit, finds the most mentioned tickers and uses Vader SentimentIntensityAnalyzer to calculate the ticker compound value.

An API serving data on all creatures, monsters, materials, equipment, and treasure in The Legend of Zelda: Breath of the Wild

Various capabilities for static malware analysis.

Soda SQL Data testing, monitoring and profiling for SQL accessible data.

A framework for joint super-resolution and image synthesis, without requiring real training data

data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

A Pythonic client for the official https://data.gov.gr API.

PyCG: Practical Python Call Graphs

A minimalistic library designed to provide native access to YNAB data from Python

Python wrapper for the Sportradar APIs ⚽️🏈