261 Repositories
Python pandas-profiling Libraries
The easy way to write your own flavor of Pandas
Pandas Flavor The easy way to write your own flavor of Pandas Pandas 0.23 added a (simple) API for registering accessors with Pandas objects. Pandas-f
A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner
swifter A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner. Blog posts Release 1.0.0 Fir
Modin: Speed up your Pandas workflows by changing a single line of code
Scale your pandas workflows by changing one line of code To use Modin, replace the pandas import: # import pandas as pd import modin.pandas as pd Inst
Koalas: pandas API on Apache Spark
pandas API on Apache Spark Explore Koalas docs » Live notebook · Issues · Mailing list Help Thirsty Koalas Devastated by Recent Fires The Koalas proje
High performance datastore for time series and tick data
Arctic TimeSeries and Tick store Arctic is a high performance datastore for numeric data. It supports Pandas, numpy arrays and pickled objects out-of-
Universal 1d/2d data containers with Transformers functionality for data analysis.
XPandas (extended Pandas) implements 1D and 2D data containers for storing type-heterogeneous tabular data of any type, and encapsulates feature extra
Pandas Google BigQuery
pandas-gbq pandas-gbq is a package providing an interface to the Google BigQuery API from pandas Installation Install latest release version via conda
sqldf for pandas
pandasql pandasql allows you to query pandas DataFrames using SQL syntax. It works similarly to sqldf in R. pandasql seeks to provide a more familiar
NumPy and Pandas interface to Big Data
Blaze translates a subset of modified NumPy and Pandas-like syntax to databases and other computing systems. Blaze allows Python users a familiar inte
cuDF - GPU DataFrame Library
cuDF - GPU DataFrames NOTE: For the latest stable README.md ensure you are on the main branch. Built based on the Apache Arrow columnar memory format,
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
daily report of @arkinvest ETF activity + data collection
ark_invest daily weekday report of @arkinvest ETF activity + data collection This script was created to: Extract and save daily csv's from ARKInvest's
🍊 :bar_chart: :bulb: Orange: Interactive data analysis
Orange Data Mining Orange is a data mining and visualization toolbox for novice and expert alike. To explore data with Orange, one requires no program
xarray: N-D labeled arrays and datasets
xarray is an open source project and Python package that makes working with labelled multi-dimensional arrays simple, efficient, and fun!
Interactive plotting for Pandas using Vega-Lite
pdvega: Vega-Lite plotting for Pandas Dataframes pdvega is a library that allows you to quickly create interactive Vega-Lite plots from Pandas datafra
Draw interactive NetworkX graphs with Altair
nx_altair Draw NetworkX graphs with Altair nx_altair offers a similar draw API to NetworkX but returns Altair Charts instead. If you'd like to contrib
Visualize and compare datasets, target values and associations, with one line of code.
In-depth EDA (target analysis, comparison, feature analysis, correlation) in two lines of code! Sweetviz is an open-source Python library that generat
Joyplots in Python with matplotlib & pandas :chart_with_upwards_trend:
JoyPy JoyPy is a one-function Python package based on matplotlib + pandas with a single purpose: drawing joyplots (a.k.a. ridgeline plots). The code f
Bokeh Plotting Backend for Pandas and GeoPandas
Pandas-Bokeh provides a Bokeh plotting backend for Pandas, GeoPandas and Pyspark DataFrames, similar to the already existing Visualization feature of
A GUI for Pandas DataFrames
PandasGUI A GUI for analyzing Pandas DataFrames. Demo Installation Install latest release from PyPi: pip install pandasgui Install directly from Githu
A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
hvPlot A high-level plotting API for the PyData ecosystem built on HoloViews. Build Status Coverage Latest dev release Latest release Docs What is it?
Productivity Tools for Plotly + Pandas
Cufflinks This library binds the power of plotly with the flexibility of pandas for easy plotting. This library is available on https://github.com/san
Missing data visualization module for Python.
missingno Messy datasets? Missing values? missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities tha
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
Statistical data visualization using matplotlib
seaborn: statistical data visualization Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing
Interactive plotting for Pandas using Vega-Lite
pdvega: Vega-Lite plotting for Pandas Dataframes pdvega is a library that allows you to quickly create interactive Vega-Lite plots from Pandas datafra
Draw interactive NetworkX graphs with Altair
nx_altair Draw NetworkX graphs with Altair nx_altair offers a similar draw API to NetworkX but returns Altair Charts instead. If you'd like to contrib
Visualize and compare datasets, target values and associations, with one line of code.
In-depth EDA (target analysis, comparison, feature analysis, correlation) in two lines of code! Sweetviz is an open-source Python library that generat
Joyplots in Python with matplotlib & pandas :chart_with_upwards_trend:
JoyPy JoyPy is a one-function Python package based on matplotlib + pandas with a single purpose: drawing joyplots (a.k.a. ridgeline plots). The code f
Bokeh Plotting Backend for Pandas and GeoPandas
Pandas-Bokeh provides a Bokeh plotting backend for Pandas, GeoPandas and Pyspark DataFrames, similar to the already existing Visualization feature of
A GUI for Pandas DataFrames
PandasGUI A GUI for analyzing Pandas DataFrames. Demo Installation Install latest release from PyPi: pip install pandasgui Install directly from Githu
A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
hvPlot A high-level plotting API for the PyData ecosystem built on HoloViews. Build Status Coverage Latest dev release Latest release Docs What is it?
Productivity Tools for Plotly + Pandas
Cufflinks This library binds the power of plotly with the flexibility of pandas for easy plotting. This library is available on https://github.com/san
Missing data visualization module for Python.
missingno Messy datasets? Missing values? missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities tha
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
Statistical data visualization using matplotlib
seaborn: statistical data visualization Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing
pprofile + matplotlib = Python program profiled as an awesome heatmap!
pyheat Profilers are extremely helpful tools. They help us dig deep into code, find and understand performance bottlenecks. But sometimes we just want
Was an interactive continuous Python profiler.
☠ This project is not maintained anymore. We highly recommend switching to py-spy which provides better performance and usability. Profiling The profi
Scalene: a high-performance, high-precision CPU and memory profiler for Python
scalene: a high-performance CPU and memory profiler for Python by Emery Berger 中文版本 (Chinese version) About Scalene % pip install -U scalene Scalen
Line-by-line profiling for Python
line_profiler and kernprof NOTICE: This is the official line_profiler repository. The most recent version of line-profiler on pypi points to this repo
Sampling profiler for Python programs
py-spy: Sampling profiler for Python programs py-spy is a sampling profiler for Python programs. It lets you visualize what your Python program is spe
A plugin for Flake8 that checks pandas code
pandas-vet pandas-vet is a plugin for flake8 that provides opinionated linting for pandas code. It began as a project during the PyCascades 2019 sprin
Mypy stubs, i.e., type information, for numpy, pandas and matplotlib
Mypy type stubs for NumPy, pandas, and Matplotlib This is a PEP-561-compliant stub-only package which provides type information for matplotlib, numpy
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana
Pandas Google BigQuery
pandas-gbq pandas-gbq is a package providing an interface to the Google BigQuery API from pandas Installation Install latest release version via conda
A pandas-like deferred expression system, with first-class SQL support
Ibis: Python data analysis framework for Hadoop and SQL engines Service Status Documentation Conda packages PyPI Azure Coverage Ibis is a toolbox to b
TuShare is a utility for crawling historical data of China stocks
TuShare Tushare Pro版已发布,请访问新的官网了解和查询数据接口! https://tushare.pro TuShare是实现对股票/期货等金融数据从数据采集、清洗加工 到 数据存储过程的工具,满足金融量化分析师和学习数据分析的人在数据获取方面的需求,它的特点是数据覆盖范围广,接口
Silky smooth profiling for Django
Silk Silk is a live profiling and inspection tool for the Django framework. Silk intercepts and stores HTTP requests and database queries before prese
Soda SQL Data testing, monitoring and profiling for SQL accessible data.
Soda SQL Data testing, monitoring and profiling for SQL accessible data. What does Soda SQL do? Soda SQL allows you to Stop your pipeline when bad dat
A compendium of useful, interesting, inspirational usage of pandas functions, each example will be an ipynb file
Pandas_by_examples A compendium of useful/interesting/inspirational usage of pandas functions, each example will be an ipynb file What is this reposit
An end-to-end machine learning web app to predict rugby scores (Pandas, SQLite, Keras, Flask, Docker)
Rugby score prediction An end-to-end machine learning web app to predict rugby scores Overview An demo project to provide a high-level overview of the
Python SDK for IEX Cloud
iexfinance Python SDK for IEX Cloud. Architecture mirrors that of the IEX Cloud API (and its documentation). An easy-to-use toolkit to obtain data for
A Python module for creating Excel XLSX files.
XlsxWriter XlsxWriter is a Python module for writing files in the Excel 2007+ XLSX file format. XlsxWriter can be used to write text, numbers, formula
Sparkling Pandas
SparklingPandas SparklingPandas aims to make it easy to use the distributed computing power of PySpark to scale your data analysis with Pandas. Sparkl
NumPy and Pandas interface to Big Data
Blaze translates a subset of modified NumPy and Pandas-like syntax to databases and other computing systems. Blaze allows Python users a familiar inte
Sampling profiler for Python programs
py-spy: Sampling profiler for Python programs py-spy is a sampling profiler for Python programs. It lets you visualize what your Python program is spe
(OLD REPO) Line-by-line profiling for Python - Current repo -
line_profiler and kernprof line_profiler is a module for doing line-by-line profiling of functions. kernprof is a convenient script for running either
Statistical data visualization using matplotlib
seaborn: statistical data visualization Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing
:truck: Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
To launch a live notebook server to test optimus using binder or Colab, click on one of the following badges: Optimus is the missing framework to prof
NumPy and Pandas interface to Big Data
Blaze translates a subset of modified NumPy and Pandas-like syntax to databases and other computing systems. Blaze allows Python users a familiar inte
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana