This curated list contains 840 awesome open-source projects with a total of 2.7M stars grouped into 32 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome!
Contents
- Machine Learning Frameworks 54 projects
- Data Visualization 48 projects
- Text Data & NLP 82 projects
- Image Data 49 projects
- Graph Data 29 projects
- Audio Data 23 projects
- Geospatial Data 22 projects
- Financial Data 23 projects
- Time Series Data 20 projects
- Medical Data 19 projects
- Optical Character Recognition 11 projects
- Data Containers & Structures 28 projects
- Data Loading & Extraction 23 projects
- Web Scraping & Crawling 1 projects
- Data Pipelines & Streaming 35 projects
- Distributed Machine Learning 26 projects
- Hyperparameter Optimization & AutoML 45 projects
- Reinforcement Learning 19 projects
- Recommender Systems 14 projects
- Privacy Machine Learning 6 projects
- Workflow & Experiment Tracking 35 projects
- Model Serialization & Conversion 11 projects
- Model Interpretability 45 projects
- Vector Similarity Search (ANN) 12 projects
- Probabilistics & Statistics 21 projects
- Adversarial Robustness 8 projects
- GPU Utilities 18 projects
- Tensorflow Utilities 13 projects
- Sklearn Utilities 17 projects
- Pytorch Utilities 27 projects
- Database Clients 1 projects
- Others 52 projects
Explanation
-
π₯ π₯ π₯ Combined project-quality score -
βοΈ Star count from GitHub -
π£ New project (less than 6 months old) -
π€ Inactive project (6 months no activity) -
π Dead project (12 months no activity) -
π π Project is trending up or down -
β Project was recently added -
βοΈ Warning (e.g. missing/risky license) -
π¨βπ» Contributors count from GitHub -
π Fork count from GitHub -
π Issue count from GitHub -
β±οΈ Last update timestamp on package manager -
π₯ Download count from package manager -
π¦ Number of dependent projects - Tensorflow related project
- Sklearn related project
- PyTorch related project
- MxNet related project
- Apache Spark related project
- Jupyter related project
- PaddlePaddle related project
- Pandas related project
Machine Learning Frameworks
General-purpose machine learning and deep learning frameworks.
Tensorflow (
π₯
44 Β·
β
150K) - An Open Source Machine Learning Framework for Everyone. Apache-2
-
GitHub (
π¨βπ» 3.5K Β·π 84K Β·π¦ 120K Β·π 30K - 14% open Β·β±οΈ 04.02.2021):git clone https://github.com/tensorflow/tensorflow
-
PyPi (
π₯ 3.6M / month Β·π¦ 23K Β·β±οΈ 21.01.2021):pip install tensorflow
-
Conda (
π₯ 2.3M Β·β±οΈ 15.07.2020):conda install -c conda-forge tensorflow
-
Docker Hub (
π₯ 47M Β·β 1.8K Β·β±οΈ 03.02.2021):docker pull tensorflow/tensorflow
PyTorch (
π₯
39 Β·
β
46K) - Tensors and Dynamic neural networks in Python with strong GPU.. BSD-3
scikit-learn (
π₯
37 Β·
β
44K) - scikit-learn: machine learning in Python. BSD-3
-
GitHub (
π¨βπ» 2.1K Β·π 21K Β·π₯ 650 Β·π¦ 190K Β·π 9K - 25% open Β·β±οΈ 04.02.2021):git clone https://github.com/scikit-learn/scikit-learn
-
PyPi (
π₯ 7.2M / month Β·π¦ 38K Β·β±οΈ 19.01.2021):pip install scikit-learn
-
Conda (
π₯ 6.8M Β·β±οΈ 21.01.2021):conda install -c conda-forge scikit-learn
StatsModels (
π₯
36 Β·
β
6K) - Statsmodels: statistical modeling and econometrics in Python. BSD-3
-
GitHub (
π¨βπ» 300 Β·π 2.2K Β·π₯ 25 Β·π¦ 37K Β·π 4.3K - 48% open Β·β±οΈ 31.01.2021):git clone https://github.com/statsmodels/statsmodels
-
PyPi (
π₯ 2.2M / month Β·π¦ 6.7K Β·β±οΈ 02.02.2021):pip install statsmodels
-
Conda (
π₯ 3.3M Β·β±οΈ 02.02.2021):conda install -c conda-forge statsmodels
XGBoost (
π₯
35 Β·
β
21K) - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or.. Apache-2
-
GitHub (
π¨βπ» 500 Β·π 7.9K Β·π₯ 1.9K Β·π¦ 14K Β·π 3.9K - 6% open Β·β±οΈ 04.02.2021):git clone https://github.com/dmlc/xgboost
-
PyPi (
π₯ 1.8M / month Β·π¦ 1.6K Β·β±οΈ 20.01.2021):pip install xgboost
-
Conda (
π₯ 1.3M Β·β±οΈ 10.12.2020):conda install -c conda-forge xgboost
LightGBM (
π₯
35 Β·
β
12K) - A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT,.. MIT
-
GitHub (
π¨βπ» 200 Β·π 3.2K Β·π₯ 92K Β·π¦ 5.6K Β·π 2.1K - 4% open Β·β±οΈ 03.02.2021):git clone https://github.com/microsoft/LightGBM
-
PyPi (
π₯ 970K / month Β·π¦ 560 Β·β±οΈ 08.12.2020):pip install lightgbm
-
Conda (
π₯ 480K Β·β±οΈ 15.01.2021):conda install -c conda-forge lightgbm
Theano (
π₯
34 Β·
β
9.3K) - Theano is a Python library that allows you to define, optimize, and.. BSD-3
MXNet (
π₯
33 Β·
β
19K) - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning.. Apache-2
-
GitHub (
π¨βπ» 950 Β·π 6.8K Β·π₯ 23K Β·π¦ 1.8K Β·π 9.4K - 19% open Β·β±οΈ 03.02.2021):git clone https://github.com/apache/incubator-mxnet
-
PyPi (
π₯ 78K / month Β·π¦ 440 Β·β±οΈ 28.08.2020):pip install mxnet
-
Conda (
π₯ 5.8K Β·β±οΈ 29.02.2020):conda install -c anaconda mxnet
pytorch-lightning (
π₯
33 Β·
β
12K) - The lightweight PyTorch wrapper for high-performance.. Apache-2
-
GitHub (
π¨βπ» 380 Β·π 1.4K Β·π₯ 79 Β·π¦ 1.8K Β·π 2.8K - 11% open Β·β±οΈ 04.02.2021):git clone https://github.com/PyTorchLightning/pytorch-lightning
-
PyPi (
π₯ 91K / month Β·π¦ 14 Β·β±οΈ 03.02.2021):pip install pytorch-lightning
-
Conda (
π₯ 28K Β·β±οΈ 03.02.2021):conda install -c conda-forge pytorch-lightning
Thinc (
π₯
32 Β·
β
2.2K) - A refreshing functional take on deep learning, compatible with your favorite.. MIT
jax (
π₯
31 Β·
β
11K) - Composable transformations of Python+NumPy programs: differentiate,.. Apache-2
PaddlePaddle (
π₯
30 Β·
β
14K Β·
π
) - PArallel Distributed Deep LEarning: Machine Learning.. Apache-2
Vowpal Wabbit (
π₯
30 Β·
β
7.4K) - Vowpal Wabbit is a machine learning system which pushes the.. BSD-3
Catboost (
π₯
30 Β·
β
5.7K) - A fast, scalable, high performance Gradient Boosting on Decision.. Apache-2
-
GitHub (
π¨βπ» 730 Β·π 860 Β·π₯ 49K Β·π 1.4K - 23% open Β·β±οΈ 04.02.2021):git clone https://github.com/catboost/catboost
-
PyPi (
π₯ 420K / month Β·π¦ 160 Β·β±οΈ 27.12.2020):pip install catboost
-
Conda (
π₯ 560K Β·β±οΈ 29.12.2020):conda install -c conda-forge catboost
Turi Create (
π₯
29 Β·
β
10K) - Turi Create simplifies the development of custom machine learning.. BSD-3
TFlearn (
π₯
29 Β·
β
9.5K) - Deep learning library featuring a higher-level API for TensorFlow. MIT
tensorpack (
π₯
28 Β·
β
5.9K) - A Neural Net Training Interface on TensorFlow, with focus.. Apache-2
CNTK (
π₯
27 Β·
β
17K Β·
π€
) - Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit. MIT
Ignite (
π₯
27 Β·
β
3.2K) - High-level library to help with training and evaluating neural.. BSD-3
Jina (
π₯
27 Β·
β
2.1K) - An easier way to build neural search in the cloud. Apache-2
-
GitHub (
π¨βπ» 85 Β·π 340 Β·π¦ 46 Β·π 630 - 7% open Β·β±οΈ 04.02.2021):git clone https://github.com/jina-ai/jina
-
PyPi (
π₯ 2.4K / month Β·β±οΈ 04.02.2021):pip install jina
-
Docker Hub (
π₯ 100K Β·β±οΈ 04.02.2021):docker pull jinaai/jina
Flax (
π₯
27 Β·
β
1.4K) - Flax is a neural network ecosystem for JAX that is designed for.. Apache-2
jax
Ludwig (
π₯
25 Β·
β
7.5K) - Ludwig is a toolbox that allows to train and evaluate deep.. Apache-2
Neural Network Libraries (
π₯
25 Β·
β
2.4K) - Neural Network Libraries. Apache-2
xLearn (
π₯
24 Β·
β
2.8K Β·
π€
) - High performance, easy-to-use, and scalable machine learning (ML).. Apache-2
einops (
π₯
24 Β·
β
2.3K) - Deep learning operations reinvented (for pytorch, tensorflow, jax and.. MIT
ktrain (
π₯
24 Β·
β
730) - ktrain is a Python library that makes deep learning and AI more.. Apache-2
tensorflow-upstream (
π₯
24 Β·
β
540) - TensorFlow ROCm port. Apache-2
SHOGUN (
π₯
23 Β·
β
2.8K) - Unified and efficient Machine Learning. BSD-3
-
GitHub (
π¨βπ» 250 Β·π 1K Β·π 1.6K - 33% open Β·β±οΈ 08.12.2020):git clone https://github.com/shogun-toolbox/shogun
-
Conda (
π₯ 90K Β·β±οΈ 25.06.2018):conda install -c conda-forge shogun
-
Docker Hub (
π₯ 1.4K Β·β 1 Β·β±οΈ 31.01.2019):docker pull shogun/shogun
mace (
π₯
21 Β·
β
4.3K) - MACE is a deep learning inference framework optimized for mobile.. Apache-2
-
GitHub (
π¨βπ» 54 Β·π 760 Β·π₯ 1.3K Β·π 620 - 6% open Β·β±οΈ 02.02.2021):git clone https://github.com/XiaoMi/mace
Neural Tangents (
π₯
21 Β·
β
1.3K) - Fast and Easy Infinite Neural Networks in Python. Apache-2
ThunderSVM (
π₯
20 Β·
β
1.3K) - ThunderSVM: A Fast SVM Library on GPUs and CPUs. Apache-2
Haiku (
π₯
20 Β·
β
930) - JAX-based neural network library. Apache-2
-
GitHub (
π¨βπ» 35 Β·π 64 Β·π¦ 58 Β·π 66 - 27% open Β·β±οΈ 01.02.2021):git clone https://github.com/deepmind/dm-haiku
Objax (
π₯
19 Β·
β
560 Β·
π£
) - Objax is a machine learning framework that provides an Object.. Apache-2
jax
Torchbearer (
π₯
17 Β·
β
580 Β·
π€
) - torchbearer: A model fitting library for PyTorch. MIT
ThunderGBM (
π₯
16 Β·
β
580) - ThunderGBM: Fast GBDTs and Random Forests on GPUs. Apache-2
elegy (
π₯
16 Β·
β
160) - Elegy is a framework-agnostic Trainer interface for the Jax.. Apache-2
jax
NeoML (
π₯
13 Β·
β
560) - Machine learning framework for both deep learning and traditional.. Apache-2
-
GitHub (
π¨βπ» 16 Β·π 79 Β·π 29 - 55% open Β·β±οΈ 03.02.2021):git clone https://github.com/neoml-lib/neoml
Show 7 hidden projects...
- dlib (
π₯ 32 Β·β 9.9K) - A toolkit for making real world machine learning and data analysis..βοΈBSL-1.0
- NuPIC (
π₯ 24 Β·β 6.2K Β·π ) - Numenta Platform for Intelligent Computing is an implementation..βοΈAGPL-3.0
- Lasagne (
π₯ 24 Β·β 3.8K Β·π ) - Lightweight library to build and train neural networks in Theano.MIT
- neon (
π₯ 22 Β·β 3.9K Β·π ) - Intel Nervana reference deep learning framework committed to best..Apache-2
- MindsDB (
π₯ 20 Β·β 3.2K) - Predictive AI layer for existing databases.βοΈGPL-3.0
- NeuPy (
π₯ 20 Β·β 660 Β·π ) - NeuPy is a Tensorflow based python library for prototyping and building..MIT
- StarSpace (
π₯ 13 Β·β 3.5K Β·π ) - Learning embeddings for classification, retrieval and ranking.MIT
Data Visualization
General-purpose and task-specific data visualization libraries.
Matplotlib (
π₯
41 Β·
β
13K) - matplotlib: plotting with Python. Python-2.0
-
GitHub (
π¨βπ» 1.2K Β·π 5.6K Β·π¦ 320K Β·π 7.6K - 21% open Β·β±οΈ 04.02.2021):git clone https://github.com/matplotlib/matplotlib
-
PyPi (
π₯ 7M / month Β·π¦ 79K Β·β±οΈ 28.01.2021):pip install matplotlib
-
Conda (
π₯ 8.1M Β·β±οΈ 28.01.2021):conda install -c conda-forge matplotlib
Plotly (
π₯
35 Β·
β
8.8K) - The interactive graphing library for Python (includes Plotly Express). MIT
-
GitHub (
π¨βπ» 160 Β·π 1.7K Β·π¦ 5 Β·π 1.9K - 43% open Β·β±οΈ 14.01.2021):git clone https://github.com/plotly/plotly.py
-
PyPi (
π₯ 1.8M / month Β·π¦ 5K Β·β±οΈ 12.01.2021):pip install plotly
-
Conda (
π₯ 1.2M Β·β±οΈ 12.01.2021):conda install -c conda-forge plotly
-
NPM (
π₯ 200K / month Β·π¦ 4 Β·β±οΈ 12.01.2021):npm install plotlywidget
Seaborn (
π₯
35 Β·
β
8.1K) - Statistical data visualization using matplotlib. BSD-3
-
GitHub (
π¨βπ» 150 Β·π 1.4K Β·π₯ 130 Β·π¦ 80K Β·π 1.8K - 4% open Β·β±οΈ 03.02.2021):git clone https://github.com/mwaskom/seaborn
-
PyPi (
π₯ 1.4M / month Β·π¦ 13K Β·β±οΈ 20.12.2020):pip install seaborn
-
Conda (
π₯ 1.9M Β·β±οΈ 28.01.2021):conda install -c conda-forge seaborn
dash (
π₯
34 Β·
β
14K) - Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required. MIT
wordcloud (
π₯
31 Β·
β
7.8K) - A little word cloud generator in Python. MIT
-
GitHub (
π¨βπ» 58 Β·π 2K Β·π¦ 8.8K Β·π 440 - 22% open Β·β±οΈ 11.11.2020):git clone https://github.com/amueller/word_cloud
-
PyPi (
π₯ 170K / month Β·π¦ 1.1K Β·β±οΈ 11.11.2020):pip install wordcloud
-
Conda (
π₯ 190K Β·β±οΈ 14.01.2021):conda install -c conda-forge wordcloud
bqplot (
π₯
30 Β·
β
3K) - Plotting library for IPython/Jupyter notebooks. Apache-2
-
GitHub (
π¨βπ» 51 Β·π 400 Β·π¦ 1.2K Β·π 510 - 36% open Β·β±οΈ 21.01.2021):git clone https://github.com/bqplot/bqplot
-
PyPi (
π₯ 12K / month Β·π¦ 110 Β·β±οΈ 14.01.2021):pip install bqplot
-
Conda (
π₯ 480K Β·β±οΈ 14.01.2021):conda install -c conda-forge bqplot
-
NPM (
π₯ 140K / month Β·π¦ 10 Β·β±οΈ 14.01.2021):npm install bqplot
pandas-profiling (
π₯
29 Β·
β
6.7K) - Create HTML profiling reports from pandas DataFrame.. MIT
-
GitHub (
π¨βπ» 65 Β·π 990 Β·π¦ 3K Β·π 420 - 15% open Β·β±οΈ 04.02.2021):git clone https://github.com/pandas-profiling/pandas-profiling
-
PyPi (
π₯ 130K / month Β·π¦ 160 Β·β±οΈ 03.09.2020):pip install pandas-profiling
-
Conda (
π₯ 100K Β·β±οΈ 09.01.2021):conda install -c conda-forge pandas-profiling
PyQtGraph (
π₯
29 Β·
β
2.3K) - Fast data visualization and GUI tools for scientific / engineering.. MIT
HoloViews (
π₯
29 Β·
β
1.8K) - With Holoviews, your data visualizes itself. BSD-3
-
GitHub (
π¨βπ» 100 Β·π 300 Β·π 2.5K - 27% open Β·β±οΈ 02.02.2021):git clone https://github.com/holoviz/holoviews
-
PyPi (
π₯ 65K / month Β·π¦ 170 Β·β±οΈ 28.01.2021):pip install holoviews
-
Conda (
π₯ 410K Β·β±οΈ 28.12.2020):conda install -c conda-forge holoviews
-
NPM (
π₯ 8.1K / month Β·β±οΈ 24.05.2020):npm install @pyviz/jupyterlab_pyviz
VisPy (
π₯
28 Β·
β
2.6K) - High-performance interactive 2D/3D data visualization library. BSD-3
-
GitHub (
π¨βπ» 140 Β·π 540 Β·π¦ 450 Β·π 1.1K - 31% open Β·β±οΈ 27.01.2021):git clone https://github.com/vispy/vispy
-
PyPi (
π₯ 11K / month Β·π¦ 120 Β·β±οΈ 28.11.2020):pip install vispy
-
Conda (
π₯ 120K Β·β±οΈ 13.01.2021):conda install -c conda-forge vispy
-
NPM (
π₯ 130 / month Β·β±οΈ 15.03.2020):npm install vispy
datashader (
π₯
28 Β·
β
2.4K) - Quickly and accurately render even the largest data. BSD-3
-
GitHub (
π¨βπ» 43 Β·π 310 Β·π¦ 570 Β·π 460 - 31% open Β·β±οΈ 17.01.2021):git clone https://github.com/holoviz/datashader
-
PyPi (
π₯ 8.8K / month Β·π¦ 70 Β·β±οΈ 07.01.2021):pip install datashader
-
Conda (
π₯ 140K Β·β±οΈ 08.01.2021):conda install -c conda-forge datashader
missingno (
π₯
27 Β·
β
2.6K) - Missing data visualization module for Python. MIT
-
GitHub (
π¨βπ» 15 Β·π 330 Β·π¦ 2.8K Β·π 100 - 14% open Β·β±οΈ 28.12.2020):git clone https://github.com/ResidentMario/missingno
-
PyPi (
π₯ 110K / month Β·π¦ 76 Β·β±οΈ 29.06.2018):pip install missingno
-
Conda (
π₯ 72K Β·β±οΈ 15.02.2020):conda install -c conda-forge missingno
data-validation (
π₯
27 Β·
β
510) - Library for exploring and validating machine learning.. Apache-2
Perspective (
π₯
26 Β·
β
3.2K) - Streaming pivot visualization via WebAssembly. Apache-2
-
GitHub (
π¨βπ» 62 Β·π 350 Β·π¦ 170 Β·π 380 - 18% open Β·β±οΈ 01.02.2021):git clone https://github.com/finos/perspective
-
PyPi (
π₯ 400 / month Β·π¦ 4 Β·β±οΈ 14.01.2021):pip install perspective-python
-
NPM (
π₯ 1.5K / month Β·β±οΈ 08.01.2021):npm install @finos/perspective-jupyterlab
PyVista (
π₯
26 Β·
β
680) - 3D plotting and mesh analysis through a streamlined interface for the.. MIT
-
GitHub (
π¨βπ» 53 Β·π 140 Β·π₯ 46 Β·π¦ 250 Β·π 400 - 32% open Β·β±οΈ 04.02.2021):git clone https://github.com/pyvista/pyvista
-
PyPi (
π₯ 7.7K / month Β·π¦ 26 Β·β±οΈ 04.02.2021):pip install pyvista
-
Conda (
π₯ 58K Β·β±οΈ 10.12.2020):conda install -c conda-forge pyvista
HyperTools (
π₯
25 Β·
β
1.6K) - A Python toolbox for gaining geometric insights into high-dimensional.. MIT
hvPlot (
π₯
25 Β·
β
340) - A high-level plotting API for pandas, dask, xarray, and networkx built on.. BSD-3
Facets Overview (
π₯
24 Β·
β
6.5K) - Visualizations for machine learning datasets. Apache-2
Chartify (
π₯
24 Β·
β
2.8K) - Python library that makes it easy for data scientists to create.. Apache-2
pythreejs (
π₯
24 Β·
β
700) - A Jupyter - Three.js bridge. BSD-3
-
GitHub (
π¨βπ» 24 Β·π 160 Β·π¦ 15 Β·π 200 - 30% open Β·β±οΈ 09.10.2020):git clone https://github.com/jupyter-widgets/pythreejs
-
PyPi (
π₯ 5.1K / month Β·π¦ 13 Β·β±οΈ 09.10.2020):pip install pythreejs
-
Conda (
π₯ 260K Β·β±οΈ 12.10.2020):conda install -c conda-forge pythreejs
-
NPM (
π₯ 5.9K / month Β·π¦ 8 Β·β±οΈ 19.03.2020):npm install jupyter-threejs
Multicore-TSNE (
π₯
23 Β·
β
1.5K) - Parallel t-SNE implementation with Python and Torch.. BSD-3
-
GitHub (
π¨βπ» 15 Β·π 190 Β·π¦ 210 Β·π 53 - 62% open Β·β±οΈ 19.08.2020):git clone https://github.com/DmitryUlyanov/Multicore-TSNE
-
PyPi (
π₯ 2K / month Β·π¦ 14 Β·β±οΈ 08.11.2017):pip install MulticoreTSNE
-
Conda (
π₯ 6K Β·β±οΈ 12.11.2018):conda install -c conda-forge multicore-tsne
openTSNE (
π₯
23 Β·
β
750) - Extensible, parallel implementations of t-SNE. BSD-3
-
GitHub (
π¨βπ» 10 Β·π 82 Β·π¦ 180 Β·π 73 - 6% open Β·β±οΈ 08.01.2021):git clone https://github.com/pavlin-policar/openTSNE
-
PyPi (
π₯ 8.4K / month Β·π¦ 4 Β·β±οΈ 08.01.2021):pip install opentsne
-
Conda (
π₯ 75K Β·β±οΈ 08.01.2021):conda install -c conda-forge opentsne
D-Tale (
π₯
22 Β·
β
2K) - Visualizer for pandas data structures. βοΈLGPL-2.1
Pandas-Bokeh (
π₯
22 Β·
β
610) - Bokeh Plotting Backend for Pandas and GeoPandas. MIT
python-ternary (
π₯
22 Β·
β
390) - Ternary plotting library for python with matplotlib. MIT
-
GitHub (
π¨βπ» 25 Β·π 110 Β·π₯ 14 Β·π¦ 55 Β·π 100 - 23% open Β·β±οΈ 05.01.2021):git clone https://github.com/marcharper/python-ternary
-
PyPi (
π₯ 670 / month Β·π¦ 10 Β·β±οΈ 10.05.2020):pip install python-ternary
-
Conda (
π₯ 48K Β·β±οΈ 10.05.2020):conda install -c conda-forge python-ternary
Sweetviz (
π₯
19 Β·
β
1.2K) - Visualize and compare datasets, target values and associations, with one.. MIT
animatplot (
π₯
19 Β·
β
350) - A python package for animating plots build on matplotlib. MIT
AutoViz (
π₯
19 Β·
β
300) - Automatically Visualize any dataset, any size with a single line of.. Apache-2
Show 6 hidden projects...
- plotnine (
π₯ 27 Β·β 2.6K) - A grammar of graphics for Python.βοΈGPL-2.0
- PDPbox (
π₯ 23 Β·β 520 Β·π ) - python partial dependence plot toolbox.MIT
- pivottablejs (
π₯ 19 Β·β 420 Β·π ) - Dragndrop Pivot Tables and Charts for Jupyter/IPython..MIT
- ivis (
π₯ 18 Β·β 220) - Dimensionality reduction in very large datasets using Siamese..βοΈGPL-2.0
- pdvega (
π₯ 16 Β·β 340 Β·π ) - Interactive plotting for Pandas using Vega-Lite.MIT
- nptsne (
π₯ 14 Β·β 24) - nptsne is a numpy compatible python binary package that offers a number..Apache-2
Text Data & NLP
Libraries for processing, cleaning, manipulating, and analyzing text data as well as libraries for NLP tasks such as language detection, fuzzy matching, classification, seq2seq learning, conversational AI, keyword extraction, and translation.
spaCy (
π₯
37 Β·
β
19K) - Industrial-strength Natural Language Processing (NLP) in Python. MIT
-
GitHub (
π¨βπ» 560 Β·π 3.2K Β·π₯ 2.9K Β·π¦ 21K Β·π 4.3K - 1% open Β·β±οΈ 04.02.2021):git clone https://github.com/explosion/spaCy
-
PyPi (
π₯ 670K / month Β·π¦ 3.1K Β·β±οΈ 02.02.2021):pip install spacy
-
Conda (
π₯ 1.4M Β·β±οΈ 04.02.2021):conda install -c conda-forge spacy
transformers (
π₯
36 Β·
β
40K) - Transformers: State-of-the-art Natural Language.. Apache-2
-
GitHub (
π¨βπ» 770 Β·π 9.8K Β·π₯ 1.3K Β·π¦ 7.4K Β·π 5.9K - 9% open Β·β±οΈ 04.02.2021):git clone https://github.com/huggingface/transformers
-
PyPi (
π₯ 530K / month Β·π¦ 130 Β·β±οΈ 21.01.2021):pip install transformers
-
Conda (
π₯ 18K Β·β±οΈ 21.01.2021):conda install -c conda-forge transformers
gensim (
π₯
35 Β·
β
12K) - Topic Modelling for Humans. βοΈLGPL-2.1
-
GitHub (
π¨βπ» 390 Β·π 3.9K Β·π₯ 3K Β·π¦ 21K Β·π 1.6K - 21% open Β·β±οΈ 31.01.2021):git clone https://github.com/RaRe-Technologies/gensim
-
PyPi (
π₯ 4.7M / month Β·π¦ 4.7K Β·β±οΈ 15.11.2020):pip install gensim
-
Conda (
π₯ 590K Β·β±οΈ 14.05.2020):conda install -c conda-forge gensim
nltk (
π₯
34 Β·
β
9.6K) - Suite of libraries and programs for symbolic and statistical natural.. Apache-2
sentencepiece (
π₯
31 Β·
β
4.8K) - Unsupervised text tokenizer for Neural Network-based text.. Apache-2
-
GitHub (
π¨βπ» 48 Β·π 630 Β·π₯ 10K Β·π¦ 5.7K Β·π 410 - 5% open Β·β±οΈ 12.01.2021):git clone https://github.com/google/sentencepiece
-
PyPi (
π₯ 700K / month Β·π¦ 240 Β·β±οΈ 10.01.2021):pip install sentencepiece
-
Conda (
π₯ 26K Β·β±οΈ 08.01.2021):conda install -c conda-forge sentencepiece
fastText (
π₯
30 Β·
β
22K Β·
π€
) - Library for fast text representation and classification. MIT
-
GitHub (
π¨βπ» 58 Β·π 4.3K Β·π¦ 1.5K Β·π 1K - 42% open Β·β±οΈ 18.07.2020):git clone https://github.com/facebookresearch/fastText
-
PyPi (
π₯ 91K / month Β·π¦ 190 Β·β±οΈ 28.04.2020):pip install fasttext
-
Conda (
π₯ 17K Β·β±οΈ 12.10.2020):conda install -c conda-forge fasttext
fairseq (
π₯
30 Β·
β
11K) - Facebook AI Research Sequence-to-Sequence Toolkit written in Python. MIT
ChatterBot (
π₯
30 Β·
β
11K) - ChatterBot is a machine learning, conversational dialog engine for.. BSD-3
snowballstemmer (
π₯
30 Β·
β
470) - Snowball compiler and stemming algorithms. BSD-3
-
GitHub (
π¨βπ» 25 Β·π 130 Β·π¦ 44K Β·π 59 - 28% open Β·β±οΈ 02.02.2021):git clone https://github.com/snowballstem/snowball
-
PyPi (
π₯ 1.9M / month Β·π¦ 13K Β·β±οΈ 21.01.2021):pip install snowballstemmer
-
Conda (
π₯ 2M Β·β±οΈ 21.01.2021):conda install -c conda-forge snowballstemmer
TextBlob (
π₯
29 Β·
β
7.5K) - Simple, Pythonic, text processing--Sentiment analysis, part-of-speech.. MIT
-
GitHub (
π¨βπ» 33 Β·π 950 Β·π₯ 88 Β·π¦ 10K Β·π 220 - 32% open Β·β±οΈ 11.01.2021):git clone https://github.com/sloria/TextBlob
-
PyPi (
π₯ 190K / month Β·π¦ 2.5K Β·β±οΈ 24.02.2019):pip install textblob
-
Conda (
π₯ 110K Β·β±οΈ 24.02.2019):conda install -c conda-forge textblob
stanza (
π₯
28 Β·
β
5.2K) - Official Stanford NLP Python Library for Many Human Languages. Apache-2
Tokenizers (
π₯
28 Β·
β
4.2K) - Fast State-of-the-Art Tokenizers optimized for Research and.. Apache-2
sentence-transformers (
??
28 Β·
β
4.1K) - Sentence Embeddings with BERT & XLNet. Apache-2
Dedupe (
π₯
28 Β·
β
2.9K) - A python library for accurate and scalable fuzzy matching, record.. MIT
phonenumbers (
π₯
28 Β·
β
2.6K) - Python port of Google's libphonenumber. Apache-2
-
GitHub (
π¨βπ» 22 Β·π 330 Β·π 120 - 2% open Β·β±οΈ 27.01.2021):git clone https://github.com/daviddrysdale/python-phonenumbers
-
PyPi (
π₯ 550K / month Β·π¦ 2.3K Β·β±οΈ 27.01.2021):pip install phonenumbers
-
Conda (
π₯ 370K Β·β±οΈ 04.08.2019):conda install -c conda-forge phonenumbers
inflect (
π₯
28 Β·
β
480) - Correctly generate plurals, ordinals, indefinite articles; convert numbers.. MIT
Rasa (
π₯
27 Β·
β
11K) - Open source machine learning framework to automate text- and voice-.. Apache-2
DeepPavlov (
π₯
26 Β·
β
5K) - An open source library for deep learning end-to-end dialog.. Apache-2
ftfy (
π₯
26 Β·
β
2.9K Β·
π€
) - Fixes mojibake and other glitches in Unicode text, after the fact. MIT
-
GitHub (
π¨βπ» 17 Β·π 100 Β·π¦ 2.7K Β·π 110 - 14% open Β·β±οΈ 17.07.2020):git clone https://github.com/LuminosoInsight/python-ftfy
-
PyPi (
π₯ 240K / month Β·π¦ 760 Β·β±οΈ 20.07.2020):pip install ftfy
-
Conda (
π₯ 94K Β·β±οΈ 20.01.2021):conda install -c conda-forge ftfy
GluonNLP (
π₯
26 Β·
β
2.2K) - Toolkit that enables easy text preprocessing, datasets loading.. Apache-2
TextDistance (
π₯
26 Β·
β
1.9K) - Compute distance between sequences. 30+ algorithms, pure python.. MIT
TensorFlow Text (
π₯
26 Β·
β
690) - Making text a first-class citizen in TensorFlow. Apache-2
jellyfish (
π₯
25 Β·
β
1.4K) - a python library for doing approximate and phonetic matching of.. BSD-2
-
GitHub (
π¨βπ» 20 Β·π 120 Β·π¦ 2.1K Β·π 95 - 9% open Β·β±οΈ 30.12.2020):git clone https://github.com/jamesturk/jellyfish
-
PyPi (
π₯ 600K / month Β·π¦ 650 Β·β±οΈ 21.05.2020):pip install jellyfish
-
Conda (
π₯ 110K Β·β±οΈ 08.01.2021):conda install -c conda-forge jellyfish
pyahocorasick (
π₯
25 Β·
β
570) - Python module (C extension and plain python) implementing Aho-.. BSD-3
-
GitHub (
π¨βπ» 20 Β·π 88 Β·π¦ 490 Β·π 98 - 32% open Β·β±οΈ 26.01.2021):git clone https://github.com/WojciechMula/pyahocorasick
-
PyPi (
π₯ 79K / month Β·π¦ 100 Β·β±οΈ 26.01.2021):pip install pyahocorasick
-
Conda (
π₯ 110K Β·β±οΈ 13.10.2020):conda install -c conda-forge pyahocorasick
ParlAI (
π₯
24 Β·
β
7K) - A framework for training and evaluating AI models on a variety of.. MIT
textgenrnn (
π₯
24 Β·
β
4.2K Β·
π€
) - Easily train your own text-generating neural network of any.. MIT
T5 (
π₯
24 Β·
β
3.2K) - Code for the paper Exploring the Limits of Transfer Learning with a.. Apache-2
vaderSentiment (
π₯
24 Β·
β
2.8K Β·
π€
) - VADER Sentiment Analysis. VADER (Valence Aware Dictionary.. MIT
fastNLP (
π₯
24 Β·
β
2K) - fastNLP: A Modularized and Extensible NLP Framework. Currently still.. Apache-2
pytorch-nlp (
π₯
24 Β·
β
1.9K) - Basic Utilities for PyTorch Natural Language Processing (NLP). BSD-3
PyTextRank (
π₯
24 Β·
β
1.4K) - Python implementation of TextRank for phrase extraction and.. MIT
haystack (
π₯
24 Β·
β
1.3K) - Transformers at scale for question answering & neural search. Using.. Apache-2
spacy-transformers (
π₯
24 Β·
β
890) - Use pretrained transformers like BERT, XLNet and GPT-2.. MIT
spacy
Ciphey (
π₯
23 Β·
β
6.2K) - Automatically decrypt encryptions without knowing the key or cipher,.. MIT
-
GitHub (
π¨βπ» 38 Β·π 350 Β·π 220 - 21% open Β·β±οΈ 19.01.2021):git clone https://github.com/Ciphey/Ciphey
-
PyPi (
π₯ 2K / month Β·β±οΈ 02.12.2020):pip install ciphey
-
Docker Hub (
π₯ 7.7K Β·β 2 Β·β±οΈ 17.12.2020):docker pull remnux/ciphey
flashtext (
π₯
23 Β·
β
4.6K Β·
π€
) - Extract Keywords from sentence or Replace keywords in sentences. MIT
Snips NLU (
π₯
23 Β·
β
3.4K Β·
π€
) - Snips Python library to extract meaning from text. Apache-2
Sumy (
π₯
23 Β·
β
2.5K) - Module for automatic summarization of text documents and HTML pages. Apache-2
neuralcoref (
π₯
23 Β·
β
2.2K) - Fast Coreference Resolution in spaCy with Neural Networks. MIT
-
GitHub (
π¨βπ» 20 Β·π 380 Β·π₯ 170 Β·π¦ 280 Β·π 260 - 16% open Β·β±οΈ 29.12.2020):git clone https://github.com/huggingface/neuralcoref
-
PyPi (
π₯ 2.2K / month Β·π¦ 18 Β·β±οΈ 08.04.2019):pip install neuralcoref
-
Conda (
π₯ 5.8K Β·β±οΈ 21.02.2020):conda install -c conda-forge neuralcoref
sense2vec (
π₯
23 Β·
β
1.2K Β·
π€
) - Contextually-keyed word vectors. MIT
-
GitHub (
π¨βπ» 14 Β·π 200 Β·π₯ 13K Β·π¦ 49 Β·π 94 - 17% open Β·β±οΈ 29.05.2020):git clone https://github.com/explosion/sense2vec
-
PyPi (
π₯ 1.9K / month Β·π¦ 6 Β·β±οΈ 22.11.2019):pip install sense2vec
-
Conda (
π₯ 15K Β·β±οΈ 16.03.2020):conda install -c conda-forge sense2vec
SciSpacy (
π₯
23 Β·
β
800) - A full spaCy pipeline and models for scientific/biomedical documents. Apache-2
pySBD (
π₯
23 Β·
β
270) - pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence.. MIT
scattertext (
π₯
22 Β·
β
1.5K) - Beautiful visualizations of how language differs among document.. Apache-2
-
GitHub (
π¨βπ» 10 Β·π 200 Β·π¦ 150 Β·π 73 - 23% open Β·β±οΈ 18.01.2021):git clone https://github.com/JasonKessler/scattertext
-
PyPi (
π₯ 1.1K / month Β·π¦ 8 Β·β±οΈ 18.01.2021):pip install scattertext
-
Conda (
π₯ 43K Β·β±οΈ 18.01.2021):conda install -c conda-forge scattertext
DeepMatcher (
π₯
21 Β·
β
3.4K Β·
π€
) - Python package for performing Entity and Text Matching using.. BSD-3
NLP Architect (
π₯
21 Β·
β
2.6K) - A model library for exploring state-of-the-art deep learning.. Apache-2
Texar (
π₯
21 Β·
β
2.1K Β·
π€
) - Toolkit for Machine Learning, Natural Language Processing, and.. Apache-2
FARM (
π₯
21 Β·
β
1.1K) - Fast & easy transfer learning for NLP. Harvesting language models.. Apache-2
gpt-2-simple (
π₯
20 Β·
β
2.5K Β·
π€
) - Python package to easily retrain OpenAI's GPT-2 text-.. MIT
Texthero (
π₯
20 Β·
β
2.1K) - Text preprocessing, representation and visualization from zero to hero. MIT
DELTA (
π₯
20 Β·
β
1.4K) - DELTA is a deep learning based natural language and speech.. Apache-2
-
GitHub (
π¨βπ» 41 Β·π 270 Β·π 77 - 11% open Β·β±οΈ 17.12.2020):git clone https://github.com/Delta-ML/delta
-
PyPi (
π₯ 7 / month Β·β±οΈ 27.03.2020):pip install delta-nlp
-
Docker Hub (
π₯ 12K Β·β±οΈ 03.02.2021):docker pull zh794390558/delta
Sockeye (
π₯
20 Β·
β
990) - Sequence-to-sequence framework with a focus on Neural Machine.. Apache-2
YouTokenToMe (
π₯
20 Β·
β
710) - Unsupervised text tokenizer focused on computational efficiency. MIT
Kashgari (
π₯
19 Β·
β
2K) - Kashgari is a production-level NLP Transfer learning framework.. Apache-2
VizSeq (
π₯
15 Β·
β
310) - An Analysis Toolkit for Natural Language Generation (Translation,.. MIT
NeuralQA (
π₯
15 Β·
β
180) - NeuralQA: A Usable Library for Question Answering on Large Datasets with.. MIT
OpenNRE (
π₯
14 Β·
β
3K) - An Open-Source Package for Neural Relation Extraction (NRE). MIT
-
GitHub (
π¨βπ» 9 Β·π 860 Β·π 300 - 6% open Β·β±οΈ 24.11.2020):git clone https://github.com/thunlp/OpenNRE
TransferNLP (
π₯
14 Β·
β
280 Β·
π€
) - NLP library designed for reproducible experimentation.. MIT
textvec (
π₯
14 Β·
β
160) - Text vectorization tool to outperform TFIDF for classification tasks. MIT
Show 9 hidden projects...
- fuzzywuzzy (
π₯ 29 Β·β 7.8K Β·π€ ) - Fuzzy String Matching in Python.βοΈGPL-2.0
- langid (
π₯ 26 Β·β 1.7K Β·π ) - Stand-alone language identification system.BSD-3
- polyglot (
π₯ 24 Β·β 1.8K) - Multilingual text (NLP) processing toolkit.βοΈGPL-3.0
- anaGo (
π₯ 22 Β·β 1.4K Β·π ) - Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition,..MIT
- MatchZoo (
π₯ 20 Β·β 3.3K Β·π ) - Facilitating the design, comparison and sharing of deep..Apache-2
- stop-words (
π₯ 20 Β·β 120 Β·π ) - Get list of common stop words in various languages in Python.BSD-3
- pyfasttext (
π₯ 19 Β·β 230 Β·π ) - Yet another Python binding for fastText.βοΈGPL-3.0
- NeuroNER (
π₯ 17 Β·β 1.5K Β·π ) - Named-entity recognition using neural networks. Easy-to-use and..MIT
- ONNX-T5 (
π₯ 15 Β·β 140 Β·π£ ) - Summarization, translation, sentiment-analysis, text-generation..Apache-2
Image Data
Libraries for image & video processing, manipulation, and augmentation as well as libraries for computer vision tasks such as facial recognition, object detection, and classification.
Pillow (
π₯
39 Β·
β
8.2K) - The friendly PIL fork (Python Imaging Library). βοΈPIL
-
GitHub (
π¨βπ» 340 Β·π 1.6K Β·π¦ 400K Β·π 2.1K - 11% open Β·β±οΈ 03.02.2021):git clone https://github.com/python-pillow/Pillow
-
PyPi (
π₯ 9.6M / month Β·π¦ 110K Β·β±οΈ 02.01.2021):pip install Pillow
-
Conda (
π₯ 7M Β·β±οΈ 11.01.2021):conda install -c conda-forge pillow
scikit-image (
π₯
36 Β·
β
4.2K) - Image processing in Python. BSD-2
-
GitHub (
π¨βπ» 480 Β·π 1.7K Β·π¦ 60K Β·π 2.1K - 30% open Β·β±οΈ 02.02.2021):git clone https://github.com/scikit-image/scikit-image
-
PyPi (
π₯ 1.3M / month Β·π¦ 15K Β·β±οΈ 23.12.2020):pip install scikit-image
-
Conda (
π₯ 2.1M Β·β±οΈ 21.01.2021):conda install -c conda-forge scikit-image
torchvision (
π₯
35 Β·
β
8.3K) - Datasets, Transforms and Models specific to Computer Vision. BSD-3
-
GitHub (
π¨βπ» 360 Β·π 4.3K Β·π¦ 41K Β·π 1.5K - 29% open Β·β±οΈ 04.02.2021):git clone https://github.com/pytorch/vision
-
PyPi (
π₯ 420K / month Β·π¦ 4.6K Β·β±οΈ 10.12.2020):pip install torchvision
-
Conda (
π₯ 34K Β·β±οΈ 14.10.2018):conda install -c conda-forge torchvision
opencv-python (
π₯
30 Β·
β
1.7K) - Automated CI toolchain to produce precompiled opencv-python,.. MIT
Face Recognition (
π₯
29 Β·
β
38K) - The world's simplest facial recognition api for Python.. MIT
Albumentations (
π₯
28 Β·
β
7.2K) - Fast image augmentation library and easy to use wrapper.. MIT
-
GitHub (
π¨βπ» 73 Β·π 920 Β·π¦ 2.7K Β·π 410 - 42% open Β·β±οΈ 03.02.2021):git clone https://github.com/albumentations-team/albumentations
-
PyPi (
π₯ 45K / month Β·π¦ 130 Β·β±οΈ 29.11.2020):pip install albumentations
-
Conda (
π₯ 15K Β·β±οΈ 29.11.2020):conda install -c conda-forge albumentations
Kornia (
π₯
28 Β·
β
3.6K) - Open Source Differentiable Computer Vision Library for PyTorch. Apache-2
imutils (
π₯
28 Β·
β
3.5K) - A series of convenience functions to make basic image processing.. MIT
ImageHash (
π₯
28 Β·
β
1.8K) - A Python Perceptual Image Hashing Module. BSD-2
-
GitHub (
π¨βπ» 17 Β·π 250 Β·π¦ 2K Β·π 87 - 19% open Β·β±οΈ 03.01.2021):git clone https://github.com/JohannesBuchner/imagehash
-
PyPi (
π₯ 210K / month Β·π¦ 530 Β·β±οΈ 19.11.2020):pip install ImageHash
-
Conda (
π₯ 100K Β·β±οΈ 19.11.2020):conda install -c conda-forge imagehash
PyTorch Image Models (
π₯
27 Β·
β
6.9K) - PyTorch image models, scripts, pretrained weights --.. Apache-2
-
GitHub (
π¨βπ» 26 Β·π 980 Β·π₯ 220K Β·π¦ 280 Β·π 240 - 9% open Β·β±οΈ 01.02.2021):git clone https://github.com/rwightman/pytorch-image-models
imageai (
π₯
27 Β·
β
5.8K) - A python library built to empower developers to build applications and.. MIT
detectron2 (
π₯
26 Β·
β
15K) - Detectron2 is FAIR's next-generation platform for object.. Apache-2
InsightFace (
π₯
26 Β·
β
8.5K) - Face Analysis Project on MXNet. MIT
MMDetection (
π₯
25 Β·
β
14K) - OpenMMLab Detection Toolbox and Benchmark. Apache-2
-
GitHub (
π¨βπ» 180 Β·π 4.7K Β·π¦ 25 Β·π 3.4K - 10% open Β·β±οΈ 01.02.2021):git clone https://github.com/open-mmlab/mmdetection
PyTorch3D (
π₯
25 Β·
β
4.3K) - PyTorch3D is FAIR's library of reusable components for deep.. MIT
mtcnn (
π₯
25 Β·
β
1.4K) - MTCNN face detection implementation for TensorFlow, as a PIP package. MIT
Augmentor (
π₯
24 Β·
β
4.3K Β·
π€
) - Image augmentation library in Python for machine learning. MIT
facenet-pytorch (
π₯
24 Β·
β
1.8K) - Pretrained Pytorch face detection (MTCNN) and recognition.. MIT
Face Alignment (
π₯
23 Β·
β
4.7K) - 2D and 3D Face alignment library build using pytorch. BSD-3
segmentation_models (
π₯
23 Β·
β
2.9K Β·
π€
) - Segmentation models with pretrained backbones. Keras.. MIT
vidgear (
π₯
22 Β·
β
1.6K) - High-performance cross-platform Video Processing Python framework.. Apache-2
CellProfiler (
π₯
22 Β·
β
540) - An open-source application for biological image analysis. BSD-3
Image Deduplicator (
π₯
21 Β·
β
3.4K) - Finding duplicate images made easy!. Apache-2
Image Super-Resolution (
π₯
21 Β·
β
2.5K) - Super-scale your images and run experiments with.. Apache-2
-
GitHub (
π¨βπ» 9 Β·π 480 Β·π¦ 40 Β·π 150 - 34% open Β·β±οΈ 11.11.2020):git clone https://github.com/idealo/image-super-resolution
-
PyPi (
π₯ 1.7K / month Β·π¦ 4 Β·β±οΈ 08.01.2020):pip install ISR
-
Docker Hub (
π₯ 120 Β·β±οΈ 01.04.2019):docker pull idealo/image-super-resolution-gpu
tensorflow-graphics (
π₯
21 Β·
β
2.4K) - TensorFlow Graphics: Differentiable Graphics Layers.. Apache-2
MMF (
π₯
20 Β·
β
4.1K) - A modular framework for vision & language multimodal research from.. BSD-3
image-match (
π₯
20 Β·
β
2.5K) - Quickly search over billions of images. Apache-2
Classy Vision (
π₯
20 Β·
β
1.1K) - An end-to-end PyTorch framework for image and video.. MIT
Torch Points 3D (
π₯
20 Β·
β
1K) - Pytorch framework for doing deep learning on point clouds. BSD-3
Caer (
π₯
19 Β·
β
420 Β·
π£
) - A lightweight Computer Vision library. Scale your models, not boilerplate. MIT
vit-pytorch (
π₯
18 Β·
β
2.5K Β·
π£
) - Implementation of Vision Transformer, a simple way to.. MIT
Norfair (
π₯
18 Β·
β
880) - Lightweight Python library for adding real-time 2D object tracking to.. BSD-3
PaddleDetection (
π₯
17 Β·
β
2.3K) - Object detection and instance segmentation toolkit.. Apache-2
-
GitHub (
π¨βπ» 44 Β·π 640 Β·π 1.2K - 28% open Β·β±οΈ 02.02.2021):git clone https://github.com/PaddlePaddle/PaddleDetection
lightly (
π₯
15 Β·
β
410 Β·
π£
) - A python library for self-supervised learning on images. MIT
DEβ«ΆTR (
π₯
14 Β·
β
6.1K) - End-to-End Object Detection with Transformers. Apache-2
-
GitHub (
π¨βπ» 19 Β·π 910 Β·π 280 - 21% open Β·β±οΈ 15.11.2020):git clone https://github.com/facebookresearch/detr
PySlowFast (
π₯
14 Β·
β
3.3K) - PySlowFast: video understanding codebase from FAIR for.. Apache-2
-
GitHub (
π¨βπ» 19 Β·π 630 Β·π¦ 2 Β·π 350 - 46% open Β·β±οΈ 25.01.2021):git clone https://github.com/facebookresearch/SlowFast
pycls (
π₯
14 Β·
β
1.4K) - Codebase for Image Classification Research, written in PyTorch. MIT
-
GitHub (
π¨βπ» 9 Β·π 150 Β·π¦ 2 Β·π 55 - 20% open Β·β±οΈ 14.01.2021):git clone https://github.com/facebookresearch/pycls
Show 4 hidden projects...
- glfw (
π₯ 29 Β·β 7.2K) - A multi-platform library for OpenGL, OpenGL ES, Vulkan, window and input.βοΈZlib
- chainercv (
π₯ 25 Β·β 1.4K Β·π ) - ChainerCV: a Library for Deep Learning in Computer Vision.MIT
- Pillow-SIMD (
π₯ 22 Β·β 1.5K Β·π€ ) - The friendly PIL fork.βοΈPIL
- Luminoth (
π₯ 21 Β·β 2.3K Β·π ) - Deep Learning toolkit for Computer Vision.BSD-3
Graph Data
Libraries for graph processing, clustering, embedding, and machine learning tasks.
networkx (
π₯
36 Β·
β
8.6K) - Network Analysis in Python. BSD-3
-
GitHub (
π¨βπ» 490 Β·π 2.2K Β·π₯ 51 Β·π¦ 66K Β·π 2.6K - 10% open Β·β±οΈ 03.02.2021):git clone https://github.com/networkx/networkx
-
PyPi (
π₯ 4M / month Β·π¦ 21K Β·β±οΈ 22.08.2020):pip install networkx
-
Conda (
π₯ 3M Β·β±οΈ 23.08.2020):conda install -c conda-forge networkx
PyTorch Geometric (
π₯
28 Β·
β
10K) - Geometric Deep Learning Extension Library for PyTorch. MIT
dgl (
π₯
27 Β·
β
6.6K) - Python package built to ease deep learning on graph, on top of existing.. Apache-2
StellarGraph (
π₯
25 Β·
β
1.8K) - StellarGraph - Machine Learning on Graphs. Apache-2
ogb (
π₯
22 Β·
β
730) - Benchmark datasets, data loaders, and evaluators for graph machine learning. MIT
torch-cluster (
π₯
21 Β·
β
330) - PyTorch Extension Library of Optimized Graph Cluster.. MIT
graph-nets (
π₯
19 Β·
β
4.8K) - Build Graph Nets in Tensorflow. Apache-2
PyTorch-BigGraph (
π₯
19 Β·
β
2.7K) - Generate embeddings from large-scale graph-structured.. BSD-3
AmpliGraph (
π₯
19 Β·
β
1.4K) - Python library for Representation Learning on Knowledge.. Apache-2
PyKEEN (
π₯
18 Β·
β
300) - A Python library for learning and evaluating knowledge graph embeddings. MIT
Paddle Graph Learning (
π₯
17 Β·
β
890 Β·
π
) - Paddle Graph Learning (PGL) is an efficient and.. Apache-2
pytorch_geometric_temporal (
π₯
16 Β·
β
340) - A Temporal Extension Library for PyTorch Geometric. MIT
GraphEmbedding (
π₯
15 Β·
β
1.8K) - Implementation and experiments of graph embedding algorithms. MIT
-
GitHub (
π¨βπ» 6 Β·π 550 Β·π¦ 7 Β·π 40 - 67% open Β·β±οΈ 18.10.2020):git clone https://github.com/shenweichen/GraphEmbedding
AutoGL (
π₯
14 Β·
β
580 Β·
π£
) - An autoML framework & toolkit for machine learning on graphs. MIT
OpenKE (
π₯
13 Β·
β
2.4K Β·
π€
) - An Open-Source Package for Knowledge Embedding (KE). MIT
-
GitHub (
π¨βπ» 10 Β·π 740 Β·π 280 - 18% open Β·β±οΈ 08.04.2020):git clone https://github.com/thunlp/OpenKE
GraphVite (
π₯
13 Β·
β
840) - GraphVite: A General and High-performance Graph Embedding System. Apache-2
Show 8 hidden projects...
- igraph (
π₯ 27 Β·β 770) - Python interface for igraph.βοΈGPL-2.0
- pygal (
π₯ 26 Β·β 2.3K) - PYthon svg GrAph plotting Library.βοΈLGPL-3.0
- Karate Club (
π₯ 21 Β·β 1.1K) - Karate Club: An API Oriented Open-source Python Framework for..βοΈGPL-3.0
- DeepWalk (
π₯ 19 Β·β 2.2K Β·π€ ) - DeepWalk - Deep Learning for Graphs.βοΈGPL-3.0
- Sematch (
π₯ 16 Β·β 340 Β·π ) - semantic similarity framework for knowledge graph.Apache-2
- pyRDF2Vec (
π₯ 15 Β·β 81) - Python Implementation and Extension of RDF2Vec.MIT
- GraphSAGE (
π₯ 14 Β·β 2.1K Β·π ) - Representation learning on large graphs using stochastic..MIT
- OpenNE (
π₯ 14 Β·β 1.4K Β·π ) - An Open-Source Package for Network Embedding (NE).MIT
Audio Data
Libraries for audio analysis, manipulation, transformation, and extraction, as well as speech recognition and music generation tasks.
DeepSpeech (
π₯
31 Β·
β
16K Β·
π
) - DeepSpeech is an open source embedded (offline, on-.. MPL-2.0
Magenta (
π₯
29 Β·
β
16K) - Magenta: Music and Art Generation with Machine Intelligence. Apache-2
torchaudio (
π₯
28 Β·
β
1.2K) - Data manipulation and transformation for audio signal.. BSD-2
audioread (
π₯
27 Β·
β
360) - cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding.. MIT
-
GitHub (
π¨βπ» 20 Β·π 83 Β·π¦ 4.1K Β·π 75 - 41% open Β·β±οΈ 20.10.2020):git clone https://github.com/beetbox/audioread
-
PyPi (
π₯ 190K / month Β·π¦ 590 Β·β±οΈ 20.10.2020):pip install audioread
-
Conda (
π₯ 200K Β·β±οΈ 08.12.2020):conda install -c conda-forge audioread
pyAudioAnalysis (
π₯
25 Β·
β
3.7K) - Python Audio Analysis Library: Feature Extraction,.. Apache-2
python-soundfile (
π₯
25 Β·
β
360) - SoundFile is an audio library based on libsndfile, CFFI, and.. BSD-3
python_speech_features (
π₯
24 Β·
β
1.8K) - This library provides common speech features for ASR.. MIT
tinytag (
π₯
22 Β·
β
430) - Read music meta data and length of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and.. MIT
TTS (
π₯
20 Β·
β
3.2K Β·
π
) - Deep learning for Text to Speech (Discussion forum:.. MPL-2.0
-
GitHub (
π¨βπ» 50 Β·π 670 Β·π₯ 39 Β·π 450 - 6% open Β·β±οΈ 01.02.2021):git clone https://github.com/mozilla/TTS
Show 4 hidden projects...
- SpeechRecognition (
π₯ 30 Β·β 5.4K Β·π ) - Speech recognition module for Python, supporting..BSD-3
- aubio (
π₯ 26 Β·β 2K) - a library for audio and music analysis.βοΈGPL-3.0
- Essentia (
π₯ 22 Β·β 1.7K) - C++ library for audio and music analysis, description and..βοΈAGPL-3.0
- Madmom (
π₯ 20 Β·β 710 Β·π ) - Python audio and music signal processing library.BSD-3
Geospatial Data
Libraries to load, process, analyze, and write geographic data as well as libraries for spatial analysis, map visualization, and geocoding.
pydeck (
π₯
33 Β·
β
8.4K) - WebGL2 powered geospatial visualization layers. MIT
-
GitHub (
π¨βπ» 150 Β·π 1.5K Β·π¦ 1.4K Β·π 2K - 4% open Β·β±οΈ 04.02.2021):git clone https://github.com/visgl/deck.gl
-
PyPi (
π₯ 82K / month Β·π¦ 2 Β·β±οΈ 26.10.2020):pip install pydeck
-
Conda (
π₯ 13K Β·β±οΈ 26.10.2020):conda install -c conda-forge pydeck
-
NPM (
π₯ 180K / month Β·π¦ 560 Β·β±οΈ 04.02.2021):npm install deck.gl
folium (
π₯
32 Β·
β
5.1K) - Python Data. Leaflet.js Maps. MIT
-
GitHub (
π¨βπ» 120 Β·π 1.9K Β·π¦ 8.3K Β·π 840 - 17% open Β·β±οΈ 18.01.2021):git clone https://github.com/python-visualization/folium
-
PyPi (
π₯ 160K / month Β·π¦ 970 Β·β±οΈ 18.01.2021):pip install folium
-
Conda (
π₯ 320K Β·β±οΈ 06.01.2021):conda install -c conda-forge folium
GeoPandas (
π₯
31 Β·
β
2.5K) - Python tools for geographic data. BSD-3
-
GitHub (
π¨βπ» 130 Β·π 530 Β·π₯ 880 Β·π¦ 7.1K Β·π 970 - 30% open Β·β±οΈ 30.01.2021):git clone https://github.com/geopandas/geopandas
-
PyPi (
π₯ 340K / month Β·π¦ 1.2K Β·β±οΈ 25.01.2021):pip install geopandas
-
Conda (
π₯ 840K Β·β±οΈ 25.01.2021):conda install -c conda-forge geopandas
Rasterio (
π₯
30 Β·
β
1.4K) - Rasterio reads and writes geospatial raster datasets. BSD-3
-
GitHub (
π¨βπ» 110 Β·π 390 Β·π₯ 700 Β·π¦ 2.7K Β·π 1.3K - 10% open Β·β±οΈ 02.02.2021):git clone https://github.com/mapbox/rasterio
-
PyPi (
π₯ 120K / month Β·π¦ 850 Β·β±οΈ 25.01.2021):pip install rasterio
-
Conda (
π₯ 860K Β·β±οΈ 25.01.2021):conda install -c conda-forge rasterio
pyproj (
π₯
29 Β·
β
570) - Python interface to PROJ (cartographic projections and coordinate.. MIT
ipyleaflet (
π₯
27 Β·
β
1.1K) - A Jupyter - Leaflet.js bridge. MIT
-
GitHub (
π¨βπ» 63 Β·π 280 Β·π¦ 680 Β·π 380 - 34% open Β·β±οΈ 22.01.2021):git clone https://github.com/jupyter-widgets/ipyleaflet
-
PyPi (
π₯ 11K / month Β·π¦ 98 Β·β±οΈ 05.01.2021):pip install ipyleaflet
-
Conda (
π₯ 590K Β·β±οΈ 16.01.2021):conda install -c conda-forge ipyleaflet
-
NPM (
π₯ 150K / month Β·π¦ 2 Β·β±οΈ 05.01.2021):npm install jupyter-leaflet
ArcGIS API (
π₯
25 Β·
β
950) - Documentation and samples for ArcGIS API for Python. Apache-2
-
GitHub (
π¨βπ» 61 Β·π 700 Β·π 310 - 35% open Β·β±οΈ 26.01.2021):git clone https://github.com/Esri/arcgis-python-api
-
PyPi (
π₯ 12K / month Β·π¦ 20 Β·β±οΈ 27.01.2021):pip install arcgis
-
Docker Hub (
π₯ 3.7K Β·β 29 Β·β±οΈ 06.03.2020):docker pull esridocker/arcgis-api-python-notebook
pymap3d (
π₯
21 Β·
β
170) - pure-Python (Numpy optional) 3D coordinate conversions for geospace ecef.. BSD-2
EarthPy (
π₯
20 Β·
β
220 Β·
π
) - A package built to support working with spatial data using open.. BSD-3
Show 7 hidden projects...
- Geocoder (
π₯ 29 Β·β 1.3K Β·π ) - Python Geocoder.MIT
- Cartopy (
π₯ 27 Β·β 1.4K) - Rasterio reads and writes geospatial raster datasets.βοΈLGPL-3.0
- Satpy (
π₯ 25 Β·β 670) - Python package for earth-observing satellite data processing.βοΈGPL-3.0
- gmaps (
π₯ 21 Β·β 700 Β·π ) - Google maps for Jupyter notebooks.BSD-3
- Sentinelsat (
π₯ 21 Β·β 560) - Search and download Copernicus Sentinel satellite images.βοΈGPL-3.0
- Mapbox GL (
π₯ 20 Β·β 560 Β·π ) - Use Mapbox GL JS to visualize data in a Python Jupyter notebook.MIT
- geoplotlib (
π₯ 19 Β·β 890 Β·π ) - python toolbox for visualizing geographical data and making maps.MIT
Financial Data
Libraries for algorithmic stock/crypto trading, risk analytics, backtesting, technical analysis, and other tasks on financial data.
yfinance (
π₯
29 Β·
β
4.1K) - Yahoo! Finance market data downloader (+faster Pandas Datareader). Apache-2
Alpha Vantage (
π₯
27 Β·
β
3.1K) - A python wrapper for Alpha Vantage API for financial data. MIT
empyrical (
π₯
25 Β·
β
710) - Common financial risk and performance metrics. Used by zipline and.. Apache-2
-
GitHub (
π¨βπ» 22 Β·π 220 Β·π¦ 510 Β·π 53 - 50% open Β·β±οΈ 14.10.2020):git clone https://github.com/quantopian/empyrical
-
PyPi (
π₯ 19K / month Β·π¦ 220 Β·β±οΈ 13.10.2020):pip install empyrical
-
Conda (
π₯ 9.4K Β·β±οΈ 14.10.2020):conda install -c conda-forge empyrical
Alphalens (
π₯
24 Β·
β
1.7K Β·
π€
) - Performance analysis of predictive (alpha) stock factors. Apache-2
-
GitHub (
π¨βπ» 25 Β·π 650 Β·π¦ 350 Β·π 180 - 20% open Β·β±οΈ 27.04.2020):git clone https://github.com/quantopian/alphalens
-
PyPi (
π₯ 1.4K / month Β·π¦ 14 Β·β±οΈ 27.04.2020):pip install alphalens
-
Conda (
π₯ 10K Β·β±οΈ 16.05.2020):conda install -c conda-forge alphalens
stockstats (
π₯
24 Β·
β
710) - Supply a wrapper ``StockDataFrame`` based on the.. BSD-3
Enigma Catalyst (
π₯
23 Β·
β
2K) - An Algorithmic Trading Library for Crypto-Assets in Python. Apache-2
TensorTrade (
π₯
21 Β·
β
2.8K) - An open source reinforcement learning framework for training,.. Apache-2
Qlib (
π₯
20 Β·
β
4.2K Β·
π£
) - Qlib is an AI-oriented quantitative investment platform, which aims.. MIT
finmarketpy (
π₯
20 Β·
β
2.5K) - Python library for backtesting trading strategies & analyzing.. Apache-2
tf-quant-finance (
π₯
19 Β·
β
2.4K) - High-performance TensorFlow library for quantitative.. Apache-2
Crypto Signals (
π₯
18 Β·
β
2.5K) - Github.com/CryptoSignal - #1 Quant Trading & Technical Analysis.. MIT
-
GitHub (
π¨βπ» 25 Β·π 690 Β·π 230 - 16% open Β·β±οΈ 03.09.2020):git clone https://github.com/CryptoSignal/crypto-signal
-
Docker Hub (
π₯ 41K Β·β 8 Β·β±οΈ 03.09.2020):docker pull shadowreaver/crypto-signal
Show 6 hidden projects...
- backtrader (
π₯ 26 Β·β 5.6K Β·π€ ) - Python Backtesting library for trading strategies.βοΈGPL-3.0
- PyAlgoTrade (
π₯ 23 Β·β 3.2K Β·π ) - Python Algorithmic Trading Library.Apache-2
- arch (
π₯ 23 Β·β 640) - ARCH models in Python.βοΈNCSA
- FinTA (
π₯ 22 Β·β 820) - Common financial technical indicators implemented in Pandas.βοΈLGPL-3.0
- Backtesting.py (
π₯ 17 Β·β 970) - Backtest trading strategies in Python.βοΈAGPL-3.0
- surpriver (
π₯ 11 Β·β 1.1K Β·π£ ) - Find big moving stocks before they move using machine..βοΈGPL-3.0
Time Series Data
Libraries for forecasting, anomaly detection, feature extraction, and machine learning on time-series and sequential data.
Prophet (
π₯
29 Β·
β
12K) - Tool for producing high quality forecasts for time series data that has.. MIT
pmdarima (
π₯
26 Β·
β
810) - A statistical library designed to fill the void in Python's time series.. MIT
Streamz (
π₯
24 Β·
β
900) - Real-time stream processing for python. BSD-3
-
GitHub (
π¨βπ» 38 Β·π 110 Β·π¦ 190 Β·π 220 - 41% open Β·β±οΈ 14.01.2021):git clone https://github.com/python-streamz/streamz
-
PyPi (
π₯ 2.1K / month Β·π¦ 16 Β·β±οΈ 02.11.2020):pip install streamz
-
Conda (
π₯ 110K Β·β±οΈ 15.01.2021):conda install -c conda-forge streamz
Darts (
π₯
22 Β·
β
710) - A python library for easy manipulation and forecasting of time series. Apache-2
-
GitHub (
π¨βπ» 24 Β·π 91 Β·π¦ 12 Β·π 64 - 32% open Β·β±οΈ 03.02.2021):git clone https://github.com/unit8co/darts
-
PyPi (
π₯ 2.9K / month Β·β±οΈ 03.02.2021):pip install u8darts
-
Docker Hub (
π₯ 99 Β·β±οΈ 03.02.2021):docker pull unit8/darts
STUMPY (
π₯
20 Β·
β
1.7K) - STUMPY is a powerful and scalable Python library for computing a Matrix.. BSD-3
pytorch-forecasting (
π₯
19 Β·
β
700) - Time series forecasting with PyTorch. MIT
matrixprofile-ts (
π₯
19 Β·
β
610 Β·
π€
) - A Python library for detecting patterns and anomalies.. Apache-2
Auto TS (
π₯
18 Β·
β
160) - Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost.. Apache-2
ADTK (
π₯
17 Β·
β
590 Β·
π€
) - A Python toolkit for rule-based/unsupervised anomaly detection in time.. MPL-2.0
tick (
π₯
17 Β·
β
320 Β·
π€
) - Module for statistical learning, with a particular emphasis on time-.. BSD-3
Show 3 hidden projects...
Medical Data
Libraries for processing and analyzing medical data such as MRIs, EEGs, genomic data, and other medical imaging formats.
Lifelines (
π₯
29 Β·
β
1.5K) - Survival analysis in Python. MIT
-
GitHub (
π¨βπ» 90 Β·π 410 Β·π¦ 490 Β·π 780 - 24% open Β·β±οΈ 22.01.2021):git clone https://github.com/CamDavidsonPilon/lifelines
-
PyPi (
π₯ 85K / month Β·π¦ 130 Β·β±οΈ 22.01.2021):pip install lifelines
-
Conda (
π₯ 120K Β·β±οΈ 10.12.2020):conda install -c conda-forge lifelines
NiBabel (
π₯
29 Β·
β
390) - Python package to access a cacophony of neuro-imaging file formats. MIT
MNE (
π₯
27 Β·
β
1.5K) - MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python. BSD-3
DIPY (
π₯
27 Β·
β
390) - DIPY is the paragon 3D/4D+ imaging library in Python. Contains generic.. BSD-3
DeepVariant (
π₯
21 Β·
β
2.2K) - DeepVariant is an analysis pipeline that uses a deep neural.. BSD-3
NiftyNet (
π₯
21 Β·
β
1.3K Β·
π€
) - [unmaintained] An open-source convolutional neural.. Apache-2
Brainiak (
π₯
19 Β·
β
230) - Brain Imaging Analysis Kit. Apache-2
-
GitHub (
π¨βπ» 32 Β·π 110 Β·π¦ 12 Β·π 180 - 35% open Β·β±οΈ 15.10.2020):git clone https://github.com/brainiak/brainiak
-
PyPi (
π₯ 93 / month Β·π¦ 1 Β·β±οΈ 15.10.2020):pip install brainiak
-
Docker Hub (
π₯ 470 Β·β 1 Β·β±οΈ 15.10.2020):docker pull brainiak/brainiak
Medical Detection Toolkit (
π₯
12 Β·
β
900 Β·
π€
) - The Medical Detection Toolkit contains 2D + 3D.. Apache-2
-
GitHub (
π¨βπ» 3 Β·π 230 Β·π 110 - 24% open Β·β±οΈ 18.04.2020):git clone https://github.com/MIC-DKFZ/medicaldetectiontoolkit
MedicalNet (
π₯
11 Β·
β
1.1K) - Many studies have shown that the performance on deep learning is.. MIT
-
GitHub (
π¨βπ» 1 Β·π 280 Β·π 57 - 75% open Β·β±οΈ 27.08.2020):git clone https://github.com/Tencent/MedicalNet
Show 5 hidden projects...
- MedPy (
π₯ 20 Β·β 310 Β·π€ ) - Medical image processing in Python.βοΈGPL-3.0
- NIPY (
π₯ 20 Β·β 290) - Neuroimaging in Python FMRI analysis package.βοΈDSDP
- DLTK (
π₯ 19 Β·β 1.2K Β·π ) - Deep Learning Toolkit for Medical Image Analysis.Apache-2
- MedicalTorch (
π₯ 15 Β·β 710 Β·π ) - A medical imaging framework for Pytorch.Apache-2
- DeepNeuro (
π₯ 14 Β·β 99 Β·π€ ) - A deep learning python package for neuroimaging data. Made by:.MIT
Optical Character Recognition
Libraries for optical character recognition (OCR) and text extraction from images or videos.
Tesseract (
π₯
30 Β·
β
3.4K) - Python-tesseract is an optical character recognition (OCR) tool.. Apache-2
EasyOCR (
π₯
27 Β·
β
10K) - Ready-to-use OCR with 80+ supported languages and all popular writing.. Apache-2
OCRmyPDF (
π₯
26 Β·
β
3.8K) - OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them.. MPL-2.0
PaddleOCR (
π₯
24 Β·
β
11K) - Awesome multilingual OCR toolkits based on PaddlePaddle.. Apache-2
attention-ocr (
π₯
20 Β·
β
830) - A Tensorflow model for text recognition (CNN + seq2seq with.. MIT
keras-ocr (
π₯
20 Β·
β
760) - A packaged and flexible version of the CRAFT text detector and.. MIT
doc2text (
π₯
19 Β·
β
1.2K) - Detect text blocks and OCR poorly scanned PDFs in bulk. Python module.. MIT
Mozart (
π₯
10 Β·
β
220 Β·
π£
) - An optical music recognition (OMR) system. Converts sheet.. Apache-2
-
GitHub (
π¨βπ» 5 Β·π 35 Β·π 2 - 50% open Β·β±οΈ 14.01.2021):git clone https://github.com/aashrafh/Mozart
Show 1 hidden projects...
- pdftabextract (
π₯ 20 Β·β 1.9K Β·π ) - A set of tools for extracting tables from PDF files..Apache-2
Data Containers & Structures
General-purpose data containers & structures as well as utilities & extensions for pandas.
pandas (
π₯
42 Β·
β
28K) - Flexible and powerful data analysis / manipulation library for.. BSD-3
-
GitHub (
π¨βπ» 2.6K Β·π 12K Β·π₯ 95K Β·π¦ 380K Β·π 20K - 18% open Β·β±οΈ 04.02.2021):git clone https://github.com/pandas-dev/pandas
-
PyPi (
π₯ 17M / month Β·π¦ 77K Β·β±οΈ 20.01.2021):pip install pandas
-
Conda (
π₯ 14M Β·β±οΈ 20.01.2021):conda install -c conda-forge pandas
numpy (
π₯
42 Β·
β
16K) - The fundamental package for scientific computing with Python. BSD-3
-
GitHub (
π¨βπ» 1.2K Β·π 5.2K Β·π₯ 300K Β·π¦ 620K Β·π 9.5K - 23% open Β·β±οΈ 03.02.2021):git clone https://github.com/numpy/numpy
-
PyPi (
π₯ 27M / month Β·π¦ 170K Β·β±οΈ 30.01.2021):pip install numpy
-
Conda (
π₯ 16M Β·β±οΈ 02.02.2021):conda install -c conda-forge numpy
h5py (
π₯
36 Β·
β
1.5K) - HDF5 for Python -- The h5py package is a Pythonic interface to the HDF5.. BSD-3
Arrow (
π₯
35 Β·
β
7.1K) - Apache Arrow is a cross-language development platform for in-memory.. Apache-2
numexpr (
π₯
31 Β·
β
1.5K) - Fast numerical array expression evaluator for Python, NumPy, PyTables,.. MIT
TinyDB (
π₯
29 Β·
β
4K) - TinyDB is a lightweight document oriented database optimized for your.. MIT
Koalas (
π₯
29 Β·
β
2.6K) - Koalas: pandas API on Apache Spark. Apache-2
-
GitHub (
π¨βπ» 47 Β·π 300 Β·π₯ 1K Β·π¦ 73 Β·π 520 - 17% open Β·β±οΈ 03.02.2021):git clone https://github.com/databricks/koalas
-
PyPi (
π₯ 550K / month Β·π¦ 2 Β·β±οΈ 22.01.2021):pip install koalas
-
Conda (
π₯ 76K Β·β±οΈ 22.01.2021):conda install -c conda-forge koalas
Bottleneck (
π₯
29 Β·
β
580) - Fast NumPy array functions written in C. BSD-2
-
GitHub (
π¨βπ» 21 Β·π 65 Β·π¦ 19K Β·π 200 - 12% open Β·β±οΈ 24.01.2021):git clone https://github.com/pydata/bottleneck
-
PyPi (
π₯ 120K / month Β·π¦ 2.9K Β·β±οΈ 21.02.2020):pip install Bottleneck
-
Conda (
π₯ 1.4M Β·β±οΈ 21.01.2021):conda install -c conda-forge bottleneck
Modin (
π₯
28 Β·
β
5.7K) - Modin: Speed up your Pandas workflows by changing a single line of.. Apache-2
datasketch (
π₯
27 Β·
β
1.4K) - MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog,.. MIT
zarr (
π₯
27 Β·
β
630) - An implementation of chunked, compressed, N-dimensional arrays for Python. MIT
-
GitHub (
π¨βπ» 35 Β·π 110 Β·π¦ 540 Β·π 410 - 43% open Β·β±οΈ 27.01.2021):git clone https://github.com/zarr-developers/zarr-python
-
PyPi (
π₯ 83K / month Β·π¦ 72 Β·β±οΈ 02.12.2020):pip install zarr
-
Conda (
π₯ 550K Β·β±οΈ 03.12.2020):conda install -c conda-forge zarr
Arctic (
π₯
24 Β·
β
2.2K) - Arctic is a high performance datastore for numeric data. βοΈLGPL-2.1
-
GitHub (
π¨βπ» 71 Β·π 440 Β·π₯ 92 Β·π¦ 100 Β·π 510 - 17% open Β·β±οΈ 26.01.2021):git clone https://github.com/man-group/arctic
-
PyPi (
π₯ 1.4K / month Β·π¦ 42 Β·β±οΈ 01.12.2020):pip install arctic
-
Conda (
π₯ 13K Β·β±οΈ 16.12.2019):conda install -c conda-forge arctic
Vaex (
π₯
22 Β·
β
5.7K) - Out-of-Core DataFrames for Python, ML, visualize and explore big tabular data.. MIT
PandaralΒ·lel (
π₯
22 Β·
β
1.4K) - A simple and efficient tool to parallelize Pandas.. BSD-3
datatable (
π₯
20 Β·
β
1.1K) - A Python package for manipulating 2-dimensional tabular data.. MPL-2.0
StaticFrame (
π₯
20 Β·
β
210) - The StaticFrame library defines the Series and Frame, immutable data.. MIT
-
GitHub (
π¨βπ» 15 Β·π 20 Β·π¦ 6 Β·π 280 - 11% open Β·β±οΈ 04.02.2021):git clone https://github.com/InvestmentSystems/static-frame
-
PyPi (
π₯ 640 / month Β·β±οΈ 25.01.2021):pip install static-frame
-
Conda (
π₯ 62K Β·β±οΈ 26.01.2021):conda install -c conda-forge static-frame
Bounter (
π₯
18 Β·
β
890) - Efficient Counter that uses a limited (bounded) amount of memory.. MIT
PandaPy (
π₯
13 Β·
β
470 Β·
π
) - PandaPy has the speed of NumPy and the usability of Pandas 10x to.. MIT
Show 5 hidden projects...
- Blaze (
π₯ 28 Β·β 2.9K Β·π ) - NumPy and Pandas interface to Big Data.BSD-3
- sklearn-pandas (
π₯ 28 Β·β 2.3K) - Pandas integration with sklearn.βοΈZlib
- pandasql (
π₯ 22 Β·β 940 Β·π ) - sqldf for pandas.MIT
- pickleDB (
π₯ 21 Β·β 540 Β·π ) - pickleDB is an open source key-value store using Python's json..BSD-3
- Pandas Summary (
π₯ 21 Β·β 360 Β·π ) - An extension to pandas dataframes describe function.MIT
Data Loading & Extraction
Libraries for loading, collecting, and extracting data from a variety of data sources and formats.
TensorFlow Datasets (
π₯
31 Β·
β
2.7K) - TFDS is a collection of datasets ready to use with.. Apache-2
python-magic (
π₯
31 Β·
β
1.8K) - A python wrapper for libmagic. MIT
-
GitHub (
π¨βπ» 49 Β·π 220 Β·π¦ 12K Β·π 160 - 18% open Β·β±οΈ 16.01.2021):git clone https://github.com/ahupp/python-magic
-
PyPi (
π₯ 1M / month Β·π¦ 5.1K Β·β±οΈ 16.01.2021):pip install python-magic
-
Conda (
π₯ 75K Β·β±οΈ 24.12.2020):conda install -c conda-forge python-magic
xmltodict (
π₯
30 Β·
β
4.3K Β·
π€
) - Python module that makes working with XML feel like you are.. MIT
-
GitHub (
π¨βπ» 41 Β·π 400 Β·π¦ 21K Β·π 200 - 31% open Β·β±οΈ 26.04.2020):git clone https://github.com/martinblech/xmltodict
-
PyPi (
π₯ 2.9M / month Β·π¦ 8.2K Β·β±οΈ 11.02.2019):pip install xmltodict
-
Conda (
π₯ 640K Β·β±οΈ 11.02.2019):conda install -c conda-forge xmltodict
smart-open (
π₯
30 Β·
β
1.9K) - Utils for streaming large files (S3, HDFS, gzip, bz2...). MIT
Datasets (
π₯
29 Β·
β
6.7K) - The largest hub of ready-to-use NLP datasets for ML models with.. Apache-2
pandas-datareader (
π₯
29 Β·
β
1.8K) - Extract data from a wide range of Internet sources.. BSD-3
-
GitHub (
π¨βπ» 77 Β·π 500 Β·π¦ 7.7K Β·π 440 - 16% open Β·β±οΈ 31.12.2020):git clone https://github.com/pydata/pandas-datareader
-
PyPi (
π₯ 83K / month Β·π¦ 1.4K Β·β±οΈ 10.07.2020):pip install pandas-datareader
-
Conda (
π₯ 91K Β·β±οΈ 20.11.2019):conda install -c conda-forge pandas-datareader
csvkit (
π₯
28 Β·
β
4.5K) - A suite of utilities for converting to and working with CSV, the king of.. MIT
snorkel (
π₯
26 Β·
β
4.4K) - A system for quickly generating training data with weak supervision. Apache-2
-
GitHub (
π¨βπ» 62 Β·π 720 Β·π₯ 500 Β·π¦ 66 Β·π 940 - 3% open Β·β±οΈ 05.09.2020):git clone https://github.com/snorkel-team/snorkel
-
PyPi (
π₯ 55K / month Β·π¦ 4 Β·β±οΈ 07.04.2020):pip install snorkel
-
Conda (
π₯ 15K Β·β±οΈ 10.04.2020):conda install -c conda-forge snorkel
tabulator-py (
π₯
25 Β·
β
200) - Python library for reading and writing tabular data via streams. MIT
Intake (
π₯
24 Β·
β
520) - Intake is a lightweight package for finding, investigating, loading and.. BSD-2
SDV (
π₯
21 Β·
β
320) - Synthetic Data Generation for tabular, relational and time series data. MIT
Show 8 hidden projects...
- PDFMiner (
π₯ 27 Β·β 4.5K Β·π ) - Python PDF Parser (Not actively maintained). Check out pdfminer.six.MIT
- textract (
π₯ 26 Β·β 2.9K Β·π ) - extract text from any document. no muss. no fuss.MIT
- Singer (
π₯ 24 Β·β 690) - Standard for moving data between databases, web APIs, files, queues,..βοΈAGPL-3.0
- Camelot (
π₯ 23 Β·β 3K Β·π ) - Camelot: PDF Table Extraction for Humans.MIT
- messytables (
π₯ 23 Β·β 360 Β·π ) - Tools for parsing messy tabular data. This is now superseded by..MIT
- pyexcel-xlsx (
π₯ 22 Β·β 85) - A wrapper library to read, manipulate and write data in xlsx and..BSD-3
- openpyxl (
π₯ 22 Β·β 17) - A Python library to read/write Excel 2010 xlsx/xlsm files.MIT
- rows (
π₯ 20 Β·β 740) - A common, beautiful interface to tabular data, no matter the format.βοΈLGPL-3.0
Web Scraping & Crawling
Libraries for web scraping, crawling, downloading, and mining as well as libraries.
Data Pipelines & Streaming
Libraries for data batch- and stream-processing, workflow automation, job scheduling, and other data pipeline tasks.
Celery (
π₯
37 Β·
β
17K) - Asynchronous task queue/job queue based on distributed message passing. BSD-3
Airflow (
π₯
35 Β·
β
21K) - Platform to programmatically author, schedule, and monitor workflows. Apache-2
-
GitHub (
π¨βπ» 1.7K Β·π 7.9K Β·π₯ 85K Β·π 2.9K - 33% open Β·β±οΈ 04.02.2021):git clone https://github.com/apache/airflow
-
PyPi (
π₯ 440K / month Β·π¦ 290 Β·β±οΈ 14.12.2020):pip install apache-airflow
-
Conda (
π₯ 270K Β·β±οΈ 26.11.2020):conda install -c conda-forge airflow
-
Docker Hub (
π₯ 5.8M Β·β 210 Β·β±οΈ 04.02.2021):docker pull apache/airflow
luigi (
π₯
33 Β·
β
14K) - Luigi is a Python module that helps you build complex pipelines of batch.. Apache-2
Beam (
π₯
32 Β·
β
4.6K) - Unified programming model to define and execute data processing.. Apache-2
dbt (
π₯
29 Β·
β
2.5K) - dbt (data build tool) enables data analysts and engineers to transform.. Apache-2
Kedro (
π₯
28 Β·
β
3.4K) - A Python framework for creating reproducible, maintainable and modular.. Apache-2
Dagster (
π₯
27 Β·
β
2.9K) - A data orchestrator for machine learning, analytics, and ETL. Apache-2
PyFunctional (
π₯
26 Β·
β
1.8K) - Python library for creating data pipelines with chain functional.. MIT
TFX (
π₯
26 Β·
β
1.3K) - TFX is an end-to-end platform for deploying production ML pipelines. Apache-2
streamparse (
π₯
25 Β·
β
1.4K) - Run Python in Apache Storm topologies. Pythonic API, CLI.. Apache-2
Great Expectations (
π₯
24 Β·
β
3.6K) - Always know what to expect from your data. Apache-2
Optimus (
π₯
23 Β·
β
970) - Agile Data Preparation Workflows madeeasy with dask, cudf,.. Apache-2
Hub (
π₯
23 Β·
β
860) - Fastest unstructured dataset management for TensorFlow/PyTorch... MPL-2.0
pysparkling (
π₯
23 Β·
β
230) - A pure Python implementation of Apache Spark's RDD and DStream.. MIT
mrq (
π₯
20 Β·
β
830) - Mr. Queue - A distributed worker task queue in Python using Redis & gevent. MIT
spark-deep-learning (
π₯
18 Β·
β
1.8K) - Deep Learning Pipelines for Apache Spark. Apache-2
-
GitHub (
π¨βπ» 15 Β·π 440 Β·π¦ 16 Β·π 100 - 74% open Β·β±οΈ 20.01.2021):git clone https://github.com/databricks/spark-deep-learning
Databolt Flow (
π₯
18 Β·
β
900) - Python library for building highly effective data science workflows. MIT
Mara Pipelines (
π₯
17 Β·
β
1.6K) - A lightweight opinionated ETL framework, halfway between plain.. MIT
BatchFlow (
π₯
17 Β·
β
150) - BatchFlow helps you conveniently work with random or sequential.. Apache-2
zenml (
π₯
14 Β·
β
620 Β·
π£
) - ZenML: Bring Zen to your ML with reproducible pipelines. Apache-2
Show 2 hidden projects...
Distributed Machine Learning
Libraries that provide capabilities to distribute and parallelize machine learning tasks across large-scale compute infrastructure.
dask.distributed (
π₯
34 Β·
β
1.1K) - A distributed task scheduler for Dask. BSD-3
-
GitHub (
π¨βπ» 230 Β·π 510 Β·π¦ 17K Β·π 2K - 36% open Β·β±οΈ 04.02.2021):git clone https://github.com/dask/distributed
-
PyPi (
π₯ 670K / month Β·π¦ 1.8K Β·β±οΈ 22.01.2021):pip install distributed
-
Conda (
π₯ 3.6M Β·β±οΈ 23.01.2021):conda install -c conda-forge distributed
Ray (
π₯
33 Β·
β
15K) - An open source framework that provides a simple, universal API for.. Apache-2
horovod (
π₯
30 Β·
β
11K) - Distributed training framework for TensorFlow, Keras, PyTorch, and.. Apache-2
ipyparallel (
π₯
28 Β·
β
1.9K) - Interactive Parallel Computing in Python. BSD-3
-
GitHub (
π¨βπ» 94 Β·π 730 Β·π¦ 1.4K Β·π 250 - 57% open Β·β±οΈ 24.08.2020):git clone https://github.com/ipython/ipyparallel
-
PyPi (
π₯ 35K / month Β·π¦ 490 Β·β±οΈ 05.05.2020):pip install ipyparallel
-
Conda (
π₯ 370K Β·β±οΈ 22.01.2021):conda install -c conda-forge ipyparallel
BigDL (
π₯
25 Β·
β
3.7K) - BigDL: Distributed Deep Learning Framework for Apache Spark. Apache-2
-
GitHub (
π¨βπ» 71 Β·π 900 Β·π¦ 22 Β·π 910 - 20% open Β·β±οΈ 21.01.2021):git clone https://github.com/intel-analytics/BigDL
-
PyPi (
π₯ 1.8K / month Β·π¦ 6 Β·β±οΈ 29.12.2020):pip install bigdl
-
Maven (
β±οΈ 05.12.2020):<dependency> <groupId>com.intel.analytics.bigdl</groupId> <artifactId>bigdl-SPARK_2.4</artifactId> <version>[VERSION]</version> </dependency>
TensorFlowOnSpark (
π₯
24 Β·
β
3.6K) - TensorFlowOnSpark brings TensorFlow programs to.. Apache-2
petastorm (
π₯
24 Β·
β
1.1K) - Petastorm library enables single machine or distributed training.. Apache-2
DeepSpeed (
π₯
23 Β·
β
4.2K) - DeepSpeed is a deep learning optimization library that makes.. MIT
-
GitHub (
π¨βπ» 42 Β·π 390 Β·π¦ 9 Β·π 300 - 45% open Β·β±οΈ 02.02.2021):git clone https://github.com/microsoft/DeepSpeed
-
PyPi (
π₯ 1.8K / month Β·β±οΈ 02.02.2021):pip install deepspeed
-
Docker Hub (
π₯ 7K Β·β 2 Β·β±οΈ 20.11.2020):docker pull deepspeed/deepspeed
analytics-zoo (
π₯
22 Β·
β
2.2K) - Distributed Tensorflow, Keras and PyTorch on Apache.. Apache-2
FairScale (
π₯
20 Β·
β
750) - PyTorch extensions for high performance and large scale training. BSD-3
Apache Singa (
π₯
19 Β·
β
2.2K) - a distributed deep learning platform. Apache-2
-
GitHub (
π¨βπ» 70 Β·π 610 Β·π 72 - 55% open Β·β±οΈ 15.01.2021):git clone https://github.com/apache/singa
-
Conda (
π₯ 260 Β·β±οΈ 20.01.2021):conda install -c nusdbsystem singa
-
Docker Hub (
π₯ 160 Β·β 2 Β·β±οΈ 04.06.2019):docker pull apache/singa
BytePS (
π₯
18 Β·
β
2.7K) - A high performance and generic framework for distributed DNN training. Apache-2
-
GitHub (
π¨βπ» 17 Β·π 360 Β·π 210 - 34% open Β·β±οΈ 03.02.2021):git clone https://github.com/bytedance/byteps
-
PyPi (
π₯ 72 / month Β·β±οΈ 04.11.2020):pip install byteps
-
Docker Hub (
π₯ 1K Β·β±οΈ 03.03.2020):docker pull bytepsimage/tensorflow
somoclu (
π₯
17 Β·
β
220 Β·
π€
) - Massively parallel self-organizing maps: accelerate training on.. MIT
Hivemind (
π₯
15 Β·
β
650) - Decentralized deep learning in PyTorch. Built to train models on.. MIT
Show 3 hidden projects...
- DEAP (
π₯ 28 Β·β 4.1K) - Distributed Evolutionary Algorithms in Python.βοΈLGPL-3.0
- TensorFrames (
π₯ 18 Β·β 770 Β·π ) - [DEPRECATED] Tensorflow wrapper for DataFrames on..Apache-2
- LazyCluster (
π₯ 12 Β·β 33) - Distributed machine learning made simple.Apache-2
Hyperparameter Optimization & AutoML
Libraries for hyperparameter optimization, automl and neural architecture search.
Hyperopt (
π₯
30 Β·
β
5.4K) - Distributed Asynchronous Hyperparameter Optimization in Python. BSD-3
-
GitHub (
π¨βπ» 84 Β·π 860 Β·π¦ 2.6K Β·π 550 - 59% open Β·β±οΈ 24.12.2020):git clone https://github.com/hyperopt/hyperopt
-
PyPi (
π₯ 310K / month Β·π¦ 500 Β·β±οΈ 07.10.2020):pip install hyperopt
-
Conda (
π₯ 170K Β·β±οΈ 14.10.2020):conda install -c conda-forge hyperopt
scikit-optimize (
π₯
30 Β·
β
2K) - Sequential model-based optimization with a `scipy.optimize`.. BSD-3
-
GitHub (
π¨βπ» 68 Β·π 370 Β·π¦ 1.3K Β·π 520 - 31% open Β·β±οΈ 31.12.2020):git clone https://github.com/scikit-optimize/scikit-optimize
-
PyPi (
π₯ 300K / month Β·π¦ 160 Β·β±οΈ 04.09.2020):pip install scikit-optimize
-
Conda (
π₯ 200K Β·β±οΈ 04.09.2020):conda install -c conda-forge scikit-optimize
Keras Tuner (
π₯
28 Β·
β
2.2K) - Hyperparameter tuning for humans. Apache-2
NNI (
π₯
27 Β·
β
9K) - An open source AutoML toolkit for automate machine learning lifecycle, including.. MIT
featuretools (
π₯
27 Β·
β
5.4K) - An open source python library for automated feature engineering. BSD-3
-
GitHub (
π¨βπ» 49 Β·π 700 Β·π¦ 670 Β·π 510 - 21% open Β·β±οΈ 01.02.2021):git clone https://github.com/alteryx/featuretools
-
PyPi (
π₯ 44K / month Β·π¦ 70 Β·β±οΈ 29.01.2021):pip install featuretools
-
Conda (
π₯ 44K Β·β±οΈ 01.02.2021):conda install -c conda-forge featuretools
auto-sklearn (
π₯
26 Β·
β
5.2K) - Automated Machine Learning with scikit-learn. BSD-3
Bayesian Optimization (
π₯
26 Β·
β
4.8K) - A Python implementation of global optimization with.. MIT
nevergrad (
π₯
25 Β·
β
2.8K) - A Python toolbox for performing gradient-free optimization. MIT
-
GitHub (
π¨βπ» 42 Β·π 270 Β·π¦ 120 Β·π 200 - 40% open Β·β±οΈ 28.01.2021):git clone https://github.com/facebookresearch/nevergrad
-
PyPi (
π₯ 7.3K / month Β·π¦ 14 Β·β±οΈ 28.01.2021):pip install nevergrad
-
Conda (
π₯ 5.9K Β·β±οΈ 14.12.2020):conda install -c conda-forge nevergrad
AdaNet (
π₯
21 Β·
β
3.2K Β·
π€
) - Fast and flexible AutoML with learning guarantees. Apache-2
Neuraxle (
π₯
21 Β·
β
360) - A Sklearn-like Framework for Hyperparameter Tuning and AutoML in.. Apache-2
mljar-supervised (
π₯
20 Β·
β
760) - Automates Machine Learning Pipeline with Feature Engineering.. MIT
Test Tube (
π₯
20 Β·
β
650 Β·
π€
) - Python library to easily log experiments and parallelize.. MIT
Auto ViML (
π₯
20 Β·
β
200) - Automatically Build Multiple ML Models with a Single Line of Code... Apache-2
lazypredict (
π₯
19 Β·
β
360) - Lazy Predict help build a lot of basic models without much code.. MIT
Dragonfly (
π₯
17 Β·
β
560 Β·
π€
) - An open source python library for scalable Bayesian optimisation. MIT
HyperparameterHunter (
π₯
16 Β·
β
640) - Easy hyperparameter optimization and automatic result.. MIT
Auto Tune Models (
π₯
16 Β·
β
500 Β·
π€
) - Auto Tune Models - A multi-tenant, multi-data system for.. MIT
AlphaPy (
π₯
15 Β·
β
540) - Automated Machine Learning [AutoML] with Python, scikit-learn, Keras,.. Apache-2
Parfit (
π₯
15 Β·
β
200 Β·
π€
) - A package for parallelizing the fit and flexibly scoring of.. MIT
ENAS (
π₯
14 Β·
β
2.4K Β·
π€
) - PyTorch implementation of Efficient Neural Architecture Search via.. Apache-2
-
GitHub (
π¨βπ» 6 Β·π 450 Β·π 44 - 84% open Β·β±οΈ 16.06.2020):git clone https://github.com/carpedm20/ENAS-pytorch
Devol (
π₯
11 Β·
β
920 Β·
π€
) - Genetic neural architecture search with Keras. MIT
-
GitHub (
π¨βπ» 18 Β·π 110 Β·π 27 - 25% open Β·β±οΈ 05.07.2020):git clone https://github.com/joeddav/devol
Show 13 hidden projects...
- TPOT (
π₯ 29 Β·β 7.8K) - A Python Automated Machine Learning tool that optimizes machine..βοΈLGPL-3.0
- auto_ml (
π₯ 21 Β·β 1.5K Β·π ) - [UNMAINTAINED] Automated machine learning for analytics & production.MIT
- MLBox (
π₯ 20 Β·β 1.2K Β·π ) - MLBox is a powerful Automated Machine Learning python library.βοΈBSD-1-Clause
- HpBandSter (
π₯ 20 Β·β 440 Β·π ) - a distributed Hyperband implementation on Steroids.BSD-3
- Advisor (
π₯ 17 Β·β 1.3K Β·π ) - Open-source implementation of Google Vizier for hyper parameters..Apache-2
- sklearn-deap (
π₯ 17 Β·β 620 Β·π ) - Use evolutionary algorithms instead of gridsearch in..MIT
- Sherpa (
π₯ 17 Β·β 280) - Hyperparameter optimization that enables researchers to experiment,..βοΈGPL-3.0
- automl-gs (
π₯ 16 Β·β 1.7K Β·π ) - Provide an input CSV and a target field to predict, generate a..MIT
- Xcessiv (
π₯ 16 Β·β 1.3K Β·π ) - A web-based application for quick, scalable, and automated..Apache-2
- Auptimizer (
π₯ 13 Β·β 160) - An automatic ML model optimization tool.βοΈGPL-3.0
- Hypermax (
π₯ 13 Β·β 94) - Better, faster hyper-parameter optimization.BSD-3
- featurewiz (
π₯ 12 Β·β 21 Β·π£ ) - Use advanced feature engineering strategies and select the..Apache-2
- Hypertunity (
π₯ 10 Β·β 120 Β·π ) - A toolset for black-box hyperparameter optimisation.Apache-2
Reinforcement Learning
Libraries for building and evaluating reinforcement learning & agent-based systems.
OpenAI Gym (
π₯
35 Β·
β
23K) - A toolkit for developing and comparing reinforcement learning.. MIT
TensorLayer (
π₯
27 Β·
β
6.5K) - Deep Learning and Reinforcement Learning Library for.. Apache-2
Dopamine (
π₯
26 Β·
β
9.3K) - Dopamine is a research framework for fast prototyping of.. Apache-2
TensorForce (
π₯
25 Β·
β
2.9K) - Tensorforce: a TensorFlow library for applied.. Apache-2
ViZDoom (
π₯
25 Β·
β
1.2K) - Doom-based AI Research Platform for Reinforcement Learning from Raw.. MIT
Stable Baselines (
π₯
24 Β·
β
2.9K) - A fork of OpenAI Baselines, implementations of reinforcement.. MIT
ChainerRL (
π₯
23 Β·
β
920) - ChainerRL is a deep reinforcement learning library built on top of.. MIT
PARL (
π₯
21 Β·
β
1.8K) - A high-performance distributed training framework for Reinforcement.. Apache-2
RLax (
π₯
17 Β·
β
550) - A library of reinforcement learning building blocks in JAX. Apache-2
jax
ReAgent (
π₯
16 Β·
β
2.8K) - A platform for Reasoning systems (Reinforcement Learning,.. BSD-3
-
GitHub (
π¨βπ» 86 Β·π 380 Β·π 94 - 21% open Β·β±οΈ 02.02.2021):git clone https://github.com/facebookresearch/ReAgent
Show 3 hidden projects...
- baselines (
π₯ 28 Β·β 11K Β·π ) - OpenAI Baselines: high-quality implementations of reinforcement..MIT
- keras-rl (
π₯ 24 Β·β 4.9K Β·π ) - Deep Reinforcement Learning for Keras.MIT
- DeepMind Lab (
π₯ 17 Β·β 6.4K) - A customisable 3D platform for agent-based AI research.βοΈGPL-2.0
Recommender Systems
Libraries for building and evaluating recommendation systems.
implicit (
π₯
28 Β·
β
2.2K) - Fast Python Collaborative Filtering for Implicit Feedback Datasets. MIT
lightfm (
π₯
27 Β·
β
3.5K) - A Python implementation of LightFM, a hybrid recommendation algorithm. Apache-2
scikit-surprise (
π₯
26 Β·
β
4.7K) - A Python scikit for building and analyzing recommender.. BSD-3
-
GitHub (
π¨βπ» 38 Β·π 820 Β·π¦ 890 Β·π 320 - 10% open Β·β±οΈ 05.08.2020):git clone https://github.com/NicolasHug/Surprise
-
PyPi (
π₯ 25K / month Β·π¦ 24 Β·β±οΈ 19.07.2020):pip install scikit-surprise
-
Conda (
π₯ 150K Β·β±οΈ 13.10.2020):conda install -c conda-forge scikit-surprise
TF Ranking (
π₯
22 Β·
β
2K) - Learning to Rank in TensorFlow. Apache-2
Recommenders (
π₯
21 Β·
β
9.1K) - Best Practices on Recommendation Systems. MIT
-
GitHub (
π¨βπ» 90 Β·π 1.6K Β·π¦ 1 Β·π 540 - 19% open Β·β±οΈ 04.02.2021):git clone https://github.com/microsoft/recommenders
tensorrec (
π₯
20 Β·
β
1.1K Β·
π€
) - A TensorFlow recommendation algorithm and framework in.. Apache-2
TF Recommenders (
π₯
19 Β·
β
710) - TensorFlow Recommenders is a library for building.. Apache-2
recmetrics (
π₯
19 Β·
β
230) - A library of metrics for evaluating recommender systems. MIT
Case Recommender (
π₯
16 Β·
β
320 Β·
π€
) - Case Recommender: A Flexible and Extensible Python.. MIT
OpenRec (
π₯
14 Β·
β
350 Β·
π€
) - OpenRec is an open-source and modular library for neural network-.. Apache-2
Privacy Machine Learning
Libraries for encrypted and privacy-preserving machine learning using methods like federated learning & differential privacy.
TensorFlow Privacy (
π₯
21 Β·
β
1.3K) - Library for training machine learning models with.. Apache-2
FATE (
π₯
20 Β·
β
2.7K) - An Industrial Grade Federated Learning Framework. Apache-2
-
GitHub (
π¨βπ» 50 Β·π 800 Β·π 800 - 34% open Β·β±οΈ 17.01.2021):git clone https://github.com/FederatedAI/FATE
TFEncrypted (
π₯
20 Β·
β
820) - A Framework for Encrypted Machine Learning in TensorFlow. Apache-2
Workflow & Experiment Tracking
Libraries to organize, track, and visualize machine learning experiments.
Tensorboard (
π₯
36 Β·
β
5.1K) - TensorFlow's Visualization Toolkit. Apache-2
-
GitHub (
π¨βπ» 250 Β·π 1.4K Β·π¦ 53K Β·π 1.5K - 36% open Β·β±οΈ 04.02.2021):git clone https://github.com/tensorflow/tensorboard
-
PyPi (
π₯ 3.7M / month Β·π¦ 3.6K Β·β±οΈ 12.11.2020):pip install tensorboard
-
Conda (
π₯ 1.6M Β·β±οΈ 15.01.2021):conda install -c conda-forge tensorboard
wandb client (
π₯
30 Β·
β
2.7K) - A tool for visualizing and tracking your machine learning.. MIT
SageMaker SDK (
π₯
30 Β·
β
1.3K) - A library for training and deploying machine learning.. Apache-2
sacred (
π₯
29 Β·
β
3.3K) - Sacred is a tool to help you configure, organize, log and reproduce.. MIT
snakemake (
π₯
29 Β·
β
840) - This is the development home of the workflow management system.. MIT
-
GitHub (
π¨βπ» 200 Β·π 190 Β·π¦ 740 Β·π 530 - 60% open Β·β±οΈ 03.02.2021):git clone https://github.com/snakemake/snakemake
-
PyPi (
π₯ 6.2K / month Β·π¦ 290 Β·β±οΈ 15.01.2021):pip install snakemake
-
Conda (
π₯ 270K Β·β±οΈ 20.01.2021):conda install -c bioconda snakemake
AzureML SDK (
π₯
28 Β·
β
2.1K) - Python notebooks with ML and deep learning examples with Azure.. MIT
tensorboardX (
π₯
27 Β·
β
6.8K Β·
π€
) - tensorboard for pytorch (and chainer, mxnet, numpy, ...). MIT
-
GitHub (
π¨βπ» 64 Β·π 780 Β·π₯ 290 Β·π¦ 10K Β·π 410 - 17% open Β·β±οΈ 05.07.2020):git clone https://github.com/lanpa/tensorboardX
-
PyPi (
π₯ 210K / month Β·π¦ 1.3K Β·β±οΈ 31.12.2019):pip install tensorboardX
-
Conda (
π₯ 270K Β·β±οΈ 06.07.2020):conda install -c conda-forge tensorboardx
Metaflow (
π₯
26 Β·
β
4K) - Build and manage real-life data science projects with ease. Apache-2
ClearML (
π₯
24 Β·
β
2.2K) - ClearML - Auto-Magical Suite of tools to streamline your ML.. Apache-2
-
GitHub (
π¨βπ» 25 Β·π 320 Β·π₯ 260 Β·π¦ 11 Β·π 260 - 21% open Β·β±οΈ 04.02.2021):git clone https://github.com/allegroai/clearml
-
PyPi (
π₯ 1.7K / month Β·β±οΈ 04.02.2021):pip install clearml
-
Docker Hub (
π₯ 30K Β·β±οΈ 05.10.2020):docker pull allegroai/trains
livelossplot (
π₯
24 Β·
β
1K) - Live training loss plot in Jupyter Notebook for Keras, PyTorch.. MIT
ml-metadata (
π₯
24 Β·
β
280) - For recording and retrieving metadata associated with ML.. Apache-2
TensorWatch (
π₯
23 Β·
β
3K) - Debugging, monitoring and visualization for Python Machine Learning.. MIT
knockknock (
π₯
22 Β·
β
2K Β·
π€
) - Knock Knock: Get notified when your training ends with only two.. MIT
-
GitHub (
π¨βπ» 18 Β·π 160 Β·π¦ 140 Β·π 33 - 36% open Β·β±οΈ 16.03.2020):git clone https://github.com/huggingface/knockknock
-
PyPi (
π₯ 1.5K / month Β·π¦ 3 Β·β±οΈ 16.03.2020):pip install knockknock
-
Conda (
π₯ 5.4K Β·β±οΈ 17.03.2020):conda install -c conda-forge knockknock
lore (
π₯
21 Β·
β
1.5K Β·
π€
) - Lore makes machine learning approachable for Software Engineers and.. MIT
Labml (
π₯
20 Β·
β
390) - Monitor PyTorch & TensorFlow model training from your mobile phone. MIT
hiddenlayer (
π₯
19 Β·
β
1.4K Β·
π€
) - Neural network graphs and training metrics for.. MIT
aim (
π₯
15 Β·
β
840) - Aim a super-easy way to record, search and compare 1000s of ML training.. Apache-2
Show 7 hidden projects...
- TensorBoard Logger (
π₯ 19 Β·β 610 Β·π ) - Log TensorBoard events without touching TensorFlow.MIT
- MXBoard (
π₯ 19 Β·β 330 Β·π ) - Logging MXNet data for visualization in TensorBoard.Apache-2
- SKLL (
π₯ 16 Β·β 520) - SciKit-Learn Laboratory (SKLL) makes it easy to run machine..βοΈBSD-1-Clause
- datmo (
π₯ 16 Β·β 330 Β·π ) - Open source production model management tool for data scientists.MIT
- steppy (
π₯ 16 Β·β 130 Β·π ) - Lightweight, Python library for fast and reproducible experimentation.MIT
- ModelChimp (
π₯ 14 Β·β 120) - Experiment tracking for machine and deep learning projects.BSD-2
- traintool (
π₯ 9 Β·β 9 Β·π£ ) - Train off-the-shelf machine learning models in one..Apache-2
Model Serialization & Conversion
Libraries to serialize models to files, convert between a variety of model formats, and optimize models for deployment.
onnx (
π₯
33 Β·
β
9.7K) - Open standard for machine learning interoperability. Apache-2
-
GitHub (
π¨βπ» 190 Β·π 1.8K Β·π₯ 10K Β·π¦ 2.5K Β·π 1.4K - 34% open Β·β±οΈ 04.02.2021):git clone https://github.com/onnx/onnx
-
PyPi (
π₯ 240K / month Β·π¦ 300 Β·β±οΈ 29.01.2021):pip install onnx
-
Conda (
π₯ 190K Β·β±οΈ 03.02.2021):conda install -c conda-forge onnx
Core ML Tools (
π₯
26 Β·
β
2.1K) - Core ML tools contain supporting tools for Core ML model.. BSD-3
TorchServe (
π₯
24 Β·
β
1.6K) - Model Serving on PyTorch. Apache-2
-
GitHub (
π¨βπ» 61 Β·π 230 Β·π₯ 190 Β·π¦ 27 Β·π 530 - 24% open Β·β±οΈ 13.01.2021):git clone https://github.com/pytorch/serve
-
PyPi (
π₯ 3.6K / month Β·β±οΈ 17.12.2020):pip install torchserve
-
Conda (
π₯ 6.4K Β·β±οΈ 17.12.2020):conda install -c pytorch torchserve
-
Docker Hub (
π₯ 51K Β·β 3 Β·β±οΈ 18.12.2020):docker pull pytorch/torchserve
mmdnn (
π₯
22 Β·
β
5.2K) - MMdnn is a set of tools to help users inter-operate among different deep.. MIT
m2cgen (
π₯
22 Β·
β
1.8K) - Transform ML models into a native code (Java, C, Python, Go, JavaScript,.. MIT
Hummingbird (
π₯
20 Β·
β
2.2K) - Hummingbird compiles trained ML models into tensor computation for.. MIT
pytorch2keras (
π₯
18 Β·
β
660 Β·
π€
) - PyTorch to Keras model convertor. MIT
Show 2 hidden projects...
- Larq Compute Engine (
π₯ 17 Β·β 130) - Highly optimized inference engine for Binarized..Apache-2
- sklearn-porter (
π₯ 16 Β·β 960 Β·π ) - Transpile trained scikit-learn estimators to C, Java,..MIT
Model Interpretability
Libraries to visualize, explain, debug, evaluate, and interpret machine learning models.
shap (
π₯
33 Β·
β
12K) - A game theoretic approach to explain the output of any machine learning model. MIT
Lime (
π₯
29 Β·
β
8.4K) - Lime: Explaining the predictions of any machine learning classifier. BSD-2
pyLDAvis (
π₯
27 Β·
β
1.4K) - Python library for interactive topic model visualization. Port of.. BSD-3
Model Analysis (
π₯
27 Β·
β
1K) - Model analysis tools for TensorFlow. Apache-2
InterpretML (
π₯
26 Β·
β
3.4K) - Fit interpretable models. Explain blackbox machine learning. MIT
yellowbrick (
π₯
26 Β·
β
3.1K) - Visual analysis and diagnostic tools to facilitate machine.. Apache-2
dtreeviz (
π₯
25 Β·
β
1.4K) - A python library for decision tree visualization and model interpretation. MIT
arviz (
π₯
25 Β·
β
940) - Exploratory analysis of Bayesian models with Python. Apache-2
-
GitHub (
π¨βπ» 64 Β·π 180 Β·π₯ 98 Β·π¦ 700 Β·π 520 - 20% open Β·β±οΈ 29.01.2021):git clone https://github.com/arviz-devs/arviz
-
PyPi (
π₯ 89K / month Β·π¦ 36 Β·β±οΈ 17.01.2021):pip install arviz
-
Conda (
π₯ 180K Β·β±οΈ 18.01.2021):conda install -c conda-forge arviz
Lucid (
π₯
24 Β·
β
4K) - A collection of infrastructure and tools for research in neural.. Apache-2
DoWhy (
π₯
24 Β·
β
2.6K) - DoWhy is a Python library for causal inference that supports explicit.. MIT
Fairness 360 (
π₯
24 Β·
β
1.2K) - A comprehensive set of fairness metrics for datasets and.. Apache-2
Alibi (
π₯
22 Β·
β
880) - Algorithms for monitoring and explaining machine learning models. Apache-2
Explainability 360 (
π₯
22 Β·
β
760) - Interpretability and explainability of data and machine.. Apache-2
TreeInterpreter (
π₯
22 Β·
β
640) - Package for interpreting scikit-learn's decision tree.. BSD-3
random-forest-importances (
π₯
22 Β·
β
410) - Code to compute permutation and drop-column.. MIT
iNNvestigate (
π₯
21 Β·
β
760) - A toolbox to iNNvestigate neural networks' predictions!. BSD-2
checklist (
π₯
20 Β·
β
1.2K) - Beyond Accuracy: Behavioral Testing of NLP models with CheckList. MIT
tf-explain (
π₯
20 Β·
β
760) - Interpretability Methods for tf.keras models with Tensorflow 2.x. MIT
sklearn-evaluation (
π₯
20 Β·
β
290) - Machine learning model evaluation made easy: plots,.. MIT
What-If Tool (
π₯
19 Β·
β
430) - Source code/webpage/demos for the What-If Tool. Apache-2
explainerdashboard (
π₯
19 Β·
β
300) - Quickly build Explainable AI dashboards that show the inner.. MIT
fairness-indicators (
π₯
18 Β·
β
180) - Tensorflow's Fairness Evaluation and Visualization.. Apache-2
LIT (
π₯
17 Β·
β
2.4K) - The Language Interpretability Tool: Interactively analyze NLP models for.. Apache-2
ExplainX.ai (
π₯
17 Β·
β
180) - Explainable AI framework for data scientists. Explain & debug any.. MIT
model-card-toolkit (
π₯
16 Β·
β
160) - a tool that leverages rich metadata and lineage.. Apache-2
FlashTorch (
π₯
15 Β·
β
530 Β·
π€
) - Visualization toolkit for neural networks in PyTorch! Demo --. MIT
Show 8 hidden projects...
- eli5 (
π₯ 27 Β·β 2.3K Β·π ) - A library for debugging/inspecting machine learning classifiers and..MIT
- scikit-plot (
π₯ 23 Β·β 2K Β·π ) - An intuitive library to add plotting functionality to scikit-..MIT
- Skater (
π₯ 19 Β·β 970 Β·π€ ) - Python Library for Model Interpretation/Explanations.βοΈUPL-1.0
- DALEX (
π₯ 18 Β·β 760) - moDel Agnostic Language for Exploration and eXplanation.βοΈGPL-3.0
- XAI (
π₯ 16 Β·β 570 Β·π ) - XAI - An eXplainability toolbox for machine learning.MIT
- imodels (
π₯ 16 Β·β 150) - Interpretable ML package for concise, transparent, and accurate predictive..MIT
- contextual-ai (
π₯ 15 Β·β 66) - Contextual AI adds explainability to different stages of..Apache-2
- Attribution Priors (
π₯ 12 Β·β 74) - Tools for training explainable models using..MIT
Vector Similarity Search (ANN)
Libraries for Approximate Nearest Neighbor Search and Vector Indexing/Similarity Search.
Faiss (
π₯
29 Β·
β
12K) - A library for efficient similarity search and clustering of dense vectors. MIT
-
GitHub (
π¨βπ» 75 Β·π 2.1K Β·π¦ 300 Β·π 1.4K - 7% open Β·β±οΈ 03.02.2021):git clone https://github.com/facebookresearch/faiss
-
PyPi (
π₯ 5.9K / month Β·π¦ 6 Β·β±οΈ 21.01.2021):pip install pymilvus
-
Conda (
π₯ 24K Β·β±οΈ 12.12.2020):conda install -c conda-forge faiss
Annoy (
π₯
29 Β·
β
8.1K) - Approximate Nearest Neighbors in C++/Python optimized for memory usage.. Apache-2
NMSLIB (
π₯
28 Β·
β
2.3K) - Non-Metric Space Library (NMSLIB): An efficient similarity search.. Apache-2
hnswlib (
π₯
26 Β·
β
1.3K Β·
π
) - Header-only C++/python library for fast approximate nearest.. Apache-2
Milvus (
π₯
25 Β·
β
5K) - An open source embedding vector similarity search engine powered by.. Apache-2
-
GitHub (
π¨βπ» 140 Β·π 790 Β·π 2.1K - 10% open Β·β±οΈ 04.02.2021):git clone https://github.com/milvus-io/milvus
-
PyPi (
π₯ 5.9K / month Β·π¦ 6 Β·β±οΈ 21.01.2021):pip install pymilvus
-
Docker Hub (
π₯ 240K Β·β 9 Β·β±οΈ 06.01.2021):docker pull milvusdb/milvus
PyNNDescent (
π₯
23 Β·
β
370 Β·
π
) - A Python nearest neighbor descent for approximate nearest.. BSD-2
-
GitHub (
π¨βπ» 10 Β·π 44 Β·π¦ 180 Β·π 57 - 47% open Β·β±οΈ 29.01.2021):git clone https://github.com/lmcinnes/pynndescent
-
PyPi (
π₯ 46K / month Β·π¦ 3 Β·β±οΈ 19.11.2020):pip install pynndescent
-
Conda (
π₯ 32K Β·β±οΈ 19.11.2020):conda install -c conda-forge pynndescent
Magnitude (
π₯
21 Β·
β
1.4K Β·
π€
) - A fast, efficient universal vector embedding utility package. MIT
NGT (
π₯
19 Β·
β
620) - Nearest Neighbor Search with Neighborhood Graph and Tree for High-.. Apache-2
N2 (
π₯
19 Β·
β
450) - TOROS N2 - lightweight approximate Nearest Neighbor library which runs fast.. Apache-2
Show 2 hidden projects...
Probabilistics & Statistics
Libraries providing capabilities for probabilistic programming/reasoning, bayesian inference, gaussian processes, or statistics.
PyMC3 (
π₯
32 Β·
β
5.5K) - Probabilistic Programming in Python: Bayesian Modeling and.. Apache-2
-
GitHub (
π¨βπ» 300 Β·π 1.3K Β·π₯ 150 Β·π¦ 2K Β·π 2.1K - 6% open Β·β±οΈ 03.02.2021):git clone https://github.com/pymc-devs/pymc3
-
PyPi (
π₯ 89K / month Β·π¦ 290 Β·β±οΈ 21.01.2021):pip install pymc3
-
Conda (
π₯ 240K Β·β±οΈ 21.01.2021):conda install -c conda-forge pymc3
tensorflow-probability (
π₯
31 Β·
β
3.2K) - Probabilistic reasoning and statistical analysis in.. Apache-2
-
GitHub (
π¨βπ» 400 Β·π 850 Β·π¦ 1 Β·π 950 - 45% open Β·β±οΈ 04.02.2021):git clone https://github.com/tensorflow/probability
-
PyPi (
π₯ 270K / month Β·π¦ 250 Β·β±οΈ 29.12.2020):pip install tensorflow-probability
-
Conda (
π₯ 30K Β·β±οΈ 13.03.2020):conda install -c conda-forge tensorflow-probability
Pyro (
π₯
28 Β·
β
6.7K) - Deep universal probabilistic programming with Python and PyTorch. Apache-2
GPyTorch (
π₯
28 Β·
β
2.3K) - A highly efficient and modular implementation of Gaussian Processes.. MIT
filterpy (
π₯
27 Β·
β
1.7K) - Python Kalman filtering and optimal estimation library. Implements.. MIT
pomegranate (
π₯
26 Β·
β
2.6K) - Fast, flexible and easy to use probabilistic modelling in Python. MIT
-
GitHub (
π¨βπ» 61 Β·π 470 Β·π¦ 390 Β·π 580 - 6% open Β·β±οΈ 09.01.2021):git clone https://github.com/jmschrei/pomegranate
-
PyPi (
π₯ 15K / month Β·π¦ 56 Β·β±οΈ 09.01.2021):pip install pomegranate
-
Conda (
π₯ 44K Β·β±οΈ 01.11.2020):conda install -c conda-forge pomegranate
pgmpy (
π₯
25 Β·
β
1.7K) - Python Library for learning (Structure and Parameter) and inference.. MIT
PyStan (
π₯
24 Β·
β
910 Β·
π
) - Temporary home for PyStan version 3. Documentation: https://pystan-.. ISC
SALib (
π₯
24 Β·
β
430) - Sensitivity Analysis Library in Python (Numpy). Contains Sobol, Morris,.. MIT
scikit-posthocs (
π₯
20 Β·
β
170) - Pairwise Multiple Comparisons (Post Hoc) Tests in Python. MIT
Baal (
π₯
17 Β·
β
320) - Using approximate bayesian posteriors in deep nets for active learning. Apache-2
Orbit (
π₯
17 Β·
β
320) - Bayesian forecasting with object-oriented design and probabilistic.. Apache-2
Show 4 hidden projects...
- patsy (
π₯ 27 Β·β 740 Β·π ) - Describing statistical models in Python using symbolic formulas.BSD-2
- Edward (
π₯ 24 Β·β 4.6K Β·π ) - A probabilistic programming language in TensorFlow. Deep..Apache-2
- pingouin (
π₯ 22 Β·β 630) - Statistical package in Python based on Pandas.βοΈGPL-3.0
- ZhuSuan (
π₯ 14 Β·β 2K Β·π ) - A probabilistic programming library for Bayesian deep learning,..MIT
Adversarial Robustness
Libraries for testing the robustness of machine learning models against attacks with adversarial/malicious examples.
CleverHans (
π₯
27 Β·
β
4.9K) - An adversarial example library for constructing attacks,.. MIT
Foolbox (
π₯
25 Β·
β
1.8K) - A Python toolbox to create adversarial examples that fool neural networks.. MIT
ART (
π₯
23 Β·
β
2K) - Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning.. MIT
TextAttack (
π₯
23 Β·
β
1.2K) - TextAttack is a Python framework for adversarial attacks, data.. MIT
robustness (
π₯
18 Β·
β
460) - A library for experimenting with, training and evaluating neural.. MIT
AdvBox (
π₯
15 Β·
β
1K) - Advbox is a toolbox to generate adversarial examples that fool neural.. Apache-2
Show 2 hidden projects...
- advertorch (
π₯ 17 Β·β 800 Β·π€ ) - A Toolbox for Adversarial Robustness Research.βοΈGPL-3.0
- Adversary (
π₯ 13 Β·β 340 Β·π ) - Tool to generate adversarial text examples and test machine..MIT
GPU Utilities
Libraries that require and make use of CUDA/GPU system capabilities to optimize data handling and machine learning tasks.
CuPy (
π₯
30 Β·
β
4.8K) - A NumPy-compatible array library accelerated by CUDA. MIT
-
GitHub (
π¨βπ» 250 Β·π 440 Β·π₯ 7.1K Β·π¦ 660 Β·π 1.2K - 28% open Β·β±οΈ 04.02.2021):git clone https://github.com/cupy/cupy
-
PyPi (
π₯ 8.8K / month Β·π¦ 190 Β·β±οΈ 28.01.2021):pip install cupy
-
Conda (
π₯ 360K Β·β±οΈ 02.02.2021):conda install -c conda-forge cupy
-
Docker Hub (
π₯ 48K Β·β 6 Β·β±οΈ 04.02.2021):docker pull cupy/cupy
gpustat (
π₯
26 Β·
β
2.2K) - A simple command-line utility for querying and monitoring GPU status. MIT
Apex (
π₯
23 Β·
β
5K) - A PyTorch Extension: Tools for easy mixed precision and distributed.. BSD-3
scikit-cuda (
π₯
21 Β·
β
800 Β·
π€
) - Python interface to GPU-powered libraries. BSD-3
py3nvml (
π₯
21 Β·
β
160 Β·
π€
) - Python 3 Bindings for NVML library. Get NVIDIA GPU status inside.. BSD-3
DALI (
π₯
20 Β·
β
3.1K) - A library containing both highly optimized building blocks and an.. Apache-2
-
GitHub (
π¨βπ» 55 Β·π 370 Β·π 790 - 23% open Β·β±οΈ 04.02.2021):git clone https://github.com/NVIDIA/DALI
BlazingSQL (
π₯
17 Β·
β
1.4K) - BlazingSQL is a lightweight, GPU accelerated, SQL engine for.. Apache-2
SpeedTorch (
π₯
16 Β·
β
610 Β·
π€
) - Library for faster pinned CPU - GPU transfer in Pytorch. MIT
Vulkan Kompute (
π₯
16 Β·
β
300) - General purpose GPU compute framework for cross vendor.. Apache-2
cuSignal (
π₯
15 Β·
β
440) - GPU accelerated signal processing. Apache-2
-
GitHub (
π¨βπ» 28 Β·π 60 Β·π 97 - 15% open Β·β±οΈ 03.02.2021):git clone https://github.com/rapidsai/cusignal
Show 3 hidden projects...
- GPUtil (
π₯ 22 Β·β 670 Β·π ) - A Python module for getting the GPU status from NVIDA GPUs using..MIT
- nvidia-ml-py3 (
π₯ 17 Β·β 61 Β·π ) - Python 3 Bindings for the NVIDIA Management Library.BSD-3
- ipyexperiments (
π₯ 16 Β·β 120) - jupyter/ipython experiment containers for GPU and..Apache-2
Tensorflow Utilities
Libraries that extend TensorFlow with additional capabilities.
tensor2tensor (
π₯
32 Β·
β
11K) - Library of deep learning models and datasets designed to.. Apache-2
tensorflow-hub (
π₯
32 Β·
β
2.7K) - A library for transfer learning by reusing parts of.. Apache-2
-
GitHub (
π¨βπ» 67 Β·π 1.4K Β·π¦ 5.3K Β·π 550 - 8% open Β·β±οΈ 28.01.2021):git clone https://github.com/tensorflow/hub
-
PyPi (
π₯ 580K / month Β·π¦ 310 Β·β±οΈ 06.01.2021):pip install tensorflow-hub
-
Conda (
π₯ 49K Β·β±οΈ 24.08.2020):conda install -c conda-forge tensorflow-hub
TF Addons (
π₯
30 Β·
β
1.2K) - Useful extra functionality for TensorFlow 2.x maintained by.. Apache-2
TensorFlow Transform (
π₯
29 Β·
β
850) - Input pipeline framework. Apache-2
TF Model Optimization (
π₯
26 Β·
β
970) - A toolkit to optimize ML models for deployment for.. Apache-2
efficientnet (
π₯
25 Β·
β
1.7K) - Implementation of EfficientNet model. Keras and.. Apache-2
TensorFlow I/O (
π₯
25 Β·
β
410) - Dataset, streaming, and file system extensions.. Apache-2
TensorFlow Cloud (
π₯
23 Β·
β
220) - The TensorFlow Cloud repository provides APIs that.. Apache-2
Neural Structured Learning (
π₯
21 Β·
β
770) - Training neural models with structured signals. Apache-2
TensorNets (
π₯
19 Β·
β
980) - High level network definitions with pre-trained weights in.. MIT
tffm (
π₯
18 Β·
β
760 Β·
π€
) - TensorFlow implementation of an arbitrary order Factorization Machine. MIT
Saliency (
π₯
17 Β·
β
630) - TensorFlow implementation for SmoothGrad, Grad-CAM, Guided.. Apache-2
TF Compression (
π₯
17 Β·
β
430) - Data compression in TensorFlow. Apache-2
Sklearn Utilities
Libraries that extend scikit-learn with additional capabilities.
MLxtend (
π₯
30 Β·
β
3.3K) - A library of extension and helper modules for Python's data.. BSD-3
imbalanced-learn (
π₯
29 Β·
β
5K) - A Python Package to Tackle the Curse of Imbalanced.. MIT
-
GitHub (
π¨βπ» 51 Β·π 1.1K Β·π¦ 4K Β·π 450 - 10% open Β·β±οΈ 03.11.2020):git clone https://github.com/scikit-learn-contrib/imbalanced-learn
-
PyPi (
π₯ 440K / month Β·π¦ 280 Β·β±οΈ 09.06.2020):pip install imbalanced-learn
-
Conda (
π₯ 110K Β·β±οΈ 14.06.2020):conda install -c conda-forge imbalanced-learn
category_encoders (
π₯
24 Β·
β
1.6K Β·
π€
) - A library of sklearn compatible categorical variable.. BSD-3
-
GitHub (
π¨βπ» 35 Β·π 290 Β·π 200 - 32% open Β·β±οΈ 31.07.2020):git clone https://github.com/scikit-learn-contrib/category_encoders
-
PyPi (
π₯ 140K / month Β·π¦ 23 Β·β±οΈ 14.10.2018):pip install category_encoders
-
Conda (
π₯ 93K Β·β±οΈ 29.04.2020):conda install -c conda-forge category_encoders
sklearn-contrib-lightning (
π₯
23 Β·
β
1.4K) - Large-scale linear classification, regression and.. BSD-3
-
GitHub (
π¨βπ» 16 Β·π 190 Β·π¦ 73 Β·π 85 - 57% open Β·β±οΈ 04.01.2021):git clone https://github.com/scikit-learn-contrib/lightning
-
PyPi (
π₯ 320 / month Β·π¦ 5 Β·β±οΈ 16.12.2020):pip install sklearn-contrib-lightning
-
Conda (
π₯ 130K Β·β±οΈ 20.12.2020):conda install -c conda-forge sklearn-contrib-lightning
fancyimpute (
π₯
23 Β·
β
920) - Multivariate imputation and matrix completion algorithms.. Apache-2
combo (
π₯
23 Β·
β
470) - (AAAI' 20) A Python Toolbox for Machine Learning Model.. BSD-2
xgboost
scikit-opt (
π₯
22 Β·
β
1.9K) - Genetic Algorithm, Particle Swarm Optimization, Simulated.. MIT
scikit-lego (
π₯
22 Β·
β
400) - Extra blocks for scikit-learn pipelines. MIT
iterative-stratification (
π₯
19 Β·
β
500) - scikit-learn cross validators for iterative.. BSD-3
scikit-tda (
π₯
19 Β·
β
260) - Topological Data Analysis for Python. MIT
DESlib (
π₯
17 Β·
β
300) - A Python library for dynamic classifier and ensemble selection. BSD-3
Show 5 hidden projects...
- sklearn-crfsuite (
π₯ 24 Β·β 360 Β·π ) - scikit-learn inspired API for CRFsuite.MIT
- scikit-multilearn (
π₯ 22 Β·β 630 Β·π ) - A scikit-learn based module for multi-label et. al...BSD-2
- skope-rules (
π₯ 20 Β·β 350) - machine learning with logical rules in Python.βοΈBSD-1-Clause
- celer (
π₯ 16 Β·β 110) - Fast solver for L1-type problems: Lasso, sparse Logisitic regression,..BSD-3
- dabl (
?? 16 Β·β 66 Β·π€ ) - Data Analysis Baseline Library.BSD-3
Pytorch Utilities
Libraries that extend Pytorch with additional capabilities.
pretrainedmodels (
π₯
27 Β·
β
7.7K Β·
π€
) - Pretrained ConvNets for pytorch: NASNet, ResNeXt,.. BSD-3
pytorch-optimizer (
π₯
25 Β·
β
1.7K) - torch-optimizer -- collection of optimizers for.. Apache-2
torchdiffeq (
π₯
24 Β·
β
3.3K) - Differentiable ODE solvers with full GPU support and.. MIT
pytorch-summary (
π₯
24 Β·
β
2.9K) - Model summary in PyTorch similar to `model.summary()` in.. MIT
PML (
π₯
24 Β·
β
2.7K) - The easiest way to use deep metric learning in your application. Modular,.. MIT
-
GitHub (
π¨βπ» 12 Β·π 360 Β·π¦ 50 Β·π 210 - 16% open Β·β±οΈ 04.02.2021):git clone https://github.com/KevinMusgrave/pytorch-metric-learning
-
PyPi (
π₯ 2.4K / month Β·β±οΈ 04.02.2021):pip install pytorch-metric-learning
-
Conda (
π₯ 1.2K Β·β±οΈ 12.01.2021):conda install -c metric-learning pytorch-metric-learning
SRU (
π₯
24 Β·
β
1.9K) - Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755). MIT
EfficientNet-PyTorch (
π₯
23 Β·
β
5.4K) - A PyTorch implementation of EfficientNet. Apache-2
PyTorch Sparse (
π₯
21 Β·
β
350) - PyTorch Extension Library of Optimized Autograd Sparse.. MIT
reformer-pytorch (
π₯
20 Β·
β
1.3K) - Reformer, the efficient Transformer, in Pytorch. MIT
EfficientNets (
π₯
20 Β·
β
1.2K Β·
π
) - Pretrained EfficientNet, EfficientNet-Lite, MixNet,.. Apache-2
Torchmeta (
π₯
20 Β·
β
1.2K) - A collection of extensions and data-loaders for few-shot learning.. MIT
torch-scatter (
π₯
20 Β·
β
590) - PyTorch Extension Library of Optimized Scatter Operations. MIT
Pytorch Toolbelt (
π₯
19 Β·
β
890) - PyTorch extensions for fast R&D prototyping and Kaggle.. MIT
Higher (
π₯
18 Β·
β
1K) - higher is a pytorch library allowing users to obtain higher order.. Apache-2
Tensor Sensor (
π₯
16 Β·
β
520 Β·
π£
) - The goal of this library is to generate more helpful.. MIT
Performer Pytorch (
π₯
16 Β·
β
490 Β·
π£
) - An implementation of Performer, a linear attention-.. MIT
tinygrad (
π₯
15 Β·
β
4.1K Β·
π£
) - You like pytorch? You like micrograd? You love tinygrad!. MIT
-
GitHub (
π¨βπ» 43 Β·π 440 Β·π 77 - 18% open Β·β±οΈ 31.01.2021):git clone https://github.com/geohot/tinygrad
Lambda Networks (
π₯
15 Β·
β
1.3K Β·
π£
) - Implementation of LambdaNetworks, a new approach to.. MIT
torchsde (
π₯
15 Β·
β
640) - Differentiable SDE solvers with GPU support and efficient.. Apache-2
-
GitHub (
π¨βπ» 4 Β·π 55 Β·π 32 - 18% open Β·β±οΈ 05.01.2021):git clone https://github.com/google-research/torchsde
Tez (
π₯
14 Β·
β
480 Β·
π£
) - Tez is a super-simple and lightweight Trainer for PyTorch. It.. Apache-2
Pywick (
π₯
14 Β·
β
310) - High-level batteries-included neural network training library for.. MIT
micrograd (
π₯
13 Β·
β
1.6K Β·
π€
) - A tiny scalar-valued autograd engine and a neural net library.. MIT
Torch-Struct (
π₯
13 Β·
β
880) - Fast, general, and tested differentiable structured prediction.. MIT
-
GitHub (
π¨βπ» 12 Β·π 68 Β·π 35 - 37% open Β·β±οΈ 20.01.2021):git clone https://github.com/harvardnlp/pytorch-struct
Show 3 hidden projects...
- AdaBound (
π₯ 19 Β·β 2.8K Β·π ) - An optimizer that trains as fast as Adam and as good as SGD.Apache-2
- Poutyne (
π₯ 19 Β·β 440) - A simplified framework and utilities for PyTorch.βοΈLGPL-3.0
- Antialiased CNNs (
π₯ 17 Β·β 1.3K) - pip install antialiased-cnns to improve stability and..βοΈCC BY-NC-SA 4.0
Database Clients
Libraries for connecting to, operating, and querying databases.
Others
scipy (
π₯
40 Β·
β
7.9K) - Ecosystem of open-source software for mathematics, science, and engineering. BSD-3
-
GitHub (
π¨βπ» 1.1K Β·π 3.5K Β·π₯ 310K Β·π¦ 300K Β·π 7.3K - 21% open Β·β±οΈ 03.02.2021):git clone https://github.com/scipy/scipy
-
PyPi (
π₯ 10M / month Β·π¦ 87K Β·β±οΈ 31.12.2020):pip install scipy
-
Conda (
π₯ 12M Β·β±οΈ 12.01.2021):conda install -c conda-forge scipy
SymPy (
π₯
36 Β·
β
7.8K) - A computer algebra system written in pure Python. BSD-3
-
GitHub (
π¨βπ» 1K Β·π 3.3K Β·π₯ 410K Β·π¦ 29K Β·π 11K - 36% open Β·β±οΈ 03.02.2021):git clone https://github.com/sympy/sympy
-
PyPi (
π₯ 440K / month Β·π¦ 6.4K Β·β±οΈ 12.12.2020):pip install sympy
-
Conda (
π₯ 1.3M Β·β±οΈ 08.01.2021):conda install -c conda-forge sympy
PyOD (
π₯
28 Β·
β
4.1K) - (JMLR'19) A Python Toolbox for Scalable Outlier Detection (Anomaly.. BSD-2
hdbscan (
π₯
28 Β·
β
1.8K Β·
π
) - A high performance implementation of HDBSCAN clustering. BSD-3
-
GitHub (
π¨βπ» 66 Β·π 340 Β·π¦ 720 Β·π 370 - 60% open Β·β±οΈ 03.02.2021):git clone https://github.com/scikit-learn-contrib/hdbscan
-
PyPi (
π₯ 71K / month Β·π¦ 120 Β·β±οΈ 03.02.2021):pip install hdbscan
-
Conda (
π₯ 530K Β·β±οΈ 04.02.2021):conda install -c conda-forge hdbscan
Keras-Preprocessing (
π₯
28 Β·
β
910) - Utilities for working with image data, text data, and.. MIT
-
GitHub (
π¨βπ» 49 Β·π 390 Β·π 190 - 48% open Β·β±οΈ 21.01.2021):git clone https://github.com/keras-team/keras-preprocessing
-
PyPi (
π₯ 2.5M / month Β·π¦ 2.7K Β·β±οΈ 14.05.2020):pip install keras-preprocessing
-
Conda (
π₯ 800K Β·β±οΈ 15.01.2021):conda install -c conda-forge keras-preprocessing
Cython BLIS (
π₯
28 Β·
β
160) - Fast matrix-multiplication as a self-contained Python library no.. BSD-3
-
GitHub (
π¨βπ» 9 Β·π 22 Β·π¦ 8.2K Β·π 21 - 28% open Β·β±οΈ 07.12.2020):git clone https://github.com/explosion/cython-blis
-
PyPi (
π₯ 600K / month Β·π¦ 390 Β·β±οΈ 07.12.2020):pip install blis
-
Conda (
π₯ 400K Β·β±οΈ 31.01.2021):conda install -c conda-forge cython-blis
Datasette (
π₯
27 Β·
β
4.6K Β·
π
) - An open source multi-tool for exploring and publishing data. Apache-2
agate (
π₯
26 Β·
β
1K Β·
π€
) - A Python data analysis library that is optimized for humans instead of.. MIT
pyclustering (
π₯
26 Β·
β
780) - pyclustring is a Python, C++ data mining library. BSD-3
-
GitHub (
π¨βπ» 26 Β·π 180 Β·π₯ 280 Β·π¦ 170 Β·π 640 - 8% open Β·β±οΈ 03.12.2020):git clone https://github.com/annoviko/pyclustering
-
PyPi (
π₯ 18K / month Β·π¦ 36 Β·β±οΈ 25.11.2020):pip install pyclustering
-
Conda (
π₯ 13K Β·β±οΈ 25.01.2021):conda install -c conda-forge pyclustering
causalml (
π₯
25 Β·
β
1.6K) - Uplift modeling and causal inference with machine learning.. Apache-2
Pythran (
π₯
25 Β·
β
1.5K) - Ahead of Time compiler for numeric kernels. BSD-3
-
GitHub (
π¨βπ» 48 Β·π 130 Β·π¦ 48 Β·π 640 - 15% open Β·β±οΈ 02.02.2021):git clone https://github.com/serge-sans-paille/pythran
-
PyPi (
π₯ 3.5K / month Β·π¦ 13 Β·β±οΈ 11.12.2020):pip install pythran
-
Conda (
π₯ 120K Β·β±οΈ 27.01.2021):conda install -c conda-forge pythran
DeepChem (
π₯
24 Β·
β
2.7K) - Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry,.. MIT
TabPy (
π₯
24 Β·
β
1K) - Execute Python code on the fly and display results in Tableau visualizations:. MIT
kmodes (
π₯
24 Β·
β
810) - Python implementations of the k-modes and k-prototypes clustering.. MIT
PennyLane (
π₯
24 Β·
β
730) - PennyLane is a cross-platform Python library for differentiable.. Apache-2
pyjanitor (
π₯
24 Β·
β
620) - Clean APIs for data cleaning. Python implementation of R package Janitor. MIT
datalad (
π₯
24 Β·
β
220) - Keep code, data, containers under control with git and git-annex. MIT
cleanlab (
π₯
22 Β·
β
1.5K) - The standard package for machine learning with noisy labels and finding.. MIT
metric-learn (
π₯
22 Β·
β
1.1K) - Metric learning algorithms in Python. MIT
AstroML (
π₯
22 Β·
β
720) - Machine learning, statistics, and data mining for astronomy and.. BSD-2
SUOD (
π₯
22 Β·
β
240) - (MLSys' 21) An Acceleration System for Large-scare Unsupervised.. BSD-2
Mars (
π₯
21 Β·
β
2K) - Mars is a tensor-based unified framework for large-scale data computation.. Apache-2
StreamAlert (
π₯
20 Β·
β
2.5K) - StreamAlert is a serverless, realtime data analysis framework.. Apache-2
-
GitHub (
π¨βπ» 30 Β·π 280 Β·π 340 - 26% open Β·β±οΈ 05.10.2020):git clone https://github.com/airbnb/streamalert
alibi-detect (
π₯
20 Β·
β
550) - Algorithms for outlier and adversarial instance detection,.. Apache-2
rrcf (
π₯
20 Β·
β
280 Β·
π€
) - Implementation of the Robust Random Cut Forest algorithm for anomaly.. MIT
gplearn (
π₯
19 Β·
β
900 Β·
π€
) - Genetic Programming in Python, with a scikit-learn inspired API. BSD-3
baikal (
π₯
19 Β·
β
570) - A graph-based functional API for building complex scikit-learn pipelines. BSD-3
Feature Engine (
π₯
19 Β·
β
440) - Feature engineering package with sklearn like functionality. BSD-3
-
GitHub (
π¨βπ» 20 Β·π 130 Β·π 99 - 26% open Β·β±οΈ 23.01.2021):git clone https://github.com/solegalli/feature_engine
-
PyPi (
π₯ 9.6K / month Β·π¦ 2 Β·β±οΈ 23.01.2021):pip install feature_engine
-
Conda (
π₯ 1.4K Β·β±οΈ 25.01.2021):conda install -c conda-forge feature_engine
scikit-rebate (
π₯
19 Β·
β
310 Β·
π€
) - A scikit-learn-compatible Python implementation of.. MIT
apricot (
π₯
18 Β·
β
300) - apricot implements submodular optimization for the purpose of selecting.. MIT
River (
π₯
17 Β·
β
1.4K) - Online machine learning in Python. BSD-3
-
GitHub (
π¨βπ» 58 Β·π 170 Β·π 280 - 13% open Β·β±οΈ 01.02.2021):git clone https://github.com/online-ml/river
traingenerator (
π₯
10 Β·
β
920 Β·
π£
) - A web app to generate template code for machine learning. MIT
-
GitHub (
π¨βπ» 3 Β·π 120 Β·π 12 - 75% open Β·β±οΈ 20.01.2021):git clone https://github.com/jrieke/traingenerator
Show 6 hidden projects...
- Autograd (
π₯ 29 Β·β 5.1K Β·π ) - Efficiently computes derivatives of numpy code.MIT
- pysc2 (
π₯ 23 Β·β 7.2K Β·π ) - StarCraft II Learning Environment.Apache-2
- minisom (
π₯ 22 Β·β 770) - MiniSom is a minimalistic implementation of the Self Organizing..βοΈCC-BY-3.0
- impyute (
π₯ 20 Β·β 270 Β·π ) - Data imputations library to preprocess datasets with missing data.MIT
- vecstack (
π₯ 18 Β·β 580 Β·π ) - Python package for stacking (machine learning technique).MIT
- pandas-ml (
π₯ 17 Β·β 270 Β·π ) - pandas, scikit-learn, xgboost and seaborn integration.BSD-3
Related Resources
- Papers With Code: Discover ML papers, code, and evaluation tables.
- Sotabench: Discover & compare open-source ML models.
- Google Dataset Search: Dataset search engine by Google.
- Dataset List: List of the biggest ML datasets from across the web.
- Awesome Public Datasets: A topic-centric list of open datasets.
- Best-of lists: Discover other best-of lists with awesome open-source projects on all kinds of topics.
- best-of-python-dev: A ranked list of awesome python developer tools and libraries.
- best-of-web-python: A ranked list of awesome python libraries for web development.
Contribution
Contributions are encouraged and always welcome! If you like to add or update projects, choose one of the following ways:
- Open an issue by selecting one of the provided categories from the issue page and fill in the requested information.
- Modify the projects.yaml with your additions or changes, and submit a pull request. This can also be done directly via the Github UI.
If you like to contribute to or share suggestions regarding the project metadata collection or markdown generation, please refer to the best-of-generator repository. If you like to create your own best-of list, we recommend to follow this guide.
For more information on how to add or update projects, please read the contribution guidelines. By participating in this project, you agree to abide by its Code of Conduct.