1223 Repositories
Python performance-analysis Libraries
Docker Container wallstreetbets-sentiment-analysis
Docker Container wallstreetbets-sentiment-analysis A docker container using restful endpoints exposed on port 5000 "/analyze" to gather sentiment anal
TickerRain is an open-source web app that stores and analysis Reddit posts in a transparent and semi-interactive manner.
TickerRain is an open-source web app that stores and analysis Reddit posts in a transparent and semi-interactive manner
CPU inference engine that delivers unprecedented performance for sparse models
The DeepSparse Engine is a CPU runtime that delivers unprecedented performance by taking advantage of natural sparsity within neural networks to reduce compute required as well as accelerate memory bound workloads. It is focused on model deployment and scaling machine learning pipelines, fitting seamlessly into your existing deployments as an inference backend.
Code for the paper A Theoretical Analysis of the Repetition Problem in Text Generation
A Theoretical Analysis of the Repetition Problem in Text Generation This repository share the code for the paper "A Theoretical Analysis of the Repeti
Summarization, translation, sentiment-analysis, text-generation and more at blazing speed using a T5 version implemented in ONNX.
Summarization, translation, Q&A, text generation and more at blazing speed using a T5 version implemented in ONNX. This package is still in alpha stag
Text vectorization tool to outperform TFIDF for classification tasks
WHAT: Supervised text vectorization tool Textvec is a text vectorization tool, with the aim to implement all the "classic" text vectorization NLP meth
An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)
VizSeq is a Python toolkit for visual analysis on text generation tasks like machine translation, summarization, image captioning, speech translation
Textpipe: clean and extract metadata from text
textpipe: clean and extract metadata from text textpipe is a Python package for converting raw text in to clean, readable text and extracting metadata
Beautiful visualizations of how language differs among document types.
Scattertext 0.1.0.0 A tool for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot. Points corresponding t
State of the Art Natural Language Processing
Spark NLP: State of the Art Natural Language Processing Spark NLP is a Natural Language Processing library built on top of Apache Spark ML. It provide
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.
VADER-Sentiment-Analysis VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifica
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
TextBlob: Simplified Text Processing Homepage: https://textblob.readthedocs.io/ TextBlob is a Python (2 and 3) library for processing textual data. It
Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js
pivottablejs: the Python module Drag’n’drop Pivot Tables and Charts for Jupyter/IPython Notebook, care of PivotTable.js Installation pip install pivot
A grammar of graphics for Python
plotnine Latest Release License DOI Build Status Coverage Documentation plotnine is an implementation of a grammar of graphics in Python, it is based
Visualize and compare datasets, target values and associations, with one line of code.
In-depth EDA (target analysis, comparison, feature analysis, correlation) in two lines of code! Sweetviz is an open-source Python library that generat
3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK)
PyVista Deployment Build Status Metrics Citation License Community 3D plotting and mesh analysis through a streamlined interface for the Visualization
Missing data visualization module for Python.
missingno Messy datasets? Missing values? missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities tha
Uniform Manifold Approximation and Projection
UMAP Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, bu
Create HTML profiling reports from pandas DataFrame objects
Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great
Intel® Nervana™ reference deep learning framework committed to best performance on all hardware
DISCONTINUATION OF PROJECT. This project will no longer be maintained by Intel. Intel will not provide or guarantee development of or support for this
A toolkit for making real world machine learning and data analysis applications in C++
dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl
fklearn: Functional Machine Learning
fklearn: Functional Machine Learning fklearn uses functional programming principles to make it easier to solve real problems with Machine Learning. Th
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website • Key Features • How To Use • Docs •
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a
Statsmodels: statistical modeling and econometrics in Python
About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an
scikit-learn: machine learning in Python
scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. The project was started
Scalene: a high-performance, high-precision CPU and memory profiler for Python
scalene: a high-performance CPU and memory profiler for Python by Emery Berger 中文版本 (Chinese version) About Scalene % pip install -U scalene Scalen
🚴 Call stack profiler for Python. Shows you why your code is slow!
pyinstrument Pyinstrument is a Python profiler. A profiler is a tool to help you 'optimize' your code - make it faster. It sounds obvious, but to get
Yet Another Python Profiler, but this time thread&coroutine&greenlet aware.
Yappi Yet Another Python Profiler, but this time thread&coroutine&greenlet aware. Highlights Fast: Yappi is fast. It is completely written in C and lo
Sampling profiler for Python programs
py-spy: Sampling profiler for Python programs py-spy is a sampling profiler for Python programs. It lets you visualize what your Python program is spe
Create standalone executables from Python scripts, with the same performance and is cross-platform.
About cx_Freeze cx_Freeze creates standalone executables from Python scripts, with the same performance, is cross-platform and should work on any plat
Nuitka is a Python compiler written in Python. It's fully compatible with Python 2.6, 2.7, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, and 3.9. You feed it your Python app, it does a lot of clever things, and spits out an executable or extension module.
Nuitka User Manual Contents Overview Usage Requirements Command Line Installation License Tutorial Setup and build on Windows Setup Install Python Ins
py.test fixture for benchmarking code
Overview docs tests package A pytest fixture for benchmarking code. It will group the tests into rounds that are calibrated to the chosen timer. See c
McCabe complexity checker for Python
McCabe complexity checker Ned's script to check McCabe complexity. This module provides a plugin for flake8, the Python code checker. Installation You
Various code metrics for Python code
Radon Radon is a Python tool that computes various metrics from the source code. Radon can compute: McCabe's complexity, i.e. cyclomatic complexity ra
Dlint is a tool for encouraging best coding practices and helping ensure Python code is secure.
Dlint Dlint is a tool for encouraging best coding practices and helping ensure Python code is secure. The most important thing I have done as a progra
A Static Analysis Tool for Detecting Security Vulnerabilities in Python Web Applications
This project is no longer maintained March 2020 Update: Please go see the amazing Pysa tutorial that should get you up to speed finding security vulne
Bandit is a tool designed to find common security issues in Python code.
A security linter from PyCQA Free software: Apache license Documentation: https://bandit.readthedocs.io/en/latest/ Source: https://github.com/PyCQA/ba
Awesome autocompletion, static analysis and refactoring library for python
Jedi - an awesome autocompletion, static analysis and refactoring library for Python Jedi is a static analysis tool for Python that is typically used
A static-analysis bot for Github
Imhotep, the peaceful builder. What is it? Imhotep is a tool which will comment on commits coming into your repository and check for syntactic errors
Custom Python linting through AST expressions
bellybutton bellybutton is a customizable, easy-to-configure linting engine for Python. What is this good for? Tools like pylint and flake8 provide, o
Automated security testing using bandit and flake8.
flake8-bandit Automated security testing built right into your workflow! You already use flake8 to lint all your code for errors, ensure docstrings ar
coala provides a unified command-line interface for linting and fixing all your code, regardless of the programming languages you use.
"Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live." ― John F. Woods coala provides a
Pylint plugin for improving code analysis for when using Django
pylint-django About pylint-django is a Pylint plugin for improving code analysis when analysing code using Django. It is also used by the Prospector t
A static type analyzer for Python code
pytype - 🦆 ✔ Pytype checks and infers types for your Python code - without requiring type annotations. Pytype can: Lint plain Python code, flagging c
Performant type-checking for python.
Pyre is a performant type checker for Python compliant with PEP 484. Pyre can analyze codebases with millions of lines of code incrementally – providi
It's not just a linter that annoys you!
README for Pylint - https://pylint.pycqa.org/ Professional support for pylint is available as part of the Tidelift Subscription. Tidelift gives softwa
The official GitHub mirror of https://gitlab.com/pycqa/flake8
Flake8 Flake8 is a wrapper around these tools: PyFlakes pycodestyle Ned Batchelder's McCabe script Flake8 runs all the tools by launching the single f
PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.
PyPika - Python Query Builder Abstract What is PyPika? PyPika is a Python API for building SQL queries. The motivation behind PyPika is to provide a s
Python ASN.1 library with a focus on performance and a pythonic API
asn1crypto A fast, pure Python library for parsing and serializing ASN.1 structures. Features Why Another Python ASN.1 Library? Related Crypto Librari
Better directory iterator and faster os.walk(), now in the Python 3.5 stdlib
scandir, a better directory iterator and faster os.walk() scandir() is a directory iteration function like os.listdir(), except that instead of return
🦉 Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)
srsly: Modern high-performance serialization utilities for Python This package bundles some of the best Python serialization libraries into one standa
Full-text multi-table search application for Django. Easy to install and use, with good performance.
django-watson django-watson is a fast multi-model full-text search plugin for Django. It is easy to install and use, and provides high quality search
No effort, no worry, maximum performance.
Django Cachalot Caches your Django ORM queries and automatically invalidates them. Documentation: http://django-cachalot.readthedocs.io Table of Conte
a flask profiler which watches endpoint calls and tries to make some analysis.
Flask-profiler version: 1.8 Flask-profiler measures endpoints defined in your flask application; and provides you fine-grained report through a web in
GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.
GoAccess What is it? GoAccess is an open source real-time web log analyzer and interactive viewer that runs in a terminal on *nix systems or through y
Automatically monitor the evolving performance of Flask/Python web services.
Flask Monitoring Dashboard A dashboard for automatic monitoring of Flask web-services. Key Features • How to use • Live Demo • Feedback • Documentatio
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par
Load and performance benchmark tool
Yandex Tank Yandextank has been moved to Python 3. Latest stable release for Python 2 here. Yandex.Tank is an extensible open source load testing tool
Scalable user load testing tool written in Python
Locust Locust is an easy to use, scriptable and scalable performance testing tool. You define the behaviour of your users in regular Python code, inst
This is a database of 180.000+ symbols containing Equities, ETFs, Funds, Indices, Futures, Options, Currencies, Cryptocurrencies and Money Markets.
Finance Database As a private investor, the sheer amount of information that can be found on the internet is rather daunting.
Meinheld is a high performance asynchronous WSGI Web Server (based on picoev)
What's this This is a high performance python wsgi web server. And Meinheld is a WSGI compliant web server. (PEP333 and PEP3333 supported) You can als
The no-nonsense, minimalist REST and app backend framework for Python developers, with a focus on reliability, correctness, and performance at scale.
The Falcon Web Framework Falcon is a reliable, high-performance Python web framework for building large-scale app backends and microservices. It encou
APT-Hunter is Threat Hunting tool for windows event logs
APT-Hunter is Threat Hunting tool for windows event logs which made by purple team mindset to provide detect APT movements hidden in the sea of windows event logs to decrease the time to uncover suspicious activity
This program goes thru reddit, finds the most mentioned tickers and uses Vader SentimentIntensityAnalyzer to calculate the ticker compound value.
This program goes thru reddit, finds the most mentioned tickers and uses Vader SentimentIntensityAnalyzer to calculate the ticker compound value.
Various capabilities for static malware analysis.
Malchive The malchive serves as a compendium for a variety of capabilities mainly pertaining to malware analysis, such as scripts supporting day to da
PyCG: Practical Python Call Graphs
PyCG - Practical Python Call Graphs PyCG generates call graphs for Python code using static analysis. It efficiently supports Higher order functions T
A fast PostgreSQL Database Client Library for Python/asyncio.
asyncpg -- A fast PostgreSQL Database Client Library for Python/asyncio asyncpg is a database interface library designed specifically for PostgreSQL a
An API Client package to access the APIs for NBA.com
nba_api An API Client package to access the APIs for NBA.com Development Version: v1.1.9 nba_api is an API Client for www.nba.com. This package is mea
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a
Backtest 1000s of minute-by-minute trading algorithms for training AI with automated pricing data from: IEX, Tradier and FinViz. Datasets and trading performance automatically published to S3 for building AI training datasets for teaching DNNs how to trade. Runs on Kubernetes and docker-compose. 150 million trading history rows generated from +5000 algorithms. Heads up: Yahoo's Finance API was disabled on 2019-01-03 https://developer.yahoo.com/yql/
Stock Analysis Engine Build and tune investment algorithms for use with artificial intelligence (deep neural networks) with a distributed stack for ru
High-performance cross-platform Video Processing Python framework powerpacked with unique trailblazing features :fire:
Releases | Gears | Documentation | Installation | License VidGear is a High-Performance Video Processing Python Library that provides an easy-to-use,
Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes in a variety of languages.
Mimesis - Fake Data Generator Description Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes
Scalable user load testing tool written in Python
Locust Locust is an easy to use, scriptable and scalable performance testing tool. You define the behaviour of your users in regular Python code, inst
Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes in a variety of languages.
Mimesis - Fake Data Generator Description Mimesis is a high-performance fake data generator for Python, which provides data for a variety of purposes
Scalable user load testing tool written in Python
Locust Locust is an easy to use, scriptable and scalable performance testing tool. You define the behaviour of your users in regular Python code, inst
Statsmodels: statistical modeling and econometrics in Python
About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano
PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) an
Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis. You write a high level configuration file specifying your in
FastAPI framework, high performance, easy to learn, fast to code, ready for production
FastAPI framework, high performance, easy to learn, fast to code, ready for production Documentation: https://fastapi.tiangolo.com Source Code: https:
The no-nonsense, minimalist REST and app backend framework for Python developers, with a focus on reliability, correctness, and performance at scale.
The Falcon Web Framework Falcon is a reliable, high-performance Python web framework for building large-scale app backends and microservices. It encou
Multi-class confusion matrix library in Python
Table of contents Overview Installation Usage Document Try PyCM in Your Browser Issues & Bug Reports Todo Outputs Dependencies Contribution References
A high performance implementation of HDBSCAN clustering. http://hdbscan.readthedocs.io/en/latest/
HDBSCAN Now a part of scikit-learn-contrib HDBSCAN - Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over va
Tools for exploratory data analysis in Python
Dora Exploratory data analysis toolkit for Python. Contents Summary Setup Usage Reading Data & Configuration Cleaning Feature Selection & Extraction V
Apache Superset is a Data Visualization and Data Exploration Platform
Superset A modern, enterprise-ready business intelligence web application. Why Superset? | Supported Databases | Installation and Configuration | Rele
Simple, realtime visualization of neural network training performance.
pastalog Simple, realtime visualization server for training neural networks. Use with Lasagne, Keras, Tensorflow, Torch, Theano, and basically everyth
The Python ensemble sampling toolkit for affine-invariant MCMC
emcee The Python ensemble sampling toolkit for affine-invariant MCMC emcee is a stable, well tested Python implementation of the affine-invariant ense
Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser)
Frog for Python This is a Python binding to the Natural Language Processing suite Frog. Frog is intended for Dutch and performs part-of-speech tagging
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par
Create UIs for prototyping your machine learning model in 3 minutes
Note: We just launched Hosted, where anyone can upload their interface for permanent hosting. Check it out! Welcome to Gradio Quickly create customiza
A unified framework for machine learning with time series
Welcome to sktime A unified framework for machine learning with time series We provide specialized time series algorithms and scikit-learn compatible
StellarGraph - Machine Learning on Graphs
StellarGraph Machine Learning Library StellarGraph is a Python library for machine learning on graphs and networks. Table of Contents Introduction Get
ML-Ensemble – high performance ensemble learning
A Python library for high performance ensemble learning ML-Ensemble combines a Scikit-learn high-level API with a low-level computational graph framew
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website • Key Features • How To Use • Docs •
Intel® Nervana™ reference deep learning framework committed to best performance on all hardware
DISCONTINUATION OF PROJECT. This project will no longer be maintained by Intel. Intel will not provide or guarantee development of or support for this
A library of extension and helper modules for Python's data analysis and machine learning libraries.
Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. Sebastian Raschka 2014-2020 Links Doc