180 Repositories
Python mlops-slurm-kubernetes Libraries
Kubediff: a tool for Kubernetes to show differences between running state and version controlled configuration.
Kubediff: a tool for Kubernetes to show differences between running state and version controlled configuration.
ClearML - Auto-Magical Suite of tools to streamline your ML workflow. Experiment Manager, MLOps and Data-Management
ClearML - Auto-Magical Suite of tools to streamline your ML workflow Experiment Manager, MLOps and Data-Management ClearML Formerly known as Allegro T
Project 4 Cloud DevOps Nanodegree
Project Overview In this project, you will apply the skills you have acquired in this course to operationalize a Machine Learning Microservice API. Yo
Custom SLURM wrapper scripts to make finding job histories and system resource usage more easily accessible
SLURM Wrappers Executables job-history A simple wrapper for grabbing data for completed and running jobs. nodes-busy Developed for the HPC systems at
Hera is a Python framework for constructing and submitting Argo Workflows.
Hera is an Argo Workflows Python SDK. Hera aims to make workflow construction and submission easy and accessible to everyone! Hera abstracts away workflow setup details while still maintaining a consistent vocabulary with Argo Workflows.
A collection of online resources to help you on your Tech journey.
Everything Tech Resources & Projects About The Project Coming from an engineering background and looking to up skill yourself on a new field can be di
BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models.
Model Serving Made Easy BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models. Supports multi
Metaflow is a human-friendly Python/R library that helps scientists and engineers build and manage real-life data science projects
Metaflow Metaflow is a human-friendly Python/R library that helps scientists and engineers build and manage real-life data science projects. Metaflow
CLI tool to computes CO2 emissions of HPC computations following green-algorithms.org methodology
gqueue gqueue is a CLI (command line interface) tool that computes carbon footprint of HPC computations on clusters running slurm. It follows the meth
End to End toy example of MLOps
churn_model MLOps Toy Example End to End You might find below links useful Connect VSCode to Git MLFlow Port Heroku App Project Organization ├── LICEN
ML for NLP and Computer Vision.
Sparrow is our open-source ML product. It runs on Skipper MLOps infrastructure.
Bodywork deploys machine learning projects developed in Python, to Kubernetes.
Bodywork deploys machine learning projects developed in Python, to Kubernetes. It helps you to: serve models as microservices execute batch jobs run r
MagTape is a Policy-as-Code tool for Kubernetes that allows for evaluating Kubernetes resources against a set of defined policies to inform and enforce best practice configurations.
MagTape is a Policy-as-Code tool for Kubernetes that allows for evaluating Kubernetes resources against a set of defined policies to inform and enforce best practice configurations. MagTape includes variable policy enforcement, notifications, and targeted metrics.
ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.
ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. It has a simple, flexible syntax, is cloud and tool agnostic, and has interfaces/abstractions that are catered towards ML workflows.
Caboto, the Kubernetes semantic analysis tool
Caboto Caboto, the Kubernetes semantic analysis toolkit. It contains a lightweight Python library for semantic analysis of plain Kubernetes manifests
A wrapper for slurm especially on Taiwania2 (HPC CLI)A wrapper for slurm especially on Taiwania2 (HPC CLI)
TWCC-slurm-wrapper A wrapper for slurm especially on Taiwania2 (HPC CLI). For Taiwania2 (HPC CLI) usage, please refer to here. (中文) How to Install? gi
Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on Kubernetes simple, portable, and scalable.
SDK: Overview of the Kubeflow pipelines service Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on
MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.
page_type languages products description sample python azure azure-machine-learning-service azure-devops Code which demonstrates how to set up and ope
The command line interface for Gradient - Gradient is an an end-to-end MLOps platform
Gradient CLI Get started: Create Account • Install CLI • Tutorials • Docs Resources: Website • Blog • Support • Contact Sales Gradient is an an end-to
In this project I played with mlflow, streamlit and fastapi to create a training and prediction app on digits
Fastapi + MLflow + streamlit Setup env. I hope I covered all. pip install -r requirements.txt Start app Go in the root dir and run these Streamlit str
A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.
chitra What is chitra? chitra (चित्र) is a multi-functional library for full-stack Deep Learning. It simplifies Model Building, API development, and M
Wafer Fault Detection using MlOps Integration
Wafer Fault Detection using MlOps Integration This is an end to end machine learning project with MlOps integration for predicting the quality of wafe
sysctl/sysfs settings on a fly for Kubernetes Cluster. No restarts are required for clusters and nodes.
SysBindings Daemon Little toolkit for control the sysctl/sysfs bindings on Kubernetes Cluster on the fly and without unnecessary restarts of cluster o
Tangram makes it easy for programmers to train, deploy, and monitor machine learning models.
Tangram Website | Discord Tangram makes it easy for programmers to train, deploy, and monitor machine learning models. Run tangram train to train a mo
MONAI Deploy App SDK offers a framework and associated tools to design, develop and verify AI-driven applications in the healthcare imaging domain.
MONAI Deploy App SDK offers a framework and associated tools to design, develop and verify AI-driven applications in the healthcare imaging domain.
African language Speech Recognition - Speech-to-Text
Swahili-Speech-To-Text Table of Contents Swahili-Speech-To-Text Overview Scenario Approach Project Structure data: models: notebooks: scripts tests: l
A project based example of Data pipelines, ML workflow management, API endpoints and Monitoring.
MLOps template with examples for Data pipelines, ML workflow management, API development and Monitoring.
Graphsignal is a machine learning model monitoring platform.
Graphsignal is a machine learning model monitoring platform. It helps ML engineers, MLOps teams and data scientists to quickly address issues with data and models as well as proactively analyze model performance and availability.
This is a public repo where code samples are stored for the book Practical MLOps.
[Book-2021] Practical MLOps O'Reilly Book
Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices
Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API.
Chartreuse: Automated Alembic migrations within kubernetes
Chartreuse: Automated Alembic SQL schema migrations within kubernetes "How to automate management of Alembic database schema migration at scale using
Write maintainable, production-ready pipelines using Jupyter or your favorite text editor. Develop locally, deploy to the cloud. ☁️
Write maintainable, production-ready pipelines using Jupyter or your favorite text editor. Develop locally, deploy to the cloud. ☁️
A Tools that help Data Scientists and ML engineers train and deploy ML models.
Domino Research This repo contains projects under active development by the Domino R&D team. We build tools that help Data Scientists and ML engineers
A zero-dependency Python library for getting the Kubernetes token of a AWS EKS cluster
tokeks A zero-dependency Python library for getting the Kubernetes token of a AWS EKS cluster. No AWS CLI, third-party client or library (boto3, botoc
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions
Learn how to responsibly deliver value with ML.
Made With ML Applied ML · MLOps · Production Join 30K+ developers in learning how to responsibly deliver value with ML. 🔥 Among the top MLOps reposit
Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code
A Python framework for creating reproducible, maintainable and modular data science code.
✨Rubrix is a production-ready Python framework for exploring, annotating, and managing data in NLP projects.
✨A Python framework to explore, label, and monitor data for NLP projects
A real-time tech course finder, created using Elasticsearch, Python, React+Redux, Docker, and Kubernetes.
A real-time tech course finder, created using Elasticsearch, Python, React+Redux, Docker, and Kubernetes.
Slack bot for monitoring your Metaflow flows!
Metaflowbot - Slack Bot for your Metaflow flows! Metaflowbot makes it fun and easy to monitor your Metaflow runs, past and present. Imagine starting a
Evidently helps analyze machine learning models during validation or production monitoring
Evidently helps analyze machine learning models during validation or production monitoring. The tool generates interactive visual reports and JSON profiles from pandas DataFrame or csv files. Currently 6 reports are available.
Rubrix is a free and open-source tool for exploring and iterating on data for artificial intelligence projects.
Open-source tool for exploring, labeling, and monitoring data for AI projects
The MLOps platform for innovators 🚀
DS2.ai is an integrated AI operation solution that supports all stages from custom AI development to deployment. It is an AI-specialized platform service that collects data, builds a training dataset through data labeling, and enables automatic development of artificial intelligence and easy deployment and operation.
🛠 All-in-one web-based IDE specialized for machine learning and data science.
All-in-one web-based development environment for machine learning Getting Started • Features & Screenshots • Support • Report a Bug • FAQ • Known Issu
Another Autoscaler is a Kubernetes controller that automatically starts, stops, or restarts pods from a deployment at a specified time using a cron annotation.
Another Autoscaler Another Autoscaler is a Kubernetes controller that automatically starts, stops, or restarts pods from a deployment at a specified t
OpenDILab RL Kubernetes Custom Resource and Operator Lib
DI Orchestrator DI Orchestrator is designed to manage DI (Decision Intelligence) jobs using Kubernetes Custom Resource and Operator. Prerequisites A w
Another Scheduler is a Kubernetes controller that automatically starts, stops, or restarts pods from a deployment at a specified time using a cron annotation.
Another Scheduler Another Scheduler is a Kubernetes controller that automatically starts, stops, or restarts pods from a deployment at a specified tim
Run Oracle on Kubernetes with El Carro
El Carro is a new project that offers a way to run Oracle databases in Kubernetes as a portable, open source, community driven, no vendor lock-in container orchestration system. El Carro provides a powerful declarative API for comprehensive and consistent configuration and deployment as well as for real-time operations and monitoring.
In this repository, I have developed an end to end Automatic speech recognition project. I have developed the neural network model for automatic speech recognition with PyTorch and used MLflow to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
End to End Automatic Speech Recognition In this repository, I have developed an end to end Automatic speech recognition project. I have developed the
⏳ Tempo: The MLOps Software Development Kit
Tempo provides a unified interface to multiple MLOps projects that enable data scientists to deploy and productionise machine learning systems.
A template repository for submitting a job to the Slurm Cluster installed at the DISI - University of Bologna
Cluster di HPC con GPU per esperimenti di calcolo (draft version 1.0) Per poter utilizzare il cluster il primo passo è abilitare l'account istituziona
A comprehensive reference for all topics related to building and maintaining microservices
This pandect (πανδέκτης is Ancient Greek for encyclopedia) was created to help you find and understand almost anything related to Microservices that i
Inferoxy is a service for quick deploying and using dockerized Computer Vision models.
Inferoxy is a service for quick deploying and using dockerized Computer Vision models. It's a core of EORA's Computer Vision platform Vision Hub that runs on top of AWS EKS.
MySQL Operator for Kubernetes
MySQL Operator for Kubernetes The MYSQL Operator for Kubernetes is an Operator for Kubernetes managing MySQL InnoDB Cluster setups inside a Kubernetes
A Simple script to hunt unused Kubernetes resources.
K8SPurger A Simple script to hunt unused Kubernetes resources. Release History Release 0.3 Added Ingress Added Services Account Adding RoleBindding Re
A declarative Kubeflow Management Tool inspired by Terraform
🍭 KRSH is Alpha version, so many bugs can be reported. If you find a bug, please write an Issue and grow the project together! A declarative Kubeflow
DataOps framework for Machine Learning projects.
Noronha DataOps Noronha is a Python framework designed to help you orchestrate and manage ML projects life-cycle. It hosts Machine Learning models ins
Sample Dockerized flask app deployed on Kubernetes on Azure using AKS
Sample Dockerized flask app deployed on Kubernetes on Azure using AKS
Graphsignal Logger
Graphsignal Logger Overview Graphsignal is an observability platform for monitoring and troubleshooting production machine learning applications. It h
framework providing automatic constructions of vulnerable infrastructures
中文 | English 1 Introduction Metarget = meta- + target, a framework providing automatic constructions of vulnerable infrastructures, used to deploy sim
Diffgram - Supervised Learning Data Platform
Data Annotation, Data Labeling, Annotation Tooling, Training Data for Machine Learning
A very lightweight monitoring system for Raspberry Pi clusters running Kubernetes.
OMNI A very lightweight monitoring system for Raspberry Pi clusters running Kubernetes. Why? When I finished my Kubernetes cluster using a few Raspber
Policy and data administration, distribution, and real-time updates on top of Open Policy Agent
⚡ OPAL ⚡ Open Policy Administration Layer OPAL is an administration layer for Open Policy Agent (OPA), detecting changes to both policy and policy dat
Python 3.6+ toolbox for submitting jobs to Slurm
Submit it! What is submitit? Submitit is a lightweight tool for submitting Python functions for computation within a Slurm cluster. It basically wraps
Toy example of an applied ML pipeline for me to experiment with MLOps tools.
Toy Machine Learning Pipeline Table of Contents About Getting Started ML task description and evaluation procedure Dataset description Repository stru
Machine Learning Platform for Kubernetes
Reproduce, Automate, Scale your data science. Welcome to Polyaxon, a platform for building, training, and monitoring large scale deep learning applica
Analytics service that is part of iter8. Robust analytics and control to unleash cloud-native continuous experimentation.
iter8-analytics iter8 enables statistically robust continuous experimentation of microservices in your CI/CD pipelines. For in-depth information about
A Kubernetes operator that creates UptimeRobot monitors for your ingresses
This operator automatically creates uptime monitors at UptimeRobot for your Kubernetes Ingress resources. This allows you to easily integrate uptime monitoring of your services into your Kubernetes deployments.
Testinfra test your infrastructures
Testinfra test your infrastructure Latest documentation: https://testinfra.readthedocs.io/en/latest About With Testinfra you can write unit tests in P
Official Python client library for kubernetes
Kubernetes Python Client Python client for the kubernetes API. Installation From source: git clone --recursive https://github.com/kubernetes-client/py
Deploy a ML inference service on a budget in less than 10 lines of code.
BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end.
A Blazing fast Security Auditing tool for Kubernetes
A Blazing fast Security Auditing tool for kubernetes!! Basic Overview Kubestriker performs numerous in depth checks on kubernetes infra to identify th
This repository contains code examples and documentation for learning how applications can be developed with Kubernetes
BigBitBus KAT Components Click on the diagram to enlarge, or follow this link for detailed documentation Introduction Welcome to the BigBitBus Kuberne
Backtest 1000s of minute-by-minute trading algorithms for training AI with automated pricing data from: IEX, Tradier and FinViz. Datasets and trading performance automatically published to S3 for building AI training datasets for teaching DNNs how to trade. Runs on Kubernetes and docker-compose. 150 million trading history rows generated from +5000 algorithms. Heads up: Yahoo's Finance API was disabled on 2019-01-03 https://developer.yahoo.com/yql/
Stock Analysis Engine Build and tune investment algorithms for use with artificial intelligence (deep neural networks) with a distributed stack for ru
Determined: Deep Learning Training Platform
Determined: Deep Learning Training Platform Determined is an open-source deep learning training platform that makes building models fast and easy. Det
Model Serving Made Easy
The easiest way to build Machine Learning APIs BentoML makes moving trained ML models to production easy: Package models trained with any ML framework
Best Practices on Recommendation Systems
Recommenders What's New (February 4, 2021) We have a new relase Recommenders 2021.2! It comes with lots of bug fixes, optimizations and 3 new algorith
Model serving at scale
Run inference at scale Cortex is an open source platform for large-scale machine learning inference workloads. Workloads Realtime APIs - respond to pr
Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.
Couler What is Couler? Couler aims to provide a unified interface for constructing and managing workflows on different workflow engines, such as Argo
Kubernetes shell: An integrated shell for working with the Kubernetes
kube-shell Kube-shell: An integrated shell for working with the Kubernetes CLI Under the hood kube-shell still calls kubectl. Kube-shell aims to provi