180 Repositories
Python mlops-slurm-kubernetes Libraries
Open-Source CI/CD platform for ML teams. Deliver ML products, better & faster. ⚡️🧑🔧
Deliver ML products, better & faster Giskard is an Open-Source CI/CD platform for ML teams. Inspect ML models visually from your Python notebook 📗 Re
Free MLOps course from DataTalks.Club
MLOps Zoomcamp Our MLOps Zoomcamp course Sign up here: https://airtable.com/shrCb8y6eTbPKwSTL (it's not automated, you will not receive an email immed
Sample code and notebooks for Vertex AI, the end-to-end machine learning platform on Google Cloud
Google Cloud Vertex AI Samples Welcome to the Google Cloud Vertex AI sample repository. Overview The repository contains notebooks and community conte
[Machine Learning Engineer Basic Guide] 부스트캠프 AI Tech - Product Serving 자료
Boostcamp-AI-Tech-Product-Serving 부스트캠프 AI Tech - Product Serving 자료 Repository 구조 part1(MLOps 개론, Model Serving, 머신러닝 프로젝트 라이프 사이클은 별도의 코드가 없으며, part
This project shows how to serve an TF based image classification model as a web service with TFServing, Docker, and Kubernetes(GKE).
Deploying ML models with CPU based TFServing, Docker, and Kubernetes By: Chansung Park and Sayak Paul This project shows how to serve a TensorFlow ima
Detecting silent model failure. NannyML estimates performance with an algorithm called Confidence-based Performance estimation (CBPE), developed by core contributors. It is the only open-source algorithm capable of fully capturing the impact of data drift on performance.
Website • Docs • Community Slack 💡 What is NannyML? NannyML is an open-source python library that allows you to estimate post-deployment model perfor
This project shows how to serve an ONNX-optimized image classification model as a web service with FastAPI, Docker, and Kubernetes.
Deploying ML models with FastAPI, Docker, and Kubernetes By: Sayak Paul and Chansung Park This project shows how to serve an ONNX-optimized image clas
Azure MLOps (v2) solution accelerators.
Azure MLOps (v2) solution accelerator Welcome to the MLOps (v2) solution accelerator repository! This project is intended to serve as the starting poi
Responsible AI Workshop: a series of tutorials & walkthroughs to illustrate how put responsible AI into practice
Responsible AI Workshop Responsible innovation is top of mind. As such, the tech industry as well as a growing number of organizations of all kinds in
Resources for the AMLD 2022 workshop "DevOps on AWS"
MLOPS on AWS | AMLD 2022 This repository contains all the resources necessary to follow along and reproduce the workshop "MLOps on AWS: a Hands-On Tut
A basic instruction for Kubernetes setup and understanding.
A basic instruction for Kubernetes setup and understanding Module ID Module Guide - Install Kubernetes Cluster k8s-install 3 Docker Core Technology mo
Repository for DCA0305, an undergraduate course about Machine Learning Workflows and Pipelines
Federal University of Rio Grande do Norte Technology Center Department of Computer Engineering and Automation Machine Learning Based Systems Design Re
Programming assignments and quizzes from all courses within the Machine Learning Engineering for Production (MLOps) specialization offered by deeplearning.ai
Machine Learning Engineering for Production (MLOps) Specialization on Coursera (offered by deeplearning.ai) Programming assignments from all courses i
A simple guide to MLOps through ZenML and its various integrations.
ZenBytes Join our Slack Community and become part of the ZenML family Give the main ZenML repo a GitHub star to show your love ZenBytes is a series of
MLOps pipeline project using Amazon SageMaker Pipelines
This project shows steps to build an end to end MLOps architecture that covers data prep, model training, realtime and batch inference, build model registry, track lineage of artifacts and model drift detection. It utilizes SageMaker Pipelines that offers machine learning (ML) to orchestrate SageMaker jobs and author reproducible ML pipelines.
Connect is a Python Flask project within the cloud-native ecosystem
Connect is a Python Flask project within the cloud-native ecosystem. Second project of Udacity's Cloud Native Nanodegree program, focusing on documenting and architecting a monolith migration to microservices.
A Python function for Slurm, to monitor the GPU information
Gpu-Monitor A Python function for Slurm, where I couldn't use nvidia-smi to monitor the GPU information. whole repo is not finish Installation TODO Mo
Vertex AI: Serverless framework for MLOPs (ESP / ENG)
Vertex AI: Serverless framework for MLOPs (ESP / ENG) Español Qué es esto? Este repo contiene un pipeline end to end diseñado usando el SDK de Kubeflo
Helperpod - A CLI tool to run a Kubernetes utility pod with pre-installed tools that can be used for debugging/testing purposes inside a Kubernetes cluster
Helperpod is a CLI tool to run a Kubernetes utility pod with pre-installed tools that can be used for debugging/testing purposes inside a Kubernetes cluster.
🔗 FusiShort is a URL shortener built with Python, Redis, Docker and Kubernetes
This is a playground application created with goal of applying full cycle software development using popular technologies like Python, Redis, Docker and Kubernetes.
Checkmk kube agent - Checkmk Kubernetes Cluster and Node Collectors
Checkmk Kubernetes Cluster and Node Collectors Checkmk cluster and node collecto
Django-Kubernetes - Learn how to deploy a docker-based Django application into a Kubernetes cluster into production on DigitalOcean
Django & Kubernetes Learn how to deploy a production-ready Django application in
Convert monolithic Jupyter notebooks into Ploomber pipelines.
Soorgeon Join our community | Newsletter | Contact us | Blog | Website | YouTube Convert monolithic Jupyter notebooks into Ploomber pipelines. soorgeo
This example implements the end-to-end MLOps process using Vertex AI platform and Smart Analytics technology capabilities
MLOps with Vertex AI This example implements the end-to-end MLOps process using Vertex AI platform and Smart Analytics technology capabilities. The ex
Kube OpenID Connect is an application that can be used to easily enable authentication flows via OIDC for a kubernetes cluster
Kube OpenID Connect is an application that can be used to easily enable authentication flows via OIDC for a kubernetes cluster. Kubernetes supports OpenID Connect Tokens as a way to identify users who access the cluster. Kube OpenID Connect helps users with it's plugin to authenticate an get kubectl config.
The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it inside a loop of Design, Model Development and Operations.
MLOps The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it insid
Repository for the Demo of using DVC with PyCaret & MLOps (DVC Office Hours - 20th Jan, 2022)
Using DVC with PyCaret & FastAPI (Demo) This repo contains all the resources for my demo explaining how to use DVC along with other interesting tools
Google Kubernetes Engine (GKE) with a Snyk Kubernetes controller installed/configured for Snyk App
Google Kubernetes Engine (GKE) with a Snyk Kubernetes controller installed/configured for Snyk App This example provisions a Google Kubernetes Engine
This library provides an abstraction to perform Model Versioning using Weight & Biases.
Description This library provides an abstraction to perform Model Versioning using Weight & Biases. Features Version a new trained model Promote a mod
My self-hosting infrastructure, fully automated from empty disk to operating services
Khue's Homelab Current status: ALPHA This project utilizes Infrastructure as Code to automate provisioning, operating, and updating self-hosted servic
Utilitaire de contrôle de Kubernetes
Utilitaire de contrôle de Kubernetes ** What is this ??? ** Every time we use a word in English our manager tells us to use the French translation of
The goal of the exercises below is to evaluate the candidate knowledge and problem solving expertise regarding the main development focuses for the iFood ML Platform team: MLOps and Feature Store development.
The goal of the exercises below is to evaluate the candidate knowledge and problem solving expertise regarding the main development focuses for the iFood ML Platform team: MLOps and Feature Store development.
A simple tcpdump sidecar injector to demonstrate Kubernetes's Mutating Webhook
k8s-tcpdump-webhook A simple tcpdump sidecar injector to demonstrate Kubernetes's Mutating Webhook Build and Deploy Build docker image; docker build -
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Awesome production machine learning This repository contains a curated list of awesome open source libraries that will help you deploy, monitor, versi
Play Wordle from any Kubernetes cluster.
wordle-operator 🟩 ⬛ 🟩 🟨 ⬛ Play Wordle from any Kubernetes cluster. Using the power of CustomResourceDefinitions and Kubernetes Operators, now you c
Generate FastAPI projects for high performance applications
Generate FastAPI projects for high performance applications. Based on MVC architectural pattern, WSGI + ASGI. Includes tests, pipeline, base utilities, Helm chart, and script for bootstrapping local Minikube with high available Redis cluster.
This is your launchpad that comes with a variety of applications waiting to run on your kubernetes cluster with a single click
This is your launchpad that comes with a variety of applications waiting to run on your kubernetes cluster with a single click.
End-to-end MLOps pipeline of a BERT model for emotion classification.
image source EmoBERT-MLOps The goal of this repository is to build an end-to-end MLOps pipeline based on the MLOps course from Made with ML, but this
A simple website-based resource monitor for slurm system.
Slurm Web A simple website-based resource monitor for slurm system. Screenshot Required python packages flask, colored, humanize, humanfriendly, beart
Always know what to expect from your data.
Great Expectations Always know what to expect from your data. Introduction Great Expectations helps data teams eliminate pipeline debt, through data t
MLops tools review for execution on multiple cluster types: slurm, kubernetes, dask...
MLops tools review focused on execution using multiple cluster types: slurm, kubernetes, dask...
An API-first distributed deployment system of deep learning models using timeseries data to analyze and predict systems behaviour
Gordo Building thousands of models with timeseries data to monitor systems. Table of content About Examples Install Uninstall Developer manual How to
Kube kombu - Running kombu consumers with support of liveness probe for kubernetes
Setup and Running Kombu consumers Steps: Install python 3.9 or greater on your s
Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort
Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort
Reproducible research and reusable acyclic workflows in Python. Execute code on HPC systems as if you executed them on your personal computer!
Reproducible research and reusable acyclic workflows in Python. Execute code on HPC systems as if you executed them on your machine! Motivation Would
MicroK8s is a small, fast, single-package Kubernetes for developers, IoT and edge.
MicroK8s The smallest, fastest Kubernetes Single-package fully conformant lightweight Kubernetes that works on 42 flavours of Linux. Perfect for: Deve
Ansible for DevOps examples.
Ansible for DevOps Examples This repository contains Ansible examples developed to support different sections of Ansible for DevOps, a book on Ansible
Basic repository showing how to use Hydra + Hydra launchers on SLURM cluster
Slurm-Hydra-Submitit This repository is a minimal working example on how to: setup Hydra setup batch of slurm jobs on top of Hydra via submitit-launch
Autoscaling volumes for Kubernetes (with the help of Prometheus)
Kubernetes Volume Autoscaler (with Prometheus) This repository contains a service that automatically increases the size of a Persistent Volume Claim i
Checkov is a static code analysis tool for infrastructure-as-code.
Checkov - Prevent cloud misconfigurations during build-time for Terraform, Cloudformation, Kubernetes, Serverless framework and other infrastructure-as-code-languages with Checkov by Bridgecrew.
Emissary - open source Kubernetes-native API gateway for microservices built on the Envoy Proxy
Emissary-ingress Emissary-Ingress is an open-source Kubernetes-native API Gateway + Layer 7 load balancer + Kubernetes Ingress built on Envoy Proxy. E
Flask app to predict daily radiation from the time series of Solcast from Islamabad, Pakistan
Solar-radiation-ISB-MLOps - Flask app to predict daily radiation from the time series of Solcast from Islamabad, Pakistan.
PyTorch Lightning + Hydra. A feature-rich template for rapid, scalable and reproducible ML experimentation with best practices. ⚡🔥⚡
Lightning-Hydra-Template A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥 Click on Use this template to initialize new re
whylogs: A Data and Machine Learning Logging Standard
whylogs: A Data and Machine Learning Logging Standard whylogs is an open source standard for data and ML logging whylogs logging agent is the easiest
Template repository to build PyTorch projects from source on any version of PyTorch/CUDA/cuDNN.
The Ultimate PyTorch Source-Build Template Translations: 한국어 TL;DR PyTorch built from source can be x4 faster than a naïve PyTorch install. This repos
Learning and experimenting with Kubernetes
Kubernetes Experiments This repository contains code that I'm using to learn and experiment with Kubernetes. 1. Environment setup minikube kubectl doc
cookiecutter template for web API with python
Python project template for Web API with cookiecutter What's this This provides the project template including minimum test/lint/typechecking package
easy_sbatch - Batch submitting Slurm jobs with script templates
easy_sbatch - Batch submitting Slurm jobs with script templates
Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size.
Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size. The hub data layout enables rapid transformations and streaming of data while training models at scale. Hub is used by Google, Waymo, Red Cross, Oxford University, and Omdena.
Modern, privacy-friendly, and detailed web analytics that works without cookies or JS.
Modern, privacy-friendly, and cookie-free web analytics. Getting started » Screenshots • Features • Office Hours Motivation There are a lot of web ana
A lightweight tool to get an AI Infrastructure Stack up in minutes not days.
K3ai will take care of setup K8s for You, deploy the AI tool of your choice and even run your code on it.
Project Faros is a reference implimentation of Red Hat OpenShift 4 on small footprint, bare-metal clusters.
Project Faros Project Faros is a reference implimentation of Red Hat OpenShift 4 on small footprint, bare-metal clusters. The project includes referen
Neptune client library - integrate your Python scripts with Neptune
Lightweight experiment tracking tool for AI/ML individuals and teams. Fits any workflow. Neptune is a lightweight experiment logging/tracking tool tha
Execute shell command lines in parallel on Slurm, S(on) of Grid Engine (SGE), PBS/Torque clusters
qbatch Execute shell command lines in parallel on Slurm, S(on) of Grid Engine (SGE), PBS/Torque clusters qbatch is a tool for executing commands in pa
Monitor the stability of a pandas or spark dataframe ⚙︎
Population Shift Monitoring popmon is a package that allows one to check the stability of a dataset. popmon works with both pandas and spark datasets.
A set of demo of deploying a Machine Learning Model in production using various methods
Machine Learning Model in Production This git is for those who have concern about serving your machine learning model to production. Overview The tuto
generate HPC scheduler systems jobs input scripts and submit these scripts to HPC systems and poke until they finish
DPDispatcher DPDispatcher is a python package used to generate HPC(High Performance Computing) scheduler systems (Slurm/PBS/LSF/dpcloudserver) jobs in
Feature Store for Machine Learning
Overview Feast is an open source feature store for machine learning. Feast is the fastest path to productionizing analytic data for model training and
Copy a Kubernetes pod and run commands in its environment
copypod Utility for copying a running Kubernetes pod so you can run commands in a copy of its environment, without worrying about it the pod potential
mypy plugin to type check Kubernetes resources
kubernetes-typed mypy plugin to dynamically define types for Kubernetes objects. Features Type checking for Custom Resources Type checking forkubernet
MLReef is an open source ML-Ops platform that helps you collaborate, reproduce and share your Machine Learning work with thousands of other users.
The collaboration platform for Machine Learning MLReef is an open source ML-Ops platform that helps you collaborate, reproduce and share your Machine
🦍 The Cloud-Native API Gateway
Kong or Kong API Gateway is a cloud-native, platform-agnostic, scalable API Gateway distinguished for its high performance and extensibility via plugi
Tyk Open Source API Gateway written in Go, supporting REST, GraphQL, TCP and gRPC protocols
Tyk API Gateway Tyk is an open source Enterprise API Gateway, supporting REST, GraphQL, TCP and gRPC protocols. Tyk Gateway is provided ‘Batteries-inc
Pulumi - Developer-First Infrastructure as Code. Your Cloud, Your Language, Your Way 🚀
Pulumi's Infrastructure as Code SDK is the easiest way to create and deploy cloud software that use containers, serverless functions, hosted services,
Open MLOps - A Production-focused Open-Source Machine Learning Framework
Open MLOps - A Production-focused Open-Source Machine Learning Framework Open MLOps is a set of open-source tools carefully chosen to ease user experi
Machine Learning automation and tracking
The Open-Source MLOps Orchestration Framework MLRun is an open-source MLOps framework that offers an integrative approach to managing your machine-lea
ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.
ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. It has a simple, flexible syntax, is cloud and tool agnostic, and has interfaces/abstractions that are catered towards ML workflows.
Kubernetes-native workflow automation platform for complex, mission-critical data and ML processes at scale. It has been battle-tested at Lyft, Spotify, Freenome, and others and is truly open-source.
Flyte Flyte is a workflow automation platform for complex, mission-critical data, and ML processes at scale Home Page · Quick Start · Documentation ·
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
Seldon Core: Blazing Fast, Industry-Ready ML An open source platform to deploy your machine learning models on Kubernetes at massive scale. Overview S
Reproducible Data Science at Scale!
Pachyderm: The Data Foundation for Machine Learning Pachyderm provides the data layer that allows machine learning teams to productionize and scale th
The Fuzzy Labs guide to the universe of open source MLOps
Open Source MLOps This is the Fuzzy Labs guide to the universe of free and open source MLOps tools. Contents What is MLOps, anyway? Data version contr
Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.
Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.
Machine Learning Toolkit for Kubernetes
Kubeflow the cloud-native platform for machine learning operations - pipelines, training and deployment. Documentation Please refer to the official do
Wiremind Kubernetes helper
Wiremind Kubernetes helper This Python library is a high-level set of Kubernetes Helpers allowing either to manage individual standard Kubernetes cont
Deploy your apps on any Cloud provider in just a few seconds
The simplest way to deploy your apps in the Cloud Deploy your apps on any Cloud providers in just a few seconds ⚡ Qovery Engine is an open-source abst
Toolkit for developing and maintaining ML models
modelkit Python framework for production ML systems. modelkit is a minimalist yet powerful MLOps library for Python, built for people who want to depl
Quick & dirty controller to schedule Kubernetes Jobs later (once)
K8s Jobber Operator Quickly implemented Kubernetes controller to enable scheduling of Jobs at a later time. Usage: To schedule a Job later, Set .spec.
A charmed operator for running PGbouncer on kubernetes.
operator-template Description TODO: Describe your charm in a few paragraphs of Markdown Usage TODO: Provide high-level usage, such as required config
A kedro-plugin to serve Kedro Pipelines as API
General informations Software repository Latest release Total downloads Pypi Code health Branch Tests Coverage Links Documentation Deployment Activity
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Website • Docs • Twitter • Join Slack Community What is Label Studio? Label Studio is an open source data labeling tool. It lets you label data types
Rancher Kubernetes API compatible with RKE, RKE2 and maybe others?
kctl Rancher Kubernetes API compatible with RKE, RKE2 and maybe others? Documentation is WIP. Quickstart pip install --upgrade kctl Usage from lazycls
ClearML - Auto-Magical Suite of tools to streamline your ML workflow. Experiment Manager, MLOps and Data-Management
ClearML - Auto-Magical Suite of tools to streamline your ML workflow Experiment Manager, MLOps and Data-Management ClearML Formerly known as Allegro T
Hubble - Network, Service & Security Observability for Kubernetes using eBPF
Network, Service & Security Observability for Kubernetes What is Hubble? Getting Started Features Service Dependency Graph Metrics & Monitoring Flow V
Demo scripts for the Kubernetes Security Webinar
Kubernetes Security Webinar [in Russian] YouTube video (October 13, 2021) Authors: Artem Yushkovsky (LinkedIn, GitHub) Maxim Mosharov @ Whitespots.io
CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system
CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system
A collection of online resources to help you on your Tech journey.
Everything Tech Resources & Projects About The Project Coming from an engineering background and looking to up skill yourself on a new field can be di
A DSL for data-driven computational pipelines
"Dataflow variables are spectacularly expressive in concurrent programming" Henri E. Bal , Jennifer G. Steiner , Andrew S. Tanenbaum Quick overview Ne
Blazingly-fast :rocket:, rock-solid, local application development :arrow_right: with Kubernetes.
Gefyra Gefyra gives Kubernetes-("cloud-native")-developers a completely new way of writing and testing their applications. Over are the times of custo
Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...
Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...