126 Python Pipelines-yaml Libraries

Run containerized, rootless applications with podman

Why? restrict scope of file system access run any application without root privileges creates usable "Desktop applications" to integrate into your nor

119 Dec 27, 2022

Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

Optimum Transformers Accelerated NLP pipelines for fast inference 🚀 on CPU and GPU. Built with 🤗 Transformers, Optimum and ONNX runtime. Installatio

115 Dec 16, 2022

Open-source data observability for modern data teams

Use cases Monitor your data warehouse in minutes: Data anomalies monitoring as dbt tests Data lineage made simple, reliable, and automated dbt operati

889 Jan 1, 2023

Repository for DCA0305, an undergraduate course about Machine Learning Workflows and Pipelines

Federal University of Rio Grande do Norte Technology Center Department of Computer Engineering and Automation Machine Learning Based Systems Design Re

81 Oct 18, 2022

A simple guide to MLOps through ZenML and its various integrations.

ZenBytes Join our Slack Community and become part of the ZenML family Give the main ZenML repo a GitHub star to show your love ZenBytes is a series of

127 Dec 27, 2022

A way to store images in YAML.

YAMLImg A way to store images in YAML. I made this after seeing Roadcrosser's JSON-G because it was too inspiring to ignore this opportunity. Installa

5 Mar 14, 2022

PyQt6 configuration in yaml format providing the most simple script.

PyamlQt（ぴゃむるきゅーと） PyQt6 configuration in yaml format providing the most simple script. Requirements yaml PyQt6, ( PyQt5 ) Installation pip install Pya

7 Aug 15, 2022

Fast and customizable reconnaissance workflow tool based on simple YAML based DSL.

Fast and customizable reconnaissance workflow tool based on simple YAML based DSL, with support of notifications and distributed workload of that work

3 Mar 11, 2022

🎡 Build Python wheels for all the platforms on CI with minimal configuration.

cibuildwheel Documentation Python wheels are great. Building them across Mac, Linux, Windows, on multiple versions of Python, is not. cibuildwheel is

1.3k Jan 2, 2023

Yamale (ya·ma·lē) - A schema and validator for YAML.

Yamale (ya·ma·lē) ⚠️ Ensure that your schema definitions come from internal or trusted sources. Yamale does not protect against intentionally maliciou

534 Dec 21, 2022

MLOps pipeline project using Amazon SageMaker Pipelines

This project shows steps to build an end to end MLOps architecture that covers data prep, model training, realtime and batch inference, build model registry, track lineage of artifacts and model drift detection. It utilizes SageMaker Pipelines that offers machine learning (ML) to orchestrate SageMaker jobs and author reproducible ML pipelines.

3 Sep 16, 2022

Vertex AI: Serverless framework for MLOPs (ESP / ENG)

Vertex AI: Serverless framework for MLOPs (ESP / ENG) Español Qué es esto? Este repo contiene un pipeline end to end diseñado usando el SDK de Kubeflo

2 Apr 28, 2022

Convert monolithic Jupyter notebooks into Ploomber pipelines.

65 Dec 16, 2022

spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines

spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines spaCy-wrap is minimal library intended for wrapping fine-tuned transformers from t

32 Dec 29, 2022

Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies

An Analysis on Ensemble Learning optimized Medical Image Classification with Deep Convolutional Neural Networks Novel and high-performance medical ima

14 Dec 18, 2022

A beginner’s guide to train and deploy machine learning pipelines in Python using PyCaret

This model involves Insurance bill prediction, which was subsequently deployed on Heroku PaaS

1 Jan 27, 2022

A tiny Configuration File Parser for Python Projects

A tiny Configuration File Parser for Python Projects. Currently working on JSON Config Files only.

1 Feb 12, 2022

Python YAML Environment (ymlenv) by Problem Fighter Library

In the name of God, the Most Gracious, the Most Merciful. PF-PY-YMLEnv Documentation Install and update using pip: pip install -U PF-PY-YMLEnv Please

2 Jan 20, 2022

SynapseML - an open source library to simplify the creation of scalable machine learning pipelines

Synapse Machine Learning SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. Sy

3.9k Dec 30, 2022

PHOTONAI is a high level python API for designing and optimizing machine learning pipelines.

PHOTONAI is a high level python API for designing and optimizing machine learning pipelines. We've created a system in which you can easily select and

Medical Machine Learning Lab - University of Münster

57 Nov 12, 2022

Advanced raster and geometry manipulations

buzzard In a nutshell, the buzzard library provides powerful abstractions to manipulate together images and geometries that come from different kind o

30 Jun 20, 2022

The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.

50 Dec 22, 2022

An API-first distributed deployment system of deep learning models using timeseries data to analyze and predict systems behaviour

Gordo Building thousands of models with timeseries data to monitor systems. Table of content About Examples Install Uninstall Developer manual How to

26 Dec 27, 2022

Medical appointments No-Show classifier

Medical Appointments No-shows Why do 20% of patients miss their scheduled appointments? A person makes a doctor appointment, receives all the instruct

4 Apr 20, 2022

Dag-bakery - Dag Bakery enables the capability to define Airflow DAGs via YAML.

DAG Bakery - WIP 🔧 dag-bakery aims to simplify our DAG development by removing all the boilerplate and duplicated code when defining multiple DAG cro

2 Jan 8, 2022

Codeflare - Scale complex AI/ML pipelines anywhere

Scale complex AI/ML pipelines anywhere CodeFlare is a framework to simplify the integration, scaling and acceleration of complex multi-step analytics

169 Nov 29, 2022

Cleaning-utils - a collection of small Python functions and classes which make cleaning pipelines shorter and easier

cleaning-utils [] [] [] cleaning-utils is a collection of small Python functions

4 Aug 31, 2022

Create Python API documentation in Markdown format.

Pydoc-Markdown Pydoc-Markdown is a tool and library to create Python API documentation in Markdown format based on lib2to3, allowing it to parse your

375 Jan 5, 2023

YAML metadata extension for Python-Markdown

YAML metadata extension for Python-Markdown This extension adds YAML meta data handling to markdown with all YAML features. As in the original, metada

14 Dec 30, 2022

whylogs: A Data and Machine Learning Logging Standard

whylogs: A Data and Machine Learning Logging Standard whylogs is an open source standard for data and ML logging whylogs logging agent is the easiest

2k Jan 6, 2023

PyTorch Implementation for Deep Metric Learning Pipelines

Easily Extendable Basic Deep Metric Learning Pipeline Karsten Roth ([email protected]), Biagio Brattoli ([email protected]) When using thi

543 Jan 4, 2023

Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size.

Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size. The hub data layout enables rapid transformations and streaming of data while training models at scale. Hub is used by Google, Waymo, Red Cross, Oxford University, and Omdena.

5.1k Jan 8, 2023

A python-based static site generator for setting up a CV/Resume site

ezcv A python-based static site generator for setting up a CV/Resume site Table of Contents What does ezcv do? Features & Roadmap Why should I use ezc

5 Oct 25, 2022

Sqlalchemy seeder that supports nested relationships.

sqlalchemyseed Sqlalchemy seeder that supports nested relationships. Supported file types json yaml csv Installation Default installation pip install

10 Aug 13, 2022

pypyr task-runner cli & api for automation pipelines.

pypyr task-runner cli & api for automation pipelines. Automate anything by combining commands, different scripts in different languages & applications into one pipeline process.

471 Dec 15, 2022

synchronize projects via yaml/json manifest. built on libvcs

vcspull - synchronize your repos. built on libvcs Manage your commonly used repos from YAML / JSON manifest(s). Compare to myrepos. Great if you use t

200 Dec 20, 2022

A python library for parsing multiple types of config files, envvars & command line arguments that takes the headache out of setting app configurations.

parse_it A python library for parsing multiple types of config files, envvars and command line arguments that takes the headache out of setting app co

97 Oct 22, 2022

The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines.

The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines. It includes tools for downloading pipelines and their dependencies and tools for measuring their performace.

8 Dec 4, 2022

Spline is a tool that is capable of running locally as well as part of well known pipelines like Jenkins (Jenkinsfile), Travis CI (.travis.yml) or similar ones.

Welcome to spline - the pipeline tool Important note: Since change in my job I didn't had the chance to continue on this project. My main new project

29 Aug 22, 2022

YAML-formatted plain-text file based models for Flask backed by Flask-SQLAlchemy

Flask-FileAlchemy Flask-FileAlchemy is a Flask extension that lets you use Markdown or YAML formatted plain-text files as the main data store for your

20 Dec 14, 2022

simple way to build the declarative and destributed data pipelines with python

unipipeline simple way to build the declarative and distributed data pipelines. Why you should use it Declarative strict config Scaffolding Fully type

0 Jan 26, 2022

Python library for creating data pipelines with chain functional programming

PyFunctional Features PyFunctional makes creating data pipelines easy by using chained functional operators. Here are a few examples of what it can do

2.1k Jan 5, 2023

Yet another serialization library on top of dataclasses, inspired by serde-rs.

pyserde Yet another serialization library on top of dataclasses, inspired by serde-rs. Guide | API Docs | Examples Overview Declare a class with pyser

164 Jan 5, 2023

Apt2sbom python package generates SPDX or YAML files

Welcome to apt2sbom This package contains a library and a CLI tool to convert a Ubuntu software package inventory to a software bill of materials. You

15 Nov 13, 2022

Pydantic model generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.

datamodel-code-generator This code generator creates pydantic model from an openapi file and others. Help See documentation for more details. Supporte

1.3k Dec 29, 2022

A C-like hardware description language (HDL) adding high level synthesis(HLS)-like automatic pipelining as a language construct/compiler feature.

██████╗ ██╗██████╗ ███████╗██╗ ██╗███╗ ██╗███████╗ ██████╗ ██╔══██╗██║██╔══██╗██╔════╝██║ ██║████╗ ██║██╔════╝██╔════╝ ██████╔╝██║██████╔╝█

391 Jan 1, 2023

dict subclass with keylist/keypath support, normalized I/O operations (base64, csv, ini, json, pickle, plist, query-string, toml, xml, yaml) and many utilities.

python-benedict python-benedict is a dict subclass with keylist/keypath support, I/O shortcuts (base64, csv, ini, json, pickle, plist, query-string, t

799 Jan 9, 2023

Automates the fixing of problems reported by yamllint by parsing its output

yamlfixer yamlfixer automates the fixing of problems reported by yamllint by parsing its output. Usage This software automatically fixes some errors a

26 Dec 26, 2022

ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.

ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. It has a simple, flexible syntax, is cloud and tool agnostic, and has interfaces/abstractions that are catered towards ML workflows.

2.6k Dec 27, 2022

Data pipelines for both TensorFlow and PyTorch!

rapidnlp-datasets Data pipelines for both TensorFlow and PyTorch ! If you want to load public datasets, try: tensorflow/datasets huggingface/datasets

1 Dec 8, 2021

Upgini : data search library for your machine learning pipelines

Automated data search library for your machine learning pipelines → find & deliver relevant external data & features to boost ML accuracy :chart_with_upwards_trend:

175 Jan 8, 2023

Media Replay Engine (MRE) is a framework to build automated video clipping and replay (highlight) generation pipelines for live and video-on-demand content.

Media Replay Engine (MRE) is a framework for building automated video clipping and replay (highlight) generation pipelines using AWS services for live

30 Nov 29, 2022

A kedro-plugin to serve Kedro Pipelines as API

General informations Software repository Latest release Total downloads Pypi Code health Branch Tests Coverage Links Documentation Deployment Activity

12 Jul 15, 2022

Primitives for machine learning and data science.

An Open Source Project from the Data to AI Lab, at MIT MLPrimitives Pipelines and primitives for machine learning and data science. Documentation: htt

65 Dec 29, 2022

A dictionary that can be flattened and re-inflated

deflatable-dict A dictionary that can be flattened and re-inflated. Particularly useful if you're interacting with yaml, for example. Installation wit

2 Oct 18, 2021

Code for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators"

Query Variation Generators This repository contains the code and annotation data for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelin

12 Nov 20, 2022

Data pipelines built with polars

valves Warning: the project is very much work in progress. Valves is a collection of functions for your data .pipe()-lines. This project aimes to host

14 Jan 3, 2023

Scikit-Learn useful pre-defined Pipelines Hub

Scikit-Pipes Scikit-Learn useful pre-defined Pipelines Hub Usage: Install scikit-pipes It's advised to install sklearn-genetic using a virtual env, in

1 Apr 26, 2022

Deep Learning Pipelines for Apache Spark

Deep Learning Pipelines for Apache Spark The repo only contains HorovodRunner code for local CI and API docs. To use HorovodRunner for distributed tra

2k Jan 8, 2023

A DSL for data-driven computational pipelines

"Dataflow variables are spectacularly expressive in concurrent programming" Henri E. Bal , Jennifer G. Steiner , Andrew S. Tanenbaum Quick overview Ne

1.9k Jan 3, 2023

Yaml - Loggers are like print() statements

Upgrade your print statements Loggers are like print() statements except they also include loads of other metadata: timestamp msg (same as print!) arg

38 Jul 20, 2022

Pypeln is a simple yet powerful Python library for creating concurrent data pipelines.

Pypeln Pypeln (pronounced as "pypeline") is a simple yet powerful Python library for creating concurrent data pipelines. Main Features Simple: Pypeln

1.4k Dec 31, 2022

What if home automation was homoiconic? Just transformations of data? No more YAML!

radiale what if home-automation was also homoiconic? The upper or proximal row contains three bones, to which Gegenbaur has applied the terms radiale,

21 Mar 26, 2022

Building house price data pipelines with Apache Beam and Spark on GCP

This project contains the process from building a web crawler to extract the raw data of house price to create ETL pipelines using Google Could Platform services.

1 Nov 22, 2021

An orchestration platform for the development, production, and observation of data assets.

Dagster An orchestration platform for the development, production, and observation of data assets. Dagster lets you define jobs in terms of the data f

6.2k Jan 8, 2023

CiteURL is an extensible tool that parses legal citations and makes links to websites where you can read the cited language for free.

CiteURL is an extensible tool that parses legal citations and makes links to websites where you can read the cited language for free. It can be used t

15 Dec 27, 2022

Simple dataclasses configuration management for Python with hocon/json/yaml/properties/env-vars/dict support.

Simple dataclasses configuration management for Python with hocon/json/yaml/properties/env-vars/dict support, based on awesome and lightweight pyhocon parsing library.

62 Dec 23, 2022

A YAML validator for Programming Historian lessons.

phyaml A simple YAML validator for Programming Historian lessons. USAGE: python3 ph-lesson-yaml-validator.py lesson.md The script automatically detect

1 Nov 7, 2021

ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.

ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. It has a simple, flexible syntax, is cloud and tool agnostic, and has interfaces/abstractions that are catered towards ML workflows.

2.6k Jan 8, 2023

This tool parses log data and allows to define analysis pipelines for anomaly detection.

logdata-anomaly-miner This tool parses log data and allows to define analysis pipelines for anomaly detection. It was designed to run the analysis wit

32 Nov 27, 2022

Streamz helps you build pipelines to manage continuous streams of data

Streamz helps you build pipelines to manage continuous streams of data. It is simple to use in simple cases, but also supports complex pipelines that involve branching, joining, flow control, feedback, back pressure, and so on.

1.1k Dec 28, 2022

Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on Kubernetes simple, portable, and scalable.

SDK: Overview of the Kubeflow pipelines service Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on

3.1k Jan 6, 2023

A super simple terminal command shortener 🐟

pcmd A super simple terminal command shortener 🐟 Source code : https://github.com/j0fiN/pcmd Documentation : https://j0fin.github.io/pcmd About Durin

9 Mar 2, 2022

AI pipelines for Nvidia Jetson Platform

Jetson Multicamera Pipelines Easy-to-use realtime CV/AI pipelines for Nvidia Jetson Platform. This project: Builds a typical multi-camera pipeline, i.

96 Dec 23, 2022

PipeChain is a utility library for creating functional pipelines.

PipeChain Motivation PipeChain is a utility library for creating functional pipelines. Let's start with a motivating example. We have a list of Austra

2 Aug 7, 2022

Python Fstab Generator is a small Python script to write and generate /etc/fstab files based on yaml file on Unix-like systems.

PyFstab Generator PyFstab Generator is a small Python script to write and generate /etc/fstab files based on yaml file on Unix-like systems. NOTE : Th

2 Nov 9, 2021

🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.

28 Dec 22, 2022

Using machine learning to predict and analyze high and low reader engagement for New York Times articles posted to Facebook.

How The New York Times can increase Engagement on Facebook Using machine learning to understand characteristics of news content that garners "high" Fa

0 Sep 16, 2021

QSIprep: Preprocessing and analysis of q-space images

QSIprep: Preprocessing and analysis of q-space images Full documentation at https://qsiprep.readthedocs.io About qsiprep configures pipelines for proc

Lifespan Informatics and Neuroimaging Center

88 Dec 15, 2022

ETL flow framework based on Yaml configs in Python

ETL framework based on Yaml configs in Python A light framework for creating data streams. Setting up streams through configuration in the Yaml file.

18 Jul 6, 2022

Automatizando a criação de DAGs usando Jinja e YAML

Automatizando a criação de DAGs no Airflow usando Jinja e YAML Arquitetura do Repo: Pastas por contexto de negócio (ex: Marketing, Analytics, HR, etc)

5 Oct 19, 2021

A project based example of Data pipelines, ML workflow management, API endpoints and Monitoring.

MLOps template with examples for Data pipelines, ML workflow management, API development and Monitoring.

33 Dec 3, 2022

Multi-Branch CI/CD Pipeline using CDK Pipelines.

Using AWS CDK Pipelines and AWS Lambda for multi-branch pipeline management and infrastructure deployment. This project shows how to use the AWS CDK P

36 Dec 23, 2022

A simple tool that updates your pubspec.yaml file, of a Flutter project, without altering the structure of your file.

3 Dec 10, 2021

Pydantic-ish YAML configuration management.

18 Oct 27, 2022

Write maintainable, production-ready pipelines using Jupyter or your favorite text editor. Develop locally, deploy to the cloud. ☁️

2.9k Jan 6, 2023

Pipelines de datos, 2021.

Este repo ilustra un proceso sencillo de automatización de transformación y modelado de datos, a través de un pipeline utilizando Luigi. Stack princip

8 May 19, 2022

Elementary is an open-source data reliability framework for modern data teams. The first module of the framework is data lineage.

Data lineage made simple, reliable, and automated. Effortlessly track the flow of data, understand dependencies and analyze impact. Features Visualiza

898 Jan 9, 2023

A machine learning library for spiking neural networks. Supports training with both torch and jax pipelines, and deployment to neuromorphic hardware.

Rockpool Rockpool is a Python package for developing signal processing applications with spiking neural networks. Rockpool allows you to build network

21 Dec 14, 2022

Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code

A Python framework for creating reproducible, maintainable and modular data science code.

7.9k Jan 1, 2023

txtai executes machine-learning workflows to transform data and build AI-powered semantic search applications.

3.1k Dec 31, 2022

Framework for creating efficient data processing pipelines

Aqueduct Framework for creating efficient data processing pipelines. Contact Feel free to ask questions in telegram t.me/avito-ml Key Features Increas

137 Dec 29, 2022

CPOST is a CLI tool to assist with the proper sizing of Clara Deploy pipelines

CPOST (Clara Pipeline Operator Sizing Tool) Tool to measure resource usage of Clara Platform pipeline operators Cpost is a tool that will help you run

5 Sep 27, 2021

This solution helps you deploy Data Lake Infrastructure on AWS using CDK Pipelines.

CDK Pipelines for Data Lake Infrastructure Deployment This solution helps you deploy data lake infrastructure on AWS using CDK Pipelines. This is base

66 Nov 23, 2022

Automated Integration Testing and Live Documentation for your API

1.3k Dec 30, 2022

🤗 Push your spaCy pipelines to the Hugging Face Hub

spacy-huggingface-hub: Push your spaCy pipelines to the Hugging Face Hub This package provides a CLI command for uploading any trained spaCy pipeline

30 Oct 9, 2022

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather than invoking the Python interpreter, Tuplex generates optimized LLVM bytecode for the given pipeline and input data set.

791 Jan 4, 2023

Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.

deep_autoviml Build keras pipelines and models in a single line of code! Table of Contents Motivation How it works Technology Install Usage API Image

102 Dec 17, 2022

Lumen provides a framework for visual analytics, which allows users to build data-driven dashboards from a simple yaml specification

Lumen project provides a framework for visual analytics, which allows users to build data-driven dashboards from a simple yaml specification

120 Jan 4, 2023

Orchest is a browser based IDE for Data Science.

Orchest is a browser based IDE for Data Science. It integrates your favorite Data Science tools out of the box, so you don’t have to. The application is easy to use and can run on your laptop as well as on a large scale cloud cluster.

3.6k Jan 9, 2023

Python Pipelines-yaml Resources

Python pipelines-yaml Libraries

Run containerized, rootless applications with podman

Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

Open-source data observability for modern data teams

Repository for DCA0305, an undergraduate course about Machine Learning Workflows and Pipelines

A simple guide to MLOps through ZenML and its various integrations.

A way to store images in YAML.

PyQt6 configuration in yaml format providing the most simple script.

Fast and customizable reconnaissance workflow tool based on simple YAML based DSL.

🎡 Build Python wheels for all the platforms on CI with minimal configuration.

Yamale (ya·ma·lē) - A schema and validator for YAML.

MLOps pipeline project using Amazon SageMaker Pipelines

Vertex AI: Serverless framework for MLOPs (ESP / ENG)

Convert monolithic Jupyter notebooks into Ploomber pipelines.

spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines

Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies

A beginner’s guide to train and deploy machine learning pipelines in Python using PyCaret

A tiny Configuration File Parser for Python Projects

Python YAML Environment (ymlenv) by Problem Fighter Library

SynapseML - an open source library to simplify the creation of scalable machine learning pipelines

PHOTONAI is a high level python API for designing and optimizing machine learning pipelines.

Advanced raster and geometry manipulations

The RAP community of practice includes all analysts and data scientists who are interested in adopting the working practices included in reproducible analytical pipelines (RAP) at NHS Digital.

An API-first distributed deployment system of deep learning models using timeseries data to analyze and predict systems behaviour

Medical appointments No-Show classifier

Dag-bakery - Dag Bakery enables the capability to define Airflow DAGs via YAML.

Codeflare - Scale complex AI/ML pipelines anywhere

Cleaning-utils - a collection of small Python functions and classes which make cleaning pipelines shorter and easier

Create Python API documentation in Markdown format.

YAML metadata extension for Python-Markdown

whylogs: A Data and Machine Learning Logging Standard

PyTorch Implementation for Deep Metric Learning Pipelines

Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size.

A python-based static site generator for setting up a CV/Resume site

Sqlalchemy seeder that supports nested relationships.

pypyr task-runner cli & api for automation pipelines.

synchronize projects via yaml/json manifest. built on libvcs

A python library for parsing multiple types of config files, envvars & command line arguments that takes the headache out of setting app configurations.

The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines.

Spline is a tool that is capable of running locally as well as part of well known pipelines like Jenkins (Jenkinsfile), Travis CI (.travis.yml) or similar ones.

YAML-formatted plain-text file based models for Flask backed by Flask-SQLAlchemy

simple way to build the declarative and destributed data pipelines with python

Python library for creating data pipelines with chain functional programming

Yet another serialization library on top of dataclasses, inspired by serde-rs.

Apt2sbom python package generates SPDX or YAML files

Pydantic model generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.

A C-like hardware description language (HDL) adding high level synthesis(HLS)-like automatic pipelining as a language construct/compiler feature.

dict subclass with keylist/keypath support, normalized I/O operations (base64, csv, ini, json, pickle, plist, query-string, toml, xml, yaml) and many utilities.

Automates the fixing of problems reported by yamllint by parsing its output

ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.

Data pipelines for both TensorFlow and PyTorch!

Upgini : data search library for your machine learning pipelines

Media Replay Engine (MRE) is a framework to build automated video clipping and replay (highlight) generation pipelines for live and video-on-demand content.

A kedro-plugin to serve Kedro Pipelines as API

Primitives for machine learning and data science.

A dictionary that can be flattened and re-inflated

Code for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators"

Data pipelines built with polars

Scikit-Learn useful pre-defined Pipelines Hub

Deep Learning Pipelines for Apache Spark

A DSL for data-driven computational pipelines

Yaml - Loggers are like print() statements

Pypeln is a simple yet powerful Python library for creating concurrent data pipelines.

What if home automation was homoiconic? Just transformations of data? No more YAML!

Building house price data pipelines with Apache Beam and Spark on GCP

An orchestration platform for the development, production, and observation of data assets.

CiteURL is an extensible tool that parses legal citations and makes links to websites where you can read the cited language for free.

Simple dataclasses configuration management for Python with hocon/json/yaml/properties/env-vars/dict support.

A YAML validator for Programming Historian lessons.

ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.

This tool parses log data and allows to define analysis pipelines for anomaly detection.

Streamz helps you build pipelines to manage continuous streams of data

Kubeflow is a machine learning (ML) toolkit that is dedicated to making deployments of ML workflows on Kubernetes simple, portable, and scalable.

A super simple terminal command shortener 🐟

AI pipelines for Nvidia Jetson Platform

PipeChain is a utility library for creating functional pipelines.

Python Fstab Generator is a small Python script to write and generate /etc/fstab files based on yaml file on Unix-like systems.

🎵 A repository for manually annotating files to create labeled acoustic datasets for machine learning.

Using machine learning to predict and analyze high and low reader engagement for New York Times articles posted to Facebook.