35 Repositories
Python airflow-dags Libraries
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
Streamify A data pipeline with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more! Description Objective The project will stre
Free Data Engineering course!
Data Engineering Zoomcamp Register in DataTalks.Club's Slack Join the #course-data-engineering channel The videos are published to DataTalks.Club's Yo
Yet another Airflow plugin using CLI command as RESTful api, supports Airflow v2.X.
中文版文档 Airflow Extended API Plugin Airflow Extended API, which export airflow CLI command as REST-ful API to extend the ability of airflow official API
A reproduction repo for a Scheduling bug in AirFlow 2.2.3
A reproduction repo for a Scheduling bug in AirFlow 2.2.3
Airflow ETL With EKS EFS Sagemaker
Airflow ETL With EKS EFS & Sagemaker (en desarrollo) Diagrama de la solución Imp
Compute execution plan: A DAG representation of work that you want to get done. Individual nodes of the DAG could be simple python or shell tasks or complex deeply nested parallel branches or embedded DAGs themselves.
Hello from magnus Magnus provides four capabilities for data teams: Compute execution plan: A DAG representation of work that you want to get done. In
Pizza Orders Data Pipeline Usecase Solved by SQL, Sqoop, HDFS, Hive, Airflow.
PizzaOrders_DataPipeline There is a Tony who is owning a New Pizza shop. He knew that pizza alone was not going to help him get seed funding to expand
Dag-bakery - Dag Bakery enables the capability to define Airflow DAGs via YAML.
DAG Bakery - WIP 🔧 dag-bakery aims to simplify our DAG development by removing all the boilerplate and duplicated code when defining multiple DAG cro
Organseg dags - The repository contains the codebase for multi-organ segmentation with directed acyclic graphs (DAGs) in CT.
Organseg dags - The repository contains the codebase for multi-organ segmentation with directed acyclic graphs (DAGs) in CT.
A lightweight tool to get an AI Infrastructure Stack up in minutes not days.
K3ai will take care of setup K8s for You, deploy the AI tool of your choice and even run your code on it.
Testbed of AI Systems Quality Management
qunomon Description A testbed for testing and managing AI system qualities. Demo Sorry. Not deployment public server at alpha version. Requirement Ins
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache Airflow Apache Airflow (or simply Airflow) is a platform to programmatically author, schedule, and monitor workflows. When workflows are define
Keeper for Ricochet Protocol, implemented with Apache Airflow
Ricochet Keeper This repository contains Apache Airflow DAGs for executing keeper operations for Ricochet Exchange. Usage You will need to run this us
Repositório para estudo do airflow
airflow-101 Repositório para estudo do airflow Docker criado baseado no tutorial Exemplo de API da pokeapi Para executar clone o repo execute as confi
Crypto Stats and Tweets Data Pipeline using Airflow
Crypto Stats and Tweets Data Pipeline using Airflow Introduction Project Overview This project was brought upon through Udacity's nanodegree program.
Sample code for Harry's Airflow online trainng course
Sample code for Harry's Airflow online trainng course You can find the videos on youtube or bilibili. I am working on adding below things: the slide p
Framework for estimating the structures and parameters of Bayesian networks (DAGs) at per-sample resolution
Sample-specific Bayesian Networks A framework for estimating the structures and parameters of Bayesian networks (DAGs) at per-sample or per-patient re
An Airflow operator to call the main function from the dbt-core Python package
airflow-dbt-python An Airflow operator to call the main function from the dbt-core Python package Motivation Airflow running in a managed environment
This is a practice on Airflow, which is building virtual env, installing Airflow and constructing data pipeline (DAGs)
airflow-test This is a practice on Airflow, which is Builing virtualbox env and setting Airflow on that env Installing Airflow using python virtual en
A tutorial presents several practical examples of how to build DAGs in Apache Airflow
Apache Airflow - Python Brasil 2021 Este tutorial apresenta vários exemplos práticos de como construir DAGs no Apache Airflow. Background Apache Airfl
Show you how to integrate Zeppelin with Airflow
Introduction This repository is to show you how to integrate Zeppelin with Airflow. The philosophy behind the ingtegration is to make the transition f
Automatizando a criação de DAGs usando Jinja e YAML
Automatizando a criação de DAGs no Airflow usando Jinja e YAML Arquitetura do Repo: Pastas por contexto de negócio (ex: Marketing, Analytics, HR, etc)
Write maintainable, production-ready pipelines using Jupyter or your favorite text editor. Develop locally, deploy to the cloud. ☁️
Write maintainable, production-ready pipelines using Jupyter or your favorite text editor. Develop locally, deploy to the cloud. ☁️
Project repository of Apache Airflow, deployed on Docker in Amazon EC2 via GitLab.
Airflow on Docker in EC2 + GitLab's CI/CD Personal project for simple data pipeline using Airflow. Airflow will be installed inside Docker container,
Backtest framework based on DAGs
MultitaskQueue It's a simple framework based on three composed concepts: Task: A task is the smaller unit of execution or simple a node in the DAG, ev
Airflow Operator for running Soda SQL scans
Airflow Operator for running Soda SQL scans
The elegance of Airflow + the power of AWS
Orkestra The elegance of Airflow + the power of AWS
Apache Liminal is an end-to-end platform for data engineers & scientists, allowing them to build, train and deploy machine learning models in a robust and agile way
Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.
Cloud-native, data onboarding architecture for the Google Cloud Public Datasets program
Public Datasets Pipelines Cloud-native, data pipeline architecture for onboarding datasets to the Google Cloud Public Datasets Program. Overview Requi
Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow.
ETL Pipeline with Airflow, Spark, s3, MongoDB and Amazon Redshift
Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Viewflow Viewflow is a framework built on the top of Airflow that enables data scientists to create materialized views. It allows data scientists to f
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache Airflow Apache Airflow (or simply Airflow) is a platform to programmatically author, schedule, and monitor workflows. When workflows are define
Zero configuration Airflow plugin that let you manage your DAG files.
simple-dag-editor SimpleDagEditor is a zero configuration plugin for Apache Airflow. It provides a file managing interface that points to your dag_fol
Soda SQL Data testing, monitoring and profiling for SQL accessible data.
Soda SQL Data testing, monitoring and profiling for SQL accessible data. What does Soda SQL do? Soda SQL allows you to Stop your pipeline when bad dat
Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.
Couler What is Couler? Couler aims to provide a unified interface for constructing and managing workflows on different workflow engines, such as Argo