276 Repositories
Python spark-kafka-integration Libraries
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
Streamify A data pipeline with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more! Description Objective The project will stre
Free Data Engineering course!
Data Engineering Zoomcamp Register in DataTalks.Club's Slack Join the #course-data-engineering channel The videos are published to DataTalks.Club's Yo
Nest Protect integration for Home Assistant. This will allow you to integrate your smoke, heat, co and occupancy status real-time in HA.
Nest Protect integration for Home Assistant Custom component for Home Assistant to interact with Nest Protect devices via an undocumented and unoffici
Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along with material in the form of Jupyter Notebooks.
Databricks Certification Spark Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along
Containerized Demo of Apache Spark MLlib on a Data Lakehouse (2022)
Spark-DeltaLake-Demo Reliable, Scalable Machine Learning (2022) This project was completed in an attempt to become better acquainted with the latest b
Django + Next.js integration
Django Next.js Django + Next.js integration From a comment on StackOverflow: Run 2 ports on the same server. One for django (public facing) and one fo
Esse é o meu primeiro repo tratando de fim a fim, uma pipeline de dados abertos do governo brasileiro relacionado a compras de contrato e cronogramas anuais com spark, em pyspark e SQL!
Olá! Esse é o meu primeiro repo tratando de fim a fim, uma pipeline de dados abertos do governo brasileiro relacionado a compras de contrato e cronogr
ipyvizzu - Jupyter notebook integration of Vizzu
ipyvizzu - Jupyter notebook integration of Vizzu. Tutorial · Examples · Repository About The Project ipyvizzu is the Jupyter Notebook integration of V
Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃
This repository provides a library for efficient training of masked language models (MLM), built with fairseq. We fork fairseq to give researchers mor
Pytest-rich - Pytest + rich integration (proof of concept)
pytest-rich Leverage rich for richer test session output. This plugin is not pub
Bcc2telegraf: An integration that sends ebpf-based bcc histogram metrics to telegraf daemon
bcc2telegraf bcc2telegraf is an integration that sends ebpf-based bcc histogram
Python-kafka-reset-consumergroup-offset-example - Python Kafka reset consumergroup offset example
Python Kafka reset consumergroup offset example This is a simple example of how
🏅 Top 5% in 제2회 연구개발특구 인공지능 경진대회 AI SPARK 챌린지
AI_SPARK_CHALLENG_Object_Detection 제2회 연구개발특구 인공지능 경진대회 AI SPARK 챌린지 🏅 Top 5% in mAP(0.75) (443명 중 13등, mAP: 0.98116) 대회 설명 Edge 환경에서의 가축 Object Dete
Cado Response Integration with Amazon GuardDuty using AWS Lambda
Cado Response Integration with Amazon GuardDuty using AWS Lambda This repository contains a simple example where: An alert is triggered by GuardDuty T
Source code for "Understanding Knowledge Integration in Language Models with Graph Convolutions"
Graph Convolution Simulator (GCS) Source code for "Understanding Knowledge Integration in Language Models with Graph Convolutions" Requirements: PyTor
A data structure that extends pyspark.sql.DataFrame with metadata information.
MetaFrame A data structure that extends pyspark.sql.DataFrame with metadata info
Ha-rpi gpio - Home Assistant Raspberry Pi GPIO Integration
Home Assistant Raspberry Pi GPIO custom integration This is a spin-off from the
Integration between the awesome window manager and the firefox web browser.
Integration between the awesome window manager and the firefox web browser.
PySpark Structured Streaming ROS Kafka ApacheSpark Cassandra
PySpark-Structured-Streaming-ROS-Kafka-ApacheSpark-Cassandra The purpose of this project is to demonstrate a structured streaming pipeline with Apache
The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it inside a loop of Design, Model Development and Operations.
MLOps The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it insid
Google Kubernetes Engine (GKE) with a Snyk Kubernetes controller installed/configured for Snyk App
Google Kubernetes Engine (GKE) with a Snyk Kubernetes controller installed/configured for Snyk App This example provisions a Google Kubernetes Engine
Skykettle ha - Redmond SkyKettle integration for Home Assistant
Redmond SkyKettle integration for Home Assistant This integration allows to cont
Python Wrapper for handling payment requests through the Daraja MPESA API
Python Daraja Description Python Wrapper for handling payment requests through the Daraja MPESA API Contribution Refer to the CONTRIBUTING GUIDE. Usag
Event-driven-model-serving - Unified API of Apache Kafka and Google PubSub
event-driven-model-serving Unified API of Apache Kafka and Google PubSub 1. Proj
A simple healthcheck wrapper to monitor Kafka.
kafka-healthcheck A simple healthcheck wrapper to monitor Kafka. Kafka Healthcheck is a simple server that provides a singular API endpoint to determi
SynapseML - an open source library to simplify the creation of scalable machine learning pipelines
Synapse Machine Learning SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. Sy
Create (templateable) cameras that display qr codes in homeassistant
QRCam This custom component creates cameras displaying qrcodes. The QRCodes can be static or generated from templates. If you use a template as conten
Designed and coded a password manager in Python with Arduino integration
Designed and coded a password manager in Python with Arduino integration. The Program uses a master user to login, and stores account data such as usernames and passwords to the master user. While logging into the program with the master user the Arduino was used as a two-factor authentication key. The program detects a connection to the Arduino and checks if certain parameters are met before completing the login procedure.
This repo provides the base code for pytorch-lightning and weight and biases simultaneous integration.
Write your model faster with pytorch-lightning-wadb-code-backbone This repository provides the base code for pytorch-lightning and weight and biases s
Strawberry-django-plus - Enhanced Strawberry GraphQL integration with Django
strawberry-django-plus Enhanced Strawberry integration with Django. Built on top
Projet d'integration SRI 3A ROS
projet-integration-sri-2021-2022 Projet d'intégration ROS SRI 2021 2022 Organization: Planification de tâches Perception Saisie: Cédérick Mouliets Sim
macOS development environment setup: Setting up a new developer machine can be an ad-hoc, manual, and time-consuming process.
dev-setup Motivation Setting up a new developer machine can be an ad-hoc, manual, and time-consuming process. dev-setup aims to simplify the process w
A Time Series Library for Apache Spark
Flint: A Time Series Library for Apache Spark The ability to analyze time series data at scale is critical for the success of finance and IoT applicat
iTerm2 Shell integration for Xonsh shell.
iTerm2 Shell Integration iTerm2 Shell integration for Xonsh shell. Installation To install use pip: xpip install xontrib-iterm2 # or: xpip install -U
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Spark Python Notebooks This is a collection of IPython notebook/Jupyter notebooks intended to train the reader on different Apache Spark concepts, fro
Distributed deep learning on Hadoop and Spark clusters.
Note: we're lovingly marking this project as Archived since we're no longer supporting it. You are welcome to read the code and fork your own version
API for SpeechAnalytics integration with FreePBX/Asterisk
freepbx_speechanalytics_api API for SpeechAnalytics integration with FreePBX/Asterisk Скопировать файл settings.py.sample в settings.py и отредактиров
BigDL - Evaluate the performance of BigDL (Distributed Deep Learning on Apache Spark) in big data analysis problems
Evaluate the performance of BigDL (Distributed Deep Learning on Apache Spark) in big data analysis problems.
Kent - Fake Sentry server for local development, debugging, and integration testing
Kent is a service for debugging and integration testing Sentry.
pyDuinoCoin is a simple python integration for the DuinoCoin REST API, that allows developers to communicate with DuinoCoin Master Server
PyDuinoCoin PyDuinoCoin is a simple python integration for the DuinoCoin REST API, that allows developers to communicate with DuinoCoin Main Server. I
Kafka Connect JDBC Docker Image.
kafka-connect-jdbc This is a dockerized version of the Confluent JDBC database connector. Usage This image is running the connect-standalone command w
Stream-Kafka-ELK-Stack - Weather data streaming using Apache Kafka and Elastic Stack.
Streaming Data Pipeline - Kafka + ELK Stack Streaming weather data using Apache Kafka and Elastic Stack. Data source: https://openweathermap.org/api O
Stacs-ci - A set of modules to enable integration of STACS with commonly used CI / CD systems
Static Token And Credential Scanner CI Integrations What is it? STACS is a YARA
Spark-movie-lens - An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
A scalable on-line movie recommender using Spark and Flask This Apache Spark tutorial will guide you step-by-step into how to use the MovieLens datase
Use Jupyter Notebooks to demonstrate how to build a Recommender with Apache Spark & Elasticsearch
Recommendation engines are one of the most well known, widely used and highest value use cases for applying machine learning. Despite this, while there are many resources available for the basics of training a recommendation model, there are relatively few that explain how to actually deploy these models to create a large-scale recommender system.
X-news - Pipeline data use scrapy, kafka, spark streaming, spark ML and elasticsearch, Kibana
X-news - Pipeline data use scrapy, kafka, spark streaming, spark ML and elasticsearch, Kibana
Bigdata - This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster
Scrapy Cluster This Scrapy project uses Redis and Kafka to create a distributed
TwitterMusicBot - A Twitter bot with Spotify integration.
A Twitter Music Bot 🤖 🎵 🎶 I created this project to learn more about APIs, so it only works for student purposes. Initially, delving into the Spoti
A project consists in a set of assignements corresponding to a BI process: data integration, construction of an OLAP cube, qurying of a OPLAP cube and reporting.
TennisBusinessIntelligenceProject - A project consists in a set of assignements corresponding to a BI process: data integration, construction of an OLAP cube, qurying of a OPLAP cube and reporting.
City-seeds - A random generator of cultural characteristics intended to spark ideas and help draw threads
City Seeds This is a random generator of cultural characteristics intended to sp
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
An distributed automation framework.
Automation Kit Repository Welcome to the Automation Kit repository! Note: This package is progressing quickly but is not yet ready for full production
Python library for ODE integration via Taylor's method and LLVM
heyoka.py Modern Taylor's method via just-in-time compilation Explore the docs » Report bug · Request feature · Discuss The heyókȟa [...] is a kind of
This mini project showcase how to build and debug Apache Spark application using Python
Spark app can't be debugged using normal procedure. This mini project showcase how to build and debug Apache Spark application using Python programming language. There are also options to run Spark application on Spark container
NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions
NeoDTI NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions (Bioinformatics).
Sleep As Android integration for Home Assistant
Sleep As Android custom integration This integration will allow you to get events from your SleepAsAndroid application in a form of the sensor states
A Sample App to Demonstrate React Native and FastAPI Integration
React Native - Service Integration with FastAPI Backend. A Sample App to Demonstrate React Native and FastAPI Integration UI Based on NativeBase toolk
Simple integration between FastAPI and cloud authentication services (AWS Cognito, Auth0, Firebase Authentication).
FastAPI Cloud Auth fastapi-cloudauth standardizes and simplifies the integration between FastAPI and cloud authentication services (AWS Cognito, Auth0
Adds integration of the Chameleon template language to FastAPI.
fastapi-chameleon Adds integration of the Chameleon template language to FastAPI. If you are interested in Jinja instead, see the sister project: gith
Adds integration of the Jinja template language to FastAPI.
fastapi-jinja Adds integration of the Jinja template language to FastAPI. This is inspired and based off fastapi-chamelon by Mike Kennedy. Check that
Pandas and Spark DataFrame comparison for humans
DataComPy DataComPy is a package to compare two Pandas DataFrames. Originally started to be something of a replacement for SAS's PROC COMPARE for Pand
pypyr task-runner cli & api for automation pipelines.
pypyr task-runner cli & api for automation pipelines. Automate anything by combining commands, different scripts in different languages & applications into one pipeline process.
PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
PatZilla is a modular patent information research platform and data integration toolkit with a modern user interface and access to multiple data sources.
Changelog CI is a GitHub Action that enables a project to automatically generate changelogs
What is Changelog CI? Changelog CI is a GitHub Action that enables a project to automatically generate changelogs. Changelog CI can be triggered on pu
Fetching tweets and integrating it with Kafka and PySpark
KafkaPySpark Zookeeper bin/zookeeper-server-start.sh config/zookeeper.properties Kafka Server bin/kafka-server-start.sh config/server.properties Kafka
Home Assistant Hilo Integration via HACS
BETA This is a beta release. There will be some bugs, issues, etc. Please bear with us and open issues in the repo. Hilo Hilo integration for Home Ass
This library is a location of the LegacyLogger for PyTorch Lightning.
neptune-contrib Documentation See neptune-contrib documentation site Installation Get prerequisites python versions 3.5.6/3.6 are supported Install li
Aqara Camera G3 integration for Home Assistant
Aqara Camera G3 integration for Home Assistant ATTENTION: The component only works after enabled telnet. Only supportd stream. Not support still image
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Redash is designed to enable anyone, regardless of the level of technical sophistication, to harness the power of data big and small. SQL users levera
Monitor the stability of a pandas or spark dataframe ⚙︎
Population Shift Monitoring popmon is a package that allows one to check the stability of a dataset. popmon works with both pandas and spark datasets.
Basic implementation of Razorpay payment gateway 💳 with Django
Razorpay Payment Integration in Django 💥 In this project Razorpay payment gateway 💳 is integrated with Django by breaking down the whole process int
Official Pytorch implementation of the paper: "Locally Shifted Attention With Early Global Integration"
Locally-Shifted-Attention-With-Early-Global-Integration Pretrained models You can download all the models from here. Training Imagenet python -m torch
Keycloak integration for Python FastAPI
FastAPI Keycloak Integration Documentation Introduction Welcome to fastapi-keycloak. This projects goal is to ease the integration of Keycloak (OpenID
Solcast Integration for Home Assistant
Solcast Solar Home Assistant(https://www.home-assistant.io/) Component This custom component integrates the Solcast API into Home Assistant. Modified
Watson-Assistant with integration capabilities
Watson-Assistant-Integration Watson-Assistant with integration capabilities "main.py" should be deployed as Cloud Function (Action) on IBM Cloud. For
Implementation of K-Nearest Neighbors Algorithm Using PySpark
KNN With Spark Implementation of KNN using PySpark. The KNN was used on two separate datasets (https://archive.ics.uci.edu/ml/datasets/iris and https:
A Habitica Integration with Github Workflows.
Habitica-Workflow A Habitica Integration with Github Workflows. How To Use? Fork (and Star) this repository. Set environment variable in Settings - S
I have baked a custom integration to control Eufy Security Cameras and access RTSP and P2P stream if possible.
I have baked a custom integration to control Eufy Security Cameras and access RTSP (real time streaming protocol) and P2P (peer to peer) stream if pos
Sanic integration with Webargs
webargs-sanic Sanic integration with Webargs. Parsing and validating request arguments: headers, arguments, cookies, files, json, etc. IMPORTANT: From
Travis CI testing a Dockerfile based on Palantir's remix of Apache Cassandra, testing IaC, and testing integration health of Debian
Testing Palantir's remix of Apache Cassandra with Snyk & Travis CI This repository is to show Travis CI testing a Dockerfile based on Palantir's remix
A script that publishes power usage data of iDrac enabled servers to an MQTT broker for integration into automation and power monitoring systems
iDracPowerMonitorMQTT This script publishes iDrac power draw data for iDrac 6 enabled servers to an MQTT broker. This can be used to integrate the pow
Apache (Py)Spark type annotations (stub files).
PySpark Stubs A collection of the Apache Spark stub files. These files were generated by stubgen and manually edited to include accurate type hints. T
A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.
Realtime Financial Market Data Visualization and Analysis Introduction This repo shows my project about real-time stock data pipeline. All the code is
Component for deep integration LedFx from Home Assistant.
LedFX for Home Assistant Component for deep integration LedFx from Home Assistant. Table of Contents FAQ Install Config Performance FAQ Q. What versio
Pyspark project that able to do joins on the spark data frames.
SPARK JOINS This project is to perform inner, all outer joins and semi joins. create_df.py: load_data.py : helps to put data into Spark data frames. d
Graphene MongoEngine integration
Graphene-Mongo A Mongoengine integration for Graphene. Installation For installing graphene-mongo, just run this command in your shell pip install gra
Simple integration of Flask and WTForms, including CSRF, file upload and Recaptcha integration.
Flask-WTF Simple integration of Flask and WTForms, including CSRF, file upload, and reCAPTCHA. Links Documentation: https://flask-wtf.readthedocs.io/
flask extension for integration with the awesome pydantic package
Flask-Pydantic Flask extension for integration of the awesome pydantic package with Flask. Installation python3 -m pip install Flask-Pydantic Basics U
Pyramid configuration with celery integration. Allows you to use pyramid .ini files to configure celery and have your pyramid configuration inside celery tasks.
Getting Started Include pyramid_celery either by setting your includes in your .ini, or by calling config.include('pyramid_celery'): pyramid.includes
A PowSyBl and Python integration based on GraalVM native image
PyPowSyBl The PyPowSyBl project gives access PowSyBl Java framework to Python developers. This Python integration relies on GraalVM to compile Java co
A simple Monte Carlo simulation using Python and matplotlib library
Monte Carlo python simulation Install linux dependencies sudo apt update sudo apt install build-essential \ software-properties-commo
Delta Sharing: An Open Protocol for Secure Data Sharing
Delta Sharing: An Open Protocol for Secure Data Sharing Delta Sharing is an open protocol for secure real-time exchange of large datasets, which enabl
HACS gives you a powerful UI to handle downloads of all your custom needs.
HACS (Home Assistant Community Store) Manage (Install, track, upgrade) and discover custom elements for Home Assistant directly from the UI. What? HAC
Simple integration of Flask and WTForms, including CSRF, file upload and Recaptcha integration.
Flask-WTF Simple integration of Flask and WTForms, including CSRF, file upload, and reCAPTCHA. Links Documentation: https://flask-wtf.readthedocs.io/
Flask extension that provides integration with Azure Storage
Flask-Azure-Storage A Flask extension that provides integration with Azure Storage Table of Contents Flask-Azure-Storage Install Usage Examples Create
Automate issue discovery for your projects against Lightning nightly and releases.
Automated Testing for Lightning EcoSystem Projects Automate issue discovery for your projects against Lightning nightly and releases. You get CPUs, Mu
Github integration with Telegram
The Telegram bot myGit is your GiHub assistant. In your conversations with your team, you can simply insert the information about the projects you are working at.
Unofficial calendar integration with Gradescope
Gradescope-Calendar This script scrapes your Gradescope account for courses and assignment details. Assignment details currently can be transferred to
The Spark Challenge Student Check-In/Out Tracking Script
The Spark Challenge Student Check-In/Out Tracking Script This Python Script uses the Student ID Database to match the entries with the ID Card Swipe a
Reading streams of Twitter data, save them to Kafka, then process with Kafka Stream API and Spark Streaming
Using Streaming Twitter Data with Kafka and Spark Reading streams of Twitter data, publishing them to Kafka topic, process message using Kafka Stream