Stream-Kafka-ELK-Stack - Weather data streaming using Apache Kafka and Elastic Stack.

Related tags

Overview

Streaming Data Pipeline - Kafka + ELK Stack

Streaming weather data using Apache Kafka and Elastic Stack.

Data source: https://openweathermap.org/api

Objectives: Develop a streaming data pipeline to extract weather data from OpenWeather API using Apache Kafka, Logstash, Elasticserach and Kibana (Kafka + ELK Stack).

To summarize, Python was used to develop a Kakfa producer that requests weather data from OpenWeather API every minute and sends it as a message to Apache Kafka. Logstash, as a Kafka consumer, consumes the data and stores it into Elasticsearch. Kibana uses the data from Elasticsearch to display the dashboard.

Kibana Weather Dashboard

Steps:

bash elk/start_elastic_docker.sh
bash kafka/start_kafka_docker.sh
Create a topic using kafka manager: localhost:9000

Logstash installed locally*

$LOGSTASH_HOME/bin/logstash -f $LOGSTASH_HOME/config/pipeline.conf

Before running Kafka Producer, is needed to set the API key inside the weather_api_key.ini file*

python3 weather_kfk_producer.py
Access Kibana: localhost:5601
Create an index pattern: must match with your index name inside pipeline.conf
Develop your dashboard.

An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify.

An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify. The ETL process flows from AWS's S3 into staging tables in AWS Redshift.

1 Feb 11, 2022

This mini project showcase how to build and debug Apache Spark application using Python

Spark app can't be debugged using normal procedure. This mini project showcase how to build and debug Apache Spark application using Python programming language. There are also options to run Spark application on Spark container

1 Dec 29, 2021

Weather analysis with Python, SQLite, SQLAlchemy, and Flask

Surf's Up Weather analysis with Python, SQLite, SQLAlchemy, and Flask Overview The purpose of this analysis was to examine weather trends (precipitati

1 Sep 5, 2021

PyNHD is a part of HyRiver software stack that is designed to aid in watershed analysis through web services.

A part of HyRiver software stack that provides access to NHD+ V2 data through NLDI and WaterData web services

23 Dec 14, 2022

cLoops2: full stack analysis tool for chromatin interactions

cLoops2: full stack analysis tool for chromatin interactions Introduction cLoops2 is an extension of our previous work, cLoops. From loop-calling base

25 Dec 14, 2022

Mining the Stack Overflow Developer Survey

Mining the Stack Overflow Developer Survey A prototype data mining application to compare the accuracy of decision tree and random forest regression m

1 Nov 16, 2021

🧪 Panel-Chemistry - exploratory data analysis and build powerful data and viz tools within the domain of Chemistry using Python and HoloViz Panel.

🧪📈 🐍. The purpose of the panel-chemistry project is to make it really easy for you to do DATA ANALYSIS and build powerful DATA AND VIZ APPLICATIONS within the domain of Chemistry using using Python and HoloViz Panel.

97 Dec 8, 2022

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

3.7k Jan 3, 2023

Elementary is an open-source data reliability framework for modern data teams. The first module of the framework is data lineage.

Data lineage made simple, reliable, and automated. Effortlessly track the flow of data, understand dependencies and analyze impact. Features Visualiza

898 Jan 9, 2023

Stream-Kafka-ELK-Stack - Weather data streaming using Apache Kafka and Elastic Stack.

Related tags

Overview

Streaming Data Pipeline - Kafka + ELK Stack

Kibana Weather Dashboard

Steps:

You might also like...

An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify.

This mini project showcase how to build and debug Apache Spark application using Python

Weather analysis with Python, SQLite, SQLAlchemy, and Flask

PyNHD is a part of HyRiver software stack that is designed to aid in watershed analysis through web services.

cLoops2: full stack analysis tool for chromatin interactions

Mining the Stack Overflow Developer Survey

🧪 Panel-Chemistry - exploratory data analysis and build powerful data and viz tools within the domain of Chemistry using Python and HoloViz Panel.

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Elementary is an open-source data reliability framework for modern data teams. The first module of the framework is data lineage.

Owner

Felipe Demenech Vasconcelos

A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.

X-news - Pipeline data use scrapy, kafka, spark streaming, spark ML and elasticsearch, Kibana

Sentiment analysis on streaming twitter data using Spark Structured Streaming & Python

Weather Image Recognition - Python weather application using series of data

PySpark Structured Streaming ROS Kafka ApacheSpark Cassandra

Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

Building house price data pipelines with Apache Beam and Spark on GCP

MetPy is a collection of tools in Python for reading, visualizing and performing calculations with weather data.

BigDL - Evaluate the performance of BigDL (Distributed Deep Learning on Apache Spark) in big data analysis problems