Fully reproducible, Dockerized, step-by-step, tutorial on how to mock a "real-time" Kafka data stream from a timestamped csv file. Detailed blog post published on Towards Data Science.

Overview

time-series-kafka-demo

img

Mock stream producer for time series data using Kafka.

I walk through this tutorial and others here on GitHub and on my Medium blog. Here is a friend link for open access to the article on Towards Data Science: Make a mock “real-time” data stream with Python and Kafka. I'll always add friend links on my GitHub tutorials for free Medium access if you don't have a paid Medium membership (referral link).

If you find any of this useful, I always appreciate contributions to my Saturday morning fancy coffee fund!

This repo demos how to convert a csv file of timestamped data into a real-time stream useful for testing streaming analytics. An example input file with random time series data and a script for generating the file are included in the data directory.

The producer and consumer Python scripts use Confluent's Kafka client for Python, which is installed in the Docker image built with the accompanying Dockerfile, if you choose to use it.

Requires Docker and Docker Compose.

Usage

Clone repo and cd into directory.

git clone https://github.com/mtpatter/time-series-kafka-demo.git
cd time-series-kafka-demo

Start the Kafka broker

docker compose up --build

Build a Docker image (optionally, for the producer and consumer)

From the main root directory:

docker build -t "kafkacsv" .

If you want to use Docker for the python scripts, this should now work:

docker run -it --rm kafkacsv python bin/sendStream.py -h

Start a consumer

To start a consumer for printing all messages in real-time from the stream "my-stream":

python bin/processStream.py my-stream

or with Docker:

docker run -it --rm \
      -v $PWD:/home \
      --network=host \
      kafkacsv python bin/processStream.py my-stream

Produce a time series stream

Send time series from data/data.csv to topic “my-stream”, and speed it up by a factor of 10.

python bin/sendStream.py data/data.csv my-stream --speed 10

or with Docker:

docker run -it --rm \
      -v $PWD:/home \
      --network=host \
      kafkacsv python bin/sendStream.py data/data.csv my-stream --speed 10

Shut down and clean up

Stop the consumer with Return and Ctrl+C.

Shutdown Kafka broker system:

docker compose down
You might also like...
This is a tool to make easier brawl stars modding using csv manipulation

Brawler Maker : Modding Tool for Brawl Stars This is a tool to make easier brawl stars modding using csv manipulation if you want to support me, just

A simple XLSX/CSV reader - to dictionary converter
A simple XLSX/CSV reader - to dictionary converter

sheet2dict A simple XLSX/CSV reader - to dictionary converter Installing To install the package from pip, first run: python3 -m pip install --no-cache

Generating a report CSV and send it to an email - Python / Django Rest Framework
Generating a report CSV and send it to an email - Python / Django Rest Framework

Generating a report in CSV format and sending it to a email How to start project. Create a folder in your machine Create a virtual environment python3

Compare two CSV files for differences. Colorize the differences and align the columns.
Compare two CSV files for differences. Colorize the differences and align the columns.

pretty-csv-diff Compare two CSV files for differences. Colorize the differences and align the columns. Command-Line Example Command-Line Usage usage:

script to calculate total GPA out of 4, based on input gpa.csv

gpa_calculator script to calculate total GPA out of 4 based on input gpa.csv to use, create a total.csv file containing only one integer showing the t

A website for courses of Major Computer Science, NKU

A website for courses of Major Computer Science, NKU

Generate a single PDF file from MkDocs repository.

PDF Generate Plugin for MkDocs This plugin will generate a single PDF file from your MkDocs repository. This plugin is inspired by MkDocs PDF Export P

MkDocs plugin for setting revision date from git per markdown file

mkdocs-git-revision-date-plugin MkDocs plugin that displays the last revision date of the current page of the documentation based on Git. The revision

MkDocs Plugin allowing your visitors to *File Print Save as PDF* the entire site.

mkdocs-print-site-plugin MkDocs plugin that adds a page to your site combining all pages, allowing your site visitors to File Print Save as PDF th

Owner
Maria Patterson
Civic techie, Astro PhD, open science architect, and inclusive AI advocate.
Maria Patterson
step by step guide for beginners for getting started with open source

Step-by-Step Guide for beginners for getting started with Open-Source Here The Contribution Begins ?? If you are a beginner then this repository is fo

Arpit Jain 66 Jan 3, 2023
More detailed upload statistics for Nicotine+

More Upload Statistics A small plugin for Nicotine+ 3.1+ to create more detailed upload statistics. ⚠ No data previous to enabling this plugin will be

Nick 1 Dec 17, 2021
Minimal reproducible example for `mkdocstrings` Python handler issue

Minimal reproducible example for `mkdocstrings` Python handler issue

Hayden Richards 0 Feb 17, 2022
A tutorial for people to run synthetic data replica's from source healthcare datasets

Synthetic-Data-Replica-for-Healthcare Description What is this? A tailored hands-on tutorial showing how to use Python to create synthetic data replic

null 11 Mar 22, 2022
Quick tutorial on orchest.io that shows how to build multiple deep learning models on your data with a single line of code using python

Deep AutoViML Pipeline for orchest.io Quickstart Build Deep Learning models with a single line of code: deep_autoviml Deep AutoViML helps you build te

Ram Seshadri 6 Oct 2, 2022
This tutorial will guide you through the process of self-hosting Polygon

Hosting guide This tutorial will guide you through the process of self-hosting Polygon Before starting Make sure you have the following tools installe

Polygon 2 Jan 31, 2022
Repository for learning Python (Python Tutorial)

Repository for learning Python (Python Tutorial) Languages and Tools ?? Overview ?? Repository for learning Python (Python Tutorial) Languages and Too

Swiftman 2 Aug 22, 2022
Tutorial for STARKs with supporting code in python

stark-anatomy STARK tutorial with supporting code in python Outline: introduction overview of STARKs basic tools -- algebra and polynomials FRI low de

null 121 Jan 3, 2023
A simple tutorial to get you started with Discord and it's Python API

Hello there Feel free to fork and star, open issues if there are typos or you have a doubt. I decided to make this post because as a newbie I never fo

Sachit 1 Nov 1, 2021
The tutorial is a collection of many other resources and my own notes

Why we need CTC? ---> looking back on history 1.1. About CRNN 1.2. from Cross Entropy Loss to CTC Loss Details about CTC 2.1. intuition: forward algor

手写AI 7 Sep 19, 2022