Source files for the data lake demo video using the AWS TICKIT database

Overview

Data Lake Demo

Source code for video demonstration detailed in the post, Building a Simple Data Lake on AWS . Build a simple data lake on AWS using a combination of services, including Amazon MWAA, AWS Glue Data Catalog, AWS Glue Crawlers, AWS Glue Jobs, AWS Glue Studio, Amazon Athena, and Amazon S3.

Architecture

Architecture

TICKIT Sample Database

Amazon Redshift TICKIT Sample Database

TICKIT Tables

  • tickit.saas.category
  • tickit.saas.event
  • tickit.saas.venue
  • tickit.crm.users
  • tickit.date
  • tickit.listing
  • tickit.sales

Naming Conventions

+-------------+--------------------------------------------------------------------+
| Prefix      | Description                                                        |
+-------------+--------------------------------------------------------------------+
| _source     | Data Source metadata only (org. call _raw in video)                |
| _raw        | Raw/Bronze data from data sources (org. call _converted in video)  |
| _refined    | Refined/Silver data - raw data with initial ELT/cleansing applied  |
| _aggregated | Gold/Aggregated data - aggregated/joined refined data              |
+-------------+--------------------------------------------------------------------+

AWS CLI Commands

There were two small changes made to the source code, as compared to the video demonstration, to help clarify the flow of data in the demonstration. The prefix for the (7) data source AWS Glue Data Catalog table’s prefix was switched from raw_ from source_. Also, the (7) Raw/Bronze AWS Glue Data Catalog table’s prefix was switched from converted_ to raw_. The final data flow is 1) source_, 2) raw_, 3) refined_, and 4) agg_ (aggregated).

DATA_LAKE_BUCKET="your-data-lake-bucket"

aws s3 rm "s3://${DATA_LAKE_BUCKET}/tickit/" --recursive

aws glue delete-database --name tickit_demo

aws glue create-database \
  --database-input '{"Name": "tickit_demo", "Description": "Track sales activity for the fictional TICKIT web site"}'

aws glue get-tables \
  --database-name tickit_demo \
  --query "TableList[].Name" \
  --output table

aws glue start-crawler --name tickit_postgresql
aws glue start-crawler --name tickit_mysql
aws glue start-crawler --name tickit_mssql

aws glue get-tables \
  --database-name tickit_demo \
  --query "TableList[].Name" \
  --expression "source_*"  \
  --output table

aws glue start-job-run --job-name tickit_public_category_raw
aws glue start-job-run --job-name tickit_public_date_raw
aws glue start-job-run --job-name tickit_public_event_raw
aws glue start-job-run --job-name tickit_public_listing_raw
aws glue start-job-run --job-name tickit_public_sales_raw
aws glue start-job-run --job-name tickit_public_users_raw
aws glue start-job-run --job-name tickit_public_venue_raw

aws glue start-job-run --job-name tickit_public_category_refine
aws glue start-job-run --job-name tickit_public_date_refine
aws glue start-job-run --job-name tickit_public_event_refine
aws glue start-job-run --job-name tickit_public_listing_refine
aws glue start-job-run --job-name tickit_public_sales_refine
aws glue start-job-run --job-name tickit_public_users_refine
aws glue start-job-run --job-name tickit_public_venue_refine

aws glue get-tables \
  --database-name tickit_demo \
  --query "TableList[].Name" \
  --output table

aws s3api list-objects-v2 \
  --bucket ${DATA_LAKE_BUCKET} \
  --prefix "tickit/" \
  --query "Contents[].Key" \
  --output table
You might also like...
Takes a video as an input and creates a video which is suitable to upload on Youtube Shorts and Tik Tok (1080x1920 resolution).

Shorts-Tik-Tok-Creator Takes a video as an input and creates a video which is suitable to upload on Youtube Shorts and Tik Tok (1080x1920 resolution).

Turn any live video stream or locally stored video into a dataset of interesting samples for ML training, or any other type of analysis.
Turn any live video stream or locally stored video into a dataset of interesting samples for ML training, or any other type of analysis.

Sieve Video Data Collection Example Find samples that are interesting within hours of raw video, for free and completely automatically using Sieve API

Video-to-GIF-Converter - A small code snippet that can be used to convert any video to a gif

Video to GIF Converter Project Description: This is a small code snippet that ca

Video-stream - A telegram video stream bot repo
Video-stream - A telegram video stream bot repo

This is a Telegram Video stream Bot. Binary Tech 💫 Features stream videos downl

TkVideoplayer - This is a simple library to play video files in tkinter.
TkVideoplayer - This is a simple library to play video files in tkinter.

TkVideoplayer - This is a simple library to play video files in tkinter.

A Simple Telegram Bot By @Tellybots to add Subtitle Files in Video
A Simple Telegram Bot By @Tellybots to add Subtitle Files in Video

Video-subtitle-merger A Simple Telegram Bot By @Tellybots to add Subtitle Files in Video Features Force Sub Button Added Soon Support Media Type Such

Python script for extracting audio from video files and creating Mel spectrograms

video2spectrogram About This package is meant to automate the process of extracting audio files from videos and saving the plots computed from these a

Convert Video Files To Text And Audio

Video-To-Text Convert Video Files To Text And Audio Convert To Audio 1: open dvtt folder in cmd 2: run this command in cmd = main.py Audio Convert To

Wonkey - an open source programming language for the creation of cross-platform video games
Wonkey - an open source programming language for the creation of cross-platform video games

Wonkey Programming Language Wonkey is an open source programming language for the creation of cross-platform video games, highly inspired by the “Blit

Comments
  • upgrade click version

    upgrade click version

    When running the run_local script, got the following error:

    ~/Projects/airflow/cicd/devopsworld-airflow/demos/demo-02-localdev/git-precommit-demo
    \n⌛ Starting SQLFluff tests...
    ~/Projects/airflow/cicd/devopsworld-airflow/demos/demo-02-localdev/git-precommit-demo/dags ~/Projects/airflow/cicd/devopsworld-airflow/demos/demo-02-localdev/git-precommit-demo
    Traceback (most recent call last):
    ..
    ..
        from .core import ParameterSource
    ImportError: cannot import name 'ParameterSource' from 'click.core' (/Users/ricsue/opt/anaconda3/lib/python3.8/site-packages/click/core.py)
    
    opened by 094459 1
  • Missing dependency in requirements_local

    Missing dependency in requirements_local

    When running the local test script as part of the pre-commit hooks, I needed to install typing-extensions==4.3.0 as I was getting the following error:

    dags/__init__.py::BLACK SKIPPED (could not import 'black': cannot import name 'TypeGuard' from 'typing_extensions'...) [  6%]
    dags/data_lake__01_clean_and_prep_demo.py::BLACK 
    ..
    ..
    INTERNALERROR>     for name, fixture in item.funcargs.items():
    INTERNALERROR> AttributeError: 'BlackItem' object has no attribute 'funcargs'
    
    opened by 094459 1
  • workaround to achieve multi-tenancy using ci/cd pipeline

    workaround to achieve multi-tenancy using ci/cd pipeline

    If users from multiple aws accounts have DAG in a single mwaa environment, It is important to restrict the users based on what aws resources they can access based on DAG level. @garystafford could we do this by adding a validation step in the ci/cd pipeline to check if the DAG policies are met by the users who write the DAGs?

    opened by subinjp 1
Owner
Gary A. Stafford
AWS Senior Solutions Architect | AWS Certified Professional | Cloud | Data | Containers | Serverless | DevOps | Polyglot Developer
Gary A. Stafford
Extracting frames from video and create video using frames

Extracting frames from video and create video using frames This program uses opencv library to extract the frames from video and create video from ext

null 1 Nov 19, 2021
Terminal-Video-Player - A program that can display video in the terminal using ascii characters

Terminal-Video-Player - A program that can display video in the terminal using ascii characters

null 15 Nov 10, 2022
This plugin generates json files used by deovr allowing you to play 2d and 3d video's using the player

deovr-plugin This plugin generates json files used by deovr allowing you to play 2d and 3d video's using the player. Deovr looks for an index file /de

null 10 Sep 29, 2022
Streamlink is a CLI utility which pipes video streams from various services into a video player

Streamlink is a CLI utility which pipes video streams from various services into a video player

null 8.2k Dec 26, 2022
MoviePy is a Python library for video editing, can read and write all the most common audio and video formats

MoviePy is a Python library for video editing: cutting, concatenations, title insertions, video compositing (a.k.a. non-linear editing), video processing, and creation of custom effects. See the gallery for some examples of use.

null 10k Jan 8, 2023
A youtube video link or id to video thumbnail python package.

Youtube-Video-Thumbnail A youtube video link or id to video thumbnail python package. Made with Python3

Fayas Noushad 10 Oct 21, 2022
Text2Video's purpose is to help people create videos quickly and easily by simply typing out the video’s script and a description of images to include in the video.

Text2Video Text2Video's purpose is to help people create videos quickly and easily by simply typing out the video’s script and a description of images

Josh Chen 19 Nov 22, 2022
Filtering user-generated video content(SberZvukTechDays)Filtering user-generated video content(SberZvukTechDays)

Filtering user-generated video content(SberZvukTechDays) Table of contents General info Team members Technologies Setup Result General info This is a

Roman 6 Apr 6, 2022
Telegram Video Chat Video Streaming bot 🇱🇰

?? Get SESSION_NAME from below: Pyrogram ?? Preview ✨ Features Music & Video stream support MultiChat support Playlist & Queue support Skip, Pause, Re

DOOZY YEZ 5 Jun 26, 2022
Play Video & Music on Telegram Group Video Chat

?? DEMONGIRL ?? ʜᴇʟʟᴏ ❤️ ???? Join us ᴠɪᴅᴇᴏ sᴛʀᴇᴀᴍ ɪs ᴀɴ ᴀᴅᴠᴀɴᴄᴇᴅ ᴛᴇʟᴇʀᴀᴍ ʙᴏᴛ ᴛʜᴀᴛ's ᴀʟʟᴏᴡ ʏᴏᴜ ᴛᴏ ᴘʟᴀʏ ᴠɪᴅᴇᴏ & ᴍᴜsɪᴄ ᴏɴ ᴛᴇʟᴇɢʀᴀᴍ ɢʀᴏᴜᴘ ᴠɪᴅᴇᴏ ᴄʜᴀᴛ ?? ɢ

Jonathan 5 Dec 31, 2021