Build a better understanding of your data in PostgreSQL.

Mark Litwintschik

Last update: Aug 30, 2022

Related tags

Third-party APIs Wrappers business-intelligence

Overview

Data Fluent for PostgreSQL

Build a better understanding of your data in PostgreSQL.

The following shows an example report generated by this tool. It gives the numbers of rows, columns, bytes as well as human-friendly size counts for each table within a given PostgreSQL database.

The following shows the row count for every column that represents a date grouped by year and month.

Installation

On Ubuntu 20:

$ wget -qO- \
    https://www.postgresql.org/media/keys/ACCC4CF8.asc \
        | sudo apt-key add -
$ echo "deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main" \
    | sudo tee /etc/apt/sources.list.d/pgdg.list

$ sudo apt update
$ sudo apt install \
    git \
    python3-pip \
    python3-virtualenv \
    postgresql-13 \
    postgresql-client-13 \
    postgresql-contrib

On macOS:

$ brew install \
    git \
    postgresql \
    virtualenv

Then, regardless of platform, setup a virtual environment and install the following Python-based dependencies.

$ virtualenv ~/.fluency
$ source ~/.fluency/bin/activate
$ python3 -m pip install \
    csvkit \
    humanfriendly \
    ipython \
    openpyxl \
    Pandas \
    psycopg2-binary \
    typer \
    xlsxwriter

Then clone this repo as it contains the datafluent_pg.py script:

$ git clone https://github.com/marklit/datafluent_pg.git ~/datafluent_pg
$ cd ~/datafluent_pg

Example Analysis

Clone fivethirtyeight's data repo. It has a large number of CSV-formatted datasets.

$ git clone https://github.com/fivethirtyeight/data.git ~/538data

Make sure you can access a PostgreSQL database on your machine. Here I'm creating an intel database for the mark user on my Ubuntu 20 machine.

$ sudo -u postgres \
    bash -c "psql -c \"CREATE USER mark
                       WITH PASSWORD 'test'
                       SUPERUSER;\""

With PostgreSQL access setup, create a database called intel.

$ createdb intel

I'll import one of the datasets within fivethirtyeight's repo. Note, because the dates within this dataset are not formatted in YYYY-MM-DD format, I needed to override the format so that the MM/DD/YYYY format would be read properly.

$ csvsql --db postgresql:///intel \
         --insert ~/538data/congress-generic-ballot/generic_topline_historical.csv \
         --datetime-format="%m/%d/%Y"

I'll run the Excel Report Generator:

$ python datafluent_pg.py

This will result in a fluency.xlsx file being produced with two worksheets: Metrics and Time Distributions.

If you need to override any parameters, please refer to the documentation:

$ python datafluent_pg.py --help

Usage: datafluent_pg.py [OPTIONS]

Options:
  --pg-dns TEXT                   [default: postgresql://localhost:5432/intel]
  --output TEXT                   [default: fluency.xlsx]
  --install-completion [bash|zsh|fish|powershell|pwsh]
                                  Install completion for the specified shell.
  --show-completion [bash|zsh|fish|powershell|pwsh]
                                  Show completion for the specified shell, to
                                  copy it or customize the installation.

  --help                          Show this message and exit.

You might also like...

A multipurpose bot designed to make Discord better for everyone, written in Python.

Hadum A multipurpose bot that makes Discord better for everyone Features A Fully Functional Moderation component: manage your staff, members and permi

1 Jan 25, 2022

A better rename and convert bot with upload mode option and Auto detection

2 Nov 9, 2021

Design and build a wrapper for the Open Weather API current weather data service

Design and build a wrapper for the Open Weather API current weather data service that returns a city's temperature, with caching, also allowing for the temperature of the latest queried cities that are still validly cached to be retrieved.

1 Jun 27, 2022

The scope of this project will be to build a data ware house on Google Cloud Platform that will help answer common business questions as well as powering dashboards

The scope of this project will be to build a data ware house on Google Cloud Platform that will help answer common business questions as well as powering dashboards.

2 Jan 20, 2022

This is a walkthrough about understanding the #BoF machine present in the #OSCP exam.

Buffer Overflow methodology Introduction These are 7 simple python scripts and a methodology to ease (not automate !) the exploitation. Each script ta

53 Dec 8, 2022

You can share your Chegg account for answers using this bot with your friends without getting your account blocked/flagged

Chegg-Answer-Bot You can share your Chegg account for answers using this bot with your friends without getting your account blocked/flagged Reuirement

27 Dec 24, 2022

A python script fetches all your starred repositories from your GitHub account and clones them to your server so you will never lose important resources

27 Oct 1, 2022

A powerful bot to copy your google drive data to your team drive

⚛️ Clonebot - Heroku version ⚡ CloneBot is a telegram bot that allows you to copy folder/team drive to team drives. One of the main advantage of this

269 Dec 23, 2022

Utilizing the freqtrade high-frequency cryptocurrency trading framework to build and optimize trading strategies. The bot runs nonstop on a Rasberry Pi.

Freqtrade Strategy Repository Please test all scripts and dry run them before using them in live mode Contact me on discord if you have any questions!

90 Jan 1, 2023

Build a better understanding of your data in PostgreSQL.

Related tags

Overview

Data Fluent for PostgreSQL

Installation

Example Analysis

You might also like...

A multipurpose bot designed to make Discord better for everyone, written in Python.

A better rename and convert bot with upload mode option and Auto detection

Design and build a wrapper for the Open Weather API current weather data service

The scope of this project will be to build a data ware house on Google Cloud Platform that will help answer common business questions as well as powering dashboards

This is a walkthrough about understanding the #BoF machine present in the #OSCP exam.

You can share your Chegg account for answers using this bot with your friends without getting your account blocked/flagged

A python script fetches all your starred repositories from your GitHub account and clones them to your server so you will never lose important resources

A powerful bot to copy your google drive data to your team drive

Utilizing the freqtrade high-frequency cryptocurrency trading framework to build and optimize trading strategies. The bot runs nonstop on a Rasberry Pi.

Owner

Mark Litwintschik

Build better AWS infrastructure

A simple bot that lives in your Telegram group, logging messages to a Postgresql database and serving statistical tables and plots to users as Telegram messages.

This is a scalable system that reads messages from public Telegram channels using Telethon and stores the data in a PostgreSQL database.

iso6.9 is a Discord bot written in Python and is used to make your Discord experience better

Project template for using aws-cdk, Chalice and React in concert, including RDS Postgresql and AWS Cognito

📈 A Discord bot for displaying the download stats of a repository made with Python, the Hikari API and PostgreSQL.

An open source development framework to help you build data workflows and modern data architecture on AWS.

Official Python client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Python apps.

Git Plan - a better workflow for git

A discord Server Bot made with Python, This bot helps people feel better by inspiring them with motivational quotes or by responding with a great message, also the users of the server can create custom messages by telling the bot with Commands.