Build a better understanding of your data in PostgreSQL.

Overview

Data Fluent for PostgreSQL

license

Build a better understanding of your data in PostgreSQL.

The following shows an example report generated by this tool. It gives the numbers of rows, columns, bytes as well as human-friendly size counts for each table within a given PostgreSQL database.

The Metrics Report

The following shows the row count for every column that represents a date grouped by year and month.

The Time Distribution Report

Installation

On Ubuntu 20:

$ wget -qO- \
    https://www.postgresql.org/media/keys/ACCC4CF8.asc \
        | sudo apt-key add -
$ echo "deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main" \
    | sudo tee /etc/apt/sources.list.d/pgdg.list

$ sudo apt update
$ sudo apt install \
    git \
    python3-pip \
    python3-virtualenv \
    postgresql-13 \
    postgresql-client-13 \
    postgresql-contrib

On macOS:

$ brew install \
    git \
    postgresql \
    virtualenv

Then, regardless of platform, setup a virtual environment and install the following Python-based dependencies.

$ virtualenv ~/.fluency
$ source ~/.fluency/bin/activate
$ python3 -m pip install \
    csvkit \
    humanfriendly \
    ipython \
    openpyxl \
    Pandas \
    psycopg2-binary \
    typer \
    xlsxwriter

Then clone this repo as it contains the datafluent_pg.py script:

$ git clone https://github.com/marklit/datafluent_pg.git ~/datafluent_pg
$ cd ~/datafluent_pg

Example Analysis

Clone fivethirtyeight's data repo. It has a large number of CSV-formatted datasets.

$ git clone https://github.com/fivethirtyeight/data.git ~/538data

Make sure you can access a PostgreSQL database on your machine. Here I'm creating an intel database for the mark user on my Ubuntu 20 machine.

$ sudo -u postgres \
    bash -c "psql -c \"CREATE USER mark
                       WITH PASSWORD 'test'
                       SUPERUSER;\""

With PostgreSQL access setup, create a database called intel.

$ createdb intel

I'll import one of the datasets within fivethirtyeight's repo. Note, because the dates within this dataset are not formatted in YYYY-MM-DD format, I needed to override the format so that the MM/DD/YYYY format would be read properly.

$ csvsql --db postgresql:///intel \
         --insert ~/538data/congress-generic-ballot/generic_topline_historical.csv \
         --datetime-format="%m/%d/%Y"

I'll run the Excel Report Generator:

$ python datafluent_pg.py

This will result in a fluency.xlsx file being produced with two worksheets: Metrics and Time Distributions.

If you need to override any parameters, please refer to the documentation:

$ python datafluent_pg.py --help
Usage: datafluent_pg.py [OPTIONS]

Options:
  --pg-dns TEXT                   [default: postgresql://localhost:5432/intel]
  --output TEXT                   [default: fluency.xlsx]
  --install-completion [bash|zsh|fish|powershell|pwsh]
                                  Install completion for the specified shell.
  --show-completion [bash|zsh|fish|powershell|pwsh]
                                  Show completion for the specified shell, to
                                  copy it or customize the installation.

  --help                          Show this message and exit.
You might also like...
A multipurpose bot designed to make Discord better for everyone, written in Python.
A multipurpose bot designed to make Discord better for everyone, written in Python.

Hadum A multipurpose bot that makes Discord better for everyone Features A Fully Functional Moderation component: manage your staff, members and permi

A better rename and convert bot with upload mode option and Auto detection

A better rename and convert bot with upload mode option and Auto detection

Design and build a wrapper for the Open Weather API current weather data service

Design and build a wrapper for the Open Weather API current weather data service that returns a city's temperature, with caching, also allowing for the temperature of the latest queried cities that are still validly cached to be retrieved.

The scope of this project will be to build a data ware house on Google Cloud Platform that will help answer common business questions as well as powering dashboards
The scope of this project will be to build a data ware house on Google Cloud Platform that will help answer common business questions as well as powering dashboards

The scope of this project will be to build a data ware house on Google Cloud Platform that will help answer common business questions as well as powering dashboards.

This is a walkthrough about understanding the #BoF machine present in the #OSCP exam.

Buffer Overflow methodology Introduction These are 7 simple python scripts and a methodology to ease (not automate !) the exploitation. Each script ta

You can share your Chegg account for answers using this bot with your friends without getting your account blocked/flagged

Chegg-Answer-Bot You can share your Chegg account for answers using this bot with your friends without getting your account blocked/flagged Reuirement

A python script fetches all your starred repositories from your GitHub account and clones them to your server so you will never lose important resources
A python script fetches all your starred repositories from your GitHub account and clones them to your server so you will never lose important resources

A python script fetches all your starred repositories from your GitHub account and clones them to your server so you will never lose important resources

A powerful bot to copy your google drive data to your team drive
A powerful bot to copy your google drive data to your team drive

⚛️ Clonebot - Heroku version ⚡ CloneBot is a telegram bot that allows you to copy folder/team drive to team drives. One of the main advantage of this

Utilizing the freqtrade high-frequency cryptocurrency trading framework to build and optimize trading strategies. The bot runs nonstop on a Rasberry Pi.
Utilizing the freqtrade high-frequency cryptocurrency trading framework to build and optimize trading strategies. The bot runs nonstop on a Rasberry Pi.

Freqtrade Strategy Repository Please test all scripts and dry run them before using them in live mode Contact me on discord if you have any questions!

Build better AWS infrastructure

Sceptre About Sceptre is a tool to drive AWS CloudFormation. It automates the mundane, repetitive and error-prone tasks, enabling you to concentrate o

sceptre 1.4k Jan 4, 2023
A simple bot that lives in your Telegram group, logging messages to a Postgresql database and serving statistical tables and plots to users as Telegram messages.

telegram-stats-bot Telegram-stats-bot is a simple bot that lives in your Telegram group, logging messages to a Postgresql database and serving statist

null 22 Dec 26, 2022
This is a scalable system that reads messages from public Telegram channels using Telethon and stores the data in a PostgreSQL database.

This is a scalable system that reads messages from public Telegram channels using Telethon and stores the data in a PostgreSQL database. Its original intention is to monitor cryptocurrency related channels, but it can be configured to read any Telegram data that is accessible through the API.

Greg 3 Jun 7, 2022
iso6.9 is a Discord bot written in Python and is used to make your Discord experience better

iso6.9-2.6stable (debloated) iso.bot is originally made by notsniped#4573. This is a remix of iso.bot by αrchιshα#5518. iso6.9 is a Discord bot writte

Kamilla Youver 2 Jun 10, 2022
Project template for using aws-cdk, Chalice and React in concert, including RDS Postgresql and AWS Cognito

What is This? This repository is an opinonated project template for using aws-cdk, Chalice and React in concert. Where aws-cdk and Chalice are in Pyth

Rasmus Jones 4 Nov 7, 2022
📈 A Discord bot for displaying the download stats of a repository made with Python, the Hikari API and PostgreSQL.

?? axyl-stats axyl-stats is a Discord bot made with Python (with the Hikari API wrapper) and PostgreSQL, used as a download counter for a GitHub repo.

Angelo-F 2 May 14, 2022
An open source development framework to help you build data workflows and modern data architecture on AWS.

AWS DataOps Development Kit (DDK) The AWS DataOps Development Kit is an open source development framework for customers that build data workflows and

Amazon Web Services - Labs 111 Dec 23, 2022
Official Python client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Python apps.

MonkeyLearn API for Python Official Python client for the MonkeyLearn API. Build and run machine learning models for language processing from your Pyt

MonkeyLearn 157 Nov 22, 2022
Git Plan - a better workflow for git

git plan A better workflow for git. Git plan inverts the git workflow so that you can write your commit message first, before you start writing code.

Rory Byrne 178 Dec 11, 2022
A discord Server Bot made with Python, This bot helps people feel better by inspiring them with motivational quotes or by responding with a great message, also the users of the server can create custom messages by telling the bot with Commands.

A discord Server Bot made with Python, This bot helps people feel better by inspiring them with motivational quotes or by responding with a great message, also the users of the server can create custom messages by telling the bot with Commands.

Aran 1 Oct 13, 2021