Bigdata Simulation Library Of Dream By Sandman Books

Related tags

Data Analysis SADMAN
Overview

BIGDATA SIMULATION LIBRARY OF DREAM BY SANDMAN BOOKS

=================

Solution Architecture

delta

Description


In the realm of Dreaming, its ruler SANDMAN, DREAM has a certain hobby; books. In his castle there is a Library in which they are kept, among other things, stories conceived by their authors but never written in our reality; Lucien, the person responsible for his organization, needs some help. Many people dream of published books, sales markets, stories that, in their reality, they would never imagine conceiving. And this voluminous data needs to be worked on. In order not to get lost in the information, Lucien receives all his dreams in a Non-Relational bank, MONGO. And he needs this to be organized in a relational way, that is, each author in his proper place. For that he pulled our dream and saw this Architecture where data arrives in MONGO undergo a transformation process in the STAGIN area and are populated in MYSQL. In its population, we split two final tables. One in its raw state, for complete queries, and another with metrics that informs the number of dreamers, their books and the total number of files. In this way, data is more organized, undergoing deduplication and consolidation processes.

Glossary of Data


Fields Type Description
_id long undescore ID
kind string type of book or text file
title string title of book
subtitle string subtitle of book
author array one or more authors who can dream of stories
publisher string publisher or not dreamed of by the author
publishedDate string year of published
edition string which edition does the book belong to
sample string sample of books
type string ISBN code
identifier string isbn identification number
pageCount integer number of pages
capCount integer number of chapters
wordCount integer number of words
categories string literary genre
original_price double original price
current_prefix string country currency prefix
current_sufix string country currency name
barcode string barcode
dreaming_date string the day you had the dream

image


delta

Start the Project


To run the project, you need to install the dependencies located in the "dependencies" folder and in the root of the project, run the shell_script "run_script.sh".

Sample of Payload MONGO


mongo

{
        "_id" : ObjectId("61b1fe6944dd42158674af31"),
        "kind" : "books#volume",
        "volumeInfo" : {
                "title" : "STORE VISIT",
                "Subtitle" : "GLASS IT HAIR MEMBER KEY ALMOST QUALITY. MARKET ALREADY AIR STILL ARTICLE. DECADE DECADE MEASURE PRESENT HUMAN MORNING. BIG BLOOD ECONOMIC FRONT SUCCESS AGO THEM. EVERY SON TROUBLE SIMPLE.",
                "author" : [
                        "PETER RODRIGUEZ",
                        "KELLY TORRES"
                ]
        },
        "publisher" : "FALL AWAY ABOUT INDEPENDENT",
        "publishedDate" : "1994",
        "edition" : "7º EDITION",
        "sample" : "...onto sport room audience. page dinner hundred. week statement should watch she even ball.\nour able tv break defense seek baby. employee last around music produce reach tv..",
        "industryIdentifiers" : [
                {
                        "type" : "ISBN_10",
                        "identifier" : "1-55027-208-X"
                },
                {
                        "type" : "ISBN_10",
                        "identifier" : "0-405-30324-6"
                }
        ],
        "pageCount" : 796,
        "wordCount" : 83331,
        "capCount" : 14,
        "categories" : [
                "NOVEL"
        ],
        "saleInfo" : {
                "original_price" : 78,
                "current_prefix" : "LAK",
                "current_sufix" : "Lao kip",
                "barcode" : "6747254889534"
        }
}

Sample of Payload in MYSQL


library

_id  |kind        |title                                                           |subtitle                                                                                                                                                                                                                                                       |author                                       |publisher                               |publishedDate|edition   |sample                                                                                                                                                                                                     |type   |identifier       |pageCount|wordCount|capCount|categories               |original_price|current_prefix|current_sufix              |barcode      |dreaming_date|
-----+------------+----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------+----------------------------------------+-------------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+-----------------+---------+---------+--------+-------------------------+--------------+--------------+---------------------------+-------------+-------------+
  670|csv#volume  |GOOD DETERMINE OF                                               |DON'T HAVE                                                                                                                                                                                                                                                     |KAREN ODOM                                   |DARK WAR INDEPENDENT                    |1982         |9º EDITION|...involve star apply later including truth. next while nor worry staff economic.¶condition region write college. return half offer. popular could direction above fish..                                  |ISBN_10|978-1-4340-7508-6|      479|    64333|      11|EPISTOLARY NOVEL         |          97.0|BHD           |Bahraini dinar             |3894426059691|     20211209|

resume

metric          |value|
----------------+-----+
Total of Dreamns|71395|
Total of Books  |59154|
Total of Data   |78000|
You might also like...
pyETT: Python library for Eleven VR Table Tennis data
pyETT: Python library for Eleven VR Table Tennis data

pyETT: Python library for Eleven VR Table Tennis data Documentation Documentation for pyETT is located at https://pyett.readthedocs.io/. Installation

DaDRA (day-druh) is a Python library for Data-Driven Reachability Analysis.
DaDRA (day-druh) is a Python library for Data-Driven Reachability Analysis.

DaDRA (day-druh) is a Python library for Data-Driven Reachability Analysis. The main goal of the package is to accelerate the process of computing estimates of forward reachable sets for nonlinear dynamical systems.

A crude Hy handle on Pandas library

Quickstart Hyenas is a curde Hy handle written on top of Pandas API to allow for more elegant access to data-scientist's powerhouse that is Pandas. In

Lale is a Python library for semi-automated data science.
Lale is a Python library for semi-automated data science.

Lale is a Python library for semi-automated data science. Lale makes it easy to automatically select algorithms and tune hyperparameters of pipelines that are compatible with scikit-learn, in a type-safe fashion.

statDistros is a Python library for dealing with various statistical distributions

StatisticalDistributions statDistros statDistros is a Python library for dealing with various statistical distributions. Now it provides various stati

PipeChain is a utility library for creating functional pipelines.

PipeChain Motivation PipeChain is a utility library for creating functional pipelines. Let's start with a motivating example. We have a list of Austra

EOD Historical Data Python Library (Unofficial)

EOD Historical Data Python Library (Unofficial) https://eodhistoricaldata.com Installation python3 -m pip install eodhistoricaldata Note Demo API key

vartests is a Python library to perform some statistic tests to evaluate Value at Risk (VaR) Models

vartests is a Python library to perform some statistic tests to evaluate Value at Risk (VaR) Models, such as: T-test: verify if mean of distribution i

A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms
A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms

MatrixProfile MatrixProfile is a Python 3 library, brought to you by the Matrix Profile Foundation, for mining time series data. The Matrix Profile is

Owner
Maycon Cypriano
DATA ENGINEER | DATA SCIENCE | DATA PYTHON | DATA DRIVEN |
Maycon Cypriano
Top 50 best selling books on amazon

It's a dashboard that shows the detailed information about each book in the top 50 best selling books on amazon over the last ten years

Nahla Tarek 1 Nov 18, 2021
Tools for the analysis, simulation, and presentation of Lorentz TEM data.

ltempy ltempy is a set of tools for Lorentz TEM data analysis, simulation, and presentation. Features Single Image Transport of Intensity Equation (SI

McMorran Lab 1 Dec 26, 2022
Zipline, a Pythonic Algorithmic Trading Library

Zipline is a Pythonic algorithmic trading library. It is an event-driven system for backtesting. Zipline is currently used in production as the backte

Quantopian, Inc. 15.7k Jan 7, 2023
Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit

pgmpy 2.2k Dec 25, 2022
Sensitivity Analysis Library in Python (Numpy). Contains Sobol, Morris, Fractional Factorial and FAST methods.

Sensitivity Analysis Library (SALib) Python implementations of commonly used sensitivity analysis methods. Useful in systems modeling to calculate the

SALib 663 Jan 5, 2023
A probabilistic programming library for Bayesian deep learning, generative models, based on Tensorflow

ZhuSuan is a Python probabilistic programming library for Bayesian deep learning, which conjoins the complimentary advantages of Bayesian methods and

Tsinghua Machine Learning Group 2.2k Dec 28, 2022
TextDescriptives - A Python library for calculating a large variety of statistics from text

A Python library for calculating a large variety of statistics from text(s) using spaCy v.3 pipeline components and extensions. TextDescriptives can be used to calculate several descriptive statistics, readability metrics, and metrics related to dependency distance.

null 150 Dec 30, 2022
A library to create multi-page Streamlit applications with ease.

A library to create multi-page Streamlit applications with ease.

Jackson Storm 107 Jan 4, 2023
HyperSpy is an open source Python library for the interactive analysis of multidimensional datasets

HyperSpy is an open source Python library for the interactive analysis of multidimensional datasets that can be described as multidimensional arrays o

HyperSpy 411 Dec 27, 2022