✨ Real-life Data Analysis and Model Training Workshop by Global AI Hub.

Overview

🎓 Data Analysis and Model Training Course by Global AI Hub

Syllabus:

Day 1

  • What is Data?

  • Multimedia

  • Structured and Unstructured Data

  • Data Types

  • Data Visualization

    • What is Visualization?
    • Tufte's 6 Principle
    • Visualization Types
      • Line Plot
      • Scatter Plot
      • Bar Plot
      • Histogram
      • Pie Charts
      • Heatmap
      • Box Plot
      • Kartil Nedir? Nasıl Hesaplanır?
      • Joint Plot
      • KDE(Kernel Density Estimate)
  • Statistics

    • Descriptive Statistics Concepts
    • The Concept of Skewness
    • Correlation and Correlation Matrix
    • The Simpsons Paradox
    • Anscombe Quartet
    • Data Distribution and Hypothesis Testing
  • Data Distribution

    • Data and Distribution
    • Gaussian(Normal) Distribution
    • t-Distribution
    • Degrees of Freedom
    • Bernoulli's Distribution
    • Exponential Distribution
  • Application

    • Pandas Revision
    • Introduction to Data Preprocessing with Pandas

Day 2

  • Hypothesis Tests

    • Basic Hypothesis testing
    • P value
    • T test
    • Z test
    • Chi-square (Chi-Square) Test
    • Errors in Hypothesis Testing
  • Data Cleaning

    • The 68-95-99.7 Rule and 3 Sigma
    • Outlier, Missing and Duplicate Data and their Detection
    • Z-Score
    • Handling missing values
    • Null vs NaN
    • Pandas Functions for missing values
    • Dimensionality Reduction
    • PCA (Principal Component Analysis)
    • Collinearity (Multiple Linear Connection
  • Data Transformation

    • Data Conversion Techniques
      • round
      • Scaling
      • Label Encoding
      • One Hot Encoding
      • Stack
      • melt
      • Shorts
      • Feature Engineering
  • Data Augmentation

    • Aggregation Functions
  • Application

    • Data Visualization with Seaborn
    • Data Preprocessing with Pandas

Day 3

  • ML Review

    • What is Machine Learning?
    • Supervised Learning
    • Unsupervised Learning
    • Errors That May Be Encountered in Model Training
    • Tools Used in Data Analysis and Machine Learning
    • End-to-End Machine Learning Project Steps
  • Application

    • Training An End-to-End ML Model with a Real Dataset

Certification

The course completion is certified.

You might also like...
LotteryBuyPredictionWebApp - Lottery Purchase Prediction Model

Lottery Purchase Prediction Model Objective and Goal Predict the lottery type th

Generates, filters, parses, and cleans data regarding the financial disclosures of judges in the American Judicial System

This repository contains code that gets data regarding financial disclosures from the Court Listener API main.py: contains driver code that interacts

A tutorial for people to run synthetic data replica's from source healthcare datasets
A tutorial for people to run synthetic data replica's from source healthcare datasets

Synthetic-Data-Replica-for-Healthcare Description What is this? A tailored hands-on tutorial showing how to use Python to create synthetic data replic

advance python series: Data Classes, OOPs, python

Working With Pydantic - Built-in Data Process ========================== Normal way to process data (reading json file): the normal princiople, it's f

A Python library for setting up projects using tabular data.

A Python library for setting up projects using tabular data. It can create project folders, standardize delimiters, and convert files to CSV from either individual files or a directory.

An open source utility for creating publication quality LaTex figures generated from OpenFOAM data files.

foamTEX An open source utility for creating publication quality LaTex figures generated from OpenFOAM data files. Explore the docs » Report Bug · Requ

Python code for working with NFL play by play data.

nfl_data_py nfl_data_py is a Python library for interacting with NFL data sourced from nflfastR, nfldata, dynastyprocess, and Draft Scout. Includes im

This contains timezone mapping information for when preprocessed from the geonames data

when-data This contains timezone mapping information for when preprocessed from the geonames data. It exists in a separate repository so that one does

Quick tutorial on orchest.io that shows how to build multiple deep learning models on your data with a single line of code using python
Quick tutorial on orchest.io that shows how to build multiple deep learning models on your data with a single line of code using python

Deep AutoViML Pipeline for orchest.io Quickstart Build Deep Learning models with a single line of code: deep_autoviml Deep AutoViML helps you build te

Owner
Global AI Hub
Where AI Talent Meets Opportunity
Global AI Hub
Materi workshop "Light up your Python!" Himpunan Mahasiswa Sistem Informasi Fakultas Ilmu Komputer Universitas Singaperbangsa Karawang, 4 September 2021 (Online via Zoom).

Workshop Python UNSIKA 2021 Materi workshop "Light up your Python!" Himpunan Mahasiswa Sistem Informasi Fakultas Ilmu Komputer Universitas Singaperban

Eka Putra 20 Mar 24, 2022
Fully reproducible, Dockerized, step-by-step, tutorial on how to mock a "real-time" Kafka data stream from a timestamped csv file. Detailed blog post published on Towards Data Science.

time-series-kafka-demo Mock stream producer for time series data using Kafka. I walk through this tutorial and others here on GitHub and on my Medium

Maria Patterson 26 Nov 15, 2022
DataAnalysis: Some data analysis projects in charles_pikachu

DataAnalysis DataAnalysis: Some data analysis projects in charles_pikachu You can star this repository to keep track of the project if it's helpful fo

null 9 Nov 4, 2022
graphical orbitational simulation of solar system planets with real values and physics implemented so you get a nice elliptical orbits. you can change timestamp value or scale from source code idc.

solarSystemOrbitalSimulation graphical orbitational simulation of solar system planets with real values and physics implemented so you get a nice elli

Mega 3 Mar 3, 2022
Automated generation of real Swagger/OpenAPI 2.0 schemas from Django REST Framework code.

drf-yasg - Yet another Swagger generator Generate real Swagger/OpenAPI 2.0 specifications from a Django Rest Framework API. Compatible with Django Res

Cristi Vîjdea 3k Dec 31, 2022
Code for our SIGIR 2022 accepted paper : P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

P3 Ranker Implementation for our SIGIR2022 accepted paper: P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-bas

null 14 Jan 4, 2023
Plotting and analysis tools for ARTIS simulations

Artistools Artistools is collection of plotting, analysis, and file format conversion tools for the ARTIS radiative transfer code. Installation First

ARTIS Monte Carlo Radiative Transfer 8 Nov 7, 2022
Data-Scrapping SEO - the project uses various data scrapping and Google autocompletes API tools to provide relevant points of different keywords so that search engines can be optimized

Data-Scrapping SEO - the project uses various data scrapping and Google autocompletes API tools to provide relevant points of different keywords so that search engines can be optimized; as this information is gathered, the marketing team can target the top keywords to get your company’s website higher on a results page.

Vibhav Kumar Dixit 2 Jul 18, 2022
layout-parser 3.4k Dec 30, 2022
Docov - Light-weight, recursive docstring coverage analysis for python modules

docov Light-weight, recursive docstring coverage analysis for python modules. Ov

Richard D. Paul 3 Feb 4, 2022