demir.ai Dataset Operations

Overview

demir.ai Dataset Operations

With this application, you can have the empty values (nan/null) deleted or filled before giving your dataset to machine learning algorithms, you can access visual or numerical information about your dataset and have more detailed information about your attributes.

The application is written in Python programming language, Flask framework is used in the backend, Html is used in the frontent. Pandas framework is used to navigate over the dataset, all numerical operations on the dataset were written by me and no ready-made functions were used, while the plots were created from scratch by me using the Opencv framework.

Before running the application, you can install the necessary packages for the application with the following command.

pip3 install -r requirements.txt

You can launch the web application with the following command, and then you can use the application by going to http://localhost:5000/.

python3 main.py

With this web application, you can delete rows or columns with empty values (nan/null) on your dataset or fill these empty values in three different ways.

  • Null value (nan) operations you can do on your dataset with demir.ai Dataset Operations:

    • Column-based deletion of null data (nan/null)
    • Row-based deletion of null data (nan/null)
    • Filling in blank data by mean, median and mode

Again, thanks to this web application, you can reach visual or numerical results about your dataset and have detailed information about your dataset.

  • Information you can learn about your dataset with demir.ai Dataset Operations:

    • Mean of columns
    • Median of columns
    • Mode of columns
    • Frequency of columns
    • Interquartile range value (IQR) of columns
    • Outliers of columns
    • Five number summary of columns
    • Box Chart of columns
    • Variance and standard deviation of columns

Null value (nan/null) operations

  • Column-based deletion of null data (nan/null): The number of nulls is calculated for each column, then the percentage of nulls is calculated and if this percentage is greater than the percentage the user enters, this column is deleted.

  • Row-based deletion of null data (nan/null): The number of nulls is calculated for each line, and if this number of nulls is greater than the number entered by the user, this line is deleted.

  • Filling in blank data by mean, median and mode:

    • Mean: The sum of the non-blank values of the columns is taken and divided by the total number of non-blank values, the average obtained is written instead of the empty values.

    • Median: The median is calculated according to the non-blank values in the columns, and then this median value is written instead of the empty columns.

    • Mode: The mode is calculated according to the non-blank values in the columns, and then this mode value is written instead of the empty columns

Information you can learn about your dataset

  • Mean of columns: The mean is calculated for each column separately and the column mean information is presented to the user.

  • Median of columns: The median is calculated for each column separately and the column median information is presented to the user.

  • Mode of columns: The mode is calculated for each column separately and the column mode information is presented to the user.

  • Frequency of columns: Frequency is calculated for each column and the frequency information of the columns is presented to the user. In this section, frequency visualization is also done by creating a bar plot from scratch with Opencv.

  • Interquartile range value (IQR) of columns: Q1 and Q3 values are found for each column, then the IQR value of the columns is found with Q3-Q1 and presented to the user.

  • Outliers of columns: If the data in the column is less than (Q1-IQR * 1.5) and greater than (Q3+IQR * 1.5), it is called outlier and this information is presented to the user.

  • Five number summary of columns: Minimum, Q1, median, Q3 and Maximum values are calculated and presented to the user.

  • Box Chart of columns: After finding the minimum, Q1, median, Q3 and maximum values for each column, a box chart is created from scratch with Opencv and this chart is presented to the user.

  • Variance and standard deviation of columns: The variance and standard deviation for each column are calculated and presented to the user.

Application video

demirai.mp4
You might also like...
A series of convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.
A series of convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.

imutils A series of convenience functions to make basic image processing functions such as translation, rotation, resizing, skeletonization, and displ

A collection of over 5.1 million sub-domains and assets belonging to public bug bounty programs, compiled into a repo, for performing bulk operations.
A collection of over 5.1 million sub-domains and assets belonging to public bug bounty programs, compiled into a repo, for performing bulk operations.

📂 Public Bug Bounty Targets Data By BugBountyResources A collection of over 5.1M sub-domains and assets belonging to bug bounty targets, all put in a

A cross platform package to do curses-like operations, plus higher level APIs and widgets to create text UIs and ASCII art animations
A cross platform package to do curses-like operations, plus higher level APIs and widgets to create text UIs and ASCII art animations

ASCIIMATICS Asciimatics is a package to help people create full-screen text UIs (from interactive forms to ASCII animations) on any platform. It is li

Deep learning operations reinvented (for pytorch, tensorflow, jax and others)
Deep learning operations reinvented (for pytorch, tensorflow, jax and others)

This video in better quality. einops Flexible and powerful tensor operations for readable and reliable code. Supports numpy, pytorch, tensorflow, and

The goal of pandas-log is to provide feedback about basic pandas operations. It provides simple wrapper functions for the most common functions that add additional logs

pandas-log The goal of pandas-log is to provide feedback about basic pandas operations. It provides simple wrapper functions for the most common funct

PyTorch Extension Library of Optimized Scatter Operations

PyTorch Scatter Documentation This package consists of a small extension library of highly optimized sparse update (scatter and segment) operations fo

PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations

PyTorch Sparse This package consists of a small extension library of optimized sparse matrix operations with autograd support. This package currently

Discord bot for calculating basic operations and formulas. (Early Development)

MathBot Discord bot for calculating basic operations and formulas. (Early Development) Commits Feel free to contribute to this bot by forking and pull

Reference python implementation of Chia pool operations for pool operators
Reference python implementation of Chia pool operations for pool operators

This repository provides a sample server written in python, which is meant to server as a basis for a Chia Pool. While this is a fully functional implementation, it requires some work in scalability and security to run in production.

Performing the following operations using python on PDF.
Performing the following operations using python on PDF.

Python PDF Handling Tutorial Python is a highly versatile language with a huge set of libraries. It is a high level language with simple syntax. Pytho

Burp Suite extension to log GraphQL operations as a comment
Burp Suite extension to log GraphQL operations as a comment

Burp GraphQL Logger A very simple, straightforward extension that logs GraphQL operations as a comment in the Proxy view. To enable the highlight, unc

Doing set operations on files considered as sets of lines

CLI tool that can be used to do set operations like union on files considering them as a set of lines. Notes It ignores all empty lines with whitespac

Visions provides an extensible suite of tools to support common data analysis operations
Visions provides an extensible suite of tools to support common data analysis operations

Visions And these visions of data types, they kept us up past the dawn. Visions provides an extensible suite of tools to support common data analysis

Additional useful operations for Python

Pyteal Extensions Additional useful operations for Python Available Operations MulDiv64: calculate m1*m2/d with no overflow on multiplication (TEAL 3+

An application which enables the users to perform simple yet intriguing PDF operations

AstutePDF A repository containing the GUI for an application which enables the users to perform simple yet intriguing PDF operations. These include, M

Extends the pyranges module with operations on joined genomic intervals

tiedpyranges Extends the pyranges module with operations on joined genomic intervals (e.g. exons of same transcript) Install with: pip install tiedpyr

A Django Webapp performing CRUD operations on Library Database.

CRUD operations - Django Library Database A Django Webapp performing CRUD operations on Library Database. Tools & Technologies used: Django MongoDB HT

This repository contains various tools useful for offensive operations (reversing, etc) regarding the PE (Portable Executable) format

PE-Tools This repository contains various tools useful for offensive operations (reversing, etc) regarding the PE (Portable Executable) format Install

A Proof-of-Concept Layer 2 Denial of Service Attack that disrupts low level operations of Programmable Logic Controllers within industrial environments. Utilizing multithreaded processing, Automator-Terminator delivers a powerful wave of spoofed ethernet packets to a null MAC address.
Owner
Ahmet Furkan DEMIR
Hi, my name is Ahmet Furkan DEMIR. I study computer engineering at Necmettin Erbakan University.
Ahmet Furkan DEMIR
Farhad Davaripour, Ph.D. 1 Jan 5, 2022
Automatically Visualize any dataset, any size with a single line of code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

AutoViz Automatically Visualize any dataset, any size with a single line of code. AutoViz performs automatic visualization of any dataset with one lin

AutoViz and Auto_ViML 1k Jan 2, 2023
Automatically Visualize any dataset, any size with a single line of code. Created by Ram Seshadri. Collaborators Welcome. Permission Granted upon Request.

AutoViz Automatically Visualize any dataset, any size with a single line of code. AutoViz performs automatic visualization of any dataset with one lin

AutoViz and Auto_ViML 299 Feb 13, 2021
A Python package that provides evaluation and visualization tools for the DexYCB dataset

DexYCB Toolkit DexYCB Toolkit is a Python package that provides evaluation and visualization tools for the DexYCB dataset. The dataset and results wer

NVIDIA Research Projects 107 Dec 26, 2022
Show Data: Show your dataset in web browser!

Show Data is to generate html tables for large scale image dataset, especially for the dataset in remote server. It provides some useful commond line tools and fully customizeble API reference to generate html table different tasks.

Dechao Meng 83 Nov 26, 2022
A simple python tool for explore your object detection dataset

A simple tool for explore your object detection dataset. The goal of this library is to provide simple and intuitive visualizations from your dataset and automatically find the best parameters for generating a specific grid of anchors that can fit you data characteristics

GRADIANT - Centro Tecnolóxico de Telecomunicacións de Galicia 142 Dec 25, 2022
Plot and save the ground truth and predicted results of human 3.6 M and CMU mocap dataset.

Visualization-of-Human3.6M-Dataset Plot and save the ground truth and predicted results of human 3.6 M and CMU mocap dataset. human-motion-prediction

Gaurav Kumar Yadav 5 Nov 18, 2022
Visualization of the World Religion Data dataset by Correlates of War Project.

World Religion Data Visualization Visualization of the World Religion Data dataset by Correlates of War Project. Mostly personal project to famirializ

Emile Bangma 1 Oct 15, 2022
SCOOP (Scalable COncurrent Operations in Python)

SCOOP (Scalable COncurrent Operations in Python) is a distributed task module allowing concurrent parallel programming on various environments, from h

Yannick Hold 573 Dec 27, 2022
An object-oriented approach to Python file/directory operations.

Unipath An object-oriented approach to file/directory operations Version: 1.1 Home page: https://github.com/mikeorr/Unipath Docs: https://github.com/m

Mike Orr 506 Dec 29, 2022