Orbivator AI - To Determine which features of data (measurements) are most important for diagnosing breast cancer and find out if breast cancer occurs or not.

anurag kumar singh

Last update: Jan 2, 2022

Related tags

Overview

Orbivator_AI

Breast Cancer Wisconsin (Diagnostic)

GOAL

To Determine which features of data (measurements) are most important for diagnosing breast cancer and find out if breast cancer occurs or not.

DATASET

https://www.kaggle.com/uciml/breast-cancer-wisconsin-data

DESCRIPTION

Breast cancer is the most common cancer amongst women in the world. It accounts for 25% of all cancer cases, and affected over 2.1 Million people in 2015 alone. It starts when cells in the breast begin to grow out of control. These cells usually form tumors that can be seen via X-ray or felt as lumps in the breast area.
Hence, we need to classify the dataset into whether the person will be having brest cancer or not.
The goal of this project is to analyse the data and classify whether the person will be having brest cancer ot not and build a model accordingly.

WHAT I HAD DONE

-> Importing the libraries

-> Loaded the dataset

Preprocessing of the dataset:

-> Knowing some of the statistical measures information

-> Visualizing the data

-> Correlation

-> Splitting the dataset

-> Training the data

-> Models used: - Random forest regressor - Logistic regression - Decision Trees

-> Evaluation of the model

-> Predicting the output of new data from the model having the high accuracy

MODELS USED

Random forest regressor:

A random forest regressor. A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting

Logistic regression:

Logistic regression is the appropriate regression analysis to conduct when the dependent variable is dichotomous (binary). Like all regression analyses, the logistic regression is a predictive analysis. Logistic regression is used to describe data and to explain the relationship between one dependent binary variable and one or more nominal, ordinal, interval or ratio-level independent variables.

Decision Trees:

Decision Trees are a type of Supervised Machine Learning (that is you explain what the input is and what the corresponding output is in the training data) where the data is continuously split according to a certain parameter.

LIBRARIES NEEDED

pandas
matplotlib
seaborn
sklearn

ACCURACIES

Random forest regressor: 79.44695652173913
Logistic regression: 63.29670329670329
Decision Trees: 89.47368421052632

CONCLUSION

Downloaded the dataset from kaggle, loading the required libraries, Data Pre-Processing, Splitting of data, building the models, testing thier accuracies and finilizing the model based on accuracy.
I have used three models to train the data starting with Random forest regressor, then SLogistic regression and after that Decision Trees. I have finilized the Decision Trees which is having highest accuracy.
Decision Trees is used to determine which features of data (measurements) are most important for diagnosing breast cancer and find out if breast cancer occurs or not with an accuracy over 89%

Anurag kumar Singh Jeesica Pearson Eric Edward Nitin kumar Aditi singh

github:https://github.com/anurag-bit/Orbivator

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence, etc. This article aims to provide an introduction on how to make use of the SpeechRecognition and pyttsx3 library of Python.

1 Feb 13, 2022

Program your own vulkan.gpuinfo.org query in Python. Used to determine baseline hardware for WebGPU.

query-gpuinfo-data License This software is not presently released under a license. The data in data/ is obtained under CC BY 4.0 as specified there.

5 Jul 18, 2022

Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*

Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*. The algorithm was extremely optimal running in ~15s to ~30s for search spaces as big as 10000000 nodes where a set of 18 actions could be performed at each node in the 3D Maze.

1 Mar 28, 2022

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

CLIP-ViL In our paper "How Much Can CLIP Benefit Vision-and-Language Tasks?", we show the improvement of CLIP features over the traditional resnet fea

310 Dec 28, 2022

Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

Static Features Classifier This is a static features classifier for Point-Could

1 Jan 25, 2022

📚 A collection of all the Deep Learning Metrics that I came across which are not accuracy/loss.

1 Jan 17, 2022

To Design and Implement Logistic Regression to Classify Between Benign and Malignant Cancer Types

To Design and Implement Logistic Regression to Classify Between Benign and Malignant Cancer Types, from a Database Taken From Dr. Wolberg reports his Clinic Cases.

1 Jul 31, 2022

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.

Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning Overview This code is for paper: Not All Unlabeled Data are Equa

22 Nov 23, 2022

Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network

DeepCDR Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network This work has been accepted to ECCB2020 and was also published in the

50 Dec 18, 2022

Orbivator AI - To Determine which features of data (measurements) are most important for diagnosing breast cancer and find out if breast cancer occurs or not.

Related tags

Overview

Orbivator_AI

You might also like...

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence

Program your own vulkan.gpuinfo.org query in Python. Used to determine baseline hardware for WebGPU.

Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

📚 A collection of all the Deep Learning Metrics that I came across which are not accuracy/loss.

To Design and Implement Logistic Regression to Classify Between Benign and Malignant Cancer Types

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.

Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network

Owner

anurag kumar singh

Predict Breast Cancer Wisconsin (Diagnostic) using Naive Bayes

Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics

Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network.

Cancer-and-Tumor-Detection-Using-Inception-model - In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks, specifically here the Inception model by google.

A series of Python scripts to access measurements from Fluke 28X meters. Fluke IR Remote Interface required.

A short code in python, Enchpyter, is able to encrypt and decrypt words as you determine, of course

An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.

In this work, we will implement some basic but important algorithm of machine learning step by step.

Feedback is important: response-aware feedback mechanism for background based conversation

Orbivator AI - To Determine which features of data (measurements) are most important for diagnosing breast cancer and find out if breast cancer occurs or not.

Related tags

Overview

Orbivator_AI

You might also like...

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence

Program your own vulkan.gpuinfo.org query in Python. Used to determine baseline hardware for WebGPU.

Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

📚 A collection of all the Deep Learning Metrics that I came across which are not accuracy/loss.

To Design and Implement Logistic Regression to Classify Between Benign and Malignant Cancer Types

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren*, Raymond A. Yeh*, Alexander G. Schwing.

Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network

Owner

anurag kumar singh

Predict Breast Cancer Wisconsin (Diagnostic) using Naive Bayes

Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics

Face Mask Detection is a project to determine whether someone is wearing mask or not, using deep neural network.

Cancer-and-Tumor-Detection-Using-Inception-model - In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks, specifically here the Inception model by google.

A series of Python scripts to access measurements from Fluke 28X meters. Fluke IR Remote Interface required.

A short code in python, Enchpyter, is able to encrypt and decrypt words as you determine, of course

An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.

In this work, we will implement some basic but important algorithm of machine learning step by step.

Feedback is important: response-aware feedback mechanism for background based conversation

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.