Mortgage-loan-prediction - Show how to perform advanced Analytics and Machine Learning in Python using a full complement of PyData utilities

Joachim

Last update: Dec 26, 2021

Related tags

Data Analysis python machine-learning machine data-analysis data-preprocessing feature-engineering random-forest-classifier

Overview

MORTGAGE LOAN AQUISITION REQUIREMENT

This entire project encompasses both Data Analysis and Machine Learning. It was carefully structured and compiled for easy understanding.

Installation:

To run this notebook you can either install.

Download anaconda from anaconda site this have almost all dependencies pre-installed. Feel free to use any environment of choice

Dependencies:

Personal project | Mortgage loan elegibility prediction

The Home Mortgage Disclosure Act (HMDA) requires many financial institutions to maintain, report, and publicly disclose information about mortgages. These public data are important because:

- they help show whether lenders are serving the housing needs of their communities.
- help authourities to determine and fish out all predatory act of lending.
- they give public officials information that helps them make decisions and policies.
- They shed light on lending patterns that could be discriminatory. Eg. a reported increase in mortgage borrowing by blacks and Hispanics as of 1993.

On my Kaggle site My Homepage.

Goal for this Notebook:

Show how to perform advanced Analytics and Machine Learning in Python using a full complement of PyData utilities. This is aimed for those looking to get into the field Data Science or those who are already in the field and looking to solve a real world project with python.

This Notebook will teach the following:

Data Handling

Importing Data with Pandas
Cleaning Data
Exploring Data through Visualizations with Matplotlib
Doing predictive Analysis with various Machine Learning Algorithms

Data Analysis/Machine Learning

Supervised Machine learning Techniques: + RandomForestClassifier + StratifiedKfold ( 5 folds) + ETC

Valuation of the Analysis

K-folds cross validation to valuate results locally
Output the results from the IPython Notebook to Kaggle

Results obtained

Was able to derive excerpt insights to give pro recommendation to borrowers
Was able to predict applicant loan approval with 74% accuracy

You might also like...

Full ELT process on GCP environment.

Rent Houses Germany - GCP Pipeline Project: The goal of the project is to extract data about house rentals in Germany, store, process and analyze it u

2 Jan 20, 2022

Using Data Science with Machine Learning techniques (ETL pipeline and ML pipeline) to classify received messages after disasters.

1 Feb 11, 2022

Churn prediction with PySpark

It is expected to develop a machine learning model that can predict customers who will leave the company.

3 Aug 13, 2021

Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) an

7.2k Dec 30, 2022

Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks

The following Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks (MOFs). The training set is extracted from the Cambridge Structural Database and the CoRE_MOF 2019 dataset.

1 Jan 9, 2022

Mortgage-loan-prediction - Show how to perform advanced Analytics and Machine Learning in Python using a full complement of PyData utilities

Related tags

Overview

MORTGAGE LOAN AQUISITION REQUIREMENT

Installation:

Dependencies:

Personal project | Mortgage loan elegibility prediction

Goal for this Notebook:

This Notebook will teach the following:

Data Handling

Data Analysis/Machine Learning

Valuation of the Analysis

Results obtained

You might also like...

Full ELT process on GCP environment.

Using Data Science with Machine Learning techniques (ETL pipeline and ML pipeline) to classify received messages after disasters.

Churn prediction with PySpark

Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks

Used for data processing in machine learning, and help us to construct ML model more easily from scratch

A collection of learning outcomes data analysis using Python and SQL, from DQLab.

Additional tools for particle accelerator data analysis and machine information

Single machine, multiple cards training; mix-precision training; DALI data loader.

Owner

Joachim

A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using the tools and APIs you know and love from the PyData stack (such as numpy, pandas, and scikit-learn).

ForecastGA is a Python tool to forecast Google Analytics data using several popular time series models.

Utilize data analytics skills to solve real-world business problems using Humana’s big data

A python package which can be pip installed to perform statistics and visualize binomial and gaussian distributions of the dataset

Tokyo 2020 Paralympics, Analytics

vartests is a Python library to perform some statistic tests to evaluate Value at Risk (VaR) Models

Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which he recommends to buy. We will use this data to build a portfolio

Show you how to integrate Zeppelin with Airflow

Full automated data pipeline using docker images

cLoops2: full stack analysis tool for chromatin interactions