Coursera Machine Learning - Python code

Jordi Warmenhoven

Last update: Dec 10, 2022

Related tags

Overview

Coursera Machine Learning

This repository contains python implementations of certain exercises from the course by Andrew Ng.

For a number of assignments in the course you are instructed to create complete, stand-alone Octave/MATLAB implementations of certain algorithms (Linear and Logistic Regression for example). The rest of the assignments depend on additional code provided by the course authors. For most of the code in this repository I have instead used existing Python implementations like Scikit-learn.

Exercise 1 - Linear Regression
Exercise 2 - Logistic Regression
Exercise 3 - Multi-class Classification and Neural Networks
Exercise 4 - Neural Networks Learning
Exercise 5 - Regularized Linear Regression and Bias v.s. Variance
Exercise 6 - Support Vector Machines
Exercise 7 - K-means Clustering and Principal Component Analysis
Exercise 8 - Anomaly Detection and Recommender Systems

References:

https://www.coursera.org/learn/machine-learning/home/welcome

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

366 Jan 3, 2023

A data preprocessing package for time series data. Design for machine learning and deep learning.

152 Jan 7, 2023

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

5.7k Dec 30, 2022

A comprehensive repository containing 30+ notebooks on learning machine learning!

3.8k Jan 9, 2023

Implemented four supervised learning Machine Learning algorithms

Implemented four supervised learning Machine Learning algorithms from an algorithmic family called Classification and Regression Trees (CARTs), details see README_Report.

0 Jan 31, 2022

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

3k Jan 8, 2023

A library of extension and helper modules for Python's data analysis and machine learning libraries.

Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. Sebastian Raschka 2014-2021 Links Doc

4.2k Dec 29, 2022

MLBox is a powerful Automated Machine Learning python library.

MLBox is a powerful Automated Machine Learning python library. It provides the following features: Fast reading and distributed data preprocessing/cle

1.4k Jan 6, 2023

Python package for stacking (machine learning technique)

vecstack Python package for stacking (stacked generalization) featuring lightweight functional API and fully compatible scikit-learn API Convenient wa

671 Dec 25, 2022

Comments

No `theta_0` parameter used in Exercise 2 Regularized Logistic Regression
Hi. I am right now doing Andrew Ng. class (Machine Learning) and am using your answer as a guidance. I've realized something in your Exercise 2 (the regularized exercise one).

You've extracted your data from the ex2data2.txt like this:

data2 = loaddata('data/ex2data2.txt', ',') y = np.c_[data2[:,2]] X = data2[:,0:2]

And I believe you didn't inserted the theta_0 parameter to the X variable's first column (supposed to be the value of 1). Aren't it supposed to be so? In the non-regularized logistic regression exercise, you did inserted the theta_0 parameter as like this:

X = np.c_[np.ones((data.shape[0],1)), data[:,0:2]]

Am I missing something here?
question
opened by hilmandayo 1
Matrix sum in Neural network's cost function

Hi Jordi,

First of all, thanks so much for the notebooks. They really help me to follow through the course. I have one question in your notebook 4, nnCostFunction -- where J = ... np.sum((np.log(a3.T)*(y_matrix)+np.log(1-a3).T*(1-y_matrix))).

I think this does matrix multiplication --> giving 10*10 matrix (or n_label * n_label). This gives a matrix, let's name this cost-matrix, Jc. This Jc matrix contains not only how a set of predicted values for one label differs from it's corresponding target (diagonal elements), but also how it is differs from targets of other labels (off-diagonal elements). For example, the multiplication would multiply a column of predicted values np.log(a3.T) of one label (e.g. k) with all columns of targets.

Then the code sums all elements of this matrix. This seems to over-calculate J. Instead of summing all the elements, I think only the diagonal elements are needed.

Please use this picture to accommodate my description, which might be confusing.

Please let me know if I misunderstood the code.

Best regards and thanks again, -Tua

opened by clumdee 2

Regarding regularized NN equation in exercise 4

I am referring to your exercise 4 code right now to complete mine. I think you've made a mistake on regularizing the NN backpropagation gradient (if I am wrong, pardon me). This is the equation:

And this is your code:

delta1 = d2.dot(a1) # 25x5000 * 5000x401 = 25x401
delta2 = d3.T.dot(a2) # 10x5000 *5000x26 = 10x26
    
theta1_ = np.c_[np.ones((theta1.shape[0],1)),theta1[:,1:]]
theta2_ = np.c_[np.ones((theta2.shape[0],1)),theta2[:,1:]]
    
theta1_grad = delta1/m + (theta1_*reg)/m
theta2_grad = delta2/m + (theta2_*reg)/m

Shouldn't it be

theta1_ = np.c_[np.zeros((theta1.shape[0],1)),theta1[:,1:]]
theta2_ = np.c_[np.zeros((theta2.shape[0],1)),theta2[:,1:]]

since we do not want to add anything to theta1_grad's and theta2_grad's first column (the bias)?

opened by hilmandayo 1

Owner

Jordi Warmenhoven

Data Visualization, Probabilistic Programming and Statistics Enthusiast | Fairly Bayesian | Mostly Python | Always curious

GitHub

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

8.1k Dec 30, 2022

CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

19 Oct 3, 2022

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

Petastorm Contents Petastorm Installation Generating a dataset Plain Python API Tensorflow API Pytorch API Spark Dataset Converter API Analyzing petas

1.6k Dec 31, 2022

This is the code repository for Interpretable Machine Learning with Python, published by Packt.

Interpretable Machine Learning with Python, published by Packt

299 Jan 2, 2023

MIT-Machine Learning with Python–From Linear Models to Deep Learning

MIT-Machine Learning with Python–From Linear Models to Deep Learning | One of the 5 courses in MIT MicroMasters in Statistics & Data Science Welcome t

2 Aug 23, 2022

Examples and code for the Practical Machine Learning workshop series

Practical Machine Learning Workshop Series Practical Machine Learning for Quantitative Finance Post conference workshop at the WBS Spring Conference D

21 Jun 25, 2022

100 Days of Machine and Deep Learning Code

?? Days of Machine Learning and Deep Learning Code MACHINE LEARNING TOPICS COVERED - FROM SCRATCH Linear Regression Logistic Regression K Means Cluste

66 Nov 2, 2022

Turns your machine learning code into microservices with web API, interactive GUI, and more.

2.8k Jan 2, 2023

Traingenerator 🧙 A web app to generate template code for machine learning ✨

Traingenerator ?? A web app to generate template code for machine learning ✨ ?? Traingenerator is now live! ??

1.2k Jan 7, 2023

PLUR is a collection of source code datasets suitable for graph-based machine learning.

PLUR (Programming-Language Understanding and Repair) is a collection of source code datasets suitable for graph-based machine learning. We provide scripts for downloading, processing, and loading the datasets. This is done by offering a unified API and data structures for all datasets.

76 Nov 25, 2022

Coursera Machine Learning - Python code

Related tags

Overview

Coursera Machine Learning

References:

You might also like...

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

A comprehensive repository containing 30+ notebooks on learning machine learning!

Implemented four supervised learning Machine Learning algorithms

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

A library of extension and helper modules for Python's data analysis and machine learning libraries.

MLBox is a powerful Automated Machine Learning python library.

Python package for stacking (machine learning technique)

Comments

No `theta_0` parameter used in Exercise 2 Regularized Logistic Regression

Matrix sum in Neural network's cost function

Regarding regularized NN equation in exercise 4

Owner

Jordi Warmenhoven

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

This is the code repository for Interpretable Machine Learning with Python, published by Packt.

MIT-Machine Learning with Python–From Linear Models to Deep Learning

Examples and code for the Practical Machine Learning workshop series

100 Days of Machine and Deep Learning Code

Turns your machine learning code into microservices with web API, interactive GUI, and more.

Traingenerator 🧙 A web app to generate template code for machine learning ✨

PLUR is a collection of source code datasets suitable for graph-based machine learning.