Coursera Machine Learning - Python code

Overview

Coursera Machine Learning

This repository contains python implementations of certain exercises from the course by Andrew Ng.

For a number of assignments in the course you are instructed to create complete, stand-alone Octave/MATLAB implementations of certain algorithms (Linear and Logistic Regression for example). The rest of the assignments depend on additional code provided by the course authors. For most of the code in this repository I have instead used existing Python implementations like Scikit-learn.

Exercise 1 - Linear Regression
Exercise 2 - Logistic Regression
Exercise 3 - Multi-class Classification and Neural Networks
Exercise 4 - Neural Networks Learning
Exercise 5 - Regularized Linear Regression and Bias v.s. Variance
Exercise 6 - Support Vector Machines
Exercise 7 - K-means Clustering and Principal Component Analysis
Exercise 8 - Anomaly Detection and Recommender Systems

References:

https://www.coursera.org/learn/machine-learning/home/welcome

You might also like...
Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.
A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

A comprehensive repository containing 30+ notebooks on learning machine learning!
A comprehensive repository containing 30+ notebooks on learning machine learning!

A comprehensive repository containing 30+ notebooks on learning machine learning!

Implemented four supervised learning Machine Learning algorithms

Implemented four supervised learning Machine Learning algorithms from an algorithmic family called Classification and Regression Trees (CARTs), details see README_Report.

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

A library of extension and helper modules for Python's data analysis and machine learning libraries.
A library of extension and helper modules for Python's data analysis and machine learning libraries.

Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. Sebastian Raschka 2014-2021 Links Doc

MLBox is a powerful Automated Machine Learning python library.
MLBox is a powerful Automated Machine Learning python library.

MLBox is a powerful Automated Machine Learning python library. It provides the following features: Fast reading and distributed data preprocessing/cle

Python package for stacking (machine learning technique)
Python package for stacking (machine learning technique)

vecstack Python package for stacking (stacked generalization) featuring lightweight functional API and fully compatible scikit-learn API Convenient wa

Comments
  • No `theta_0` parameter used in Exercise 2 Regularized Logistic Regression

    No `theta_0` parameter used in Exercise 2 Regularized Logistic Regression

    Hi. I am right now doing Andrew Ng. class (Machine Learning) and am using your answer as a guidance. I've realized something in your Exercise 2 (the regularized exercise one).

    You've extracted your data from the ex2data2.txt like this:

    data2 = loaddata('data/ex2data2.txt', ',')
    y = np.c_[data2[:,2]]
    X = data2[:,0:2]
    

    And I believe you didn't inserted the theta_0 parameter to the X variable's first column (supposed to be the value of 1). Aren't it supposed to be so? In the non-regularized logistic regression exercise, you did inserted the theta_0 parameter as like this:

    X = np.c_[np.ones((data.shape[0],1)), data[:,0:2]]
    

    Am I missing something here?

    question 
    opened by hilmandayo 1
  • Matrix sum in Neural network's cost function

    Matrix sum in Neural network's cost function

    Hi Jordi,

    First of all, thanks so much for the notebooks. They really help me to follow through the course. I have one question in your notebook 4, nnCostFunction -- where J = ... np.sum((np.log(a3.T)*(y_matrix)+np.log(1-a3).T*(1-y_matrix))).

    I think this does matrix multiplication --> giving 10*10 matrix (or n_label * n_label). This gives a matrix, let's name this cost-matrix, Jc. This Jc matrix contains not only how a set of predicted values for one label differs from it's corresponding target (diagonal elements), but also how it is differs from targets of other labels (off-diagonal elements). For example, the multiplication would multiply a column of predicted values np.log(a3.T) of one label (e.g. k) with all columns of targets.

    Then the code sums all elements of this matrix. This seems to over-calculate J. Instead of summing all the elements, I think only the diagonal elements are needed.

    Please use this picture to accommodate my description, which might be confusing. img_20170829_155209

    Please let me know if I misunderstood the code.

    Best regards and thanks again, -Tua

    opened by clumdee 2
  • Regarding regularized NN equation in exercise 4

    Regarding regularized NN equation in exercise 4

    I am referring to your exercise 4 code right now to complete mine. I think you've made a mistake on regularizing the NN backpropagation gradient (if I am wrong, pardon me). This is the equation:

    screen shot 2017-08-25 at 17 11 11

    And this is your code:

    delta1 = d2.dot(a1) # 25x5000 * 5000x401 = 25x401
    delta2 = d3.T.dot(a2) # 10x5000 *5000x26 = 10x26
        
    theta1_ = np.c_[np.ones((theta1.shape[0],1)),theta1[:,1:]]
    theta2_ = np.c_[np.ones((theta2.shape[0],1)),theta2[:,1:]]
        
    theta1_grad = delta1/m + (theta1_*reg)/m
    theta2_grad = delta2/m + (theta2_*reg)/m
    

    Shouldn't it be

    theta1_ = np.c_[np.zeros((theta1.shape[0],1)),theta1[:,1:]]
    theta2_ = np.c_[np.zeros((theta2.shape[0],1)),theta2[:,1:]]
    

    since we do not want to add anything to theta1_grad's and theta2_grad's first column (the bias)?

    opened by hilmandayo 1
Owner
Jordi Warmenhoven
Data Visualization, Probabilistic Programming and Statistics Enthusiast | Fairly Bayesian | Mostly Python | Always curious
Jordi Warmenhoven
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Vowpal Wabbit 8.1k Dec 30, 2022
CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Iterative 19 Oct 3, 2022
Uber Open Source 1.6k Dec 31, 2022
This is the code repository for Interpretable Machine Learning with Python, published by Packt.

Interpretable Machine Learning with Python, published by Packt

Packt 299 Jan 2, 2023
MIT-Machine Learning with Python–From Linear Models to Deep Learning

MIT-Machine Learning with Python–From Linear Models to Deep Learning | One of the 5 courses in MIT MicroMasters in Statistics & Data Science Welcome t

null 2 Aug 23, 2022
Examples and code for the Practical Machine Learning workshop series

Practical Machine Learning Workshop Series Practical Machine Learning for Quantitative Finance Post conference workshop at the WBS Spring Conference D

CompatibL 21 Jun 25, 2022
100 Days of Machine and Deep Learning Code

?? Days of Machine Learning and Deep Learning Code MACHINE LEARNING TOPICS COVERED - FROM SCRATCH Linear Regression Logistic Regression K Means Cluste

Tanishq Gautam 66 Nov 2, 2022
Turns your machine learning code into microservices with web API, interactive GUI, and more.

Turns your machine learning code into microservices with web API, interactive GUI, and more.

Machine Learning Tooling 2.8k Jan 2, 2023
Traingenerator 🧙 A web app to generate template code for machine learning ✨

Traingenerator ?? A web app to generate template code for machine learning ✨ ?? Traingenerator is now live! ??

Johannes Rieke 1.2k Jan 7, 2023
PLUR is a collection of source code datasets suitable for graph-based machine learning.

PLUR (Programming-Language Understanding and Repair) is a collection of source code datasets suitable for graph-based machine learning. We provide scripts for downloading, processing, and loading the datasets. This is done by offering a unified API and data structures for all datasets.

Google Research 76 Nov 25, 2022