Supporting code for short YouTube series Neural Networks Demystified.

Stephen

Last update: Dec 23, 2022

Related tags

Deep Learning Neural-Networks-Demystified

Overview

Neural Networks Demystified

Supporting iPython notebooks for the YouTube Series Neural Networks Demystified. I've included formulas, code, and the text of the movies in the iPython notebooks, in addition to raw code in python scripts.

iPython notebooks can be downloaded and run locally, or viewed using nbviewer: http://nbviewer.ipython.org/.

Using the iPython notebook

The iPython/Jupyter notebook is an incredible tool, but can be a little tricky to setup. I recommend the [anaconda] (https://store.continuum.io/cshop/anaconda/) distribution of python. I've written and tested this code with the the anaconda build of python 2 running on OSX. You will likely get a few warnings about contour plotting - if anyone has a fix for this, feel free to submit a pull request.

Comments

costFunction in Lecture 7
Hey everybode,

if J in the cost function is calculated like given in lecture 7

J = 0.5*sum((y-self.yHat)**2)/X.shape[0] + (self.Lambda/2)*(sum(self.W1**2)+sum(self.W2**2))

J ends up being a vector what causes in my believe the following error

File "C:\Program Files\Anaconda\lib\site-packages\scipy\optimize\linesearch.py", line 148, in scalar_search_wolfe1 alpha1 = min(1.0, 1.01*2*(phi0 - old_phi0)/derphi0)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

The problem is solved by adding another sum() to the calculation of J

J2 = 0.5*sum((y-self.yHat)**2)/X.shape[0] + (self.Lambda/2)*(sum(sum(self.W1**2))+sum(self.W2**2))

after that J is a scalar again.

Did you miss the sum() or did I miss something (and ended up doing something mathematically incorrent).

Best regards and thanks in advance!
opened by menecken 11
Something wrong with costfunction

Hi Stephen, i watched your video about this code, very good by the way, but when i tryed to implement i've got this error:

Traceback (most recent call last): in computeNumericalGradient numgrad[p] = (loss2 - loss1) / (2*e) ValueError: setting an array element with a sequence.

and when i forked your code, the error persists, the costfunction shoud return a number? or am i wrog?

thank's, Felipe Melo

opened by felipe-melo 6
I think I spotted an error in your excellent tutorial

Right after the line "We can now replace dyHat/dz3 with f prime of z 3." the formula shows an "dz3/dW3" on the left side which sould be a "dJ/dW2" in my oppinion.

Thank you again! Great work.

opened by menecken 5
import statements missing?

I had to add an import for numpy as np, ... and now I see 'plot()' used without declaration. Am I missing something obvious? e.g. in pt2 plot(testInput, sigmoid(testInput), linewidth= 2)

opened by danbri 3
Fix Part 2 and Part 3 encoding. Convert UTF-8-BOM to UTF-8
what

Fix https://github.com/stephencwelch/Neural-Networks-Demystified/issues/31

Fix Part 2, Part 3 and Part 4 encoding. Convert UTF-8-BOM to UTF-8

Fix incorrect doublequotes escape in Part 4

Why

https://github.com/stephencwelch/Neural-Networks-Demystified/issues/31

Can't open Part 2, Part 3, Part 4
opened by comeanother 2
Multiplying two matrices??

delta2 = np.dot(delta3, self.W2.T)*self.sigmoidPrime(self.z2) This line appears in a function (forward) in the code. Whereby I noticed that z2 was a matrix (per your explanation). I am not that proficient in the python programming language so may not know what it means, but how does np.dot and * differ? The reason I ask this is because I discovered an anomaly in my code while trying to implement this by making a model with inputLayerSize=1, hiddenLayerSize=2 and outputLayerSize=1. Essentially, it should give me predictions for y for the equation y=2x (without knowing the equation firsthand, ofcourse). But when finding delta2 I am stuck because (W2) Transpose is of the order 1 x 2, del3 is of order 1 x 1 and z2 is of the order 1 x 2. How do I matrix multiply del3.W2 and sigmoidPrime(z2) ? (P.S. I was not implementing it in python) and thanks in advance.

opened by mind-matrix 2
where is predict function?

Dear Stephen,

I learnt lots of things from your videos and pyhton codes. Thank you! However I could not find a predict function in your code. Is it missing? I think It would be nice to add this function.

Best,

Halil Agin.

opened by halilagin 2

Incorrect multiplication

  def forward(self, X):
        #Propogate inputs though network
        self.z2 = np.dot(X, self.W1)
        self.a2 = self.sigmoid(self.z2)
        self.z3 = np.dot(self.a2, self.W2)
        yHat = self.sigmoid(self.z3) 
        return yHat

I think you may want to be doing z2 = np.dot(X, self.W1.T) instead of z2 = np.dot(X, self.W1) given your explanation of weights.

opened by deychak 1

Adds gitignore file and removes python bytecode files.

Should not have python bytecode (*.pyc files) included in repository. Commit removes these files and adds .gitignore file to exclude future bytecode files.

opened by jasonhamilton 1
Regularization parameter typo?
Hey Stephen, Thanks a LOT for this awesome series! :D I can't imagine the amount of time it must have taken to prepare all this content!

So, I think I noticed a typo in ln [20] of part 7 ( https://github.com/stephencwelch/Neural-Networks-Demystified/blob/master/Part%207%20Overfitting%2C%20Testing%2C%20and%20Regularization.ipynb )

delta3 = np.multiply(-(y-self.yHat), self.sigmoidPrime(self.z3)) #Add gradient of regularization term: dJdW2 = np.dot(self.a2.T, delta3)/X.shape[0] + self.lambd*self.W2 delta2 = np.dot(delta3, self.W2.T)*self.sigmoidPrime(self.z2) #Add gradient of regularization term: dJdW1 = np.dot(X.T, delta2)/X.shape[0] + self.lambd*self.W1

The regularization parameter is supposed to be self.Lambda but in the above snippet, you use self.lambd - I'm guessing this is a typo?

Also, I noticed that video 7 does not have any mention of the '/X.shape[0]' term inside the costFunctionPrime function! (https://youtu.be/S4ZUwgesjS8?t=4m58s) Maybe you could add an annotation about this missing term there? BTW, in the video the regularization parameter is being referred to as self.Lambda

Cheers! Jayanth Krishnan
opened by Jayanth-Krishnan-Natarajan 1

ipython 4.0.0 loses "--pylab inline" functionality recommended in README.md

notebook --pylab inline
[E 11:11:32.373 NotebookApp] Support for specifying --pylab on the command line has been removed.
[E 11:11:32.373 NotebookApp] Please use `%pylab inline` or `%matplotlib inline` in the notebook itself.
danbri-macbookpro2:Neural-Networks-Demystified danbri$ ipython notebookip
danbri-macbookpro2:Neural-Networks-Demystified danbri$ ipython --version
4.0.0

... this is recommended in https://github.com/stephencwelch/Neural-Networks-Demystified/blob/master/README.md presumably to include e.g. numpy as np.

I believe this setup should also work:

cat ~/.ipython/profile_default/startup/go.py 
import numpy as np

opened by danbri 1

3d modeling

fig = plt.figure() and ax = fig.gca(projection='3d') needs to be ax = plt.figure().add_subplot(projection='3d') part7 and also the other 3d models #3D plot: #Uncomment to plot out-of-notebook (you'll be able to rotate) #%matplotlib qt

from mpl_toolkits.mplot3d import Axes3D fig = plt.figure() ax = fig.gca(projection='3d')

#Scatter training examples: ax.scatter(10X[:,0], 5X[:,1], 100*y, c='k', alpha = 1, s=30)

surf = ax.plot_surface(xx, yy, 100*allOutputs.reshape(100, 100),
cmap=cm.jet, alpha = 0.5)

ax.set_xlabel('Hours Sleep') ax.set_ylabel('Hours Study') ax.set_zlabel('Test Score')

opened by samohtGTO 0
basic linear algebra question
thank you for a legendary tutorial on the basics of neural nets and gradient descent. I understood the derivation of the gradient (3 applications of the chain rule! whoa!!) but why did you transpose the matrix ∂z3/∂W? At about 4:45 in part 4 you had to multiply the back propagating error (delta 3) which is a 3x1 matrix by a3(a 3x3 matrix). You commuted delta 3 and a3 . But matrix multiplication is not commutative. And you transposed a3 to boot!

These two operations seem arbitrary. Why are they valid? Why did you simply not take the dot product of delta3 and a3 (this way you would get a 3 x 1 matrix

S1

S2

S3

a1-1....a1-2....a1-3

a2-1....a2-2....a2-3

a3-1....a3-2....a3-3

S1 * (a1-1..+..a1-2..+..a1-3)

S2 * (a2-1..+..a2-2..+..a3-3)

S3* (a3-1..+..a3-2..+..a3-3)

And the result was a 3 x 1 matrix - I imagine this is the new gradient?
opened by bwanaaa 0
a probable mistake?

In "Part 4 Backpropagation.ipynb, the 'variables' table": Dimension of Cost, J, should always be (1,1), instead of '(1, outputLayerSize)', right?

opened by yileic 0
Dimension Error on Training the NN
Hi,

I tried to implement the same code, but changed the layers as shown:

self.inputLayerSize = 3 self.outputLayerSize = 5 self.hiddenLayerSize = 5

This is because the data set shape is: X = (4162, 3) Y = (4162,)

However, after executing T.train(X, Y), I get the following error:

ValueError Traceback (most recent call last) in () ----> 1 T.train(X, Y)

in train(self, X, y) 26 27 options = {'maxiter': 200, 'disp' : True} ---> 28 _res = optimize.minimize(self.costFunctionWrapper, params0, jac=True, method='BFGS', args=(X, y), options=options, callback=self.callbackF) 29 30 self.N.setParams(_res.x)

//anaconda/lib/python2.7/site-packages/scipy/optimize/_minimize.pyc in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options) 439 return _minimize_cg(fun, x0, args, jac, callback, **options) 440 elif meth == 'bfgs': --> 441 return _minimize_bfgs(fun, x0, args, jac, callback, **options) 442 elif meth == 'newton-cg': 443 return _minimize_newtoncg(fun, x0, args, jac, hess, hessp, callback,

//anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in _minimize_bfgs(fun, x0, args, jac, callback, gtol, norm, eps, maxiter, disp, return_all, **unknown_options) 845 else: 846 grad_calls, myfprime = wrap_function(fprime, args) --> 847 gfk = myfprime(x0) 848 k = 0 849 N = len(x0)

//anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in function_wrapper(*wrapper_args) 287 def function_wrapper(wrapper_args): 288 ncalls[0] += 1 --> 289 return function((wrapper_args + args)) 290 291 return ncalls, function_wrapper

//anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in derivative(self, x, *args) 69 return self.jac 70 else: ---> 71 self(x, *args) 72 return self.jac 73

//anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in call(self, x, *args) 61 def call(self, x, *args): 62 self.x = numpy.asarray(x).copy() ---> 63 fg = self.fun(x, *args) 64 self.jac = fg[1] 65 return fg[0]

in costFunctionWrapper(self, params, X, y) 10 def costFunctionWrapper(self, params, X, y): 11 self.N.setParams(params) ---> 12 cost = self.N.costFunction(X, y) 13 grad = self.N.computeGradients(X,y) 14

in costFunction(self, X, y) 29 #Compute cost for given X,y, use weights already stored in class. 30 self.yHat = self.forward(X) ---> 31 J = 0.5*np.sum((y-self.yHat)**2) 32 return J 33

ValueError: operands could not be broadcast together with shapes (4162,) (4162,5)

Would appreciate the help!
opened by jonmilson 0

Owner

Stephen

GitHub

Supporting code for the paper "Dangers of Bayesian Model Averaging under Covariate Shift"

Dangers of Bayesian Model Averaging under Covariate Shift This repository contains the code to reproduce the experiments in the paper Dangers of Bayes

25 Sep 21, 2022

Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

7 Nov 27, 2022

This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

Stock Market Buy/Sell/Hold prediction Using convolutional Neural Network This repo is an attempt to implement the research paper titled "Algorithmic F

136 Dec 28, 2022

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Complex-Valued Neural Networks (CVNN) Done by @NEGU93 - J. Agustin Barrachina Using this library, the only difference with a Tensorflow code is that y

1 Nov 12, 2021

existing and custom freqtrade strategies supporting the new hyperstrategy format.

freqtrade-strategies Description Existing and self-developed strategies, rewritten to support the new HyperStrategy format from the freqtrade-develop

39 Aug 20, 2021

Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021

Supporting Clustering with Contrastive Learning SCCL (NAACL 2021) Dejiao Zhang, Feng Nan, Xiaokai Wei, Shangwen Li, Henghui Zhu, Kathleen McKeown, Ram

231 Jan 5, 2023

Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

GANs for Biological Image Synthesis This codes implements the ICCV-2017 paper "GANs for Biological Image Synthesis". The paper and its supplementary m

95 Nov 25, 2022

A torch.Tensor-like DataFrame library supporting multiple execution runtimes and Arrow as a common memory format

TorchArrow (Warning: Unstable Prototype) This is a prototype library currently under heavy development. It does not currently have stable releases, an

536 Jan 6, 2023

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples This repository is the official implementation of paper [Qimera: Data-free Q

21 Nov 3, 2022

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

Core ML Tools Use coremltools to convert machine learning models from third-party libraries to the Core ML format. The Python package contains the sup

3k Jan 8, 2023

An off-line judger supporting distributed problem repositories

Thaw 中文 | English Thaw is an off-line judger supporting distributed problem repositories. Everyone can use Thaw release problems with license on GitHu

2 Jan 9, 2022

Artificial intelligence technology inferring issues and logically supporting facts from raw text

개요 비정형 텍스트를 학습하여 쟁점별 사실과 논리적 근거 추론이 가능한 인공지능 원천기술 Artificial intelligence techno

6 Dec 29, 2021

Split your patch similarly to `git add -p` but supporting multiple buckets

split-patch.py This is git add -p on steroids for patches. Given a my.patch you can run ./split-patch.py my.patch You can choose in which bucket to p

102 Oct 6, 2022

A short code in python, Enchpyter, is able to encrypt and decrypt words as you determine, of course

Enchpyter Enchpyter is a program do encrypt and decrypt any word you want (just letters). You enter how many letters jumps and write the word, so, the

2 Oct 10, 2022

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

892 Dec 28, 2022

Supporting code for short YouTube series Neural Networks Demystified.

Related tags

Overview

Neural Networks Demystified

Using the iPython notebook

Comments

what

Why

Owner

Stephen

Supporting code for the paper "Dangers of Bayesian Model Averaging under Covariate Shift"

Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

existing and custom freqtrade strategies supporting the new hyperstrategy format.

Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021

Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

A torch.Tensor-like DataFrame library supporting multiple execution runtimes and Arrow as a common memory format

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

An off-line judger supporting distributed problem repositories

Artificial intelligence technology inferring issues and logically supporting facts from raw text

Split your patch similarly to `git add -p` but supporting multiple buckets

A short code in python, Enchpyter, is able to encrypt and decrypt words as you determine, of course

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

Code for the paper "TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks"

Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Shuwa Gesture Toolkit is a framework that detects and classifies arbitrary gestures in short videos

A very short and easy implementation of Quantile Regression DQN