Supporting code for short YouTube series Neural Networks Demystified.

Overview

Neural Networks Demystified

Supporting iPython notebooks for the YouTube Series Neural Networks Demystified. I've included formulas, code, and the text of the movies in the iPython notebooks, in addition to raw code in python scripts.

iPython notebooks can be downloaded and run locally, or viewed using nbviewer: http://nbviewer.ipython.org/.

Using the iPython notebook

The iPython/Jupyter notebook is an incredible tool, but can be a little tricky to setup. I recommend the [anaconda] (https://store.continuum.io/cshop/anaconda/) distribution of python. I've written and tested this code with the the anaconda build of python 2 running on OSX. You will likely get a few warnings about contour plotting - if anyone has a fix for this, feel free to submit a pull request.

Comments
  • costFunction in Lecture 7

    costFunction in Lecture 7

    Hey everybode,

    if J in the cost function is calculated like given in lecture 7

     J = 0.5*sum((y-self.yHat)**2)/X.shape[0] + (self.Lambda/2)*(sum(self.W1**2)+sum(self.W2**2))
    

    J ends up being a vector what causes in my believe the following error

    File "C:\Program Files\Anaconda\lib\site-packages\scipy\optimize\linesearch.py", line 148, in scalar_search_wolfe1
    alpha1 = min(1.0, 1.01*2*(phi0 - old_phi0)/derphi0)
    

    ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

    The problem is solved by adding another sum() to the calculation of J

        J2 = 0.5*sum((y-self.yHat)**2)/X.shape[0] + (self.Lambda/2)*(sum(sum(self.W1**2))+sum(self.W2**2))
    

    after that J is a scalar again.

    Did you miss the sum() or did I miss something (and ended up doing something mathematically incorrent).

    Best regards and thanks in advance!

    opened by menecken 11
  • Something wrong with costfunction

    Something wrong with costfunction

    Hi Stephen, i watched your video about this code, very good by the way, but when i tryed to implement i've got this error:

    Traceback (most recent call last): in computeNumericalGradient numgrad[p] = (loss2 - loss1) / (2*e) ValueError: setting an array element with a sequence.

    and when i forked your code, the error persists, the costfunction shoud return a number? or am i wrog?

    thank's, Felipe Melo

    opened by felipe-melo 6
  • I think I spotted an error in your excellent tutorial

    I think I spotted an error in your excellent tutorial

    Right after the line "We can now replace dyHat/dz3 with f prime of z 3." the formula shows an "dz3/dW3" on the left side which sould be a "dJ/dW2" in my oppinion.

    Thank you again! Great work.

    opened by menecken 5
  • import statements missing?

    import statements missing?

    I had to add an import for numpy as np, ... and now I see 'plot()' used without declaration. Am I missing something obvious? e.g. in pt2 plot(testInput, sigmoid(testInput), linewidth= 2)

    opened by danbri 3
  • Fix Part 2 and Part 3 encoding. Convert UTF-8-BOM to UTF-8

    Fix Part 2 and Part 3 encoding. Convert UTF-8-BOM to UTF-8

    what

    • Fix https://github.com/stephencwelch/Neural-Networks-Demystified/issues/31
    • Fix Part 2, Part 3 and Part 4 encoding. Convert UTF-8-BOM to UTF-8
    • Fix incorrect doublequotes escape in Part 4

    Why

    • https://github.com/stephencwelch/Neural-Networks-Demystified/issues/31
    • Can't open Part 2, Part 3, Part 4
    opened by comeanother 2
  • Multiplying two matrices??

    Multiplying two matrices??

    delta2 = np.dot(delta3, self.W2.T)*self.sigmoidPrime(self.z2) This line appears in a function (forward) in the code. Whereby I noticed that z2 was a matrix (per your explanation). I am not that proficient in the python programming language so may not know what it means, but how does np.dot and * differ? The reason I ask this is because I discovered an anomaly in my code while trying to implement this by making a model with inputLayerSize=1, hiddenLayerSize=2 and outputLayerSize=1. Essentially, it should give me predictions for y for the equation y=2x (without knowing the equation firsthand, ofcourse). But when finding delta2 I am stuck because (W2) Transpose is of the order 1 x 2, del3 is of order 1 x 1 and z2 is of the order 1 x 2. How do I matrix multiply del3.W2 and sigmoidPrime(z2) ? (P.S. I was not implementing it in python) and thanks in advance.

    opened by mind-matrix 2
  • where is predict function?

    where is predict function?

    Dear Stephen,

    I learnt lots of things from your videos and pyhton codes. Thank you! However I could not find a predict function in your code. Is it missing? I think It would be nice to add this function.

    Best,

    Halil Agin.

    opened by halilagin 2
  • Incorrect multiplication

    Incorrect multiplication

      def forward(self, X):
            #Propogate inputs though network
            self.z2 = np.dot(X, self.W1)
            self.a2 = self.sigmoid(self.z2)
            self.z3 = np.dot(self.a2, self.W2)
            yHat = self.sigmoid(self.z3) 
            return yHat
    

    I think you may want to be doing z2 = np.dot(X, self.W1.T) instead of z2 = np.dot(X, self.W1) given your explanation of weights.

    opened by deychak 1
  • Adds gitignore file and removes python bytecode files.

    Adds gitignore file and removes python bytecode files.

    Should not have python bytecode (*.pyc files) included in repository. Commit removes these files and adds .gitignore file to exclude future bytecode files.

    opened by jasonhamilton 1
  • Regularization parameter typo?

    Regularization parameter typo?

    Hey Stephen, Thanks a LOT for this awesome series! :D I can't imagine the amount of time it must have taken to prepare all this content!

    So, I think I noticed a typo in ln [20] of part 7 ( https://github.com/stephencwelch/Neural-Networks-Demystified/blob/master/Part%207%20Overfitting%2C%20Testing%2C%20and%20Regularization.ipynb )

    delta3 = np.multiply(-(y-self.yHat), self.sigmoidPrime(self.z3))
    #Add gradient of regularization term:
    dJdW2 = np.dot(self.a2.T, delta3)/X.shape[0] + self.lambd*self.W2
    
    delta2 = np.dot(delta3, self.W2.T)*self.sigmoidPrime(self.z2)
    #Add gradient of regularization term:
    dJdW1 = np.dot(X.T, delta2)/X.shape[0] + self.lambd*self.W1
    

    The regularization parameter is supposed to be self.Lambda but in the above snippet, you use self.lambd - I'm guessing this is a typo?

    Also, I noticed that video 7 does not have any mention of the '/X.shape[0]' term inside the costFunctionPrime function! (https://youtu.be/S4ZUwgesjS8?t=4m58s) Maybe you could add an annotation about this missing term there? BTW, in the video the regularization parameter is being referred to as self.Lambda

    Cheers! Jayanth Krishnan

    opened by Jayanth-Krishnan-Natarajan 1
  • ipython 4.0.0 loses

    ipython 4.0.0 loses "--pylab inline" functionality recommended in README.md

    notebook --pylab inline
    [E 11:11:32.373 NotebookApp] Support for specifying --pylab on the command line has been removed.
    [E 11:11:32.373 NotebookApp] Please use `%pylab inline` or `%matplotlib inline` in the notebook itself.
    danbri-macbookpro2:Neural-Networks-Demystified danbri$ ipython notebookip
    danbri-macbookpro2:Neural-Networks-Demystified danbri$ ipython --version
    4.0.0
    

    ... this is recommended in https://github.com/stephencwelch/Neural-Networks-Demystified/blob/master/README.md presumably to include e.g. numpy as np.

    I believe this setup should also work:

    cat ~/.ipython/profile_default/startup/go.py 
    import numpy as np
    
    opened by danbri 1
  • 3d modeling

    3d modeling

    fig = plt.figure() and ax = fig.gca(projection='3d') needs to be ax = plt.figure().add_subplot(projection='3d') part7 and also the other 3d models #3D plot: #Uncomment to plot out-of-notebook (you'll be able to rotate) #%matplotlib qt

    from mpl_toolkits.mplot3d import Axes3D fig = plt.figure() ax = fig.gca(projection='3d')

    #Scatter training examples: ax.scatter(10X[:,0], 5X[:,1], 100*y, c='k', alpha = 1, s=30)

    surf = ax.plot_surface(xx, yy, 100*allOutputs.reshape(100, 100),
    cmap=cm.jet, alpha = 0.5)

    ax.set_xlabel('Hours Sleep') ax.set_ylabel('Hours Study') ax.set_zlabel('Test Score')

    opened by samohtGTO 0
  • basic linear algebra question

    basic linear algebra question

    thank you for a legendary tutorial on the basics of neural nets and gradient descent. I understood the derivation of the gradient (3 applications of the chain rule! whoa!!) but why did you transpose the matrix ∂z3/∂W? At about 4:45 in part 4 you had to multiply the back propagating error (delta 3) which is a 3x1 matrix by a3(a 3x3 matrix). You commuted delta 3 and a3 . But matrix multiplication is not commutative. And you transposed a3 to boot!

    These two operations seem arbitrary. Why are they valid? Why did you simply not take the dot product of delta3 and a3 (this way you would get a 3 x 1 matrix

    • S1
    • S2
    • S3
    • a1-1....a1-2....a1-3
    • a2-1....a2-2....a2-3
    • a3-1....a3-2....a3-3
    • S1 * (a1-1..+..a1-2..+..a1-3)
    • S2 * (a2-1..+..a2-2..+..a3-3)
    • S3* (a3-1..+..a3-2..+..a3-3)
    And the result was a 3 x 1 matrix - I imagine this is the new gradient?
    opened by bwanaaa 0
  • a probable mistake?

    a probable mistake?

    In "Part 4 Backpropagation.ipynb, the 'variables' table": Dimension of Cost, J, should always be (1,1), instead of '(1, outputLayerSize)', right?

    opened by yileic 0
  • Dimension Error on Training the NN

    Dimension Error on Training the NN

    Hi,

    I tried to implement the same code, but changed the layers as shown:

        self.inputLayerSize = 3 
        self.outputLayerSize = 5 
        self.hiddenLayerSize = 5
    

    This is because the data set shape is: X = (4162, 3) Y = (4162,)

    However, after executing T.train(X, Y), I get the following error:


    ValueError Traceback (most recent call last) in () ----> 1 T.train(X, Y)

    in train(self, X, y) 26 27 options = {'maxiter': 200, 'disp' : True} ---> 28 _res = optimize.minimize(self.costFunctionWrapper, params0, jac=True, method='BFGS', args=(X, y), options=options, callback=self.callbackF) 29 30 self.N.setParams(_res.x)

    //anaconda/lib/python2.7/site-packages/scipy/optimize/_minimize.pyc in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options) 439 return _minimize_cg(fun, x0, args, jac, callback, **options) 440 elif meth == 'bfgs': --> 441 return _minimize_bfgs(fun, x0, args, jac, callback, **options) 442 elif meth == 'newton-cg': 443 return _minimize_newtoncg(fun, x0, args, jac, hess, hessp, callback,

    //anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in _minimize_bfgs(fun, x0, args, jac, callback, gtol, norm, eps, maxiter, disp, return_all, **unknown_options) 845 else: 846 grad_calls, myfprime = wrap_function(fprime, args) --> 847 gfk = myfprime(x0) 848 k = 0 849 N = len(x0)

    //anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in function_wrapper(*wrapper_args) 287 def function_wrapper(wrapper_args): 288 ncalls[0] += 1 --> 289 return function((wrapper_args + args)) 290 291 return ncalls, function_wrapper

    //anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in derivative(self, x, *args) 69 return self.jac 70 else: ---> 71 self(x, *args) 72 return self.jac 73

    //anaconda/lib/python2.7/site-packages/scipy/optimize/optimize.pyc in call(self, x, *args) 61 def call(self, x, *args): 62 self.x = numpy.asarray(x).copy() ---> 63 fg = self.fun(x, *args) 64 self.jac = fg[1] 65 return fg[0]

    in costFunctionWrapper(self, params, X, y) 10 def costFunctionWrapper(self, params, X, y): 11 self.N.setParams(params) ---> 12 cost = self.N.costFunction(X, y) 13 grad = self.N.computeGradients(X,y) 14

    in costFunction(self, X, y) 29 #Compute cost for given X,y, use weights already stored in class. 30 self.yHat = self.forward(X) ---> 31 J = 0.5*np.sum((y-self.yHat)**2) 32 return J 33

    ValueError: operands could not be broadcast together with shapes (4162,) (4162,5)

    Would appreciate the help!

    opened by jonmilson 0
Owner
Stephen
Stephen
Supporting code for the paper "Dangers of Bayesian Model Averaging under Covariate Shift"

Dangers of Bayesian Model Averaging under Covariate Shift This repository contains the code to reproduce the experiments in the paper Dangers of Bayes

Pavel Izmailov 25 Sep 21, 2022
Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

Abdultawwab Safarji 7 Nov 27, 2022
This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

Stock Market Buy/Sell/Hold prediction Using convolutional Neural Network This repo is an attempt to implement the research paper titled "Algorithmic F

Asutosh Nayak 136 Dec 28, 2022
Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Complex-Valued Neural Networks (CVNN) Done by @NEGU93 - J. Agustin Barrachina Using this library, the only difference with a Tensorflow code is that y

youceF 1 Nov 12, 2021
existing and custom freqtrade strategies supporting the new hyperstrategy format.

freqtrade-strategies Description Existing and self-developed strategies, rewritten to support the new HyperStrategy format from the freqtrade-develop

null 39 Aug 20, 2021
Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021

Supporting Clustering with Contrastive Learning SCCL (NAACL 2021) Dejiao Zhang, Feng Nan, Xiaokai Wei, Shangwen Li, Henghui Zhu, Kathleen McKeown, Ram

null 231 Jan 5, 2023
Implementation supporting the ICCV 2017 paper "GANs for Biological Image Synthesis"

GANs for Biological Image Synthesis This codes implements the ICCV-2017 paper "GANs for Biological Image Synthesis". The paper and its supplementary m

Anton Osokin 95 Nov 25, 2022
A torch.Tensor-like DataFrame library supporting multiple execution runtimes and Arrow as a common memory format

TorchArrow (Warning: Unstable Prototype) This is a prototype library currently under heavy development. It does not currently have stable releases, an

Facebook Research 536 Jan 6, 2023
Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples This repository is the official implementation of paper [Qimera: Data-free Q

Kanghyun Choi 21 Nov 3, 2022
Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

Core ML Tools Use coremltools to convert machine learning models from third-party libraries to the Core ML format. The Python package contains the sup

Apple 3k Jan 8, 2023
An off-line judger supporting distributed problem repositories

Thaw 中文 | English Thaw is an off-line judger supporting distributed problem repositories. Everyone can use Thaw release problems with license on GitHu

countercurrent_time 2 Jan 9, 2022
Artificial intelligence technology inferring issues and logically supporting facts from raw text

개요 비정형 텍스트를 학습하여 쟁점별 사실과 논리적 근거 추론이 가능한 인공지능 원천기술 Artificial intelligence techno

null 6 Dec 29, 2021
Split your patch similarly to `git add -p` but supporting multiple buckets

split-patch.py This is git add -p on steroids for patches. Given a my.patch you can run ./split-patch.py my.patch You can choose in which bucket to p

null 102 Oct 6, 2022
A short code in python, Enchpyter, is able to encrypt and decrypt words as you determine, of course

Enchpyter Enchpyter is a program do encrypt and decrypt any word you want (just letters). You enter how many letters jumps and write the word, so, the

João Assalim 2 Oct 10, 2022
This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

DeepMind 892 Dec 28, 2022
Code for the paper "TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks"

TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks This is a Python3 / Pytorch implementation of TadGAN paper. The associated

Arun 92 Dec 3, 2022
Code for "Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks", CVPR 2021

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks This repository contains the code that accompanies our CVPR 20

Despoina Paschalidou 161 Dec 20, 2022
Shuwa Gesture Toolkit is a framework that detects and classifies arbitrary gestures in short videos

Shuwa Gesture Toolkit is a framework that detects and classifies arbitrary gestures in short videos

Google 89 Dec 22, 2022
A very short and easy implementation of Quantile Regression DQN

Quantile Regression DQN Quantile Regression DQN a Minimal Working Example, Distributional Reinforcement Learning with Quantile Regression (https://arx

Arsenii Senya Ashukha 80 Sep 17, 2022