Python Machine Learning Jupyter Notebooks (ML website)

Tirthajyoti Sarkar

Last update: Jan 3, 2023

Related tags

Machine Learning flask data-science machine-learning statistics deep-learning neural-network random-forest clustering numpy naive-bayes scikit-learn regression pandas artificial-intelligence pytest classification dimensionality-reduction matplotlib decision-trees k-nearest-neighbours

Overview

Python Machine Learning Jupyter Notebooks (ML website)

Dr. Tirthajyoti Sarkar, Fremont, California (Please feel free to connect on LinkedIn here)

Also check out these super-useful Repos that I curated

Requirements

Python 3.6+
NumPy (pip install numpy)
Pandas (pip install pandas)
Scikit-learn (pip install scikit-learn)
SciPy (pip install scipy)
Statsmodels (pip install statsmodels)
MatplotLib (pip install matplotlib)
Seaborn (pip install seaborn)
Sympy (pip install sympy)
Flask (pip install flask)
WTForms (pip install wtforms)
Tensorflow (pip install tensorflow>=1.15)
Keras (pip install keras)
pdpipe (pip install pdpipe)

You can start with this article that I wrote in Heartbeat magazine (on Medium platform):

"Some Essential Hacks and Tricks for Machine Learning with Python"

Essential tutorial-type notebooks on Pandas and Numpy

Jupyter notebooks covering a wide range of functions and operations on the topics of NumPy, Pandans, Seaborn, Matplotlib etc.

Detailed Numpy operations
Detailed Pandas operations
Numpy and Pandas quick basics
Matplotlib and Seaborn quick basics
Advanced Pandas operations
How to read various data sources
PDF reading and table processing demo
How fast are Numpy operations compared to pure Python code? (Read my article on Medium related to this topic)
Fast reading from Numpy using .npy file format (Read my article on Medium on this topic)

Tutorial-type notebooks covering regression, classification, clustering, dimensionality reduction, and some basic neural network algorithms

Regression

Simple linear regression with t-statistic generation

Polynomial regression using scikit-learn pipeline feature (check the article I wrote on Towards Data Science)
Decision trees and Random Forest regression (showing how the Random Forest works as a robust/regularized meta-estimator rejecting overfitting)
Detailed visual analytics and goodness-of-fit diagnostic tests for a linear regression problem
Robust linear regression using HuberRegressor from Scikit-learn

Classification

Logistic regression/classification (Here is the Notebook)

k-nearest neighbor classification (Here is the Notebook)
Decision trees and Random Forest Classification (Here is the Notebook)
Support vector machine classification (Here is the Notebook) (check the article I wrote in Towards Data Science on SVM and sorting algorithm)

Naive Bayes classification (Here is the Notebook)

Clustering

K-means clustering (Here is the Notebook)
Affinity propagation (showing its time complexity and the effect of damping factor) (Here is the Notebook)
Mean-shift technique (showing its time complexity and the effect of noise on cluster discovery) (Here is the Notebook)
DBSCAN (showing how it can generically detect areas of high density irrespective of cluster shapes, which the k-means fails to do) (Here is the Notebook)
Hierarchical clustering with Dendograms showing how to choose optimal number of clusters (Here is the Notebook)

Dimensionality reduction

Principal component analysis

Deep Learning/Neural Network

Demo notebook to illustrate the superiority of deep neural network for complex nonlinear function approximation task
Step-by-step building of 1-hidden-layer and 2-hidden-layer dense network using basic TensorFlow methods

Random data generation using symbolic expressions

How to use Sympy package to generate random datasets using symbolic mathematical expressions.
Here is my article on Medium on this topic: Random regression and classification problem generation with symbolic expression

Synthetic data generation techniques

Notebooks here

Simple deployment examples (serving ML models on web API)

Serving a linear regression model through a simple HTTP server interface. User needs to request predictions by executing a Python script. Uses Flask and Gunicorn.
Serving a recurrent neural network (RNN) through a HTTP webpage, complete with a web form, where users can input parameters and click a button to generate text based on the pre-trained RNN model. Uses Flask, Jinja, Keras/TensorFlow, WTForms.

Object-oriented programming with machine learning

Implementing some of the core OOP principles in a machine learning context by building your own Scikit-learn-like estimator, and making it better.

See my articles on Medium on this topic.

Unit testing ML code with Pytest

Check the files and detailed instructions in the Pytest directory to understand how one should write unit testing code/module for machine learning models

Memory and timing profiling

Profiling data science code and ML models for memory footprint and computing time is a critical but often overlooed area. Here are a couple of Notebooks showing the ideas,

Comments

Add indications on how to run Jupyter notebooks with Docker in a few minutes
The https://github.com/machine-learning-helpers/docker-python-jupyter project builds a Docker image so that the (your) Jupyter notebooks can be run out-of-the-box on almost any platform in a few minutes.

It gives something like:

Initialization of the Git repository for the Jupyter notebooks:

$ mkdir -p ~/dev/ml $ cd ~/dev/ml $ git clone https://github.com/tirthajyoti/PythonMachineLearning.git

Initialization of the Docker image to run those Jupyter notebooks:

$ docker pull artificialintelligence/python-jupyter

Usage:

$ cd ~/dev/ml/PythonMachineLearning $ docker run -d -p 9000:8888 -v ${PWD}:/notebook -v ${PWD}:/data artificialintelligence/python-jupyter

And then you can open http://localhost:9000 in your browser.

Any modification to the notebooks may be committed to the Git repository (if you are registered as a contributor), and/or submitted as a pull request.

Shutdown the Docker image

$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 431b12a93ccf artificialintelligence/python-jupyter "/bin/sh -c 'jupyt..." 4 minutes ago Up 4 minutes 0.0.0.0:9000->8888/tcp friendly_euclid $ docker kill 431b12a93ccf

So, all the above could be added to your README.md file.
opened by da115115 2
Using scipy's genetic algorithm for initial parameter estimation in gradient descent

I see you are writing Python code for optimization on GitHub. A general problem for gradient descent and other non-linear algorithms - particularly for more complex equations - is the choice of initial parameters to start the "descent" in error space. Without good starting parameters, the algorithm will stop in a local error minimum. For this reason the authors of scipy have added a genetic algorithm for initial parameter estimation for use in gradient descent. The module is named scipy.optimize.differential_evolution.

I have used scipy's Differential Evolution genetic algorithm to determine initial parameters for fitting a double Lorentzian peak equation to Raman spectroscopy of carbon nanotubes and found that the results were excellent. The GitHub project, with a test spectroscopy data file, is:

https://github.com/zunzun/RamanSpectroscopyFit

If you have any questions, please let me know. My background is in nuclear engineering and industrial radiation physics, and I love Python, so I will be glad to help.

opened by zunzun 2
Bump numpy from 1.16.3 to 1.22.0 in /Deployment/rnn_app
Bumps numpy from 1.16.3 to 1.22.0.

Release notes

Sourced from numpy's releases.

v1.22.0

NumPy 1.22.0 Release Notes

NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.

A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.

NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.

New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.

A new configurable allocator for use by downstream projects.

These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

Expired deprecations

Deprecated numeric style dtype strings have been removed

Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

(gh-19539)

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

(gh-19615)

... (truncated)

Commits

4adc87d Merge pull request #20685 from charris/prepare-for-1.22.0-release

fd66547 REL: Prepare for the NumPy 1.22.0 release.

125304b wip

c283859 Merge pull request #20682 from charris/backport-20416

5399c03 Merge pull request #20681 from charris/backport-20954

f9c45f8 Merge pull request #20680 from charris/backport-20663

794b36f Update armccompiler.py

d93b14e Update test_public_api.py

7662c07 Update init.py

311ab52 Update armccompiler.py

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 1
Bump numpy from 1.16.3 to 1.22.0 in /Deployment/Linear_regression
Bumps numpy from 1.16.3 to 1.22.0.

Release notes

Sourced from numpy's releases.

v1.22.0

NumPy 1.22.0 Release Notes

NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.

A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.

NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.

New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.

A new configurable allocator for use by downstream projects.

These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

Expired deprecations

Deprecated numeric style dtype strings have been removed

Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

(gh-19539)

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

(gh-19615)

... (truncated)

Commits

4adc87d Merge pull request #20685 from charris/prepare-for-1.22.0-release

fd66547 REL: Prepare for the NumPy 1.22.0 release.

125304b wip

c283859 Merge pull request #20682 from charris/backport-20416

5399c03 Merge pull request #20681 from charris/backport-20954

f9c45f8 Merge pull request #20680 from charris/backport-20663

794b36f Update armccompiler.py

d93b14e Update test_public_api.py

7662c07 Update init.py

311ab52 Update armccompiler.py

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Bump jinja2 from 2.10.1 to 2.11.3 in /Deployment/rnn_app
Bumps jinja2 from 2.10.1 to 2.11.3.

Release notes

Sourced from jinja2's releases.

2.11.3

This contains a fix for a speed issue with the urlize filter. urlize is likely to be called on untrusted user input. For certain inputs some of the regular expressions used to parse the text could take a very long time due to backtracking. As part of the fix, the email matching became slightly stricter. The various speedups apply to urlize in general, not just the specific input cases.

PyPI: https://pypi.org/project/Jinja2/2.11.3/

Changes: https://jinja.palletsprojects.com/en/2.11.x/changelog/#version-2-11-3

2.11.2

Changelog: https://jinja.palletsprojects.com/en/2.11.x/changelog/#version-2-11-2

2.11.1

This fixes an issue in async environment when indexing the result of an attribute lookup, like {{ data.items[1:] }}.

Changes: https://jinja.palletsprojects.com/en/2.11.x/changelog/#version-2-11-1

2.11.0

Changes: https://jinja.palletsprojects.com/en/2.11.x/changelog/#version-2-11-0

Blog: https://palletsprojects.com/blog/jinja-2-11-0-released/

Twitter: https://twitter.com/PalletsTeam/status/1221883554537230336

This is the last version to support Python 2.7 and 3.5. The next version will be Jinja 3.0 and will support Python 3.6 and newer.

2.10.3

Changes: http://jinja.palletsprojects.com/en/2.10.x/changelog/#version-2-10-3

2.10.2

Changes: http://jinja.palletsprojects.com/en/2.10.x/changelog/#version-2-10-2

Changelog

Sourced from jinja2's changelog.

Version 2.11.3

Released 2021-01-31

Improve the speed of the urlize filter by reducing regex backtracking. Email matching requires a word character at the start of the domain part, and only word characters in the TLD. :pr:1343

Version 2.11.2

Released 2020-04-13

Fix a bug that caused callable objects with __getattr__, like :class:~unittest.mock.Mock to be treated as a :func:contextfunction. :issue:1145

Update wordcount filter to trigger :class:Undefined methods by wrapping the input in :func:soft_str. :pr:1160

Fix a hang when displaying tracebacks on Python 32-bit. :issue:1162

Showing an undefined error for an object that raises AttributeError on access doesn't cause a recursion error. :issue:1177

Revert changes to :class:~loaders.PackageLoader from 2.10 which removed the dependency on setuptools and pkg_resources, and added limited support for namespace packages. The changes caused issues when using Pytest. Due to the difficulty in supporting Python 2 and :pep:451 simultaneously, the changes are reverted until 3.0. :pr:1182

Fix line numbers in error messages when newlines are stripped. :pr:1178

The special namespace() assignment object in templates works in async environments. :issue:1180

Fix whitespace being removed before tags in the middle of lines when lstrip_blocks is enabled. :issue:1138

:class:~nativetypes.NativeEnvironment doesn't evaluate intermediate strings during rendering. This prevents early evaluation which could change the value of an expression. :issue:1186

Version 2.11.1

Released 2020-01-30

Fix a bug that prevented looking up a key after an attribute ({{ data.items[1:] }}) in an async template. :issue:1141

... (truncated)

Commits

cf21539 release version 2.11.3

15ef8f0 Merge pull request #1343 from pallets/urlize-speedup

ef658dc speed up urlize matching

eeca0fe Merge pull request #1207 from mhansen/patch-1

2dd7691 Merge pull request #1209 from mhansen/patch-3

4892940 do_dictsort: update example ready to copy/paste

7db7d33 api.rst: bugfix in docs, import PackageLoader

9ec465b fix changelog header

737a4cd release version 2.11.2

179df6b Merge pull request #1190 from pallets/native-eval

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Bump tensorflow from 1.15.2 to 1.15.4 in /Deployment/rnn_app
Bumps tensorflow from 1.15.2 to 1.15.4.

Release notes

Sourced from tensorflow's releases.

TensorFlow 1.15.4

Release 1.15.4

Bug Fixes and Other Changes

Fixes an undefined behavior causing a segfault in tf.raw_ops.Switch (CVE-2020-15190)

Fixes three vulnerabilities in conversion to DLPack format (CVE-2020-15191, CVE-2020-15192, CVE-2020-15193)

Fixes two vulnerabilities in SparseFillEmptyRowsGrad (CVE-2020-15194, CVE-2020-15195)

Fixes an integer truncation vulnerability in code using the work sharder API (CVE-2020-15202)

Fixes a format string vulnerability in tf.strings.as_string (CVE-2020-15203)

Fixes segfault raised by calling session-only ops in eager mode (CVE-2020-15204)

Fixes data leak and potential ASLR violation from tf.raw_ops.StringNGrams (CVE-2020-15205)

Fixes segfaults caused by incomplete SavedModel validation (CVE-2020-15206)

Fixes a data corruption due to a bug in negative indexing support in TFLite (CVE-2020-15207)

Fixes a data corruption due to dimension mismatch in TFLite (CVE-2020-15208)

Fixes several vulnerabilities in TFLite saved model format (CVE-2020-15209, CVE-2020-15210, CVE-2020-15211)

Updates sqlite3 to 3.33.00 to handle CVE-2020-9327, CVE-2020-11655, CVE-2020-11656, CVE-2020-13434, CVE-2020-13435, CVE-2020-13630, CVE-2020-13631, CVE-2020-13871, and CVE-2020-15358.

Fixes #41630 by including max_seq_length in CuDNN descriptor cache key

Pins numpy to 1.18.5 to prevent ABI breakage when compiling code that uses both NumPy and TensorFlow headers.

TensorFlow 1.15.3

Bug Fixes and Other Changes

Updates sqlite3 to 3.31.01 to handle CVE-2019-19880, CVE-2019-19244 and CVE-2019-19645

Updates curl to 7.69.1 to handle CVE-2019-15601

Updates libjpeg-turbo to 2.0.4 to handle CVE-2018-19664, CVE-2018-20330 and CVE-2019-13960

Updates Apache Spark to 2.4.5 to handle CVE-2019-10099, CVE-2018-17190 and CVE-2018-11770

Changelog

Sourced from tensorflow's changelog.

Release 1.15.4

Bug Fixes and Other Changes

Fixes an undefined behavior causing a segfault in tf.raw_ops.Switch (CVE-2020-15190)

Fixes three vulnerabilities in conversion to DLPack format (CVE-2020-15191, CVE-2020-15192, CVE-2020-15193)

Fixes two vulnerabilities in SparseFillEmptyRowsGrad (CVE-2020-15194, CVE-2020-15195)

Fixes an integer truncation vulnerability in code using the work sharder API (CVE-2020-15202)

Fixes a format string vulnerability in tf.strings.as_string (CVE-2020-15203)

Fixes segfault raised by calling session-only ops in eager mode (CVE-2020-15204)

Fixes data leak and potential ASLR violation from tf.raw_ops.StringNGrams (CVE-2020-15205)

Fixes segfaults caused by incomplete SavedModel validation (CVE-2020-15206)

Fixes a data corruption due to a bug in negative indexing support in TFLite (CVE-2020-15207)

Fixes a data corruption due to dimension mismatch in TFLite (CVE-2020-15208)

Fixes several vulnerabilities in TFLite saved model format (CVE-2020-15209, CVE-2020-15210, CVE-2020-15211)

Updates sqlite3 to 3.33.00 to handle CVE-2020-9327, CVE-2020-11655, CVE-2020-11656, CVE-2020-13434, CVE-2020-13435, CVE-2020-13630, CVE-2020-13631, CVE-2020-13871, and CVE-2020-15358.

Fixes #41630 by including max_seq_length in CuDNN descriptor cache key

Pins numpy to 1.18.5 to prevent ABI breakage when compiling code that uses both NumPy and TensorFlow headers.

Release 2.3.0

Major Features and Improvements

tf.data adds two new mechanisms to solve input pipeline bottlenecks and save resources:

... (truncated)

Commits

df8c55c Merge pull request #43442 from tensorflow-jenkins/version-numbers-1.15.4-31571

0e8cbcb Update version numbers to 1.15.4

5b65bf2 Merge pull request #43437 from tensorflow-jenkins/relnotes-1.15.4-10691

814e8d8 Update RELEASE.md

757085e Insert release notes place-fill

e99e53d Merge pull request #43410 from tensorflow/mm-fix-1.15

bad36df Add missing import

f3f1835 No disable_tfrt present on this branch

7ef5c62 Merge pull request #43406 from tensorflow/mihaimaruseac-patch-1

abbf34a Remove import that is not needed

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Bump tensorflow from 1.15.0 to 1.15.2 in /Deployment/rnn_app
Bumps tensorflow from 1.15.0 to 1.15.2.

Release notes

Sourced from tensorflow's releases.

TensorFlow 1.15.2

Release 1.15.2

Note that this release no longer has a single pip package for GPU and CPU. Please see #36347 for history and details

Bug Fixes and Other Changes

Fixes a security vulnerability where converting a Python string to a tf.float16 value produces a segmentation fault (CVE-2020-5215)

Updates curl to 7.66.0 to handle CVE-2019-5482 and CVE-2019-5481

Updates sqlite3 to 3.30.01 to handle CVE-2019-19646, CVE-2019-19645 and CVE-2019-16168

Changelog

Sourced from tensorflow's changelog.

Release 1.15.2

Bug Fixes and Other Changes

Fixes a security vulnerability where converting a Python string to a tf.float16 value produces a segmentation fault (CVE-2020-5215)

Updates curl to 7.66.0 to handle CVE-2019-5482 and CVE-2019-5481

Updates sqlite3 to 3.30.01 to handle CVE-2019-19646, CVE-2019-19645 and CVE-2019-16168

Release 2.1.0

TensorFlow 2.1 will be the last TF release supporting Python 2. Python 2 support officially ends an January 1, 2020. As announced earlier, TensorFlow will also stop supporting Python 2 starting January 1, 2020, and no more releases are expected in 2019.

Major Features and Improvements

The tensorflow pip package now includes GPU support by default (same as tensorflow-gpu) for both Linux and Windows. This runs on machines with and without NVIDIA GPUs. tensorflow-gpu is still available, and CPU-only packages can be downloaded at tensorflow-cpu for users who are concerned about package size.

Windows users: Officially-released tensorflow Pip packages are now built with Visual Studio 2019 version 16.4 in order to take advantage of the new /d2ReducedOptimizeHugeFunctions compiler flag. To use these new packages, you must install "Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019", available from Microsoft's website here.

This does not change the minimum required version for building TensorFlow from source on Windows, but builds enabling EIGEN_STRONG_INLINE can take over 48 hours to compile without this flag. Refer to configure.py for more information about EIGEN_STRONG_INLINE and /d2ReducedOptimizeHugeFunctions.

If either of the required DLLs, msvcp140.dll (old) or msvcp140_1.dll (new), are missing on your machine, import tensorflow will print a warning message.

The tensorflow pip package is built with CUDA 10.1 and cuDNN 7.6.

tf.keras

Experimental support for mixed precision is available on GPUs and Cloud TPUs. See usage guide.

Introduced the TextVectorization layer, which takes as input raw strings and takes care of text standardization, tokenization, n-gram generation, and vocabulary indexing. See this end-to-end text classification example.

Keras .compile .fit .evaluate and .predict are allowed to be outside of the DistributionStrategy scope, as long as the model was constructed inside of a scope.

Experimental support for Keras .compile, .fit, .evaluate, and .predict is available for Cloud TPUs, Cloud TPU, for all types of Keras models (sequential, functional and subclassing models).

Automatic outside compilation is now enabled for Cloud TPUs. This allows tf.summary to be used more conveniently with Cloud TPUs.

Dynamic batch sizes with DistributionStrategy and Keras are supported on Cloud TPUs.

Support for .fit, .evaluate, .predict on TPU using numpy data, in addition to tf.data.Dataset.

Keras reference implementations for many popular models are available in the TensorFlow Model Garden.

tf.data

Changes rebatching for tf.data datasets + DistributionStrategy for better performance. Note that the dataset also behaves slightly differently, in that the rebatched dataset cardinality will always be a multiple of the number of replicas.

tf.data.Dataset now supports automatic data distribution and sharding in distributed environments, including on TPU pods.

Distribution policies for tf.data.Dataset can now be tuned with 1. tf.data.experimental.AutoShardPolicy(OFF, AUTO, FILE, DATA) 2. tf.data.experimental.ExternalStatePolicy(WARN, IGNORE, FAIL)

tf.debugging

Add tf.debugging.enable_check_numerics() and tf.debugging.disable_check_numerics() to help debugging the root causes of issues involving infinities and NaNs.

tf.distribute

Custom training loop support on TPUs and TPU pods is avaiable through strategy.experimental_distribute_dataset, strategy.experimental_distribute_datasets_from_function, strategy.experimental_run_v2, strategy.reduce.

Support for a global distribution strategy through tf.distribute.experimental_set_strategy(), in addition to strategy.scope().

TensorRT

TensorRT 6.0 is now supported and enabled by default. This adds support for more TensorFlow ops including Conv3D, Conv3DBackpropInputV2, AvgPool3D, MaxPool3D, ResizeBilinear, and ResizeNearestNeighbor. In addition, the TensorFlow-TensorRT python conversion API is exported as tf.experimental.tensorrt.Converter.

Environment variable TF_DETERMINISTIC_OPS has been added. When set to "true" or "1", this environment variable makes tf.nn.bias_add operate deterministically (i.e. reproducibly), but currently only when XLA JIT compilation is not enabled. Setting TF_DETERMINISTIC_OPS to "true" or "1" also makes cuDNN convolution and max-pooling operate deterministically. This makes Keras Conv*D and MaxPool*D layers operate deterministically in both the forward and backward directions when running on a CUDA-enabled GPU.

Breaking Changes

Deletes Operation.traceback_with_start_lines for which we know of no usages.

Removed id from tf.Tensor.__repr__() as id is not useful other than internal debugging.

Some tf.assert_* methods now raise assertions at operation creation time if the input tensors' values are known at that time, not during the session.run(). This only changes behavior when the graph execution would have resulted in an error. When this happens, a noop is returned and the input tensors are marked non-feedable. In other words, if they are used as keys in feed_dict argument to session.run(), an error will be raised. Also, because some assert ops don't make it into the graph, the graph structure changes. A different graph can result in different per-op random seeds when they are not given explicitly (most often).

The following APIs are not longer experimental: tf.config.list_logical_devices, tf.config.list_physical_devices, tf.config.get_visible_devices, tf.config.set_visible_devices, tf.config.get_logical_device_configuration, tf.config.set_logical_device_configuration.

tf.config.experimentalVirtualDeviceConfiguration has been renamed to tf.config.LogicalDeviceConfiguration.

tf.config.experimental_list_devices has been removed, please use tf.config.list_logical_devices.

Bug Fixes and Other Changes
... (truncated)

Commits

5d80e1e Merge pull request #36215 from tensorflow-jenkins/version-numbers-1.15.2-8214

71e9d8f Update version numbers to 1.15.2

e50120e Merge pull request #36214 from tensorflow-jenkins/relnotes-1.15.2-2203

1a7e9fb Releasing 1.15.2 instead of 1.15.1

85f7aab Insert release notes place-fill

e75a6d6 Merge pull request #36190 from tensorflow/mm-r1.15-fix-v2-build

a6d8973 Use config=v1 as this is r1.15 branch.

fdb8589 Merge pull request #35912 from tensorflow-jenkins/relnotes-1.15.1-31298

a6051e8 Add CVE number for main patch

360b2e3 Merge pull request #34532 from ROCmSoftwarePlatform/r1.15-rccl-upstream-patch

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
Question about How fast are NumPy ops.ipynb

Hey just wondering, for the How fast are NumPy ops.ipynb

When considering the speed for the log(10) of all the elements in the Numpy array a1, shouldn't you also include the creation of the initial Numpy array?

Line 50 is this:

t1=time.time() a2=np.log10(a1) t2 = time.time() print("With direct NumPy log10 method it took {} seconds".format(t2-t1)) speed.append(t2-t1)

But isn't it more fair to make it this:

t1=time.time() a1 = np.array(l1) a2=np.log10(a1) t2 = time.time() print("With direct NumPy log10 method it took {} seconds".format(t2-t1)) speed.append(t2-t1)

Considering that it is an additional step not present in the other methods? In your code the bolded line is line 40.

opened by GitwellAnyohub 0
Prefixed the CSV file-path by 'Datasets/'

As the data are in a specific directory, namely 'Datasets', that latter must be added as a prefix to all the data file-paths (in all the notebooks, not only this one). By the way, since Pandas natively reads .bz2 compressed files, you can take that opportunity and compress (with BZip2) all the data files.

opened by da115115 0
Statistically significant function in regression model

Hi,

I'm wondering what the yesno-fuction does in the following notebook: https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Regression_Diagnostics.ipynb

def yes_no(b): if b: return 'Yes' else: return 'No'

It should decide whether a parameter is significantly important or not for the model? Where does the b refer to and what's the threshold for it to decide it's not statistically significant?

I usually look at the p-values in the statsmodels-ols table and when they fall below 0.05, they are significant, but in this notebook something else seems to be happening and I'm wondering if you could elaborate a bit on it (What is b?, how is it calculated?, what's the b's threshold? How to change the threshold from 0.01 to 0.05?) When the p-value in the ols-table is above 0.05, but the yes_no-function decides it's significant, what should I do (leave the parameter out or not)?

Kind regards, Matthias

opened by MatthiVH 1
Wrong interpretation of the Shapiro-Wilk test

In the Regression_diagnostics notebook, you are presenting the Shapiro-Wilk test.

The Shapiro-Wilk test's null hypothesis is that the data come from a Gaussian distribution. Therefore, the lower the p-value, the higher the change to reject the Gaussian distribution. The notebook says the opposite:

opened by F-A 2