A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

Overview

Machine Learning Notebooks, 3rd edition

This project aims at teaching you the fundamentals of Machine Learning in python. It contains the example code and solutions to the exercises in the third edition of my O'Reilly book Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow (3rd edition):

Note: If you are looking for the second edition notebooks, check out ageron/handson-ml2. For the first edition, see ageron/handson-ml.

Quick Start

Want to play with these notebooks online without having to install anything?

Use any of the following services (I recommended Colab or Kaggle, since they offer free GPUs and TPUs).

WARNING: Please be aware that these services provide temporary environments: anything you do will be deleted after a while, so make sure you download any data you care about.

  • Open In Colab

  • Open in Kaggle

  • Launch binder

  • Launch in Deepnote

Just want to quickly look at some notebooks, without executing any code?

  • Render nbviewer

  • github.com's notebook viewer also works but it's not ideal: it's slower, the math equations are not always displayed correctly, and large notebooks often fail to open.

Want to run this project using a Docker image?

Read the Docker instructions.

Want to install this project on your own machine?

Start by installing Anaconda (or Miniconda), git, and if you have a TensorFlow-compatible GPU, install the GPU driver, as well as the appropriate version of CUDA and cuDNN (see TensorFlow's documentation for more details).

Next, clone this project by opening a terminal and typing the following commands (do not type the first $ signs on each line, they just indicate that these are terminal commands):

$ git clone https://github.com/ageron/handson-ml3.git
$ cd handson-ml3

Next, run the following commands:

$ conda env create -f environment.yml
$ conda activate homl3
$ python -m ipykernel install --user --name=python3

Finally, start Jupyter:

$ jupyter notebook

If you need further instructions, read the detailed installation instructions.

FAQ

Which Python version should I use?

I recommend Python 3.8. If you follow the installation instructions above, that's the version you will get. Most code will work with other versions of Python 3, but some libraries do not support Python 3.9 or 3.10 yet, which is why I recommend Python 3.8.

I'm getting an error when I call load_housing_data()

Make sure you call fetch_housing_data() before you call load_housing_data(). If you're getting an HTTP error, make sure you're running the exact same code as in the notebook (copy/paste it if needed). If the problem persists, please check your network configuration.

I'm getting an SSL error on MacOSX

You probably need to install the SSL certificates (see this StackOverflow question). If you downloaded Python from the official website, then run /Applications/Python\ 3.8/Install\ Certificates.command in a terminal (change 3.8 to whatever version you installed). If you installed Python using MacPorts, run sudo port install curl-ca-bundle in a terminal.

I've installed this project locally. How do I update it to the latest version?

See INSTALL.md

How do I update my Python libraries to the latest versions, when using Anaconda?

See INSTALL.md

Contributors

I would like to thank everyone who contributed to this project, either by providing useful feedback, filing issues or submitting Pull Requests. Special thanks go to Haesun Park and Ian Beauregard who reviewed every notebook and submitted many PRs, including help on some of the exercise solutions. Thanks as well to Steven Bunkley and Ziembla who created the docker directory, and to github user SuperYorio who helped on some exercise solutions.

Comments
  • Install tensorflow-gpu with conda

    Install tensorflow-gpu with conda

    The following sentence from "INSATALL.md" is somewhat outdated:

    but the good news is that they will be installed automatically when you install the tensorflow-gpu package from Anaconda.

    It is because the official command below uses tf version 2.4 which is not compatible with the jupyter notebooks for homl3.

    conda install -c anaconda tensorflow-gpu

    Cf. https://anaconda.org/anaconda/tensorflow-gpu

    Is there any other easy way to use tensorflow-gpu?

    opened by liganega 9
  • pydot requirement missing for keras

    pydot requirement missing for keras "plot_model" functionality

    Cloned repo and created new env from the environment.yml file today. When I ran through notebook, "10_neural_nets_with_keras", the, "tf.keras.utils.plot_model(...)" cell failed with missing pydot requirement. Cell is marked as "extra code", and overall this doesn't really impact running through the notebook.

    opened by immersinn 6
  • [BUG]

    [BUG]

    Thanks for helping us improve this project!

    Before you create this issue Please make sure you are using the latest updated code and libraries: see https://github.com/ageron/handson-ml3/blob/main/INSTALL.md#update-this-project-and-its-libraries

    Also please make sure to read the FAQ (https://github.com/ageron/handson-ml3#faq) and search for existing issues (both open and closed), as your question may already have been answered: https://github.com/ageron/handson-ml3/issues

    Describe the bug Part 1 - Chapter 2 - (EndtoEndML), in creating ColumnTransformer supplying feature_names_out as argument rises TypeError exception.

    To Reproduce Please copy the code that fails here, using code blocks like this:

    def ratio_pipeline(name=None):
        return make_pipeline(
            SimpleImputer(strategy="median"),
            FunctionTransformer(column_ratio, feature_names_out= lambda input_features: [name]),
            StandardScaler())
    

    And if you got an exception, please copy the full stacktrace here:

    Traceback (most recent call last):
      File "F:\*****\*****\*****\CHAPTER2-END_TO_END_PROJECT\final-example.py", line 49, in <module>
        ("bedrooms_ratio", ratio_pipeline("bedrooms_ratio"), ["total_bedrooms", "total_rooms"]),
      File "F:\*****\*****\*****\CHAPTER2-END_TO_END_PROJECT\final-example.py", line 38, in ratio_pipeline
        FunctionTransformer(column_ratio, feature_names_out= lambda input_features: [name]),
    TypeError: FunctionTransformer.__init__() got an unexpected keyword argument 'feature_names_out'
    

    Expected behavior The preprocessing fit and transform the dataset

    Versions (please complete the following information):

    • OS: Windows 10
    • Python: 3.10
    • TensorFlow: ****
    • Scikit-Learn: 1.0.2
    • Other libraries that may be connected with the issue:*****

    Additional context

    This is my full code

    import numpy as np
    from sklearn.base import BaseEstimator, TransformerMixin
    from sklearn.cluster import KMeans
    from sklearn.compose import ColumnTransformer, make_column_selector
    from sklearn.impute import SimpleImputer
    from sklearn.metrics.pairwise import rbf_kernel
    from sklearn.pipeline import make_pipeline
    from sklearn.preprocessing import FunctionTransformer, StandardScaler, OneHotEncoder
    from data import housing
    
    
    class ClusterSimilarity(BaseEstimator, TransformerMixin):
        def __init__(self, n_clusters=10, gamma=1.0, random_state=None):
            self.n_clusters = n_clusters
            self.gamma = gamma
            self.random_state = random_state
    
        def fit(self, X, y=None, sample_weight=None):
            self.kmeans_ = KMeans(self.n_clusters, random_state=self.random_state)
            self.kmeans_.fit(X, sample_weight=sample_weight)
            return self  # always return self!
    
        def transform(self, X):
            return rbf_kernel(X, self.kmeans_.cluster_centers_, gamma=self.gamma)
    
        def get_feature_names_out(self, names=None):
            return [f"Cluster {i} similarity" for i in range(self.n_clusters)]
    
    
    def column_ratio(X):
        return X[:, [0]] / X[:, [1]]
    
    
    def ratio_pipeline(name=None):
        return make_pipeline(
            SimpleImputer(strategy="median"),
            FunctionTransformer(column_ratio, feature_names_out= lambda input_features: [name]),
            StandardScaler())
    
    
    log_pipeline = make_pipeline(SimpleImputer(strategy="median"), FunctionTransformer(np.log), StandardScaler())
    cluster_simil = ClusterSimilarity(n_clusters=10, gamma=1., random_state=42)
    default_num_pipeline = make_pipeline(SimpleImputer(strategy="median"),StandardScaler())
    
    cat_pipeline = make_pipeline(SimpleImputer(strategy="most_frequent"), OneHotEncoder(handle_unknown="ignore"))
    
    preprocessing = ColumnTransformer([
        ("bedrooms_ratio", ratio_pipeline("bedrooms_ratio"), ["total_bedrooms", "total_rooms"]),
        ("rooms_per_house", ratio_pipeline("rooms_per_house"), ["total_rooms", "households"]),
        ("people_per_house", ratio_pipeline("people_per_house"), ["population", "households"]),
        ("log", log_pipeline, ["total_bedrooms", "total_rooms", "population", "households", "median_income"]),
        ("geo", cluster_simil, ["latitude", "longitude"]),
        ("cat", cat_pipeline, make_column_selector(dtype_include=np.object)),
    ], remainder=default_num_pipeline)  # one column remaining: housing_median_age
    
    
    if __name__ == '__main__':
        housing_prepared = preprocessing.fit_transform(housing)
        print(housing_prepared.shape)
        print(preprocessing.get_feature_names_out())
    
    opened by alme7airbi93 5
  • [QUESTION] Ridge regression cost function

    [QUESTION] Ridge regression cost function

    What's the right definition of the cost function for Ridge regression?

    1. The doc version:
    ridge1
    1. The book version:
    ridge2

    In the 2nd edition, 2 is used instead of m. Anyway, it's not obvious why with m.

    opened by liganega 4
  • Ch7: clarify comments, fix typo, update score outputs

    Ch7: clarify comments, fix typo, update score outputs

    Hi @ageron,

    This PR clarifies some comments, fixes several typos and deprecated np.object, and updates output scores in the solutions for exercises 8 and 9, because it makes some difference for the comments.

    I also updated the answers to Solutions 6 and 7 about decreasing/increasing the learning rate. Please double-check: it makes sense to me, but I may be wrong here.

    opened by vi3itor 3
  • Chap 16, cell[49], Reusing Pretrained Embeddings and Language Models.

    Chap 16, cell[49], Reusing Pretrained Embeddings and Language Models.

    Hello, I'm using Nvidia Geforce 1050 ti 4gb, after running cell [49], I'm getting this error. Kindly share the requirements of GPU which will run smoothly with this book.

    W tensorflow/core/framework/op_kernel.cc:1733] RESOURCE_EXHAUSTED: failed to allocate memory

    opened by Asjad22 2
  • Chapter 4: Fix figure numbering and correct typos

    Chapter 4: Fix figure numbering and correct typos

    Hi @ageron,

    This PR updates some figure labels, corrects several typos (mostly in the solution section) and updates the link to NumPy documentation.

    Also, you finished Chapter 4 with the following sentences: We’ve opened up the first Machine Learning black boxes! In the next chapters, we will open many more, starting with Decision Trees. Did you mean "Support Vector Machines"?

    opened by vi3itor 2
  • Update matplotlib tutorial

    Update matplotlib tutorial

    Hi @ageron,

    Yet another PR :)

    • updated documentation links,
    • removed unnecessary import for 3D plotting: it's not been required since Matplotlib 3.2.0 (March 2020),
    • added the link to matplotlib's tutorial for 3d plotting,
    • corrected a few typos,
    • replaced "attributes" with "parameters" and "arguments" depending on the context.
    opened by vi3itor 2
  • [QUESTION] SVC's default value for `decision_function_shape`

    [QUESTION] SVC's default value for `decision_function_shape`

    According to sklearn's document, it seems that the hyperparmeter decision_function_shape for SVC model is "ovr", not "ovo". But in the book, "ovo" is mentioned and explained in the section about Multiclass classification, Chapter 3.

    On the other hand, it is written in the jupyter notebook as follows:

    If you want decision_function() to return all 45 scores, you can set the decision_function_shape hyperparameter to "ovo". The default value is "ovr", but don't let this confuse you: SVC always uses OvO for training. This hyperparameter only affects whether or not the 45 scores get aggregated or not:

    This should be also explained in the book.

    opened by liganega 2
  • [QUESTION] About the keyword argument

    [QUESTION] About the keyword argument "feature_names_out" of "FunctionTransformer" class

    The definition of ratio_pipeline() function in the jupyter notebook of Chapter 2 contains the following transformer:

    FunctionTransformer(column_ratio,
                        feature_names_out=[name]),
    

    But the keyword feature_names_out is not listed as an argument in the scikit-learn document of FunctionTransformer class or its parent classes, and it's not clear how to understand it.

    Is that just for naming the transformer to be built or for any other pourpose? And why isn't it listed as a keyword argment of the FunctionTransformer's constructor or of its parent classes'?

    opened by liganega 2
  • Code indentation error in load_housing_data()

    Code indentation error in load_housing_data()

    load_housing_data() function had an indentation error because of which if the 'housing.tgz' already existed in the drive, the code won't extract the 'housing.csv' and 'datasets/housing/housing.csv' won't exist.

    opened by AniketYadav17 1
  • [BUG] Chapter 2 cell 72: '.toarray()' missing in df_test example

    [BUG] Chapter 2 cell 72: '.toarray()' missing in df_test example

    Thanks for helping us improve this project!

    Before you create this issue Please make sure you are using the latest updated code and libraries: see https://github.com/ageron/handson-ml3/blob/main/INSTALL.md#update-this-project-and-its-libraries

    Also please make sure to read the FAQ (https://github.com/ageron/handson-ml3#faq) and search for existing issues (both open and closed), as your question may already have been answered: https://github.com/ageron/handson-ml3/issues

    Describe the bug Running cell 72 in the Chapter 2 returns a 2x5 sparse matrix and not a numpy array. I run this in Jetbrains DataSpell.

    To Reproduce Please copy the code that fails here, using code blocks like this: cell72:

    cat_encoder.transform(df_test)
    

    And if you got an exception, please copy the full stacktrace here: not an exception and the output is correct - it just misses the step of converting the output to a numpy array. The np array output is displayed both in the book (page 74 at the top as well as in the notebook).

    <2x5 sparse matrix of type '<class 'numpy.float64'>'
    	with 2 stored elements in Compressed Sparse Row format>
    

    Expected behavior The expected out put is the output in the book (p. 74) or in the notebook (out:72) I achieve this result by adding .toarray() to the code

    Screenshots If applicable, add screenshots to help explain your problem.

    Versions (please complete the following information):

    • OS: MacOS Ventura 13.1
    • Python: 3.10
    • TensorFlow: as per requirements.txt
    • Scikit-Learn: as per requirements.txt
    • Other libraries that may be connected with the issue: Using Jetbrains DataSpell

    Additional context Add any other context about the problem here. DataSpellProjects_–_homl3_ch2_ipynb

    opened by Pattkopp 0
  • Problem creating HOML3 environment

    Problem creating HOML3 environment

    Having trouble creating HOML3 environment on Windows 10. I thought this might be due to having standalone Python 3.10 installed, so I uninstalled that, then uninstalled and reinstalled anaconda. Each time I retry creating the HOML3 environment, I first clean up by deleting env3\homl3 in the anaconda folder.

    Here's the error. Before trying to create the HOML3 environment this time, I did "python -m pip install libtorrent" and got "Requirement already satisfied: libtorrent in c:\users\x475a\anaconda3\lib\site-packages (2.0.7)"

    Thank you.

    Building wheel for AutoROM.accept-rom-license (pyproject.toml): finished with status 'error' Failed to build AutoROM.accept-rom-license

    Pip subprocess error: error: subprocess-exited-with-error

    × Building wheel for AutoROM.accept-rom-license (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [66 lines of output] C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools\config\setupcfg.py:508: SetuptoolsDeprecationWarning: The license_file parameter is deprecated, use license_files instead. warnings.warn(msg, warning_class) running bdist_wheel running build running build_py creating build creating build\lib copying AutoROM.py -> build\lib installing to build\bdist.win-amd64\wheel running install running install_lib creating build\bdist.win-amd64 creating build\bdist.win-amd64\wheel copying build\lib\AutoROM.py -> build\bdist.win-amd64\wheel. running install_egg_info running egg_info writing AutoROM.accept_rom_license.egg-info\PKG-INFO writing dependency_links to AutoROM.accept_rom_license.egg-info\dependency_links.txt writing requirements to AutoROM.accept_rom_license.egg-info\requires.txt writing top-level names to AutoROM.accept_rom_license.egg-info\top_level.txt reading manifest file 'AutoROM.accept_rom_license.egg-info\SOURCES.txt' reading manifest template 'MANIFEST.in' adding license file 'LICENSE.txt' writing manifest file 'AutoROM.accept_rom_license.egg-info\SOURCES.txt' Copying AutoROM.accept_rom_license.egg-info to build\bdist.win-amd64\wheel.\AutoROM.accept_rom_license-0.5.0-py3.10.egg-info running install_scripts Traceback (most recent call last): File "C:\Users\x475a\anaconda3\envs\homl3\lib\site-packages\pip_vendor\pep517\in_process_in_process.py", line 351, in main() File "C:\Users\x475a\anaconda3\envs\homl3\lib\site-packages\pip_vendor\pep517\in_process_in_process.py", line 333, in main json_out['return_val'] = hook(**hook_input['kwargs']) File "C:\Users\x475a\anaconda3\envs\homl3\lib\site-packages\pip_vendor\pep517\in_process_in_process.py", line 249, in build_wheel return _build_backend().build_wheel(wheel_directory, config_settings, File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools\build_meta.py", line 413, in build_wheel return self._build_with_temp_dir(['bdist_wheel'], '.whl', File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools\build_meta.py", line 398, in _build_with_temp_dir self.run_setup() File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools\build_meta.py", line 484, in run_setup super(BuildMetaLegacyBackend, File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools\build_meta.py", line 335, in run_setup exec(code, locals()) File "", line 18, in File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools_init.py", line 87, in setup return distutils.core.setup(**attrs) File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools_distutils\core.py", line 185, in setup return run_commands(dist) File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools_distutils\core.py", line 201, in run_commands dist.run_commands() File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools_distutils\dist.py", line 969, in run_commands self.run_command(cmd) File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools\dist.py", line 1208, in run_command super().run_command(command) File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools_distutils\dist.py", line 988, in run_command cmd_obj.run() File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\wheel\bdist_wheel.py", line 360, in run self.run_command("install") File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools_distutils\cmd.py", line 318, in run_command self.distribution.run_command(command) File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools\dist.py", line 1208, in run_command super().run_command(command) File "C:\Users\x475a\AppData\Local\Temp\pip-build-env-vk00i65p\overlay\Lib\site-packages\setuptools_distutils\dist.py", line 988, in run_command cmd_obj.run() File "", line 11, in run File "C:\Users\x475a\AppData\Local\Temp\pip-install-n3yw4sxy\autorom-accept-rom-license_e037a052a27841a5a54970ef670daf18\AutoROM.py", line 13, in import libtorrent as lt ImportError: DLL load failed while importing libtorrent: The specified module could not be found. [end of output]

    note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for AutoROM.accept-rom-license ERROR: Could not build wheels for AutoROM.accept-rom-license, which is required to install pyproject.toml-based projects

    failed

    CondaEnvException: Pip failed

    opened by suburbanmd 0
  • [IDEA] Requesting to Include Graph Neural Networks

    [IDEA] Requesting to Include Graph Neural Networks

    The field of graph representation learning has grown at an incredible and sometimes unwieldy pace over the past few years, and a lot of new algorithms and innovations were made in the field. I read the second version completely it's a power-packed guide for beginners to gain knowledge on both machine learning and deep learning. But I find this important puzzle was missing there, I request the author to add a chapter on graph neural networks in upcoming versions. The hands-on experience with graph neural networks will be useful for all the readers. Thanks.

    opened by Kirushikesh 0
  • Link is not working chapter 11[BUG]

    Link is not working chapter 11[BUG]

    I'm currently reading the book from O'reilly website and in chapter 11 there is one link that is not working. It is under Unsupervised Pretraining header and when i'm trying to open the link https://homl.info/extra-anns which redirects to https://colab.research.google.com/github/ageron/handson-ml3/blob/main/extra_ann_architectures.ipynb but this is not working and when i cheked the repo, I dont find it as well.

    @ageron can you please update it.

    opened by amitmeel 0
  • chapter-15, cell[41], Multivariate Time Series

    chapter-15, cell[41], Multivariate Time Series

    Screenshot_20220908_230306

    After training mulvar_model with train_mulvar_ds and valid_mulvar_ds, I am trying to predict the model on mulvar_test using model.predict(mulvar_test). and it's returning the array of floats containing 914 values(914 values maybe because of the shape of mulvar_test is (914, 5 ) ) and i guess thats the array of predictions on rail. all i want to know if there's any way that model.predict will return only one value instead of array 914 values?

    opened by Asjad22 1
  • [BUG] docker compose build fails with unresolved packages

    [BUG] docker compose build fails with unresolved packages

    => ERROR [ 4/12] RUN echo ' - pyvirtualdisplay' >> /tmp/environment. 14.1s

    [ 4/12] RUN echo ' - pyvirtualdisplay' >> /tmp/environment.yml && conda env create -f /tmp/environment.yml && conda clean -afy && find /opt/conda/ -follow -type f -name '.a' -delete && find /opt/conda/ -follow -type f -name '.pyc' -delete && find /opt/conda/ -follow -type f -name '*.js.map' -delete && rm /tmp/environment.yml: #8 0.444 Collecting package metadata (repodata.json): ...working... done #8 12.95 Solving environment: ...working... failed #8 12.96 #8 12.96 ResolvePackageNotFound: #8 12.96 - pyglet=1.5 #8 12.96 - box2d-py #8 12.96


    executor failed running [/bin/sh -c echo ' - pyvirtualdisplay' >> /tmp/environment.yml && conda env create -f /tmp/environment.yml && conda clean -afy && find /opt/conda/ -follow -type f -name '.a' -delete && find /opt/conda/ -follow -type f -name '.pyc' -delete && find /opt/conda/ -follow -type f -name '*.js.map' -delete && rm /tmp/environment.yml]: exit code: 1 ERROR: Service 'handson-ml3' failed to build : Build failed

    opened by sinharahul 1
Owner
Aurélien Geron
Author of the book Hands-On Machine Learning with Scikit-Learn and TensorFlow. Former PM of YouTube video classification and founder & CTO of a telco operator.
Aurélien Geron
A collection of Scikit-Learn compatible time series transformers and tools.

tsfeast A collection of Scikit-Learn compatible time series transformers and tools. Installation Create a virtual environment and install: From PyPi p

Chris Santiago 0 Mar 30, 2022
PySpark + Scikit-learn = Sparkit-learn

Sparkit-learn PySpark + Scikit-learn = Sparkit-learn GitHub: https://github.com/lensacom/sparkit-learn About Sparkit-learn aims to provide scikit-lear

Lensa 1.1k Jan 4, 2023
Relevance Vector Machine implementation using the scikit-learn API.

scikit-rvm scikit-rvm is a Python module implementing the Relevance Vector Machine (RVM) machine learning technique using the scikit-learn API. Quicks

James Ritchie 204 Nov 18, 2022
Painless Machine Learning for python based on scikit-learn

PlainML Painless Machine Learning Library for python based on scikit-learn. Install pip install plainml Example from plainml import KnnModel, load_ir

null 1 Aug 6, 2022
Automated Machine Learning with scikit-learn

auto-sklearn auto-sklearn is an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator. Find the documentation here

AutoML-Freiburg-Hannover 6.7k Jan 7, 2023
Iris species predictor app is used to classify iris species created using python's scikit-learn, fastapi, numpy and joblib packages.

Iris Species Predictor Iris species predictor app is used to classify iris species using their sepal length, sepal width, petal length and petal width

Siva Prakash 5 Apr 5, 2022
Penguins species predictor app is used to classify penguins species created using python's scikit-learn, fastapi, numpy and joblib packages.

Penguins Classification App Penguins species predictor app is used to classify penguins species using their island, sex, bill length (mm), bill depth

Siva Prakash 3 Apr 5, 2022
Predicting Baseball Metric Clusters: Clustering Application in Python Using scikit-learn

Clustering Clustering Application in Python Using scikit-learn This repository contains the prediction of baseball metric clusters using MLB Statcast

Tom Weichle 2 Apr 18, 2022
K-Means clusternig example with Python and Scikit-learn

Unsupervised-Machine-Learning Flat Clustering K-Means clusternig example with Python and Scikit-learn Flat clustering Clustering algorithms group a se

Emin 1 Dec 13, 2021
Scikit learn library models to account for data and concept drift.

liquid_scikit_learn Scikit learn library models to account for data and concept drift. This python library focuses on solving data drift and concept d

null 7 Nov 18, 2021
Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets

Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets Datasets Used: Iris dataset,

Samrat Mitra 2 Nov 18, 2021
icepickle is to allow a safe way to serialize and deserialize linear scikit-learn models

icepickle It's a cooler way to store simple linear models. The goal of icepickle is to allow a safe way to serialize and deserialize linear scikit-lea

vincent d warmerdam 24 Dec 9, 2022
A scikit-learn based module for multi-label et. al. classification

scikit-multilearn scikit-multilearn is a Python module capable of performing multi-label learning tasks. It is built on-top of various scientific Pyth

null 802 Jan 1, 2023
Highly interpretable classifiers for scikit learn, producing easily understood decision rules instead of black box models

Highly interpretable, sklearn-compatible classifier based on decision rules This is a scikit-learn compatible wrapper for the Bayesian Rule List class

Tamas Madl 482 Nov 19, 2022
Distributed scikit-learn meta-estimators in PySpark

sk-dist: Distributed scikit-learn meta-estimators in PySpark What is it? sk-dist is a Python package for machine learning built on top of scikit-learn

Ibotta 282 Dec 9, 2022
Scikit-Learn useful pre-defined Pipelines Hub

Scikit-Pipes Scikit-Learn useful pre-defined Pipelines Hub Usage: Install scikit-pipes It's advised to install sklearn-genetic using a virtual env, in

Rodrigo Arenas 1 Apr 26, 2022
A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Allen Chiang 152 Jan 7, 2023
A comprehensive repository containing 30+ notebooks on learning machine learning!

A comprehensive repository containing 30+ notebooks on learning machine learning!

Jean de Dieu Nyandwi 3.8k Jan 9, 2023
A Python implementation of GRAIL, a generic framework to learn compact time series representations.

GRAIL A Python implementation of GRAIL, a generic framework to learn compact time series representations. Requirements Python 3.6+ numpy scipy tslearn

null 3 Nov 24, 2021