Learn how to responsibly deliver value with ML.

Overview

 Made With ML

Applied ML · MLOps · Production
Join 30K+ developers in learning how to responsibly deliver value with ML.

     
🔥   Among the top MLOps repositories on GitHub


Foundations

Learn the foundations of ML through intuitive explanations, clean code and visuals.

🛠   Toolkit 🔥   Machine Learning 🤖   Deep Learning
Notebooks Linear Regression CNNs
Python Logistic Regression Embeddings
NumPy Neural Network RNNs
Pandas Data Quality Attention
PyTorch Utilities Transformers

📆   More topics coming soon!
Subscribe for our monthly updates on new content.


MLOps

Learn how to apply ML to build a production grade product to deliver value.

📦   Product 📝   Scripting ♻️   Reproducibility
Objective Organization Git
Solution Packaging Pre-commit
Iteration Documentation Versioning
🔢   Data Styling Docker
Labeling Makefile 🚀   Production
Preprocessing Logging Dashboard
Exploratory data analysis 📦   Interfaces CI/CD workflows
Splitting Command-line Infrastructure
Augmentation RESTful API Monitoring
📈   Modeling   Testing Feature store
Evaluation Code Pipelines
Experiment tracking Data Continual learning
Optimization Models

📆   New lessons every month!
Subscribe for our monthly updates on new content.


FAQ

Who is this content for?

  • Software engineers looking to learn ML and become even better software engineers.
  • Data scientists who want to learn how to responsibly deliver value with ML.
  • College graduates looking to learn the practical skills they'll need for the industry.
  • Product Managers who want to develop a technical foundation for ML applications.

What is the structure?

Lessons will be released weekly and each one will include:

  • intuition: high level overview of the concepts that will be covered and how it all fits together.
  • code: simple code examples to illustrate the concept.
  • application: applying the concept to our specific task.
  • extensions: brief look at other tools and techniques that will be useful for difference situations.

What makes this content unique?

  • hands-on: If you search production ML or MLOps online, you'll find great blog posts and tweets. But in order to really understand these concepts, you need to implement them. Unfortunately, you don’t see a lot of the inner workings of running production ML because of scale, proprietary content & expensive tools. However, Made With ML is free, open and live which makes it a perfect learning opportunity for the community.
  • intuition-first: We will never jump straight to code. In every lesson, we will develop intuition for the concepts and think about it from a product perspective.
  • software engineering: This course isn't just about ML. In fact, it's mostly about clean software engineering! We'll cover important concepts like versioning, testing, logging, etc. that really makes something production-grade product.
  • focused yet holistic: For every concept, we'll not only cover what's most important for our specific task (this is the case study aspect) but we'll also cover related methods (this is the guide aspect) which may prove to be useful in other situations.

Who is the author?

  • I've deployed large scale ML systems at Apple as well as smaller systems with constraints at startups and want to share the common principles I've learned.
  • Connect with me on Twitter and LinkedIn

Why is this free?

While this content is for everyone, it's especially targeted towards people who don't have as much opportunity to learn. I believe that creativity and intelligence are randomly distributed while opportunities are siloed. I want to enable more people to create and contribute to innovation.


To cite this content, please use:
@misc{madewithml,
    author       = {Goku Mohandas},
    title        = {Made With ML},
    howpublished = {\url{https://madewithml.com/}},
    year         = {2021}
}
Comments
  • sns.barplot in EDA

    sns.barplot in EDA

    i think instead of ax = sns.barplot(list(tags), list(tag_counts)) it should be ax = sns.barplot(x=list(tags), y=list(tag_counts)) in code at https://madewithml.com/courses/mlops/exploratory-data-analysis/

    opened by gexahedron 4
  • Alternative to Colab and Binder for running `practicalAI` in the cloud

    Alternative to Colab and Binder for running `practicalAI` in the cloud

    Hi @GokuMohandas,

    I've been recently taking a look at the sample Notebooks in this project and I found them really interesting and valuable for teaching purposes. We're even thinking about adding part of them to our curriculum at https://rmotr.com/ (cofounder and teacher here), in our Data Science program.

    We have a small service at RMOTR that lets you run a Jupyter environment online in a single click. Similar to Google Colab or Binder, but also with the ability of installing custom requirements, clone an entire GH repo, etc. We use it for our students, so they don't have to hit the initial wall of installing the whole local Jupyter setup when they are getting started in the DS world.

    You can see how practicalAI looks like in the service using this link: https://notebooks.rmotr.com/clone/gh/GokuMohandas/practicalAI

    Note that all requirements listed in requirements.txt are already installed when the env is loaded, so people can start using it right away. That gives you the flexibility of adding any requirement, and not being tied to what Colab provides by default.

    Do you think it would be a good choice to add it as a third launching option? Alternatively to Colab and Binder, already listed in the README.

    I hope you like it, and I truly appreciate any feedback.

    thanks.

    opened by martinzugnoni 4
  • Foundations --> Embeddings

    Foundations --> Embeddings

    1. Typo under Model section: 3. We'll apply convolution via filters (filter_size, vocab_size, num_filters) should be embedding_dim to replace vocab_size?
    2. Typo under Experiments: first have to decice
    3. Typo under Interpretability padding our inputs before convolution to result is outputs is should be in
    4. Could there be a general explanation of moving models/data across devices? My current understanding is that they have to be both on the same place (cpu/gpu). If on gpu, just stay on gpu through the whole train/eval/predict session. I couldn't understand why under Inference device = torch.device("cpu") moves things back to cpu.
    5. interpretable_trainer.predict_step(dataloader) breaks with AttributeError: 'list' object has no attribute 'dim'. The precise step is F.softmax(z), where for interpretable_model, z is a list of 3 items and it was trying to softmax a list instead of a tensor.
    opened by gitgithan 3
  • Foundations --> Linear regression (Error in implementation)

    Foundations --> Linear regression (Error in implementation)

    Under Pytorch --> Interpretability: b_unscaled = b * y_scaler.scale_ + y_scaler.mean_ - np.sum(W_unscaled*X_scaler.mean_) This line seems to be missing a * (y_scaler.scale_/X_scaler.scale_) in the last np.sum term.

    The table for W unscaled was also confusing. It has a sum term shown there, which means if X began with 2 predictors (this lesson only used 1 predictor), the scaled W will have 2 predictors while the sum will aggregate the 2 weights into 1 unscaled weight? Can't wrap my head around this.

    Also, under Pytorch --> Interpretability, W_unscaled = W * (y_scaler.scale_/X_scaler.scale_) there was no sum used here, so looks inconsistent with the formula in the table.

    image

    opened by gitgithan 3
  • alternative for colab notebook service in mainland china

    alternative for colab notebook service in mainland china

    hi! appreciate your work here, me and my friends really learned a lot here we happened to find a platform in mainland China providing similar service to google colab and kaggle ( as you may known there is connectivity problem to google services in mainland China) called KESCI(www.kesci.com). They provide dev-ready and up-to-date Python & R cpu environment all for free and an upcoming gpu support. we also managed to translate the whole series to Chinese and applied for a column to publish them on KESCI, as a series. you can access it here : https://www.kesci.com/home/column/5c20e4c5916b6200104eea63 the Computer Vision notebook has already been translated but is still being trained in the transfer-learning section also, do you think it is possible to add this as another launching option? i think there must be more people in China who could learn from your tutorials!

    opened by gerard0315 3
  • Recreated content authorized by the original copyright owner

    Recreated content authorized by the original copyright owner

    Hi,GokuMohandas: I translate all content of Made-With-ML into chinese language, I post the content in my [blog] (https://franztao.github.io) and wechat blog。I wish get your agree about the recreated content by the original copyright owner?

    opened by franztao 2
  • Silly question: LabelEncoder

    Silly question: LabelEncoder

    While creating the LabelEncoder class, I couldnt understand why return self in class method fit(self,y)? My understanding is that when we call this method, the object variables are updated so no need for self? Please correct me if I'm wrong, just trying to reason myself with each step of the code.

        def fit(self, y):
            classes = np.unique(y)
            for i, class_ in enumerate(classes):
                self.class_to_index[class_] = i
            self.index_to_class = {v: k for k,v in self.class_to_index.items()}
            self.classes = list(self.class_to_index.keys())
            return self #Why?
    
    opened by knosing 2
  • Issue in viewing the experiment in MLflow

    Issue in viewing the experiment in MLflow

    I am running the tagifai.ipynb notebook on the windows platform but facing difficulty viewing the experiment in MLflow.

    Steps Done:

    1. Cloned the repo
    2. Running the "mlops-course\notebooks\tagifai.ipynb" in vs code locally.
    3. To run the server "mlflow server -h 0.0.0.0 -p 8000 --backend-store-uri /experiments/" from the location of the notebook, experiments is the next folder inside it. # $PWD is omitted because of windows.
    4. Opening the "http://localhost:8000/#/"

    Observation :

    1. No signs of experiment run.
    2. Image attached below for ref. image

    Please provide assistance with this issue.

    Thanks

    opened by mukul74 2
  • Foundations --> Transformers

    Foundations --> Transformers

    Hi Goku... I am really thankful for all your amazing tutorials.

    I however was facing some issues in the Transformers lecture. There are a few minor bugs here with missing variables and imports; which was not an issue.

    The training code however is missing the block:

    # Train
    best_model = trainer.train(
        num_epochs, patience, train_dataloader, val_dataloader)
    

    Also when i wrote this and ran it, I got an error:

    /usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:14: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
      
    /usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:15: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
      from ipykernel import kernelapp as app
    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    [<ipython-input-68-8d0f0dee99db>](https://localhost:8080/#) in <module>()
          1 # Train
          2 best_model = trainer.train(
    ----> 3     num_epochs, patience, train_dataloader, val_dataloader)
    
    6 frames
    [/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py](https://localhost:8080/#) in dropout(input, p, training, inplace)
       1277     if p < 0.0 or p > 1.0:
       1278         raise ValueError("dropout probability has to be between 0 and 1, " "but got {}".format(p))
    -> 1279     return _VF.dropout_(input, p, training) if inplace else _VF.dropout(input, p, training)
       1280 
       1281 
    
    TypeError: dropout(): argument 'input' (position 1) must be Tensor, not str
    

    Apparently, the issue comes from the line :

    seq, pool = self.transformer(input_ids=ids, attention_mask=masks)
    

    wherein the "pool" returned is of class string. Upon printing the type and the value of it i get the following :

    <class 'str'>
    pooler_output
    

    Can you please have a look into this. Thanks in Advance!!

    opened by shashankvasisht 2
  • Lambda function missing

    Lambda function missing

    Hi Goku,

    I'm going through the Pandas and I noticed that in the Feature engineering section, you mentioned about applying a lambda function to create a new feature, but the code for it does not appear. I think it's just a minor typo.

    Regards, Roberto

    opened by jroberayalas 2
  • Lessons page, Basic ML:

    Lessons page, Basic ML: "Notebook not found".

    Problem: Starting from either https://practicalai.me/learn/lessons/ or https://github.com/practicalAI/practicalAI, when attempting to click any of the lessons I see "Notebook not found". Proposed fix: Possibly "basic_ml" should be added to the path?

    image

    When I click "authorize with Github" I see the same thing: image

    The link given then does not work: image

    In the case of the "linear regression" notebook, the non-working link given on the "lessons" page is https://colab.research.google.com/github/practicalAI/practicalAI/blob/master/notebooks/04_Linear_Regression.ipynb

    Whereas if you go find it on github directly, it is https://colab.research.google.com/github/practicalAI/practicalAI/blob/master/notebooks/basic_ml/04_Linear_Regression.ipynb

    opened by ColinLeongUDRI 2
  • Update 07_Logistic_Regression.ipynb - softmax definition fix

    Update 07_Logistic_Regression.ipynb - softmax definition fix

    Changed $\hat{y} = \frac{e^{XW_y}}{\sum_j e^{XW}}$ to $\hat{y} = \frac{e^{XW}}{\sum_j e^{XW_j}}$ in the equation defining softmax (located after the second paragraph).

    opened by matospiso 1
  • Update 07_Logistic_Regression.ipynb

    Update 07_Logistic_Regression.ipynb

    fstring format error with too many double quotes as in the first example of this error print (f"m:b = {class_counts["malignant"]/class_counts["benign"]:.2f}"), Should be single quote like print (f'm:b = {class_counts["malignant"]/class_counts["benign"]:.2f}')

    opened by data-steve 0
Owner
Goku Mohandas
Founder @madewithml. AI Research @apple. Author @oreillymedia. ML Lead @Ciitizen. Alum @hopkinsmedicine and @gatech
Goku Mohandas
Auto updating website that tracks closed & open issues/PRs on scikit-learn/scikit-learn.

Repository Status for Scikit-learn Live webpage Auto updating website that tracks closed & open issues/PRs on scikit-learn/scikit-learn. Running local

Thomas J. Fan 6 Dec 27, 2022
A scikit-learn based module for multi-label et. al. classification

scikit-multilearn scikit-multilearn is a Python module capable of performing multi-label learning tasks. It is built on-top of various scientific Pyth

null 802 Jan 1, 2023
Highly interpretable classifiers for scikit learn, producing easily understood decision rules instead of black box models

Highly interpretable, sklearn-compatible classifier based on decision rules This is a scikit-learn compatible wrapper for the Bayesian Rule List class

Tamas Madl 482 Nov 19, 2022
Automated Machine Learning with scikit-learn

auto-sklearn auto-sklearn is an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator. Find the documentation here

AutoML-Freiburg-Hannover 6.7k Jan 7, 2023
Relevance Vector Machine implementation using the scikit-learn API.

scikit-rvm scikit-rvm is a Python module implementing the Relevance Vector Machine (RVM) machine learning technique using the scikit-learn API. Quicks

James Ritchie 204 Nov 18, 2022
Distributed scikit-learn meta-estimators in PySpark

sk-dist: Distributed scikit-learn meta-estimators in PySpark What is it? sk-dist is a Python package for machine learning built on top of scikit-learn

Ibotta 282 Dec 9, 2022
Iris species predictor app is used to classify iris species created using python's scikit-learn, fastapi, numpy and joblib packages.

Iris Species Predictor Iris species predictor app is used to classify iris species using their sepal length, sepal width, petal length and petal width

Siva Prakash 5 Apr 5, 2022
A collection of Scikit-Learn compatible time series transformers and tools.

tsfeast A collection of Scikit-Learn compatible time series transformers and tools. Installation Create a virtual environment and install: From PyPi p

Chris Santiago 0 Mar 30, 2022
Penguins species predictor app is used to classify penguins species created using python's scikit-learn, fastapi, numpy and joblib packages.

Penguins Classification App Penguins species predictor app is used to classify penguins species using their island, sex, bill length (mm), bill depth

Siva Prakash 3 Apr 5, 2022
Scikit learn library models to account for data and concept drift.

liquid_scikit_learn Scikit learn library models to account for data and concept drift. This python library focuses on solving data drift and concept d

null 7 Nov 18, 2021
Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets

Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets Datasets Used: Iris dataset,

Samrat Mitra 2 Nov 18, 2021
K-Means clusternig example with Python and Scikit-learn

Unsupervised-Machine-Learning Flat Clustering K-Means clusternig example with Python and Scikit-learn Flat clustering Clustering algorithms group a se

Emin 1 Dec 13, 2021
A Python implementation of GRAIL, a generic framework to learn compact time series representations.

GRAIL A Python implementation of GRAIL, a generic framework to learn compact time series representations. Requirements Python 3.6+ numpy scipy tslearn

null 3 Nov 24, 2021
Scikit-Learn useful pre-defined Pipelines Hub

Scikit-Pipes Scikit-Learn useful pre-defined Pipelines Hub Usage: Install scikit-pipes It's advised to install sklearn-genetic using a virtual env, in

Rodrigo Arenas 1 Apr 26, 2022
Predicting Baseball Metric Clusters: Clustering Application in Python Using scikit-learn

Clustering Clustering Application in Python Using scikit-learn This repository contains the prediction of baseball metric clusters using MLB Statcast

Tom Weichle 2 Apr 18, 2022
learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio

learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio

BDFD 6 Nov 5, 2022
To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.

To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.

Astitva Veer Garg 1 Jan 11, 2022
Book Recommender System Using Sci-kit learn N-neighbours

Model-Based-Recommender-Engine I created a book Recommender System using Sci-kit learn's N-neighbours algorithm for my model and the streamlit library

null 1 Jan 13, 2022
Painless Machine Learning for python based on scikit-learn

PlainML Painless Machine Learning Library for python based on scikit-learn. Install pip install plainml Example from plainml import KnnModel, load_ir

null 1 Aug 6, 2022