Learn how to responsibly deliver value with ML.

Goku Mohandas

Last update: Dec 30, 2022

Related tags

Machine Learning python data-science machine-learning natural-language-processing deep-learning pytorch mlops

Overview

Made With ML

Applied ML · MLOps · Production
Join 30K+ developers in learning how to responsibly deliver value with ML.

🔥 Among the top MLOps repositories on GitHub

Foundations

Learn the foundations of ML through intuitive explanations, clean code and visuals.

Lessons: https://madewithml.com/#foundations
Code: GokuMohandas/MadeWithML/tree/master/notebooks

🛠 Toolkit	🔥 Machine Learning	🤖 Deep Learning
Notebooks	Linear Regression	CNNs
Python	Logistic Regression	Embeddings
NumPy	Neural Network	RNNs
Pandas	Data Quality	Attention
PyTorch	Utilities	Transformers

📆 More topics coming soon!
Subscribe for our monthly updates on new content.

MLOps

Learn how to apply ML to build a production grade product to deliver value.

Lessons: https://madewithml.com/courses/mlops/
Code: GokuMohandas/MLOps

📦 Product	📝 Scripting	♻️ Reproducibility
Objective	Organization	Git
Solution	Packaging	Pre-commit
Iteration	Documentation	Versioning
🔢 Data	Styling	Docker
Labeling	Makefile	🚀 Production
Preprocessing	Logging	Dashboard
Exploratory data analysis	📦 Interfaces	CI/CD workflows
Splitting	Command-line	Infrastructure
Augmentation	RESTful API	Monitoring
📈 Modeling	✅ Testing	Feature store
Evaluation	Code	Pipelines
Experiment tracking	Data	Continual learning
Optimization	Models

📆 New lessons every month!
Subscribe for our monthly updates on new content.

FAQ

Who is this content for?

Software engineers looking to learn ML and become even better software engineers.
Data scientists who want to learn how to responsibly deliver value with ML.
College graduates looking to learn the practical skills they'll need for the industry.
Product Managers who want to develop a technical foundation for ML applications.

What is the structure?

Lessons will be released weekly and each one will include:

intuition: high level overview of the concepts that will be covered and how it all fits together.
code: simple code examples to illustrate the concept.
application: applying the concept to our specific task.
extensions: brief look at other tools and techniques that will be useful for difference situations.

What makes this content unique?

hands-on: If you search production ML or MLOps online, you'll find great blog posts and tweets. But in order to really understand these concepts, you need to implement them. Unfortunately, you don’t see a lot of the inner workings of running production ML because of scale, proprietary content & expensive tools. However, Made With ML is free, open and live which makes it a perfect learning opportunity for the community.
intuition-first: We will never jump straight to code. In every lesson, we will develop intuition for the concepts and think about it from a product perspective.
software engineering: This course isn't just about ML. In fact, it's mostly about clean software engineering! We'll cover important concepts like versioning, testing, logging, etc. that really makes something production-grade product.
focused yet holistic: For every concept, we'll not only cover what's most important for our specific task (this is the case study aspect) but we'll also cover related methods (this is the guide aspect) which may prove to be useful in other situations.

Who is the author?

I've deployed large scale ML systems at Apple as well as smaller systems with constraints at startups and want to share the common principles I've learned.
Connect with me on Twitter and LinkedIn

Why is this free?

While this content is for everyone, it's especially targeted towards people who don't have as much opportunity to learn. I believe that creativity and intelligence are randomly distributed while opportunities are siloed. I want to enable more people to create and contribute to innovation.

To cite this content, please use:

@misc{madewithml,
    author       = {Goku Mohandas},
    title        = {Made With ML},
    howpublished = {\url{https://madewithml.com/}},
    year         = {2021}
}

Comments

sns.barplot in EDA

i think instead of ax = sns.barplot(list(tags), list(tag_counts)) it should be ax = sns.barplot(x=list(tags), y=list(tag_counts)) in code at https://madewithml.com/courses/mlops/exploratory-data-analysis/

opened by gexahedron 4
Alternative to Colab and Binder for running `practicalAI` in the cloud

Hi @GokuMohandas,

I've been recently taking a look at the sample Notebooks in this project and I found them really interesting and valuable for teaching purposes. We're even thinking about adding part of them to our curriculum at https://rmotr.com/ (cofounder and teacher here), in our Data Science program.

We have a small service at RMOTR that lets you run a Jupyter environment online in a single click. Similar to Google Colab or Binder, but also with the ability of installing custom requirements, clone an entire GH repo, etc. We use it for our students, so they don't have to hit the initial wall of installing the whole local Jupyter setup when they are getting started in the DS world.

You can see how practicalAI looks like in the service using this link: https://notebooks.rmotr.com/clone/gh/GokuMohandas/practicalAI

Note that all requirements listed in requirements.txt are already installed when the env is loaded, so people can start using it right away. That gives you the flexibility of adding any requirement, and not being tied to what Colab provides by default.

Do you think it would be a good choice to add it as a third launching option? Alternatively to Colab and Binder, already listed in the README.

I hope you like it, and I truly appreciate any feedback.

thanks.

opened by martinzugnoni 4
Foundations --> Embeddings
Typo under Model section: 3. We'll apply convolution via filters (filter_size, vocab_size, num_filters) should be embedding_dim to replace vocab_size?

Typo under Experiments: first have to decice

Typo under Interpretability padding our inputs before convolution to result is outputs is should be in

Could there be a general explanation of moving models/data across devices? My current understanding is that they have to be both on the same place (cpu/gpu). If on gpu, just stay on gpu through the whole train/eval/predict session. I couldn't understand why under Inference device = torch.device("cpu") moves things back to cpu.

interpretable_trainer.predict_step(dataloader) breaks with AttributeError: 'list' object has no attribute 'dim'. The precise step is F.softmax(z), where for interpretable_model, z is a list of 3 items and it was trying to softmax a list instead of a tensor.
opened by gitgithan 3
Foundations --> Linear regression (Error in implementation)

Under Pytorch --> Interpretability: b_unscaled = b * y_scaler.scale_ + y_scaler.mean_ - np.sum(W_unscaled*X_scaler.mean_) This line seems to be missing a * (y_scaler.scale_/X_scaler.scale_) in the last np.sum term.

The table for W unscaled was also confusing. It has a sum term shown there, which means if X began with 2 predictors (this lesson only used 1 predictor), the scaled W will have 2 predictors while the sum will aggregate the 2 weights into 1 unscaled weight? Can't wrap my head around this.

Also, under Pytorch --> Interpretability, W_unscaled = W * (y_scaler.scale_/X_scaler.scale_) there was no sum used here, so looks inconsistent with the formula in the table.

opened by gitgithan 3
alternative for colab notebook service in mainland china

hi! appreciate your work here, me and my friends really learned a lot here we happened to find a platform in mainland China providing similar service to google colab and kaggle ( as you may known there is connectivity problem to google services in mainland China) called KESCI(www.kesci.com). They provide dev-ready and up-to-date Python & R cpu environment all for free and an upcoming gpu support. we also managed to translate the whole series to Chinese and applied for a column to publish them on KESCI, as a series. you can access it here : https://www.kesci.com/home/column/5c20e4c5916b6200104eea63 the Computer Vision notebook has already been translated but is still being trained in the transfer-learning section also, do you think it is possible to add this as another launching option? i think there must be more people in China who could learn from your tutorials!

opened by gerard0315 3
Recreated content authorized by the original copyright owner

Hi,GokuMohandas: I translate all content of Made-With-ML into chinese language, I post the content in my [blog] (https://franztao.github.io) and wechat blog。I wish get your agree about the recreated content by the original copyright owner?

opened by franztao 2
Silly question: LabelEncoder
While creating the LabelEncoder class, I couldnt understand why return self in class method fit(self,y)? My understanding is that when we call this method, the object variables are updated so no need for self? Please correct me if I'm wrong, just trying to reason myself with each step of the code.

def fit(self, y): classes = np.unique(y) for i, class_ in enumerate(classes): self.class_to_index[class_] = i self.index_to_class = {v: k for k,v in self.class_to_index.items()} self.classes = list(self.class_to_index.keys()) return self #Why?
opened by knosing 2
Issue in viewing the experiment in MLflow
I am running the tagifai.ipynb notebook on the windows platform but facing difficulty viewing the experiment in MLflow.

Steps Done:

Cloned the repo

Running the "mlops-course\notebooks\tagifai.ipynb" in vs code locally.

To run the server "mlflow server -h 0.0.0.0 -p 8000 --backend-store-uri /experiments/" from the location of the notebook, experiments is the next folder inside it. # $PWD is omitted because of windows.

Opening the "http://localhost:8000/#/"

Observation :

No signs of experiment run.

Image attached below for ref.

Please provide assistance with this issue.

Thanks
opened by mukul74 2

Foundations --> Transformers

Hi Goku... I am really thankful for all your amazing tutorials.

I however was facing some issues in the Transformers lecture. There are a few minor bugs here with missing variables and imports; which was not an issue.

The training code however is missing the block:

# Train
best_model = trainer.train(
    num_epochs, patience, train_dataloader, val_dataloader)

Also when i wrote this and ran it, I got an error:

/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:14: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:15: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  from ipykernel import kernelapp as app
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-68-8d0f0dee99db>](https://localhost:8080/#) in <module>()
      1 # Train
      2 best_model = trainer.train(
----> 3     num_epochs, patience, train_dataloader, val_dataloader)

6 frames
[/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py](https://localhost:8080/#) in dropout(input, p, training, inplace)
   1277     if p < 0.0 or p > 1.0:
   1278         raise ValueError("dropout probability has to be between 0 and 1, " "but got {}".format(p))
-> 1279     return _VF.dropout_(input, p, training) if inplace else _VF.dropout(input, p, training)
   1280 
   1281 

TypeError: dropout(): argument 'input' (position 1) must be Tensor, not str

Apparently, the issue comes from the line :

seq, pool = self.transformer(input_ids=ids, attention_mask=masks)

wherein the "pool" returned is of class string. Upon printing the type and the value of it i get the following :

<class 'str'>
pooler_output

Can you please have a look into this. Thanks in Advance!!

opened by shashankvasisht 2

Lambda function missing

Hi Goku,

I'm going through the Pandas and I noticed that in the Feature engineering section, you mentioned about applying a lambda function to create a new feature, but the code for it does not appear. I think it's just a minor typo.

Regards, Roberto

opened by jroberayalas 2
Lessons page, Basic ML: "Notebook not found".

Problem: Starting from either https://practicalai.me/learn/lessons/ or https://github.com/practicalAI/practicalAI, when attempting to click any of the lessons I see "Notebook not found". Proposed fix: Possibly "basic_ml" should be added to the path?

When I click "authorize with Github" I see the same thing:

The link given then does not work:

In the case of the "linear regression" notebook, the non-working link given on the "lessons" page is https://colab.research.google.com/github/practicalAI/practicalAI/blob/master/notebooks/04_Linear_Regression.ipynb

Whereas if you go find it on github directly, it is https://colab.research.google.com/github/practicalAI/practicalAI/blob/master/notebooks/basic_ml/04_Linear_Regression.ipynb

opened by ColinLeongUDRI 2
Update 07_Logistic_Regression.ipynb - softmax definition fix

Changed $\hat{y} = \frac{e^{XW_y}}{\sum_j e^{XW}}$ to $\hat{y} = \frac{e^{XW}}{\sum_j e^{XW_j}}$ in the equation defining softmax (located after the second paragraph).

opened by matospiso 1
Update 07_Logistic_Regression.ipynb

fstring format error with too many double quotes as in the first example of this error print (f"m:b = {class_counts["malignant"]/class_counts["benign"]:.2f}"), Should be single quote like print (f'm:b = {class_counts["malignant"]/class_counts["benign"]:.2f}')

opened by data-steve 0

Owner

Goku Mohandas

Founder @madewithml. AI Research @apple. Author @oreillymedia. ML Lead @Ciitizen. Alum @hopkinsmedicine and @gatech

GitHub https://madewithml.com

Auto updating website that tracks closed & open issues/PRs on scikit-learn/scikit-learn.

Repository Status for Scikit-learn Live webpage Auto updating website that tracks closed & open issues/PRs on scikit-learn/scikit-learn. Running local

6 Dec 27, 2022

A scikit-learn based module for multi-label et. al. classification

scikit-multilearn scikit-multilearn is a Python module capable of performing multi-label learning tasks. It is built on-top of various scientific Pyth

802 Jan 1, 2023

Highly interpretable classifiers for scikit learn, producing easily understood decision rules instead of black box models

Highly interpretable, sklearn-compatible classifier based on decision rules This is a scikit-learn compatible wrapper for the Bayesian Rule List class

482 Nov 19, 2022

Automated Machine Learning with scikit-learn

auto-sklearn auto-sklearn is an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator. Find the documentation here

6.7k Jan 7, 2023

Relevance Vector Machine implementation using the scikit-learn API.

scikit-rvm scikit-rvm is a Python module implementing the Relevance Vector Machine (RVM) machine learning technique using the scikit-learn API. Quicks

204 Nov 18, 2022

Distributed scikit-learn meta-estimators in PySpark

sk-dist: Distributed scikit-learn meta-estimators in PySpark What is it? sk-dist is a Python package for machine learning built on top of scikit-learn

282 Dec 9, 2022

Iris species predictor app is used to classify iris species created using python's scikit-learn, fastapi, numpy and joblib packages.

Iris Species Predictor Iris species predictor app is used to classify iris species using their sepal length, sepal width, petal length and petal width

5 Apr 5, 2022

A collection of Scikit-Learn compatible time series transformers and tools.

tsfeast A collection of Scikit-Learn compatible time series transformers and tools. Installation Create a virtual environment and install: From PyPi p

0 Mar 30, 2022

Penguins species predictor app is used to classify penguins species created using python's scikit-learn, fastapi, numpy and joblib packages.

Penguins Classification App Penguins species predictor app is used to classify penguins species using their island, sex, bill length (mm), bill depth

3 Apr 5, 2022

Scikit learn library models to account for data and concept drift.

liquid_scikit_learn Scikit learn library models to account for data and concept drift. This python library focuses on solving data drift and concept d

7 Nov 18, 2021

Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets

Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets Datasets Used: Iris dataset,

2 Nov 18, 2021

K-Means clusternig example with Python and Scikit-learn

Unsupervised-Machine-Learning Flat Clustering K-Means clusternig example with Python and Scikit-learn Flat clustering Clustering algorithms group a se

1 Dec 13, 2021

A Python implementation of GRAIL, a generic framework to learn compact time series representations.

GRAIL A Python implementation of GRAIL, a generic framework to learn compact time series representations. Requirements Python 3.6+ numpy scipy tslearn

3 Nov 24, 2021

Scikit-Learn useful pre-defined Pipelines Hub

Scikit-Pipes Scikit-Learn useful pre-defined Pipelines Hub Usage: Install scikit-pipes It's advised to install sklearn-genetic using a virtual env, in

1 Apr 26, 2022

Predicting Baseball Metric Clusters: Clustering Application in Python Using scikit-learn

Clustering Clustering Application in Python Using scikit-learn This repository contains the prediction of baseball metric clusters using MLB Statcast

2 Apr 18, 2022

learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio

learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio

6 Nov 5, 2022

To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.

1 Jan 11, 2022

Book Recommender System Using Sci-kit learn N-neighbours

Model-Based-Recommender-Engine I created a book Recommender System using Sci-kit learn's N-neighbours algorithm for my model and the streamlit library

1 Jan 13, 2022

Painless Machine Learning for python based on scikit-learn

PlainML Painless Machine Learning Library for python based on scikit-learn. Install pip install plainml Example from plainml import KnnModel, load_ir

1 Aug 6, 2022