This project shows how to serve an ONNX-optimized image classification model as a web service with FastAPI, Docker, and Kubernetes.

Sayak Paul

Last update: Dec 23, 2022

Related tags

FastAPI Utilities ml-deployment-k8s-fastapi

Overview

Deploying ML models with FastAPI, Docker, and Kubernetes

By: Sayak Paul and Chansung Park

This project shows how to serve an ONNX-optimized image classification model as a RESTful web service with FastAPI, Docker, and Kubernetes (k8s). The idea is to first Dockerize the API and then deploy it on a k8s cluster running on Google Kubernetes Engine (GKE). We do this integration using GitHub Actions.

👋 Note: Even though this project uses an image classification its structure and techniques can be used to serve other models as well.

Deploying the model as a service with k8s

We decouple the model optimization part from our API code. The optimization part is available within the notebooks/TF_to_ONNX.ipynb notebook.
Then we locally test the API. You can find the instructions within the api directory.
To deploy the API, we define our deployment.yaml workflow file inside .github/workflows. It does the following tasks:
- Looks for any changes in the specified directory. If there are any changes:
- Builds and pushes the latest Docker image to Google Container Register (GCR).
- Deploys the Docker container on the k8s cluster running on GKE.

Configurations needed beforehand

Create a k8s cluster on GKE. Here's a relevant resource.
Create a service account key (JSON) file. It's a good practice to only grant it the roles required for the project. For example, for this project, we created a fresh service account and granted it permissions for the following: Storage Admin, GKE Developer, and GCR Developer.
Crete a secret named GCP_CREDENTIALS on your GitHub repository and copy paste the contents of the service account key file into the secret.

Configure bucket storage related permissions for the service account:

$ export PROJECT_ID=<PROJECT_ID>
$ export ACCOUNT=<ACCOUNT>

$ gcloud -q projects add-iam-policy-binding ${PROJECT_ID} \
    --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com \
    --role roles/storage.admin

$ gcloud -q projects add-iam-policy-binding ${PROJECT_ID} \
    --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com \
    --role roles/storage.objectAdmin

gcloud -q projects add-iam-policy-binding ${PROJECT_ID} \
    --member=serviceAccount:${ACCOUNT}@${PROJECT_ID}.iam.gserviceaccount.com \
    --role roles/storage.objectCreator

If you're on the main branch already then upon a new push, the worflow defined in .github/workflows/deployment.yaml should automatically run. Here's how the final outputs should look like so (run link):

Notes

Since we use CPU-based pods within the k8s cluster, we use ONNX optimizations since they are known to provide performance speed-ups for CPU-based environments. If you are using GPU-based pods then look into TensorRT.
We use Kustomize to manage the deployment on k8s.

Querying the API endpoint

From workflow outputs, you should see something like so:

NAME             TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)        AGE
fastapi-server   LoadBalancer   xxxxxxxxxx   xxxxxxxxxx        80:30768/TCP   23m
kubernetes       ClusterIP      xxxxxxxxxx     <none>          443/TCP        160m

Note the EXTERNAL-IP corresponding to fastapi-server (iff you have named your service like so). Then cURL it:

curl -X POST -F [email protected] -F with_resize=True -F with_post_process=True http://{EXTERNAL-IP}:80/predict/image

You should get the following output (if you're using the cat.jpg image present in the api directory):

"{\"Label\": \"tabby\", \"Score\": \"0.538\"}"

The request assumes that you have a file called cat.jpg present in your working directory.

TODO (s)

Set up logging for the k8s pods.
Find a better way to report the latest API endpoint.

Acknowledgements

ML-GDE program for providing GCP credit support.

Comments

Feat/locust grpc

@deep-diver currently, the load test runs into:

I have ensured https://github.com/sayakpaul/ml-deployment-k8s-fastapi/blob/feat/locust-grpc/locust/grpc/locustfile.py#L49 returns the correct output. But after a few requests, I run into the above problem.

Also, I should mention that the gRPC client currently does not take care of image resizing which makes it a bit less comparable to the REST client which handles preprocessing as well postprocessing.

opened by sayakpaul 18
Setup TF Serving based deployment
In this new feature, the following works are expected

~~Update the notebook~~ Create a new notebook with the TF Serving prototype based on both gRPC(Ref) and RestAPI(Ref).

~~Update the notebook~~ Update the newly created notebook to check the %%timeit on the TF Serving server locally.

Build/Commit docker image based on TF Serving base image using this method.

Deploy the built docker image on GKE cluster

Check the deployed model's performance with a various scenarios (maybe the same ones applied to ONNX+FastAPI scenarios)

new feature
opened by deep-diver 11
Perform load testing with Locust
Resources:

https://towardsdatascience.com/performance-testing-an-ml-serving-api-with-locust-ecd98ab9b7f7

https://microsoft.github.io/PartsUnlimitedMRP/pandp/200.1x-PandP-LocustTest.html

https://github.com/https-deeplearning-ai/machine-learning-engineering-for-production-public/tree/main/course4/week2-ungraded-labs/C4_W2_Lab_3_Latency_Test_Compose
opened by sayakpaul 10
4 dockerize
fix

move api/utils/requirements.txt to /api

add missing dependency python-multipart to the requirements.txt

add

Dockerfile

Closes https://github.com/sayakpaul/ml-deployment-k8s-fastapi/issues/4
opened by deep-diver 4
Deployment on GKE with GitHub Actions

Closes https://github.com/sayakpaul/ml-deployment-k8s-fastapi/issues/5, https://github.com/sayakpaul/ml-deployment-k8s-fastapi/issues/7, and https://github.com/sayakpaul/ml-deployment-k8s-fastapi/issues/6.

opened by sayakpaul 2
chore: refactored the colab notebook.

Just added a text cell explaining why it's better to include the preprocessing function in the final exported model. Also, added a cell to show if the TF and ONNX outputs match with np.testing.assert_allclose().

opened by sayakpaul 2

Releases(v1.0.0)

v1.0.0(Feb 21, 2022)

Source code(tar.gz)
Source code(zip)
resnet50_w_preprocessing.onnx(97.42 MB)
resnet50_w_preprocessing_tf.tar.gz(101.89 MB)

Owner

Sayak Paul

ML Engineer at @carted | One PR at a time

GitHub

Full stack, modern web application generator. Using FastAPI, PostgreSQL as database, Docker, automatic HTTPS and more.

Full Stack FastAPI and PostgreSQL - Base Project Generator Generate a backend and frontend stack using Python, including interactive API documentation

10.8k Jan 8, 2023

Sample FastAPI project that uses async SQLAlchemy, SQLModel, Postgres, Alembic, and Docker.

FastAPI + SQLModel + Alembic Sample FastAPI project that uses async SQLAlchemy, SQLModel, Postgres, Alembic, and Docker. Want to learn how to build th

228 Jan 2, 2023

Docker Sample Project - FastAPI + NGINX

Docker Sample Project - FastAPI + NGINX Run FastAPI and Nginx using Docker container Installation Make sure Docker is installed on your local machine

1 Feb 11, 2022

Flask-vs-FastAPI - Understanding Flask vs FastAPI Web Framework. A comparison of two different RestAPI frameworks.

Flask-vs-FastAPI Understanding Flask vs FastAPI Web Framework. A comparison of two different RestAPI frameworks. IntroductionIn Flask is a popular mic

1 Jan 1, 2022

🚀 Cookiecutter Template for FastAPI + React Projects. Using PostgreSQL, SQLAlchemy, and Docker

FastAPI + React · A cookiecutter template for bootstrapping a FastAPI and React project using a modern stack. Features FastAPI (Python 3.8) JWT authen

1.4k Jan 2, 2023

🚀 Cookiecutter Template for FastAPI + React Projects. Using PostgreSQL, SQLAlchemy, and Docker

FastAPI + React · A cookiecutter template for bootstrapping a FastAPI and React project using a modern stack. Features FastAPI (Python 3.8) JWT authen

448 Feb 19, 2021

FastAPI with Docker and Traefik

Dockerizing FastAPI with Postgres, Uvicorn, and Traefik Want to learn how to build this? Check out the post. Want to use this project? Development Bui

51 Jan 6, 2023

Boilerplate code for quick docker implementation of REST API with JWT Authentication using FastAPI, PostgreSQL and PgAdmin ⭐

FRDP Boilerplate code for quick docker implementation of REST API with JWT Authentication using FastAPI, PostgreSQL and PgAdmin ⛏ . Getting Started Fe

53 Dec 29, 2022

FastAPI-PostgreSQL-Celery-RabbitMQ-Redis bakcend with Docker containerization

FastAPI - PostgreSQL - Celery - Rabbitmq backend This source code implements the following architecture: All the required database endpoints are imple

54 Nov 26, 2022

FastAPI + Postgres + Docker Compose + Heroku Deploy Template

FastAPI + Postgres + Docker Compose + Heroku Deploy ⚠️ For educational purpose only. Not ready for production use YET Features FastAPI with Postgres s

12 Dec 27, 2022

FastAPI application and service structure for a more maintainable codebase

Abstracting FastAPI Services See this article for more information: https://camillovisini.com/article/abstracting-fastapi-services/ Poetry poetry inst

309 Jan 4, 2023

A FastAPI Middleware of joerick/pyinstrument to check your service performance.

fastapi_profiler A FastAPI Middleware of joerick/pyinstrument to check your service performance. ?? Info A FastAPI Middleware of pyinstrument to check

107 Jan 5, 2023

SuperSaaSFastAPI - Python SaaS Boilerplate for building Software-as-Service (SAAS) apps with FastAPI, Vue.js & Tailwind

Python SaaS Boilerplate for building Software-as-Service (SAAS) apps with FastAP

31 Jan 10, 2023

FastAPI-Amis-Admin is a high-performance, efficient and easily extensible FastAPI admin framework. Inspired by django-admin, and has as many powerful functions as django-admin.

简体中文 | English 项目介绍 FastAPI-Amis-Admin fastapi-amis-admin是一个拥有高性能,高效率,易拓展的fastapi管理后台框架. 启发自Django-Admin,并且拥有不逊色于Django-Admin的强大功能. 源码 · 在线演示 · 文档 · 文

318 Dec 31, 2022

This project shows how to serve an ONNX-optimized image classification model as a web service with FastAPI, Docker, and Kubernetes.

Related tags

Overview

Deploying ML models with FastAPI, Docker, and Kubernetes

Deploying the model as a service with k8s

Configurations needed beforehand

Notes

Querying the API endpoint

TODO (s)

Acknowledgements

Comments

Feat/locust grpc

Setup TF Serving based deployment

Perform load testing with Locust

4 dockerize

Deployment on GKE with GitHub Actions

chore: refactored the colab notebook.

Releases(v1.0.0)

v1.0.0(Feb 21, 2022)

Owner

Sayak Paul

Full stack, modern web application generator. Using FastAPI, PostgreSQL as database, Docker, automatic HTTPS and more.

Sample FastAPI project that uses async SQLAlchemy, SQLModel, Postgres, Alembic, and Docker.

Docker Sample Project - FastAPI + NGINX

Flask-vs-FastAPI - Understanding Flask vs FastAPI Web Framework. A comparison of two different RestAPI frameworks.

🚀 Cookiecutter Template for FastAPI + React Projects. Using PostgreSQL, SQLAlchemy, and Docker

🚀 Cookiecutter Template for FastAPI + React Projects. Using PostgreSQL, SQLAlchemy, and Docker

FastAPI with Docker and Traefik

Boilerplate code for quick docker implementation of REST API with JWT Authentication using FastAPI, PostgreSQL and PgAdmin ⭐

FastAPI-PostgreSQL-Celery-RabbitMQ-Redis bakcend with Docker containerization

FastAPI + Postgres + Docker Compose + Heroku Deploy Template

FastAPI application and service structure for a more maintainable codebase

A FastAPI Middleware of joerick/pyinstrument to check your service performance.

SuperSaaSFastAPI - Python SaaS Boilerplate for building Software-as-Service (SAAS) apps with FastAPI, Vue.js & Tailwind

FastAPI-Amis-Admin is a high-performance, efficient and easily extensible FastAPI admin framework. Inspired by django-admin, and has as many powerful functions as django-admin.

fastapi-admin2 is an upgraded fastapi-admin, that supports ORM dialects, true Dependency Injection and extendability

:rocket: CLI tool for FastAPI. Generating new FastAPI projects & boilerplates made easy.

Simple FastAPI Example : Blog API using FastAPI : Beginner Friendly

Пример использования GraphQL Ariadne с FastAPI и сравнение его с GraphQL Graphene FastAPI

Sample-fastapi - A sample app using Fastapi that you can deploy on App Platform