BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models.

BentoML

Last update: Jan 4, 2023

Related tags

Machine Learning kubernetes machine-learning ai aws-lambda tensorflow ml model-management model-deployment model-serving ml-infrastructure azure-ml mlops aws-sagemaker machine-learning-operations bentoml ml-platform bentoml-format prediction-service

Overview

Model Serving Made Easy

BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models.

Supports multiple ML frameworks, including Tensorflow, PyTorch, Keras, XGBoost and more
Cloud native deployment with Docker, Kubernetes, AWS, Azure and many more
High-Performance online API serving and offline batch serving
Web dashboards and APIs for model registry and deployment management

BentoML bridges the gap between Data Science and DevOps. By providing a standard interface for describing a prediction service, BentoML abstracts away how to run model inference efficiently and how model serving workloads can integrate with cloud infrastructures. See how it works!

Join our community on Slack 👈

Documentation

BentoML documentation: https://docs.bentoml.org/

Quickstart Guide, try it out on Google Colab
Core Concepts
API References
FAQ
Example projects: bentoml/Gallery

Key Features

Production-ready online serving:

Support multiple ML frameworks including PyTorch, TensorFlow, Scikit-Learn, XGBoost, and many more
Containerized model server for production deployment with Docker, Kubernetes, OpenShift, AWS ECS, Azure, GCP GKE, etc
Adaptive micro-batching for optimal online serving performance
Discover and package all dependencies automatically, including PyPI, conda packages and local python modules
Serve compositions of multiple models
Serve multiple endpoints in one model server
Serve any Python code along with trained models
Automatically generate REST API spec in Swagger/OpenAPI format
Prediction logging and feedback logging endpoint
Health check endpoint and Prometheus /metrics endpoint for monitoring

Standardize model serving and deployment workflow for teams:

Central repository for managing all your team's prediction services via Web UI and API
Launch offline batch inference job from CLI or Python
One-click deployment to cloud platforms including AWS EC2, AWS Lambda, AWS SageMaker, and Azure Functions
Distributed batch or streaming serving with Apache Spark
Utilities that simplify CI/CD pipelines for ML
Automated offline batch inference job with Dask (roadmap)
Advanced model deployment for Kubernetes ecosystem (roadmap)
Integration with training and experimentation management products including MLFlow, Kubeflow (roadmap)

ML Frameworks

Scikit-Learn - Docs | Examples
PyTorch - Docs | Examples
Tensorflow 2 - Docs | Examples
Tensorflow Keras - Docs | Examples
XGBoost - Docs | Examples
LightGBM - Docs | Examples
FastText - Docs | Examples
FastAI - Docs | Examples
H2O - Docs | Examples
ONNX - Docs | Examples
Spacy - Docs | Examples
Statsmodels - Docs | Examples
CoreML - Docs
Transformers - Docs
Gluon - Docs
Detectron - Docs
PaddlePaddle - Docs | Example
EvalML - Docs
EasyOCR -Docs
ONNX-MLIR - Docs

Deployment Options

Be sure to check out deployment overview doc to understand which deployment option is best suited for your use case.

One-click deployment with BentoML:
Deploy with open-source platforms:
- Docker
- Kubernetes
- Knative
- Kubeflow
- KFServing
- Clipper
Manual cloud deployment guides:

Introduction

BentoML provides APIs for defining a prediction service, a servable model so to speak, which includes the trained ML model itself, plus its pre-processing, post-processing code, input/output specifications and dependencies. Here's what a simple prediction service look like in BentoML:

import pandas as pd

from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import DataframeInput, JsonOutput
from bentoml.frameworks.sklearn import SklearnModelArtifact

# BentoML packages local python modules automatically for deployment
from my_ml_utils import my_encoder

@env(infer_pip_packages=True)
@artifacts([SklearnModelArtifact('my_model')])
class MyPredictionService(BentoService):
    """
    A simple prediction service exposing a Scikit-learn model
    """

    @api(input=DataframeInput(), output=JsonOutput(), batch=True)
    def predict(self, df: pd.DataFrame):
        """
        An inference API named `predict` that takes tabular data in pandas.DataFrame 
        format as input, and returns Json Serializable value as output.

        A batch API is expect to receive a list of inference input and should returns
        a list of prediction results.
        """
        model_input_df = my_encoder.fit_transform(df)
        predictions = self.artifacts.my_model.predict(model_input_df)

        return list(predictions)

This can be easily plugged into your model training process: import your bentoml prediction service class, pack it with your trained model, and call save to persist the entire prediction service at the end, which creates a BentoML bundle:

from my_prediction_service import MyPredictionService
svc = MyPredictionService()
svc.pack('my_model', my_sklearn_model)
svc.save()  # saves to $HOME/bentoml/repository/MyPredictionService/{version}/

The generated BentoML bundle is a file directory that contains all the code files, serialized models, and configs required for reproducing this prediction service for inference. BentoML automatically captures all the python dependencies information and have everything versioned and managed together in one place.

BentoML automatically generates a version ID for this bundle, and keeps track of all bundles created under the $HOME/bentoml directory. With a BentoML bundle, user can start a local API server hosting it, either by its file path or its name and version:

bentoml serve MyPredictionService:latest

# alternatively
bentoml serve $HOME/bentoml/repository/MyPredictionService/{version}/

A docker container image that's ready for production deployment can be created now with just one command:

bentoml containerize MyPredictionService:latest -t my_prediction_service:v3

docker run -p 5000:5000 my_prediction_service:v3 --workers 2

The container image produced will have all the required dependencies installed. Besides the model inference API, the containerized BentoML model server also comes with Prometheus metrics, health check endpoint, prediction logging, and tracing support out-of-the-box. This makes it super easy for your DevOps team to incorporate your models into production systems.

BentoML's model management component is called Yatai, it means food cart in Japanese, and you can think of it as where you'd store your bentos 🍱 . Yatai provides CLI, Web UI, and Python API for accessing BentoML bundles you have created, and you can start a Yatai server for your team to manage all models on cloud storage(S3, GCS, MinIO etc) and build CI/CD workflow around it. Learn more about it here.

Read the Quickstart Guide to learn more about the basic functionalities of BentoML. You can also try it out here on Google Colab.

Why BentoML

Moving trained Machine Learning models to serving applications in production is hard. It is a sequential process across data science, engineering and DevOps teams: after a model is trained by the data science team, they hand it over to the engineering team to refine and optimize code and creates an API, before DevOps can deploy.

And most importantly, Data Science teams want to continuously repeat this process, monitor the models deployed in production and ship new models quickly. It often takes months for an engineering team to build a model serving & deployment solution that allow data science teams to ship new models in a repeatable and reliable way.

BentoML is a framework designed to solve this problem. It provides high-level APIs for Data Science team to create prediction services, abstract away DevOps' infrastructure needs and performance optimizations in the process. This allows DevOps team to seamlessly work with data science side-by-side, deploy and operate their models packaged in BentoML format in production.

Check out Frequently Asked Questions page on how does BentoML compares to Tensorflow-serving, Clipper, AWS SageMaker, MLFlow, etc.

Contributing

Have questions or feedback? Post a new github issue or discuss in our Slack channel:

Want to help build BentoML? Check out our contributing guide and the development guide.

Releases

BentoML is under active development and is evolving rapidly. It is currently a Beta release, we may change APIs in future releases and there are still major features being worked on.

Read more about the latest updates from the releases page.

Usage Tracking

BentoML by default collects anonymous usage data using Amplitude. It only collects BentoML library's own actions and parameters, no user or model data will be collected. Here is the code that does it.

This helps BentoML team to understand how the community is using this tool and what to build next. You can easily opt-out of usage tracking by running the BentoML commands with the --do-not-track option.

% bentoml [command] --do-not-track

or by setting the BENTOML_DO_NOT_TRACK environment variable to True.

% export BENTOML_DO_NOT_TRACK=True

License

Apache License 2.0

Comments

Failure of bento serve in production with AnyIO error

Describe the bug

The sklearn example available here https://docs.bentoml.org/en/latest/quickstart.html#installation fails at inference time with AnyIO error. The bento service is deployed in production mode bentoml serve iris_classifier:latest --production. When deployed in development mode, the inference works as expected.

To Reproduce

Steps to reproduce the issue:

Install BentoML: pip install bentoml --pre
Train and save bento sklearn model:

import bentoml

from sklearn import svm
from sklearn import datasets

# Load predefined training set to build an example model
iris = datasets.load_iris()
X, y = iris.data, iris.target

# Model Training
clf = svm.SVC(gamma='scale')
clf.fit(X, y)

# Call to bentoml.<FRAMEWORK>.save(<MODEL_NAME>, model)
# In order to save to BentoML's standard format in a local model store
bentoml.sklearn.save("iris_clf", clf)

Create BentoML service

# bento.py
import bentoml
import bentoml.sklearn
import numpy as np

from bentoml.io import NumpyNdarray

# Load the runner for the latest ScikitLearn model we just saved
iris_clf_runner = bentoml.sklearn.load_runner("iris_clf:latest")

# Create the iris_classifier service with the ScikitLearn runner
# Multiple runners may be specified if needed in the runners array
# When packaged as a bento, the runners here will included
svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner])

# Create API function with pre- and post- processing logic with your new "svc" annotation
@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
def predict(input_ndarray: np.ndarray) -> np.ndarray:
    # Define pre-processing logic
    result = iris_clf_runner.run(input_ndarray)
    # Define post-processing logic
    return result

Create BentoML configuration file

# bentofile.yaml
service: "bento.py:svc"  # A convention for locating your service: <YOUR_SERVICE_PY>:<YOUR_SERVICE_ANNOTATION>
include:
 - "*.py"  # A pattern for matching which files to include in the bento
python:
  packages:
   - scikit-learn  # Additional libraries to be included in the bento

Build BentoML service bentoml build
Run bento in production bentoml serve iris_classifier:latest --production
Send inference request

import requests
response = requests.post(
    "http://127.0.0.1:5000/predict",
    headers={"content-type": "application/json"},
    data="[5,4,3,2]").text
print(response)

Expected behavior

The response should be the classification result, namely 1.

Screenshots/Logs

The error generated by the server is the following:

 Exception on /predict [POST]
                       ╭──────────────────────────────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────────────────────────────╮
                       │                                                                                                                                                                    │
                       │ /usr/local/lib/python3.8/dist-packages/anyio/from_thread.py:31 in run                                                                                              │
                       │                                                                                                                                                                    │
                       │    28 │                                                                                                                                                            │
                       │    29 │   """                                                                                                                                                      │
                       │    30 │   try:                                                                                                                                                     │
                       │ ❱  31 │   │   asynclib = threadlocals.current_async_module                                                                                                         │
                       │    32 │   except AttributeError:                                                                                                                                   │
                       │    33 │   │   raise RuntimeError('This function can only be run from an AnyIO worker thread')                                                                      │
                       │    34                                                                                                                                                              │
                       ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
                       AttributeError: '_thread._local' object has no attribute 'current_async_module'

                       During handling of the above exception, another exception occurred:

                       ╭──────────────────────────────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────────────────────────────╮
                       │                                                                                                                                                                    │
                       │ /usr/local/lib/python3.8/dist-packages/bentoml/_internal/server/service_app.py:356 in api_func                                                                     │
                       │                                                                                                                                                                    │
                       │   353 │   │   │   │   │   if isinstance(api.input, Multipart):                                                                                                     │
                       │   354 │   │   │   │   │   │   output: t.Any = await run_in_threadpool(api.func, **input_data)                                                                      │
                       │   355 │   │   │   │   │   else:                                                                                                                                    │
                       │ ❱ 356 │   │   │   │   │   │   output: t.Any = await run_in_threadpool(api.func, input_data)                                                                        │
                       │   357 │   │   │   │   response = await api.output.to_http_response(output)                                                                                         │
                       │   358 │   │   │   except BentoMLException as e:                                                                                                                    │
                       │   359 │   │   │   │   log_exception(request, sys.exc_info())                                                                                                       │
                       │ /usr/local/lib/python3.8/dist-packages/starlette/concurrency.py:40 in run_in_threadpool                                                                            │
                       │                                                                                                                                                                    │
                       │   37 │   elif kwargs:  # pragma: no cover                                                                                                                          │
                       │   38 │   │   # loop.run_in_executor doesn't accept 'kwargs', so bind them in here                                                                                  │
                       │   39 │   │   func = functools.partial(func, **kwargs)                                                                                                              │
                       │ ❱ 40 │   return await loop.run_in_executor(None, func, *args)                                                                                                      │
                       │   41                                                                                                                                                               │
                       │   42                                                                                                                                                               │
                       │   43 class _StopIteration(Exception):                                                                                                                              │
                       │                                                                                                                                                                    │
                       │ /usr/lib/python3.8/concurrent/futures/thread.py:57 in run                                                                                                          │
                       │                                                                                                                                                                    │
                       │    54 │   │   │   return                                                                                                                                           │
                       │    55 │   │                                                                                                                                                        │
                       │    56 │   │   try:                                                                                                                                                 │
                       │ ❱  57 │   │   │   result = self.fn(*self.args, **self.kwargs)                                                                                                      │
                       │    58 │   │   except BaseException as exc:                                                                                                                         │
                       │    59 │   │   │   self.future.set_exception(exc)                                                                                                                   │
                       │    60 │   │   │   # Break a reference cycle with the exception 'exc'                                                                                               │
                       │                                                                                                                                                                    │
                       │ /root/bentoml/bentos/iris_classifier/fgzarmenwoh6jsyx/src/bento.py:20 in predict                                                                                   │
                       │                                                                                                                                                                    │
                       │   17 @svc.api(input=NumpyNdarray(), output=NumpyNdarray())                                                                                                         │
                       │   18 def predict(input_ndarray: np.ndarray) -> np.ndarray:                                                                                                         │
                       │   19 │   # Define pre-processing logic                                                                                                                             │
                       │ ❱ 20 │   result = iris_clf_runner.run(input_ndarray)                                                                                                               │
                       │   21 │   # Define post-processing logic                                                                                                                            │
                       │   22 │   return result                                                                                                                                             │
                       │   23                                                                                                                                                               │
                       │                                                                                                                                                                    │
                       │ /usr/local/lib/python3.8/dist-packages/bentoml/_internal/runner/runner.py:141 in run                                                                               │
                       │                                                                                                                                                                    │
                       │   138 │   │   return await self._impl.async_run_batch(*args, **kwargs)                                                                                             │
                       │   139 │                                                                                                                                                            │
                       │   140 │   def run(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                                                                                                   │
                       │ ❱ 141 │   │   return self._impl.run(*args, **kwargs)                                                                                                               │
                       │   142 │                                                                                                                                                            │
                       │   143 │   def run_batch(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                                                                                             │
                       │   144 │   │   return self._impl.run_batch(*args, **kwargs)                                                                                                         │
                       │                                                                                                                                                                    │
                       │ /usr/local/lib/python3.8/dist-packages/bentoml/_internal/runner/remote.py:111 in run                                                                               │
                       │                                                                                                                                                                    │
                       │   108 │   def run(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                                                                                                   │
                       │   109 │   │   import anyio                                                                                                                                         │
                       │   110 │   │                                                                                                                                                        │
                       │ ❱ 111 │   │   return anyio.from_thread.run(self.async_run, *args, **kwargs)                                                                                        │
                       │   112 │                                                                                                                                                            │
                       │   113 │   def run_batch(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                                                                                             │
                       │   114 │   │   import anyio                                                                                                                                         │
                       │                                                                                                                                                                    │
                       │ /usr/local/lib/python3.8/dist-packages/anyio/from_thread.py:33 in run                                                                                              │
                       │                                                                                                                                                                    │
                       │    30 │   try:                                                                                                                                                     │
                       │    31 │   │   asynclib = threadlocals.current_async_module                                                                                                         │
                       │    32 │   except AttributeError:                                                                                                                                   │
                       │ ❱  33 │   │   raise RuntimeError('This function can only be run from an AnyIO worker thread')                                                                      │
                       │    34 │                                                                                                                                                            │
                       │    35 │   return asynclib.run_async_from_thread(func, *args)                                                                                                       │
                       │    36                                                                                                                                                              │
                       ╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
                       RuntimeError: This function can only be run from an AnyIO worker thread

Environment:

OS: Ubuntu 20.04
Python Version : 3.8
BentoML Version: 1.0.0.a3
AnyIO version: 3.5.0

installed the requirements here https://github.com/bentoml/BentoML/blob/main/requirements/tests-requirements.txt

Additional context

bug

opened by andreea-anghel 29

[tests] Move yatai e2e tests to Github CI
Description

Motivation and Context

How Has This Been Tested?

Types of changes

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[ ] New feature and improvements (non-breaking change which adds/improves functionality)

[ ] Bug fix (non-breaking change which fixes an issue)

[ ] Code Refactoring (internal change which is not user facing)

[ ] Documentation

[ ] Test, CI, or build

Component(s) if applicable

[ ] BentoService (service definition, dependency management, API input/output adapters)

[ ] Model Artifact (model serialization, multi-framework support)

[ ] Model Server (mico-batching, dockerisation, logging, OpenAPI, instruments)

[ ] YataiService gRPC server (model registry, cloud deployment automation)

[ ] YataiService web server (nodejs HTTP server and web UI)

[ ] Internal (BentoML's own configuration, logging, utility, exception handling)

[ ] BentoML CLI

Checklist:

[ ] My code follows the bentoml code style, both ./dev/format.sh and ./dev/lint.sh script have passed (instructions).

[ ] My change reduces project test coverage and requires unit tests to be added

[ ] I have added unit tests covering my code change

[ ] My change requires a change to the documentation

[ ] I have updated the documentation accordingly
opened by yubozhao 26
Issue with conda dependencies with custom channels

Describe the bug

This is related to some issues I've mentioned where you're building in an environment with only access to specific repositories.

In short, the Dockerfile will call a script that does conda env update -n base -f ./enviornment.yml. The problem is that this will also use the global conda settings. Unlike conda install there is no --override-channels argument.

So the build will break in such an environment

To Reproduce

This is quite difficult reproduce. Other than building in such an environment, you could create a service with a specific conda url as well the conda_overwrite_channels=True which will create an environment.yml with only that url. If you add a -v to the conda env update you should see it installing from other urls.

Expected behavior

It should be the case that only the conda channels specified are used.

Potential Fix

One fix is to add the line conda config --system --remove channels defaults in the bentoml-init.sh file, just before theconda env update line. This will remove the default urls so that the only channels used are specified in the environment.yml file.

Since conda is not used after this step it should not cause other issues. There might be other ways - I've looked for a way to do this via the conda env update command with no luck.

A safer approach might be to do this iff the conda_overwrite_channels=True but this would involve interacting with the bentoml-init.sh file when building which currently isn't done.

opened by gregd33 24
Small breaking change to onnx-mlir PyExecution session requires small tweak to open source code

Describe the bug A change to the onnx-mlir PyExecutionSession requires BentoML's onnx-mlir PyExecutionSession to be updated with a fix where run_main_graph no longer needs specified in the PyExecutionSession invocation as shown below:

To Reproduce Any workflow using an old copy of the compiled models will need to refresh their compiled model and repack their serving environment. Any discussions for incompatible changes needed beyond that should be discussed with me immediately.

Expected behavior The model should run as expected and give the expected output.

Screenshots/Logs Sufficient documentation has been provided in the documentation above and I or @andrewsi-z will provide the fix for this since he originally authored this code.

Additional context A fix has been discussed with @andrewsi-z and should be straightforward in implementation. I'll work with the team moving forward to implement the PR.
bug

opened by messerb5467 22
[Feature] Easy Ec2 deployment
Description

One click deployment for AWS ec2 with autoscaling group.

Motivation and Context

For easy deployment without load balancing

How Has This Been Tested?

Types of changes

[ ] Breaking change (fix or feature that would cause existing functionality to change)

[x] New feature and improvements (non-breaking change which adds/improves functionality)

[ ] Bug fix (non-breaking change which fixes an issue)

[x] Code Refactoring (internal change which is not user facing)

[ ] Documentation

[x] Test, CI, or build

Component(s) if applicable

[x] BentoService (service definition, dependency management, API input/output adapters)

[ ] Model Artifact (model serialization, multi-framework support)

[x] Model Server (mico-batching, dockerisation, logging, OpenAPI, instruments)

[x] YataiService gRPC server (model registry, cloud deployment automation)

[ ] YataiService web server (nodejs HTTP server and web UI)

[ ] Internal (BentoML's own configuration, logging, utility, exception handling)

[x] BentoML CLI

Checklist:

[x] My code follows the bentoml code style, both ./dev/format.sh and ./dev/lint.sh script have passed (instructions).

[ ] My change reduces project test coverage and requires unit tests to be added

[x] I have added unit tests covering my code change

[x] My change requires a change to the documentation

[x] I have updated the documentation accordingly
opened by mayurnewase 21

Docker container stopped working with: ModuleNotFoundError: No module named 'ruamel'

Describe the bug Without any change to my code, new Docker containers aren't working anymore. When I try to run it, I get:

Traceback (most recent call last):
  File "/opt/conda/bin/bentoml", line 5, in <module>
    from bentoml.cli import cli
  File "/opt/conda/lib/python3.6/site-packages/bentoml/__init__.py", line 27, in <module>
    from bentoml.saved_bundle import load, save_to_dir
  File "/opt/conda/lib/python3.6/site-packages/bentoml/saved_bundle/__init__.py", line 15, in <module>
    from bentoml.saved_bundle.bundler import save_to_dir
  File "/opt/conda/lib/python3.6/site-packages/bentoml/saved_bundle/bundler.py", line 31, in <module>
    from bentoml.utils.usage_stats import track_save
  File "/opt/conda/lib/python3.6/site-packages/bentoml/utils/usage_stats.py", line 22, in <module>
    from ruamel.yaml import YAML
ModuleNotFoundError: No module named 'ruamel'

To Reproduce Steps to reproduce the behavior:

Create a BentoService with:

@env(conda_channels=["conda-forge"], conda_dependencies=["libpq=12.3"],
     pip_dependencies=["mxnet==1.4.1", "gluonts==0.5", "numpy==1.16", "pandas==1.0.5", "holidays==0.9.12",
                       "python-dateutil==2.8", "convertdate==2.2", "pydantic==1.6", "luigi==2.8", "sqlalchemy==1.3",
                       "psycopg2==2.8"])

Pack the model and build the Docker Container
When the container is being built, we can see the following package being installed while updating the conda environment: ruamel_yaml-0.15.87
When trying to run the container, ModuleNotFoundError: No module named 'ruamel' appears. If you open bash into the container, you can see that the ruamel is installed, but can't be imported.

Expected behavior It was expected to run successfully.

Screenshots/Logs

Resulting environment.yml:

name: bentoml-DemandForecaster
channels:
- defaults
- conda-forge
dependencies:
- python=3.6.11
- pip
- libpq=12.3

Log of installed packages through conda:

Downloading and Extracting Packages
idna-2.10            | 50 KB     | ########## | 100% 
python-3.6.11        | 34.1 MB   | ########## | 100% 
yaml-0.2.5           | 75 KB     | ########## | 100% 
cryptography-2.9.2   | 556 KB    | ########## | 100% 
pysocks-1.7.1        | 30 KB     | ########## | 100% 
krb5-1.17.1          | 1.3 MB    | ########## | 100% 
pyopenssl-19.1.0     | 48 KB     | ########## | 100% 
pycparser-2.20       | 94 KB     | ########## | 100% 
cffi-1.14.0          | 223 KB    | ########## | 100% 
sqlite-3.32.3        | 1.4 MB    | ########## | 100% 
tk-8.6.10            | 3.0 MB    | ########## | 100% 
conda-4.8.3          | 2.8 MB    | ########## | 100% 
wheel-0.34.2         | 51 KB     | ########## | 100% 
pycosat-0.6.3        | 82 KB     | ########## | 100% 
pip-20.2.1           | 1.8 MB    | ########## | 100% 
openssl-1.1.1g       | 2.5 MB    | ########## | 100% 
xz-5.2.5             | 341 KB    | ########## | 100% 
tqdm-4.42.1          | 56 KB     | ########## | 100% 
python_abi-3.6       | 4 KB      | ########## | 100% 
urllib3-1.25.9       | 103 KB    | ########## | 100% 
ruamel_yaml-0.15.87  | 270 KB    | ########## | 100% 
conda-package-handli | 797 KB    | ########## | 100% 
libpq-12.3           | 2.6 MB    | ########## | 100% 
certifi-2020.6.20    | 156 KB    | ########## | 100% 
requests-2.24.0      | 56 KB     | ########## | 100% 
setuptools-49.2.1    | 736 KB    | ########## | 100% 
ca-certificates-2020 | 125 KB    | ########## | 100% 
readline-8.0         | 356 KB    | ########## | 100% 
chardet-3.0.4        | 180 KB    | ########## | 100% 
brotlipy-0.7.0       | 323 KB    | ########## | 100% 
six-1.15.0           | 13 KB     | ########## | 100% 
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done

Error when trying to import bentoml:

Traceback (most recent call last):
  File "/opt/conda/bin/bentoml", line 5, in <module>
    from bentoml.cli import cli
  File "/opt/conda/lib/python3.6/site-packages/bentoml/__init__.py", line 27, in <module>
    from bentoml.saved_bundle import load, save_to_dir
  File "/opt/conda/lib/python3.6/site-packages/bentoml/saved_bundle/__init__.py", line 15, in <module>
    from bentoml.saved_bundle.bundler import save_to_dir
  File "/opt/conda/lib/python3.6/site-packages/bentoml/saved_bundle/bundler.py", line 31, in <module>
    from bentoml.utils.usage_stats import track_save
  File "/opt/conda/lib/python3.6/site-packages/bentoml/utils/usage_stats.py", line 22, in <module>
    from ruamel.yaml import YAML
ModuleNotFoundError: No module named 'ruamel'

Environment:

OS: Linux Manjaro
Python/BentoML Version: Python 3.6.11, BentoML 0.8.5

Additional context As a workaround, I added ruamel.yaml=0.16 in conda_dependencies.

bug

opened by fernandocamargoai 21

got internal server error when trying to invoke sagemaker endpoint
Describe the bug I am following the tutorial described in https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html. I got the expected results until I get to this part

Instead of the output with the red characters, I got this message :

I did as the message said and checked the associated CloudWatch log and this are the error messages

To Reproduce Steps to reproduce the behavior:

follow the tutorial described in https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html . (The only thing different is the region, I set it to ap-southeast-1)

Expected behavior the output should be like the first screenshot above.

Environment:

OS: [Linux Mint 19.3 Tricia Cinnamon]

Python/BentoML Version [e.g. Python 3.7.4, BentoML 0.7.8]

Additional context What I tried to solve this error: The first message about unpickling svm created with scikit-learn 0.21.3 with scikit-learn 0.23.1 may causes errors made me upgrade my local scikit-learn package from 0.21.3 to 0.23.1 but it did nothing. I assume that the scikit-learn package referred here is the scikit-learn inside the docker image for aws sagemaker ?
bug
opened by palver7 18
Error in Serverless deployment with AWS Lambda
Describe the bug While running: !bentoml deploy ./model --platform aws-lambda --region us-west-2

this error appears: [2019-08-27 17:25:40,854] INFO - Using user AWS region: us-west-2 [2019-08-27 17:25:40,855] INFO - Using AWS stage: dev Encounter error when deploying to aws-lambda Error: 'Service Information' is not in list

To Reproduce sudo npm install [email protected] --global (tried [email protected]. didn't work for the example below too.)

Go to BentoML/examples/deploy-with-serverless/deploy-with-serverless.ipynb

Run all the codes until !bentoml deploy ./model --platform aws-lambda --region us-west-2

See error

Expected behavior Successful deployment

Environment:

MacOS 10.13.6

serverless 1.49.0

Python 3.7.3

BentoML 0.3.4

ipython 7.6.1

bug
opened by ji-clara 18
Allow user to customize readiness probe

Is your feature request related to a problem? Please describe. Currently, the readiness probe does not bother to check anything; it just returns 200 OK if the app is started up. However, in case the developer accidentally introduces a bug into the bento/service file when modifying it, the deployment would be marked as ready, when it should not be. This makes a bugged deployment replace a previously working one, resulting in downstream failures.

Describe the solution you'd like Allow the user to customize the readiness probe / readyz behaviour with a custom function, for instance to call the model with a known valid input, and assert that the model returns a valid output, before marking the deployment as ready. This would also allow developers to assert that connections to external resources such as a feature store are working correctly, before marking a deployment as ready to accept connections.

Describe alternatives you've considered The developer can create a new route that is not called /readyz (since it is reserved) to perform these checks and then modify the Kubernetes deployment to use this route as the readiness probe. Not sure whether this is compatible with Yatai since I do not use it.

Additional context None
feature

opened by jiewpeng 17
Service with sklearn model fails on my EKS cluster
I have created a simple service:

model_runner = bentoml.sklearn.load_runner("mymodel:latest") svc = bentoml.Service("myservice", runners=[model_runner]) @svc.api(input=NumpyNdarray(), output=NumpyNdarray()) def classify(input_series: np.ndarray) -> np.ndarray: return model_runner.run(input_series)

When I run it on my laptop (MacBook Pro M1), using

bentoml serve ./service.py:svc --reload

everything works fine when I invoke the generated classify API.

Now when I push this service to my Yatai server as a bento and deploy it to my K8s cluster (EKS), I get the following error when I invoke the API:

Looking at the code, the problem lies in https://github.com/bentoml/BentoML/blob/119b103e2417291b18127d64d38f092893c8de4f/bentoml/_internal/frameworks/sklearn.py#L163 In my case, _num_threads answers 0. Digging a bit further, resource_quota.cpu is computed here: https://github.com/bentoml/BentoML/blob/119b103e2417291b18127d64d38f092893c8de4f/bentoml/_internal/runner/utils.py#L208. Here are the values I get on the pod running the API:

| source | value | | --- | --- | | file /sys/fs/cgroup/cpu/cpu.cfs_quota_us | -1 | | file /sys/fs/cgroup/cpu/cpu.cfs_period_us | 100000 | | file /sys/fs/cgroup/cpu/cpu.shares | 2 | | call to os.cpu_count() | 2 |

Given those values, query_cgroup_cpu_count() will return 0.001953125, which once rounded will end up as 0, meaning n_jobs will alway be 0. So the call will always fail on my pods.
opened by amelki 17

containerize with conda fails, missing install.sh

Describe the bug

bentoml containerize with conda options in bentofile.yaml fails with chmod: cannot access '/home/bentoml/bento/env/python/install.sh': No such file or directory.

bentofile.yaml with:

python:
      packages: 
      - scikit-learn

works without issues.

To Reproduce

sample_bentofile.yaml:

service: "sample_service:svc"
include:
    - "*.py"  
conda:
    dependencies:
    - python=3.8.13
    - pip    
    pip:
    - scikit-learn

sample_service.py:

import numpy as np
import bentoml
from bentoml.io import NumpyNdarray

iris_clf_runner = bentoml.sklearn.get("iris_clf:latest").to_runner()

svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner])

@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
def classify(input_series: np.ndarray) -> np.ndarray:
    result = iris_clf_runner.predict.run(input_series)
    return result

bentoml build -f sample_bentofile.yaml
bentoml containerize iris_classifier:latest --debug --no-cache
fails with:

Building BentoML service "iris_classifier:d4lyn4xzho2n2atv" from build context "/home/ubuntu/experiments/mmdetection/bentoml"
Packing model "iris_clf:7o4gi2hvzshtkatv"

██████╗░███████╗███╗░░██╗████████╗░█████╗░███╗░░░███╗██╗░░░░░
██╔══██╗██╔════╝████╗░██║╚══██╔══╝██╔══██╗████╗░████║██║░░░░░
██████╦╝█████╗░░██╔██╗██║░░░██║░░░██║░░██║██╔████╔██║██║░░░░░
██╔══██╗██╔══╝░░██║╚████║░░░██║░░░██║░░██║██║╚██╔╝██║██║░░░░░
██████╦╝███████╗██║░╚███║░░░██║░░░╚█████╔╝██║░╚═╝░██║███████╗
╚═════╝░╚══════╝╚═╝░░╚══╝░░░╚═╝░░░░╚════╝░╚═╝░░░░░╚═╝╚══════╝

Successfully built Bento(tag="iris_classifier:d4lyn4xzho2n2atv")
ubuntu@ip-172-31-12-44 ~/e/m/bentoml (master)> bentoml containerize iris_classifier:latest --debug --no-cache                                                       (bentotest) 
Building docker image for Bento(tag="iris_classifier:d4lyn4xzho2n2atv")...
[+] Building 14.0s (17/22)                                                                                                                                                      
 => [internal] load build definition from Dockerfile                                                                                                                       0.0s
 => => transferring dockerfile: 2.87kB                                                                                                                                     0.0s
 => [internal] load .dockerignore                                                                                                                                          0.0s
 => => transferring context: 2B                                                                                                                                            0.0s
 => resolve image config for docker.io/docker/dockerfile:1.4-labs                                                                                                          0.1s
 => CACHED docker-image://docker.io/docker/dockerfile:1.4-labs@sha256:b50ad4af81d1c76ab7c0e1ffc216909e7adc23e99910243e1c88331c2a8ef52d                                     0.0s
 => [internal] load build definition from Dockerfile                                                                                                                       0.0s
 => [internal] load .dockerignore                                                                                                                                          0.0s
 => [internal] load metadata for docker.io/continuumio/miniconda3:latest                                                                                                  13.6s
 => CACHED [internal] settings cache mount permissions                                                                                                                     0.0s
 => CACHED [cached 1/1] FROM docker.io/continuumio/miniconda3:latest@sha256:977263e8d1e476972fddab1c75fe050dd3cd17626390e874448bd92721fd659b                               0.0s
 => [internal] load build context                                                                                                                                          0.0s
 => => transferring context: 33.62kB                                                                                                                                       0.0s
 => [stage-1  2/13] RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache                 0.3s
 => [stage-1  3/13] RUN --mount=type=cache,from=cached,sharing=shared,target=/var/cache/apt     --mount=type=cache,from=cached,sharing=shared,target=/var/lib/apt  apt-g  11.3s
 => [stage-1  4/13] RUN rm -rf /var/lib/{apt,cache,log}                                                                                                                    0.4s
 => [stage-1  5/13] RUN groupadd -g 1034 -o bentoml && useradd -m -u 1034 -g 1034 -o -r bentoml                                                                            0.4s 
 => [stage-1  6/13] RUN mkdir /home/bentoml/bento && chown bentoml:bentoml /home/bentoml/bento -R                                                                          0.5s 
 => [stage-1  7/13] WORKDIR /home/bentoml/bento                                                                                                                            0.0s 
 => [stage-1  8/13] COPY --chown=bentoml:bentoml . ./                                                                                                                      0.0s 
 => ERROR [stage-1  9/13] RUN --mount=type=cache,mode=0777,target=/root/.cache/pip     chmod +x /home/bentoml/bento/env/python/install.sh &&     bash /home/bentoml/bento  0.4s 
------
 > [stage-1  9/13] RUN --mount=type=cache,mode=0777,target=/root/.cache/pip     chmod +x /home/bentoml/bento/env/python/install.sh &&     bash /home/bentoml/bento/env/python/install.sh:
#0 0.325 chmod: cannot access '/home/bentoml/bento/env/python/install.sh': No such file or directory
------
error: failed to solve: executor failed running [/bin/sh -c chmod +x /home/bentoml/bento/env/python/install.sh &&     bash /home/bentoml/bento/env/python/install.sh]: exit code: 1
Failed building docker image: Command '['docker', 'buildx', 'build', '--progress', 'auto', '--tag', 'iris_classifier:d4lyn4xzho2n2atv', '--file', 'env/docker/Dockerfile', '--load', '--no-cache', '.']' returned non-zero exit status 1.

Environment:

Ubuntu 18.04.6 LTS
Python 3.8.13
bentoml, version 1.0.0rc2.post24+g25b6e63

bug

opened by smidm 16

bug: RuntimeError: Found no NVIDIA driver on your system.

Describe the bug

I'm not sure if this is actually a bug or an error from my side, so please excuse the latter.

I am able to successfully build a bento that uses the gpu with no problems. However, containerizing it leads to the following error (it does not find the Nvidia GPU drivers):

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

Did I forget to specify more information to use nvidia drivers, may be in the Dockerfile.template ? Note that it runs in a local miniconda environment. Could this be the issue ?

Here is the bentofile.yaml :

service: "service:svc"  # Same as the argument passed to `bentoml serve`
#include:
#- "*.py"  # A pattern for matching which files to include in the bento
exclude:
- "examples/"
- "*.png"
- "*.gif"
- "venv/"
- "venv"
docker:
  distro: debian
  dockerfile_template: ./Dockerfile.template
  python_version: "3.10.8"
  cuda_version: "11.6.2"
python:
   packages:  # Additional pip packages required by the service
   - filelock
   - Pillow
   - torch
   - fire
   - humanize
   - requests
   - tqdm
   - matplotlib
   - scikit-image
   - scipy
   - numpy

This is the Dockerfile.template

{% extends bento_base_template %}
{% block SETUP_BENTO_COMPONENTS %}
{{ super() }}
RUN echo "We are running this during bentoml containerize!"
RUN apt-get update && \
    apt-get upgrade -y && \
    apt-get install -y git
RUN pip install git+https://github.com/openai/CLIP.git
RUN echo "CLIP installed!"
{% endblock %}

To reproduce

No response

Expected behavior

No response

Environment

Environment variable

BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''

System information

bentoml: 1.0.12 python: 3.10.8 platform: Linux-5.15.85-1-MANJARO-x86_64-with-glibc2.36 uid_gid: 1000:1000 conda: 22.9.0 in_conda_env: True

conda_packages

name: pointe
channels:
  - defaults
dependencies:
  - _libgcc_mutex=0.1=main
  - _openmp_mutex=5.1=1_gnu
  - bzip2=1.0.8=h7b6447c_0
  - ca-certificates=2022.10.11=h06a4308_0
  - certifi=2022.12.7=py310h06a4308_0
  - ld_impl_linux-64=2.38=h1181459_1
  - libffi=3.4.2=h6a678d5_6
  - libgcc-ng=11.2.0=h1234567_1
  - libgomp=11.2.0=h1234567_1
  - libstdcxx-ng=11.2.0=h1234567_1
  - libuuid=1.41.5=h5eee18b_0
  - ncurses=6.3=h5eee18b_3
  - openssl=1.1.1s=h7f8727e_0
  - pip=22.3.1=py310h06a4308_0
  - python=3.10.8=h7a1cb2a_1
  - readline=8.2=h5eee18b_0
  - setuptools=65.5.0=py310h06a4308_0
  - sqlite=3.40.0=h5082296_0
  - tk=8.6.12=h1ccaba5_0
  - tzdata=2022g=h04d1e81_0
  - wheel=0.37.1=pyhd3eb1b0_0
  - xz=5.2.8=h5eee18b_0
  - zlib=1.2.13=h5eee18b_0
  - pip:
    - aiohttp==3.8.3
    - aiosignal==1.3.1
    - anyio==3.6.2
    - appdirs==1.4.4
    - asgiref==3.6.0
    - async-timeout==4.0.2
    - attrs==22.2.0
    - backoff==2.2.1
    - bentoml==1.0.12
    - build==0.9.0
    - cattrs==22.2.0
    - charset-normalizer==2.1.1
    - circus==0.18.0
    - click==8.1.3
    - click-option-group==0.5.5
    - clip==1.0
    - cloudpickle==2.2.0
    - commonmark==0.9.1
    - contextlib2==21.6.0
    - contourpy==1.0.6
    - cycler==0.11.0
    - deepmerge==1.1.0
    - deprecated==1.2.13
    - exceptiongroup==1.1.0
    - filelock==3.9.0
    - fire==0.5.0
    - fonttools==4.38.0
    - frozenlist==1.3.3
    - fs==2.4.16
    - ftfy==6.1.1
    - googleapis-common-protos==1.57.0
    - h11==0.14.0
    - humanize==4.4.0
    - idna==3.4
    - imageio==2.23.0
    - jinja2==3.1.2
    - kiwisolver==1.4.4
    - markupsafe==2.1.1
    - matplotlib==3.6.2
    - multidict==6.0.4
    - networkx==2.8.8
    - numpy==1.24.1
    - nvidia-cublas-cu11==11.10.3.66
    - nvidia-cuda-nvrtc-cu11==11.7.99
    - nvidia-cuda-runtime-cu11==11.7.99
    - nvidia-cudnn-cu11==8.5.0.96
    - opentelemetry-api==1.14.0
    - opentelemetry-exporter-otlp-proto-http==1.14.0
    - opentelemetry-instrumentation==0.35b0
    - opentelemetry-instrumentation-aiohttp-client==0.35b0
    - opentelemetry-instrumentation-asgi==0.35b0
    - opentelemetry-proto==1.14.0
    - opentelemetry-sdk==1.14.0
    - opentelemetry-semantic-conventions==0.35b0
    - opentelemetry-util-http==0.35b0
    - packaging==21.3
    - pathspec==0.10.3
    - pep517==0.13.0
    - pillow==9.4.0
    - pip-requirements-parser==32.0.1
    - pip-tools==6.12.1
    - prometheus-client==0.15.0
    - protobuf==3.20.3
    - psutil==5.9.4
    - pygments==2.14.0
    - pynvml==11.4.1
    - pyparsing==3.0.9
    - python-dateutil==2.8.2
    - python-json-logger==2.0.4
    - python-multipart==0.0.5
    - pywavelets==1.4.1
    - pyyaml==6.0
    - pyzmq==24.0.1
    - regex==2022.10.31
    - requests==2.28.1
    - rich==13.0.0
    - schema==0.7.5
    - scikit-image==0.19.3
    - scipy==1.9.3
    - simple-di==0.1.5
    - six==1.16.0
    - sniffio==1.3.0
    - starlette==0.23.1
    - termcolor==2.1.1
    - tifffile==2022.10.10
    - tomli==2.0.1
    - torch==1.13.1
    - torchvision==0.14.1
    - tornado==6.2
    - tqdm==4.64.1
    - typing-extensions==4.4.0
    - urllib3==1.26.13
    - uvicorn==0.20.0
    - watchfiles==0.18.1
    - wcwidth==0.2.5
    - wrapt==1.14.1
    - yarl==1.8.2
prefix: /home/be/miniconda3/envs/pointe

pip_packages

aiohttp==3.8.3
aiosignal==1.3.1
anyio==3.6.2
appdirs==1.4.4
asgiref==3.6.0
async-timeout==4.0.2
attrs==22.2.0
backoff==2.2.1
bentoml==1.0.12
build==0.9.0
cattrs==22.2.0
certifi @ file:///croot/certifi_1671487769961/work/certifi
charset-normalizer==2.1.1
circus==0.18.0
click==8.1.3
click-option-group==0.5.5
clip @ git+https://github.com/openai/CLIP.git@d50d76daa670286dd6cacf3bcd80b5e4823fc8e1
cloudpickle==2.2.0
commonmark==0.9.1
contextlib2==21.6.0
contourpy==1.0.6
cycler==0.11.0
deepmerge==1.1.0
Deprecated==1.2.13
exceptiongroup==1.1.0
filelock==3.9.0
fire==0.5.0
fonttools==4.38.0
frozenlist==1.3.3
fs==2.4.16
ftfy==6.1.1
googleapis-common-protos==1.57.0
h11==0.14.0
humanize==4.4.0
idna==3.4
imageio==2.23.0
Jinja2==3.1.2
kiwisolver==1.4.4
MarkupSafe==2.1.1
matplotlib==3.6.2
multidict==6.0.4
networkx==2.8.8
numpy==1.24.1
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
opentelemetry-api==1.14.0
opentelemetry-exporter-otlp-proto-http==1.14.0
opentelemetry-instrumentation==0.35b0
opentelemetry-instrumentation-aiohttp-client==0.35b0
opentelemetry-instrumentation-asgi==0.35b0
opentelemetry-proto==1.14.0
opentelemetry-sdk==1.14.0
opentelemetry-semantic-conventions==0.35b0
opentelemetry-util-http==0.35b0
packaging==21.3
pathspec==0.10.3
pep517==0.13.0
Pillow==9.4.0
pip-requirements-parser==32.0.1
pip-tools==6.12.1
-e git+https://github.com/openai/point-e.git@fc8a607c08a3ea804cc82bf1ef8628f88a3a5d2f#egg=point_e
prometheus-client==0.15.0
protobuf==3.20.3
psutil==5.9.4
Pygments==2.14.0
pynvml==11.4.1
pyparsing==3.0.9
python-dateutil==2.8.2
python-json-logger==2.0.4
python-multipart==0.0.5
PyWavelets==1.4.1
PyYAML==6.0
pyzmq==24.0.1
regex==2022.10.31
requests==2.28.1
rich==13.0.0
schema==0.7.5
scikit-image==0.19.3
scipy==1.9.3
simple-di==0.1.5
six==1.16.0
sniffio==1.3.0
starlette==0.23.1
termcolor==2.1.1
tifffile==2022.10.10
tomli==2.0.1
torch==1.13.1
torchvision==0.14.1
tornado==6.2
tqdm==4.64.1
typing_extensions==4.4.0
urllib3==1.26.13
uvicorn==0.20.0
watchfiles==0.18.1
wcwidth==0.2.5
wrapt==1.14.1
yarl==1.8.2

bug

opened by BEpresent 9

bug: transformers save_model and load_model tasks conflict

Describe the bug

Transformers save_model currently check for task_name and task_definition to be not None to pickle custom pipelines.

This behaviour should be consistent with load_model

To reproduce

See bentoml/_internal/frameworks/transformers.py#load_model,save_model

Expected behavior

No response

Environment

na
bug

opened by aarnphm 0
fix: quote sys.executable for circus

Ensure to quote sys.executable to work around https://github.com/circus-tent/circus/blob/b8c97d34a08b7d44ac3203440872510238b1132a/circus/process.py#L412

Signed-off-by: Aaron Pham [email protected]

opened by aarnphm 2
docs: Missing API reference for bentoml.Model

Describe the bug

As a part of the integration of BentoML, we need to automate the process of importing and exporting BentoModels. Examining the documentation, specifically the API Reference section, there is no mentioning of the ability to do so through the Python SDK. Examining the code in this repo, we found bentoml.Model.export and bentoml.Model.import_from but they are not documented anywhere in the API reference.

To reproduce

Irrelevant

Expected behavior

Anything present int he SDK should be found in the API Reference part of the documentation

Environment

Irrelevant
documentation

opened by eliorc 0
[BUG] BentoML bentoml.container.build looks for "=" in platform argument
When using the "bentoml.container.build" to containerise a service, if a platform is specified, the code in the link below looks for the "=" sign and fails with an index error:

"name": "IndexError", "message": "list index out of range",

https://github.com/bentoml/BentoML/blob/98f6f63cfe0242a8df230fed00aa323a29735372/src/bentoml/container.py#L404 Removing line 404 fixed the issue. Example code:

import bentoml bentoml.container.build( bento_tag="<service>:<version>", tag="<service>:<version>", platform="linux/amd64", )
opened by drsantos89 2

Releases(v1.0.12)

v1.0.12(Dec 8, 2022)
Important bug fixes.

Fixed runner call failures with keyword arguments.

Fixed incorrect user base image override .

What's Changed

fix(runner): content-type error by @aarnphm in https://github.com/bentoml/BentoML/pull/3302

feat: grpc servicer implementation per version by @aarnphm in https://github.com/bentoml/BentoML/pull/3316

feat(grpc): adding service metadata by @aarnphm in https://github.com/bentoml/BentoML/pull/3278

docs: Update monitoring docs format by @ssheng in https://github.com/bentoml/BentoML/pull/3324

fix(runner): remote run_method with kwargs by @larme in https://github.com/bentoml/BentoML/pull/3326

fix: don't overwrite user base image by @aarnphm in https://github.com/bentoml/BentoML/pull/3329

fix: add upper bound for packaging version by @aarnphm in https://github.com/bentoml/BentoML/pull/3331

fix(container): podman health result string parsing by @aarnphm in https://github.com/bentoml/BentoML/pull/3330

fix: io descriptor backward compatibility by @sauyon in https://github.com/bentoml/BentoML/pull/3327

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.11...v1.0.12
Source code(tar.gz)
Source code(zip)
bentoml-1.0.12-py3-none-any.whl(888.38 KB)
bentoml-1.0.12.tar.gz(16.30 MB)
v1.0.11(Dec 7, 2022)
🍱 BentoML v1.0.11 is here featuring the introduction of an inference collection and model monitoring API that can be easily integrated with any model monitoring frameworks.

Introduced the bentoml.monitor API for monitoring any features, predictions, and target data in numerical, categorical, and numerical sequence types.

import bentoml from bentoml.io import Text from bentoml.io import NumpyNdarray CLASS_NAMES = ["setosa", "versicolor", "virginica"] iris_clf_runner = bentoml.sklearn.get("iris_clf:latest").to_runner() svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner]) @svc.api( input=NumpyNdarray.from_sample(np.array([4.9, 3.0, 1.4, 0.2], dtype=np.double)), output=Text(), ) async def classify(features: np.ndarray) -> str: with bentoml.monitor("iris_classifier_prediction") as mon: mon.log(features[0], name="sepal length", role="feature", data_type="numerical") mon.log(features[1], name="sepal width", role="feature", data_type="numerical") mon.log(features[2], name="petal length", role="feature", data_type="numerical") mon.log(features[3], name="petal width", role="feature", data_type="numerical") results = await iris_clf_runner.predict.async_run([features]) result = results[0] category = CLASS_NAMES[result] mon.log(category, name="pred", role="prediction", data_type="categorical") return category

Enabled monitoring data collection through log file forwarding using any forwarders (fluentbit, filebeat, logstash) or OTLP exporter implementations.

Configuration for monitoring data collection through log files.

monitoring: enabled: true type: default options: log_path: path/to/log/file

Configuration for monitoring data collection through an OTLP exporter.

monitoring: enable: true type: otlp options: endpoint: http://localhost:5000 insecure: true credentials: null headers: null timeout: 10 compression: null meta_sample_rate: 1.0

Supported third-party monitoring data collector integrations through BentoML Plugins. See bentoml/plugins repository for more details.

🐳 Improved containerization SDK and CLI options, read more in #3164.

Added support for multiple backend builder options (Docker, nerdctl, Podman, Buildah, Buildx) in addition to buildctl (standalone buildkit builder).

Improved Python SDK for containerization with different backend builder options.

import bentoml bentoml.container.build("iris_classifier:latest", backend="podman", features=["grpc","grpc-reflection"], **kwargs)

Improved CLI to include the newly added options.

bentoml containerize --help

Standardized the generated Dockerfile in bentos to be compatible with all build tools for use cases that require building from a Dockerfile directly.

💡 We continue to update the documentation and examples on every release to help the community unlock the full power of BentoML.

Learn more about inference data collection and model monitoring capabilities in BentoML.

Learn more about the default metrics that comes out-of-the-box and how to add custom metrics in BentoML.

What's Changed

chore: add framework utils functions directory by @larme in https://github.com/bentoml/BentoML/pull/3203

fix: missing f-string in tag validation error message by @csh3695 in https://github.com/bentoml/BentoML/pull/3205

chore(build_config): bypass exception when cuda and conda is specified by @aarnphm in https://github.com/bentoml/BentoML/pull/3188

docs: Update asynchronous API documentation by @ssheng in https://github.com/bentoml/BentoML/pull/3204

style: use relative import inside _internal/ by @larme in https://github.com/bentoml/BentoML/pull/3209

style: fix monitoring type error by @aarnphm in https://github.com/bentoml/BentoML/pull/3208

chore(build): add dependabot for pyproject.toml by @aarnphm in https://github.com/bentoml/BentoML/pull/3139

chore(deps): bump black[jupyter] from 22.8.0 to 22.10.0 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3217

chore(deps): bump pylint from 2.15.3 to 2.15.5 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3212

chore(deps): bump pytest-asyncio from 0.19.0 to 0.20.1 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3216

chore(deps): bump imageio from 2.22.1 to 2.22.4 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3211

fix: don't index ContextVar at runtime by @sauyon in https://github.com/bentoml/BentoML/pull/3221

chore(deps): bump pyarrow from 9.0.0 to 10.0.0 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3214

chore: configuration check for development by @aarnphm in https://github.com/bentoml/BentoML/pull/3223

fix bento create by @quandollar in https://github.com/bentoml/BentoML/pull/3220

fix(docs): missing table tag by @nyongja in https://github.com/bentoml/BentoML/pull/3231

docs: grammar corrections by @tbazin in https://github.com/bentoml/BentoML/pull/3234

chore(deps): bump pytest-asyncio from 0.20.1 to 0.20.2 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3238

chore(deps): bump pytest-xdist[psutil] from 2.5.0 to 3.0.2 by @dependabot in https://github.com/bentoml/BentoML/pull/3245

chore(deps): bump pytest from 7.1.3 to 7.2.0 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3237

chore(deps): bump build[virtualenv] from 0.8.0 to 0.9.0 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3240

deps: bumping gRPC and OTLP dependencies by @aarnphm in https://github.com/bentoml/BentoML/pull/3228

feat(file): support custom mime type for file proto by @aarnphm in https://github.com/bentoml/BentoML/pull/3095

fix: multipart for client by @sauyon in https://github.com/bentoml/BentoML/pull/3253

fix(json): make sure to parse a list of dict for_sample by @aarnphm in https://github.com/bentoml/BentoML/pull/3229

chore: move test proto to internal tests only by @aarnphm in https://github.com/bentoml/BentoML/pull/3255

fix(framework): external_modules for loading pytorch by @bojiang in https://github.com/bentoml/BentoML/pull/3254

feat(container): builder implementation by @aarnphm in https://github.com/bentoml/BentoML/pull/3164

feat(sdk): implement otlp monitoring exporter by @bojiang in https://github.com/bentoml/BentoML/pull/3257

chore(grpc): add missing init.py by @aarnphm in https://github.com/bentoml/BentoML/pull/3259

docs(metrics): Update docs for the default metrics by @ssheng in https://github.com/bentoml/BentoML/pull/3262

chore: generate plain dockerfile without buildkit syntax by @aarnphm in https://github.com/bentoml/BentoML/pull/3261

style: remove # type: ignore by @aarnphm in https://github.com/bentoml/BentoML/pull/3265

fix: lazy load ONNX utils by @aarnphm in https://github.com/bentoml/BentoML/pull/3266

fix(pytorch): pickle is the unpickler of cloudpickle by @bojiang in https://github.com/bentoml/BentoML/pull/3269

fix: instructions for missing sklearn dependency by @benjamintanweihao in https://github.com/bentoml/BentoML/pull/3271

docs: ONNX signature docs by @larme in https://github.com/bentoml/BentoML/pull/3272

chore(deps): bump pyarrow from 10.0.0 to 10.0.1 by @dependabot in https://github.com/bentoml/BentoML/pull/3273

chore(deps): bump pylint from 2.15.5 to 2.15.6 by @dependabot in https://github.com/bentoml/BentoML/pull/3274

fix(pandas): only set columns when apply_column_names is set by @mqk in https://github.com/bentoml/BentoML/pull/3275

feat: configuration versioning by @aarnphm in https://github.com/bentoml/BentoML/pull/3052

fix(container): support comma in docker env by @larme in https://github.com/bentoml/BentoML/pull/3285

chore(stub): import filetype by @aarnphm in https://github.com/bentoml/BentoML/pull/3260

fix(container): ensure to stream logs when DOCKER_BUILDKIT=0 by @aarnphm in https://github.com/bentoml/BentoML/pull/3294

docs: update instructions for containerize message by @aarnphm in https://github.com/bentoml/BentoML/pull/3289

fix: unset NVIDIA_VISIBLE_DEVICES when cuda image is used by @aarnphm in https://github.com/bentoml/BentoML/pull/3298

fix: multipart logic by @sauyon in https://github.com/bentoml/BentoML/pull/3297

chore(deps): bump pylint from 2.15.6 to 2.15.7 by @dependabot in https://github.com/bentoml/BentoML/pull/3291

docs: wrong arguments when saving by @KimSoungRyoul in https://github.com/bentoml/BentoML/pull/3306

chore(deps): bump pylint from 2.15.7 to 2.15.8 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3308

chore(deps): bump pytest-xdist[psutil] from 3.0.2 to 3.1.0 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3309

chore(pyproject): bumping python version typeshed to 3.11 by @aarnphm in https://github.com/bentoml/BentoML/pull/3281

fix(monitor): disable validate for Formatter by @bojiang in https://github.com/bentoml/BentoML/pull/3317

doc(monitoring): monitoring guide by @bojiang in https://github.com/bentoml/BentoML/pull/3300

feat: parsing path for env by @aarnphm in https://github.com/bentoml/BentoML/pull/3314

fix: remove assertion for dtype by @aarnphm in https://github.com/bentoml/BentoML/pull/3320

feat: client lazy load by @aarnphm in https://github.com/bentoml/BentoML/pull/3323

chore: provides shim for bentoctl by @aarnphm in https://github.com/bentoml/BentoML/pull/3322

New Contributors

@csh3695 made their first contribution in https://github.com/bentoml/BentoML/pull/3205

@nyongja made their first contribution in https://github.com/bentoml/BentoML/pull/3231

@tbazin made their first contribution in https://github.com/bentoml/BentoML/pull/3234

@KimSoungRyoul made their first contribution in https://github.com/bentoml/BentoML/pull/3306

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.10...v1.0.11
Source code(tar.gz)
Source code(zip)
bentoml-1.0.11-py3-none-any.whl(884.06 KB)
bentoml-1.0.11.tar.gz(16.30 MB)
v1.0.10(Nov 9, 2022)
🍱 BentoML v1.0.10 is released to address a recurring broken pipe reported by the community. Also included in this release, is a list of improvements we’d like to share with the community.

Fixed an aiohttp.client_exceptions.ClientOSError caused by asymmetrical keep alive timeout settings between the API Server and Runner.

aiohttp.client_exceptions.ClientOSError: [Errno 32] Broken pipe

Added multi-output support for ONNX and TensorFlow frameworks.

Added from_sample support to all IO Descriptors in addition to just bentoml.io.NumpyNdarray and the sample is reflected in the Swagger UI.

# Pandas Example @svc.api( input=PandasDataFrame.from_sample( pd.DataFrame([1,2,3,4]) ), output=PandasDataFrame(), ) # JSON Example @svc.api( input=JSON.from_sample( {"foo": 1, "bar": 2} ), output=JSON(), )

💡 We continue to update the documentation and examples on every release to help the community unlock the full power of BentoML.

Check out the updated multi-model inference graph guide and example to learn how to compose multiple models in the same Bento service.

Did you know BentoML support OpenTelemetry tracing out-of-the-box? Checkout the Tracing guide for tracing support for OTLP, Jaeger, and Zipkin.

What's Changed

feat(cli): log conditional environment variables by @aarnphm in https://github.com/bentoml/BentoML/pull/3156

fix: ensure conda not use pipefail and unset variables by @aarnphm in https://github.com/bentoml/BentoML/pull/3171

fix(templates): ensure to use python3 and pip3 by @aarnphm in https://github.com/bentoml/BentoML/pull/3170

fix(sdk): montioring log output by @bojiang in https://github.com/bentoml/BentoML/pull/3175

feat: make quickstart batchable by @sauyon in https://github.com/bentoml/BentoML/pull/3172

fix: lazy check for stubs via path when install local wheels by @aarnphm in https://github.com/bentoml/BentoML/pull/3180

fix(openapi): remove summary field under Info by @aarnphm in https://github.com/bentoml/BentoML/pull/3178

docs: Inference graph example by @ssheng in https://github.com/bentoml/BentoML/pull/3183

docs: remove whitespaces in migration guides by @wellshs in https://github.com/bentoml/BentoML/pull/3185

fix(build_config): validation when NoneType by @aarnphm in https://github.com/bentoml/BentoML/pull/3187

fix(docs): indentation in migration.rst by @aarnphm in https://github.com/bentoml/BentoML/pull/3186

doc(example): monitoring example for classification tasks by @bojiang in https://github.com/bentoml/BentoML/pull/3176

refactor(sdk): separate default monitoring impl by @bojiang in https://github.com/bentoml/BentoML/pull/3189

fix(ssl): provide default values in configuration by @aarnphm in https://github.com/bentoml/BentoML/pull/3191

fix: don't ignore logging conf by @sauyon in https://github.com/bentoml/BentoML/pull/3192

feat: tensorflow multi outputs support by @larme in https://github.com/bentoml/BentoML/pull/3115

docs: cleanup whitespace and typo by @aarnphm in https://github.com/bentoml/BentoML/pull/3195

chore: cleanup deadcode by @aarnphm in https://github.com/bentoml/BentoML/pull/3196

fix(runner): set uvicorn keep-alive by @sauyon in https://github.com/bentoml/BentoML/pull/3198

perf: refine onnx implementation by @larme in https://github.com/bentoml/BentoML/pull/3166

feat: from_sample for IO descriptor by @aarnphm in https://github.com/bentoml/BentoML/pull/3143

New Contributors

@wellshs made their first contribution in https://github.com/bentoml/BentoML/pull/3185

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.8...v1.0.9

What's Changed

fix: from_sample override logic by @aarnphm in https://github.com/bentoml/BentoML/pull/3202

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.9...v1.0.10
Source code(tar.gz)
Source code(zip)
bentoml-1.0.10-py3-none-any.whl(852.78 KB)
bentoml-1.0.10.tar.gz(15.34 MB)
v1.0.8(Nov 1, 2022)
🍱 BentoML v1.0.8 is released with a list of improvement we hope that you’ll find useful.

Introduced Bento Client for easy access to the BentoML service over HTTP. Both sync and async calls are supported. See the Bento Client Guide for more details.

from bentoml.client import Client client = Client.from_url("http://localhost:3000") # Sync call response = client.classify(np.array([[4.9, 3.0, 1.4, 0.2]])) # Async call response = await client.async_classify(np.array([[4.9, 3.0, 1.4, 0.2]]))

Introduced custom metrics support for easy instrumentation of custom metrics over Prometheus. See Metrics Guide for more details.

# Histogram metric inference_duration = bentoml.metrics.Histogram( name="inference_duration", documentation="Duration of inference", labelnames=["nltk_version", "sentiment_cls"], ) # Counter metric polarity_counter = bentoml.metrics.Counter( name="polarity_total", documentation="Count total number of analysis by polarity scores", labelnames=["polarity"], )

Full Prometheus style syntax is supported for instrumenting custom metrics inside API and Runner definitions.

# Histogram inference_duration.labels( nltk_version=nltk.__version__, sentiment_cls=self.sia.__class__.__name__ ).observe(time.perf_counter() - start) # Counter polarity_counter.labels(polarity=is_positive).inc()

Improved health checking to also cover the status of runners to avoid returning a healthy status before runners are ready.

Added SSL/TLS support to gRPC serving.

bentoml serve-grpc --ssl-certfile=credentials/cert.pem --ssl-keyfile=credentials/key.pem --production --enable-reflection

Added channelz support for easy debugging gRPC serving.

Allowed nested requirements with the -r syntax.

# requirements.txt -r nested/requirements.txt pydantic Pillow fastapi

Improved the adaptive batching dispatcher auto-tuning ability to avoid sporadic request failures due to batching in the beginning of the runner lifecycle.

Fixed a bug such that runners will raise a TypeError when overloaded. Now an HTTP 503 Service Unavailable will be returned when runner is overloaded.

File "python3.9/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 188, in async_run_method return tuple(AutoContainer.from_payload(payload) for payload in payloads) TypeError: 'Response' object is not iterable

💡 We continue to update the documentation and examples on every release to help the community unlock the full power of BentoML.

Check out the updated PyTorch Framework Guide on how to use external_modules to save classes or utility functions required by the model.

See the Metrics Guide on how to add custom metrics to your API and custom Runners.

Learn more about how to use the Bento Client to call your BentoML service with Python easily.

Check out the latest blog post on why model serving over gRPC matters to data scientists.

🥂 We’d like to thank the community for your continued support and engagement.

Shout out to @judahrand for multiple contributions to BentoML and bentoctl.

Shout out to @phildamore-phdata, @quandollar, @2JooYeon, and @fortunto2 for their first contribution to BentoML.

Source code(tar.gz)
Source code(zip)
bentoml-1.0.8-py3-none-any.whl(859.10 KB)
bentoml-1.0.8.tar.gz(13.68 MB)
v1.0.7(Oct 3, 2022)
🍱 BentoML released v1.0.7 as a patch to quickly fix a critical module import issue introduced in v1.0.6. The import error manifests in the import of any modules under io.* or models.*. The following is an example of a typical error message and traceback. Please upgrade to v1.0.7 to address this import issue.

packages/anyio/_backends/_asyncio.py", line 21, in <module> from io import IOBase ImportError: cannot import name 'IOBase' from 'bentoml.io'

What's Changed

test(grpc): e2e + unit tests by @aarnphm in https://github.com/bentoml/BentoML/pull/2984

feat: support multipart upload for large bento and model by @yetone in https://github.com/bentoml/BentoML/pull/3044

fix(config): respect api_server.workers by @judahrand in https://github.com/bentoml/BentoML/pull/3049

chore(lint): remove unused import by @aarnphm in https://github.com/bentoml/BentoML/pull/3051

fix(import): namespace collision by @aarnphm in https://github.com/bentoml/BentoML/pull/3058

New Contributors

@judahrand made their first contribution in https://github.com/bentoml/BentoML/pull/3049

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.6...v1.0.7
Source code(tar.gz)
Source code(zip)
bentoml-1.0.7-py3-none-any.whl(838.18 KB)
bentoml-1.0.7.tar.gz(742.52 KB)
v1.0.6(Sep 27, 2022)
🍱 BentoML has just released v1.0.6 featuring the gRPC preview! Without changing a line of code, you can now serve your Bentos as a gRPC service. Similar to serving over HTTP, BentoML gRPC supports all the ML frameworks, observability features, adaptive batching, and more out-of-the-box, simply by calling the serve-grpc CLI command.

> pip install "bentoml[grpc]" > bentoml serve-grpc iris_classifier:latest --production

Checkout our updated tutorial for a quick 10-minute crash course of BentoML gRPC.

Review the standardized Protobuf definition of service APIs and IO types, NDArray, DataFrame, File/Image, JSON, etc.

Learn more about multi-language client support (Python, Go, Java, Node.js, etc) with working examples.

Customize gRPC service by mounting new servicers and interceptors.

⚠️ gRPC is current under preview. The public APIs may undergo incompatible changes in the future patch releases until the official v1.1.0 minor version release.

Enhanced access logging format to output Trace and Span IDs in the more standard hex encoding by default.

Added request total, duration, and in-progress metrics to Runners, in addition to API Servers.

Added support for XGBoost SKLearn models.

Added support for restricting image mime types in the Image IO descriptor.

🥂 We’d like to thank our community for their contribution and support.

Shout out to @benjamintanweihao for fixing a BentoML CLI bug.

Shout out to @lsh918 for mixing a PyTorch framework issue.

Shout out to @jeffthebear for enhancing the Pandas DataFrame OpenAPI schema.

Shout out to @jiewpeng for adding the support for customizing access logs with Trace and Span ID formats.

What's Changed

fix: log runner errors explicitly by @ssheng in https://github.com/bentoml/BentoML/pull/2952

ci: temp fix for models test by @sauyon in https://github.com/bentoml/BentoML/pull/2949

fix: fix context parameter for multi-input IO descriptors by @sauyon in https://github.com/bentoml/BentoML/pull/2948

fix: use torch.from_numpy() instead of torch.Tensor() to keep data type by @lsh918 in https://github.com/bentoml/BentoML/pull/2951

docs: fix wrong name for example neural net by @ssun-g in https://github.com/bentoml/BentoML/pull/2959

docs: fix bentoml containerize command help message by @aarnphm in https://github.com/bentoml/BentoML/pull/2957

chore(cli): remove unused --no-trunc by @benjamintanweihao in https://github.com/bentoml/BentoML/pull/2965

fix: relax regex for setting environment variables by @benjamintanweihao in https://github.com/bentoml/BentoML/pull/2964

docs: update wrong paths for disabling logs by @creativedutchmen in https://github.com/bentoml/BentoML/pull/2974

feat: track serve update for start subcommands by @ssheng in https://github.com/bentoml/BentoML/pull/2976

feat: logging customization by @jiewpeng in https://github.com/bentoml/BentoML/pull/2961

chore(cli): using quotes instead of backslash by @sauyon in https://github.com/bentoml/BentoML/pull/2981

feat(cli): show full tracebacks in debug mode by @sauyon in https://github.com/bentoml/BentoML/pull/2982

feature(runner): add multiple output support by @larme in https://github.com/bentoml/BentoML/pull/2912

docs: add airflow integration page by @parano in https://github.com/bentoml/BentoML/pull/2990

chore(ci): fix the unit test of transformers by @bojiang in https://github.com/bentoml/BentoML/pull/3003

chore(ci): fix the issue caused by the change of check_task by @bojiang in https://github.com/bentoml/BentoML/pull/3004

fix(multipart): support multipart file inputs to non-file descriptors by @sauyon in https://github.com/bentoml/BentoML/pull/3005

feat(server): add runner metrics; refactoring batch size metrics by @bojiang in https://github.com/bentoml/BentoML/pull/2977

EXPERIMENTAL: gRPC support by @aarnphm in https://github.com/bentoml/BentoML/pull/2808

fix(runner): receive requests before cork by @bojiang in https://github.com/bentoml/BentoML/pull/2996

fix(server): service_name label of runner metrics by @bojiang in https://github.com/bentoml/BentoML/pull/3008

chore(misc): remove mentioned for team member from PR request by @aarnphm in https://github.com/bentoml/BentoML/pull/3009

feat(xgboost): support xgboost sklearn models by @sauyon in https://github.com/bentoml/BentoML/pull/2997

feat(io/image): allow restricting mime types by @sauyon in https://github.com/bentoml/BentoML/pull/2999

fix(grpc): docker message by @aarnphm in https://github.com/bentoml/BentoML/pull/3012

fix: broken legacy metrics by @aarnphm in https://github.com/bentoml/BentoML/pull/3019

fix(e2e): exception test for image IO by @aarnphm in https://github.com/bentoml/BentoML/pull/3017

revert(3017): filter write-only mime type for Image IO by @bojiang in https://github.com/bentoml/BentoML/pull/3020

chore: cleanup containerize utils by @aarnphm in https://github.com/bentoml/BentoML/pull/3014

feat(proto): add serialized_bytes to pb.Part by @aarnphm in https://github.com/bentoml/BentoML/pull/3022

docs: Update README.md by @parano in https://github.com/bentoml/BentoML/pull/3023

chore(grpc): vcs generated stubs by @aarnphm in https://github.com/bentoml/BentoML/pull/3016

feat(io/image): allow writeable mimes as output by @sauyon in https://github.com/bentoml/BentoML/pull/3024

docs: fix descriptor typo by @darioarias in https://github.com/bentoml/BentoML/pull/3027

fix(server): log localhost instead of 0.0.0.0 by @sauyon in https://github.com/bentoml/BentoML/pull/3033

fix(io): Pandas OpenAPI schema by @jeffthebear in https://github.com/bentoml/BentoML/pull/3032

chore(docker): support more cuda versions by @larme in https://github.com/bentoml/BentoML/pull/3035

docs: updates on blocks that failed to render by @aarnphm in https://github.com/bentoml/BentoML/pull/3031

chore: migrate to pyproject.toml by @aarnphm in https://github.com/bentoml/BentoML/pull/3025

docs: gRPC tutorial by @aarnphm in https://github.com/bentoml/BentoML/pull/3013

docs: gRPC advanced guides by @aarnphm in https://github.com/bentoml/BentoML/pull/3034

feat(configuration): override options with envvar by @bojiang in https://github.com/bentoml/BentoML/pull/3018

chore: update links by @aarnphm in https://github.com/bentoml/BentoML/pull/3040

fix(configuration): should validate config early by @aarnphm in https://github.com/bentoml/BentoML/pull/3041

qa(bentos): update latest options by @aarnphm in https://github.com/bentoml/BentoML/pull/3042

qa: ignore tools from distribution by @aarnphm in https://github.com/bentoml/BentoML/pull/3045

dependencies: ignore broken pypi combination by @aarnphm in https://github.com/bentoml/BentoML/pull/3043

feat: gRPC tracking by @aarnphm in https://github.com/bentoml/BentoML/pull/3015

configuration: migrate schema to api_server by @ssheng in https://github.com/bentoml/BentoML/pull/3046

qa: cleanup MLflow by @aarnphm in https://github.com/bentoml/BentoML/pull/2945

New Contributors

@lsh918 made their first contribution in https://github.com/bentoml/BentoML/pull/2951

@ssun-g made their first contribution in https://github.com/bentoml/BentoML/pull/2959

@benjamintanweihao made their first contribution in https://github.com/bentoml/BentoML/pull/2965

@creativedutchmen made their first contribution in https://github.com/bentoml/BentoML/pull/2974

@darioarias made their first contribution in https://github.com/bentoml/BentoML/pull/3027

@jeffthebear made their first contribution in https://github.com/bentoml/BentoML/pull/3032

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.5...v1.0.6
Source code(tar.gz)
Source code(zip)
bentoml-1.0.6-py3-none-any.whl(827.79 KB)
bentoml-1.0.6.tar.gz(733.70 KB)
v1.0.5(Aug 30, 2022)
🍱 BentoML v1.0.5 is released as a quick fix to a Yatai incompatibility introduced in v1.0.4.

The incompatibility manifests in the following error message when deploying a bento on Yatai. Upgrading BentoML to v1.0.5 will resolve the issue.
Error while finding module specification for 'bentoml._internal.server.cli.api_server' (ModuleNotFoundError: No module named 'bentoml._internal.server.cli')

The incompatibility resides in all Yatai versions prior to v1.0.0-alpha.*. Alternatively, upgrading Yatai to v1.0.0-alpha.* will also restore the compatibility with bentos built in v1.0.4.

Source code(tar.gz)
Source code(zip)
bentoml-1.0.5-py3-none-any.whl(774.05 KB)
bentoml-1.0.5.tar.gz(702.46 KB)
v1.0.4(Aug 26, 2022)
🍱 BentoML v1.0.4 is here!

Added support for explicit GPU mapping for runners. In addition to specifying the number of GPU devices allocated to a runner, we can map a list of device IDs directly to a runner through configuration.

runners: iris_clf_1: resources: nvidia.com/gpu: [2, 4] # Map device 2 and 4 to iris_clf_1 runner iris_clf_2: resources: nvidia.com/gpu: [1, 3] # Map device 1 and 3 to iris_clf_2 runner

Added SSL support for API server through both CLI and configuration.

--ssl-certfile TEXT SSL certificate file --ssl-keyfile TEXT SSL key file --ssl-keyfile-password TEXT SSL keyfile password --ssl-version INTEGER SSL version to use (see stdlib 'ssl' module) --ssl-cert-reqs INTEGER Whether client certificate is required (see stdlib 'ssl' module) --ssl-ca-certs TEXT CA certificates file --ssl-ciphers TEXT Ciphers to use (see stdlib 'ssl' module)

Added adaptive batching size histogram metrics, BENTOML_{runner}_{method}_adaptive_batch_size_bucket, for observability of batching mechanism details.

Added support OpenTelemetry OTLP exporter for tracing and configures the OpenTelemetry resource automatically if user has not explicitly configured it through environment variables. Upgraded OpenTelemetry python packages to version 0.33b0.

Added support for saving external_modules alongside with models in the save_model API. Saving external Python modules is useful for models with external dependencies, such as tokenizers, preprocessors, and configurations.

Enhanced Swagger UI to include additional documentation and helper links.

💡 We continue to update the documentation on every release to help our users unlock the full power of BentoML.

Checkout the adaptive batching documentation on how to leverage batching to improve inference latency and efficiency.

Checkout the runner configuration documentation on how to customize resource allocation for runners at run time.

🙌 We continue to receive great engagement and support from the BentoML community.

Shout out to @sptowey for their contribution on adding SSL support.

Shout out to @dbuades for their contribution on adding the OTLP exporter.

Shout out to @tweeklab for their contribution on fixing a bug on import_model in the MLflow framework.

What's Changed

refactor: cli to bentoml_cli by @sauyon in https://github.com/bentoml/BentoML/pull/2880

chore: remove typing-extensions dependency by @sauyon in https://github.com/bentoml/BentoML/pull/2879

fix: remove chmod install scripts by @aarnphm in https://github.com/bentoml/BentoML/pull/2830

fix: relative imports to lazy by @aarnphm in https://github.com/bentoml/BentoML/pull/2882

fix(cli): click utilities imports by @aarnphm in https://github.com/bentoml/BentoML/pull/2883

docs: add custom model runner example by @parano in https://github.com/bentoml/BentoML/pull/2885

qa: analytics unit tests by @aarnphm in https://github.com/bentoml/BentoML/pull/2878

chore: script for releasing quickstart bento by @parano in https://github.com/bentoml/BentoML/pull/2892

fix: pushing models from Bento instead of local modelstore by @parano in https://github.com/bentoml/BentoML/pull/2887

fix(containerize): supports passing multiple tags by @aarnphm in https://github.com/bentoml/BentoML/pull/2872

feat: explicit GPU runner mappings by @jjmachan in https://github.com/bentoml/BentoML/pull/2862

fix: setuptools doesn't include bentoml_cli by @bojiang in https://github.com/bentoml/BentoML/pull/2898

feat: Add SSL support for http api servers via bentoml serve by @sptowey in https://github.com/bentoml/BentoML/pull/2886

patch: ssl styling and default value check by @aarnphm in https://github.com/bentoml/BentoML/pull/2899

fix(scheduling): raise an error for invalid resources by @bojiang in https://github.com/bentoml/BentoML/pull/2894

chore(templates): cleanup debian dependency logic by @aarnphm in https://github.com/bentoml/BentoML/pull/2904

fix(ci): unittest failed by @bojiang in https://github.com/bentoml/BentoML/pull/2908

chore(cli): add figlet for CLI by @aarnphm in https://github.com/bentoml/BentoML/pull/2909

feat: codespace by @aarnphm in https://github.com/bentoml/BentoML/pull/2907

feat: use yatai proxy to upload/download bentos/models by @yetone in https://github.com/bentoml/BentoML/pull/2832

fix(scheduling): numpy worker environs are not taking effect by @bojiang in https://github.com/bentoml/BentoML/pull/2893

feat: Adaptive batching size histogram metrics by @ssheng in https://github.com/bentoml/BentoML/pull/2902

chore(swagger): include help links by @parano in https://github.com/bentoml/BentoML/pull/2927

feat(tracing): add support for otlp exporter by @dbuades in https://github.com/bentoml/BentoML/pull/2918

chore: Lock OpenTelemetry versions and add tracing metadata by @ssheng in https://github.com/bentoml/BentoML/pull/2928

revert: unminify CSS by @aarnphm in https://github.com/bentoml/BentoML/pull/2931

fix: importing mlflow:/ urls with no extra path info by @tweeklab in https://github.com/bentoml/BentoML/pull/2930

fix(yatai): make presigned_urls_deprecated optional by @bojiang in https://github.com/bentoml/BentoML/pull/2933

feat: add timeout option for bentoml runner config by @jjmachan in https://github.com/bentoml/BentoML/pull/2890

perf(cli): speed up by @aarnphm in https://github.com/bentoml/BentoML/pull/2934

chore: remove multipart IO descriptor warning by @ssheng in https://github.com/bentoml/BentoML/pull/2936

fix(json): revert eager check by @aarnphm in https://github.com/bentoml/BentoML/pull/2926

chore: remove --config flag to load the bentoml runtime config by @jjmachan in https://github.com/bentoml/BentoML/pull/2939

chore: update README messaging by @ssheng in https://github.com/bentoml/BentoML/pull/2937

fix: use a temporary file for file uploads by @sauyon in https://github.com/bentoml/BentoML/pull/2929

feat(cli): add CLI command to serve a runner by @bojiang in https://github.com/bentoml/BentoML/pull/2920

docs: Runner configuration for batching and resource allocation by @ssheng in https://github.com/bentoml/BentoML/pull/2941

bug: handle bad image file by @parano in https://github.com/bentoml/BentoML/pull/2942

chore(docs): earlier check for buildx by @aarnphm in https://github.com/bentoml/BentoML/pull/2940

fix(cli): helper message default values by @ssheng in https://github.com/bentoml/BentoML/pull/2943

feat(sdk): add external_modules option to save_model by @bojiang in https://github.com/bentoml/BentoML/pull/2895

fix(cli): component name regression by @ssheng in https://github.com/bentoml/BentoML/pull/2944

New Contributors

@sptowey made their first contribution in https://github.com/bentoml/BentoML/pull/2886

@dbuades made their first contribution in https://github.com/bentoml/BentoML/pull/2918

@tweeklab made their first contribution in https://github.com/bentoml/BentoML/pull/2930

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.3...v1.0.4
Source code(tar.gz)
Source code(zip)
bentoml-1.0.4-py3-none-any.whl(773.63 KB)
bentoml-1.0.4.tar.gz(702.05 KB)
v1.0.3(Aug 8, 2022)
🍱 BentoML v1.0.3 release has brought a list of performance and feature improvement.

Improved Runner IO performance by enhancing the underlying serialization and deserialization, especially in models with large input and output sizes. Our image input benchmark showed a 100% throughput improvement.

v1.0.2 🐌

v1.0.3 💨

Added support for specifying URLs to exclude from tracing.

Added support custom components for OpenAPI generation.

🙌 We continue to receive great engagement and support from the BentoML community.

Shout out to Ben Kessler for helping benchmarking performance.

Shout out to Jiew Peng Lim for adding the support for configuring URLs to exclude from tracing.

Shout out to Susana Bouchardet for add the support for JSON IO Descriptor to return empty response body.

Thanks to Keming and mplk for contributing their first PRs in BentoML.

What's Changed

chore(deps): bump actions/setup-node from 2 to 3 by @dependabot in https://github.com/bentoml/BentoML/pull/2846

fix: extend --cache-from consumption to python tuple by @anwang2009 in https://github.com/bentoml/BentoML/pull/2847

feat: add support for excluding urls from tracing by @jiewpeng in https://github.com/bentoml/BentoML/pull/2843

docs: update notice about buildkit by @aarnphm in https://github.com/bentoml/BentoML/pull/2837

chore: add CODEOWNERS by @aarnphm in https://github.com/bentoml/BentoML/pull/2842

doc(frameworks): tensorflow by @bojiang in https://github.com/bentoml/BentoML/pull/2718

feat: add support for specifying urls to exclude from tracing as a list by @jiewpeng in https://github.com/bentoml/BentoML/pull/2851

fix(configuration): merging global runner config to runner specific config by @jjmachan in https://github.com/bentoml/BentoML/pull/2849

fix: Setting status code and cookies by @ssheng in https://github.com/bentoml/BentoML/pull/2854

chore: README typo by @kemingy in https://github.com/bentoml/BentoML/pull/2859

chore: gallery links to bentoml/examples by @aarnphm in https://github.com/bentoml/BentoML/pull/2858

fix(runner): use pickle instead for multi payload parameters by @aarnphm in https://github.com/bentoml/BentoML/pull/2857

doc(framework): pytorch guide by @bojiang in https://github.com/bentoml/BentoML/pull/2735

docs: add missing output to Runner docs by @mplk in https://github.com/bentoml/BentoML/pull/2868

chore: fix push and load interop by @aarnphm in https://github.com/bentoml/BentoML/pull/2863

fix: Usage stats by @ssheng in https://github.com/bentoml/BentoML/pull/2876

fix: JSON(IODescriptor[JSONType]).to_http_response returns empty body when the response is None. by @sbouchardet in https://github.com/bentoml/BentoML/pull/2874

chore: Address comments in the #2874 by @ssheng in https://github.com/bentoml/BentoML/pull/2877

fix: debugger breaks on circus process by @aarnphm in https://github.com/bentoml/BentoML/pull/2875

feat: support custom components for OpenAPI generation by @aarnphm in https://github.com/bentoml/BentoML/pull/2845

New Contributors

@anwang2009 made their first contribution in https://github.com/bentoml/BentoML/pull/2847

@jiewpeng made their first contribution in https://github.com/bentoml/BentoML/pull/2843

@kemingy made their first contribution in https://github.com/bentoml/BentoML/pull/2859

@mplk made their first contribution in https://github.com/bentoml/BentoML/pull/2868

@sbouchardet made their first contribution in https://github.com/bentoml/BentoML/pull/2874

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.2...v1.0.3
Source code(tar.gz)
Source code(zip)
bentoml-1.0.3-py3-none-any.whl(761.33 KB)
bentoml-1.0.3.tar.gz(681.53 KB)
v1.0.2(Jul 29, 2022)
🍱 We have just released BentoML v1.0.2 with a number of features and bug fixes requested by the community.

Added support for custom model versions, e.g. bentoml.tensorflow.save_model("model_name:1.2.4", model).

Fixed PyTorch Runner payload serialization issue due to tensor not on CPU.

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first

Fixed Transformers GPU device assignment due to kwargs handling.

Fixed excessive Runner thread spawning issue under high load.

Fixed PyTorch Runner inference error due to saving tensor during inference mode.

RuntimeError: Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd.

Fixed Keras Runner error when the input has only a single element.

Deprecated the validate_json option in JSON IO descriptor and recommended specifying validation logic natively in the Pydantic model.

🎨 We added an examples directory and in it you will find interesting sample projects demonstrating various applications of BentoML. We welcome your contribution if you have a project idea and would like to share with the community.

💡 We continue to update the documentation on every release to help our users unlock the full power of BentoML.

Did you know BentoML service supports mounting and calling runners from custom FastAPI and Flask apps?

Did you know IO descriptor supports input and output validation of schema, shape, and data types?

What's Changed

chore: remove all --pre from documentation by @aarnphm in https://github.com/bentoml/BentoML/pull/2738

chore(framework): onnx guide minor improvements by @larme in https://github.com/bentoml/BentoML/pull/2744

fix(framework): fix how pytorch DataContainer convert GPU tensor by @larme in https://github.com/bentoml/BentoML/pull/2739

doc: add missing variable by @robsonpeixoto in https://github.com/bentoml/BentoML/pull/2752

chore(deps): cattrs>=22.1.0 in setup.cfg by @sugatoray in https://github.com/bentoml/BentoML/pull/2758

fix(transformers): kwargs and migrate to framework tests by @ssheng in https://github.com/bentoml/BentoML/pull/2761

chore: add type hint for run and async_run by @aarnphm in https://github.com/bentoml/BentoML/pull/2760

docs: fix typo in SECURITY.md by @parano in https://github.com/bentoml/BentoML/pull/2766

chore: use pypa/build as PEP517 backend by @aarnphm in https://github.com/bentoml/BentoML/pull/2680

chore(e2e): capture log output by @aarnphm in https://github.com/bentoml/BentoML/pull/2767

chore: more robust prometheus directory ensuring by @bojiang in https://github.com/bentoml/BentoML/pull/2526

doc(framework): add scikit-learn section to ONNX documentation by @larme in https://github.com/bentoml/BentoML/pull/2764

chore: clean up dependencies by @sauyon in https://github.com/bentoml/BentoML/pull/2769

docs: misc docs reorganize and cleanups by @parano in https://github.com/bentoml/BentoML/pull/2768

fix(io descriptors): finish removing init_http_response by @sauyon in https://github.com/bentoml/BentoML/pull/2774

chore: fix typo by @aarnphm in https://github.com/bentoml/BentoML/pull/2776

feat(model): allow custom model versions by @sauyon in https://github.com/bentoml/BentoML/pull/2775

chore: add watchfiles as bentoml dependency by @aarnphm in https://github.com/bentoml/BentoML/pull/2777

doc(framework): keras guide by @larme in https://github.com/bentoml/BentoML/pull/2741

docs: Update service schema and validation by @ssheng in https://github.com/bentoml/BentoML/pull/2778

doc(frameworks): fix pip package syntax by @larme in https://github.com/bentoml/BentoML/pull/2782

fix(runner): thread limiter doesn't take effect by @bojiang in https://github.com/bentoml/BentoML/pull/2781

feat: add additional env var configuring num of threads in Runner by @parano in https://github.com/bentoml/BentoML/pull/2786

fix(templates): sharing variables at template level by @aarnphm in https://github.com/bentoml/BentoML/pull/2796

bug: fix JSON io_descriptor validate_json option by @parano in https://github.com/bentoml/BentoML/pull/2803

chore: improve error message when failed importing user service code by @parano in https://github.com/bentoml/BentoML/pull/2806

chore: automatic cache action version update and remove stale bot by @aarnphm in https://github.com/bentoml/BentoML/pull/2798

chore(deps): bump actions/checkout from 2 to 3 by @dependabot in https://github.com/bentoml/BentoML/pull/2810

chore(deps): bump codecov/codecov-action from 2 to 3 by @dependabot in https://github.com/bentoml/BentoML/pull/2811

chore(deps): bump github/codeql-action from 1 to 2 by @dependabot in https://github.com/bentoml/BentoML/pull/2813

chore(deps): bump actions/cache from 2 to 3 by @dependabot in https://github.com/bentoml/BentoML/pull/2812

chore(deps): bump actions/setup-python from 2 to 4 by @dependabot in https://github.com/bentoml/BentoML/pull/2814

fix(datacontainer): pytorch to_payload should disable gradient by @aarnphm in https://github.com/bentoml/BentoML/pull/2821

fix(framework): fix keras single input edge case by @larme in https://github.com/bentoml/BentoML/pull/2822

fix(framework): keras GPU handling by @larme in https://github.com/bentoml/BentoML/pull/2824

docs: update custom bentoserver guide by @parano in https://github.com/bentoml/BentoML/pull/2809

fix(runner): bind limiter to runner_ref instead by @bojiang in https://github.com/bentoml/BentoML/pull/2826

fix(pytorch): inference_mode context is thead local by @bojiang in https://github.com/bentoml/BentoML/pull/2828

fix: address multiple tags for containerize by @aarnphm in https://github.com/bentoml/BentoML/pull/2797

chore: Add gallery projects under examples by @ssheng in https://github.com/bentoml/BentoML/pull/2833

chore: running formatter on examples folder by @aarnphm in https://github.com/bentoml/BentoML/pull/2834

docs: update security auth middleware by @g0nz4rth in https://github.com/bentoml/BentoML/pull/2835

fix(io_descriptor): DataFrame columns check by @alizia in https://github.com/bentoml/BentoML/pull/2836

fix: examples directory structure by @ssheng in https://github.com/bentoml/BentoML/pull/2839

revert: "fix: address multiple tags for containerize (#2797)" by @ssheng in https://github.com/bentoml/BentoML/pull/2840

New Contributors

@robsonpeixoto made their first contribution in https://github.com/bentoml/BentoML/pull/2752

@sugatoray made their first contribution in https://github.com/bentoml/BentoML/pull/2758

@g0nz4rth made their first contribution in https://github.com/bentoml/BentoML/pull/2835

@alizia made their first contribution in https://github.com/bentoml/BentoML/pull/2836

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.0...v1.0.1
Source code(tar.gz)
Source code(zip)
bentoml-1.0.2-py3-none-any.whl(756.17 KB)
bentoml-1.0.2.tar.gz(677.08 KB)
v1.0.0(Jul 13, 2022)
🍱 The wait is over. BentoML has officially released v1.0.0. We are excited to share with you the notable features improvements.

Introduced BentoML Runner, an abstraction for parallel model inference. It allows the compute intensive model inference step to scale separately from the transformation and business logic. The Runner is easily instantiated and invoked, but behind the scenes, BentoML is optimizing for micro-batching and fanning out inference if needed. Here’s a simple example of instantiating a Runner. Learn more about using runners.

Redesigned how models are saved, moved, and loaded with BentoML. We introduced new primitives which allow users to call a save_model() method which saves the model in the most optimal way based the recommended practices of the ML framework. The model is then stored in a flexible local repository where users can use “import” and “export” functionality to push and pull “finalized” models from remote locations like S3. Bentos can be built locally or remotely with these models. Once built, Yatai or bentoctl can easily deploy to the cloud service of your choice. Learn more about preparing models and building bentos.

Enhanced micro-batching capability with the new runner abstraction, batching is even more powerful. When incoming data is spread to different transformation processes, the runner will fan in inferences when inference is invoked. Multiple inputs will be batched into a single inference call. Most ML frameworks implement some form of vectorization which improves performance for multiple inputs at once. Our adaptive batching not only batches inputs as they are received, but also regresses the time of the last several groups of inputs in order to optimize the batch size and latency windows.

Improved reproducibility of the model by recording and locking the dependent library versions. We use the versions to package the correct dependencies so that the environment in which the model runs in production is identical to the environment it was trained in. All direct and transitive dependencies are recorded and deployed with the model when running in production. In our 1.0 version we now support Conda as well as several different ways to customize your pip packages when “building your Bento”. Learn more about building bentos.

Simplified Docker image creation during containerization to generate the right image for you depending on the features that you’ve decided to implement in your service. For example, if your runner specifies that it can run on a GPU, we will automatically choose the right Nvidia docker image as a base when containerizing your service. If needed, we also provide the flexibility to customize your docker image as well. Learn more about containerization.

Improved input and output validation with native type validation rules. Numpy and Pandas DataFrame can specify a static shape or even dynamically infer schema by providing sample data. The Pydantic schema that is produced per endpoint also integrates with our Swagger UI so that each endpoint is better documented for sharing. Learn more about service APIs and IO Descriptors.

⚠️ BentoML v1.0.0 is backward incompatible with v0.13.1. If you wish to stay on the v0.13.1 LTS version, please lock the dependency with bentoml==0.13.1. We have also prepared a migration guide from v0.13.1 to v1.0.0 to help with your project migration. We are committed to supporting the v0.13-LTS versions with critical bug fixes and security patches.

🎉 After years of seeing hundreds of model serving use cases, we are proud to present the official release of BentoML 1.0. We could not have done it without the growth and support of our community.
Source code(tar.gz)
Source code(zip)
bentoml-1.0.0-py3-none-any.whl(755.87 KB)
bentoml-1.0.0.tar.gz(678.30 KB)
v1.0.0-rc3(Jul 1, 2022)
We have just released BentoML 1.0.0rc3 with a number of highly anticipated features and improvements. Check it out with the following command!

$ pip install -U bentoml --pre

⚠️ BentoML will release the official 1.0.0 version next week and remove the need to use --pre tag to install BentoML versions after 1.0.0. If you wish to stay on the 0.13.1 LTS version, please lock the dependency with bentoml==0.13.1.

Added support for framework runners in the following ML frameworks.

fast.ai

CatBoost

ONNX

Added support for Huggingface Transformers custom pipelines.

Fixed a logging issue causing the api_server and runners to not generate error logs.

Optimized Tensorflow inference procedure.

Improved resource request configuration for runners.

Resource request can be now configured in the BentoML configuration. If unspecified, runners will be scheduled to best utilized the available system resources.

runners: resources: cpu: 8.0 nvidia.com/gpu: 4.0

Updated the API for custom runners to declare the types of supported resources.

import bentoml class MyRunnable(bentoml.Runnable): SUPPORTS_CPU_MULTI_THREADING = True # Deprecated SUPPORT_CPU_MULTI_THREADING SUPPORTED_RESOURCES = ("nvidia.com/gpu", "cpu") # Deprecated SUPPORT_NVIDIA_GPU ... my_runner = bentoml.Runner( MyRunnable, runnable_init_params={"foo": foo, "bar": bar}, name="custom_runner_name", ... )

Deprecated the API for specifying resources from the framework to_runner() and custom Runner APIs. For better flexibility at runtime, it is recommended to specifying resources through configuration.

What's Changed

fix(dependencies): require pyyaml>=5 by @sauyon in https://github.com/bentoml/BentoML/pull/2626

refactor(server): merge contexts; add yatai headers by @bojiang in https://github.com/bentoml/BentoML/pull/2621

chore(pylint): update pylint configuration by @sauyon in https://github.com/bentoml/BentoML/pull/2627

fix: Transformers NVIDIA_VISIBLE_DEVICES value type casting by @ssheng in https://github.com/bentoml/BentoML/pull/2624

fix: Server silently crash without logging exceptions by @ssheng in https://github.com/bentoml/BentoML/pull/2635

fix(framework): some GPU related fixes by @larme in https://github.com/bentoml/BentoML/pull/2637

tests: minor e2e test cleanup by @sauyon in https://github.com/bentoml/BentoML/pull/2643

docs: Add model in bentoml.pytorch.save_model() pytorch integration example by @AlexandreNap in https://github.com/bentoml/BentoML/pull/2644

chore(ci): always enable actions on PR by @sauyon in https://github.com/bentoml/BentoML/pull/2646

chore: updates ci by @aarnphm in https://github.com/bentoml/BentoML/pull/2650

fix(docker): templates bash heredoc should pass -ex by @aarnphm in https://github.com/bentoml/BentoML/pull/2651

feat: CatBoost integration by @yetone in https://github.com/bentoml/BentoML/pull/2615

feat: FastAI by @aarnphm in https://github.com/bentoml/BentoML/pull/2571

feat: Support Transformers custom pipeline by @ssheng in https://github.com/bentoml/BentoML/pull/2640

feat(framework): onnx support by @larme in https://github.com/bentoml/BentoML/pull/2629

chore(tensorflow): optimize inference procedure by @bojiang in https://github.com/bentoml/BentoML/pull/2567

fix(runner): validate runner names by @sauyon in https://github.com/bentoml/BentoML/pull/2588

fix(runner): lowercase runner names and add tests by @sauyon in https://github.com/bentoml/BentoML/pull/2656

style: github naming by @aarnphm in https://github.com/bentoml/BentoML/pull/2659

tests(framework): add new framework tests by @sauyon in https://github.com/bentoml/BentoML/pull/2660

docs: missing code annotation by @jjmachan in https://github.com/bentoml/BentoML/pull/2654

perf(templates): cache python installation via conda by @aarnphm in https://github.com/bentoml/BentoML/pull/2662

fix(ci): destroy the runner after init_local by @bojiang in https://github.com/bentoml/BentoML/pull/2665

fix(conda): python installation order by @aarnphm in https://github.com/bentoml/BentoML/pull/2668

fix(tensorflow): casting error on kwargs by @bojiang in https://github.com/bentoml/BentoML/pull/2664

feat(runner): implement resource configuration by @sauyon in https://github.com/bentoml/BentoML/pull/2632

New Contributors

@AlexandreNap made their first contribution in https://github.com/bentoml/BentoML/pull/2644

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.0-rc2...v1.0.0-rc3
Source code(tar.gz)
Source code(zip)
bentoml-1.0.0rc3-py3-none-any.whl(749.64 KB)
bentoml-1.0.0rc3.tar.gz(665.56 KB)
v1.0.0-rc2(Jun 22, 2022)
We have just released BentoML 1.0.0rc2 with an exciting lineup of improvements. Check it out with the following command!

$ pip install -U bentoml --pre

Standardized logging configuration and improved logging performance.

If imported as a library, BentoML will no longer configure logging explicitly and will respect the logging configuration of the importing Python process. To customize BentoML logging as a library, configurations can be added for the bentoml logger.

formatters: ... handlers: ... loggers: ... bentoml: handlers: [...] level: INFO ...

If started as a server, BentoML will continue to configure logging format and output to stdout at INFO level. All third party libraries will be configured to log at the WARNING level.

Added LightGBM framework support.

Updated model and bento creation timestamps CLI display to use the local timezone for better use experience, while timestamps in metadata will remain in the UTC timezone.

Improved the reliability of bento build with advanced options including base_image and dockerfile_template.

Beside all the exciting product work, we also started a blog at modelserving.com sharing our learnings gained from building BentoML and supporting the MLOps community. Checkout our latest blog [Breaking up with Flask & FastAPI: Why ML model serving requires a specialized framework] (share your thoughts with us on our LinkedIn post.

Lastly, a big shoutout to @Mike Kuhlen for adding the LightGBM framework support. 🥂

What's Changed

feat(cli): output times in the local timezone by @sauyon in https://github.com/bentoml/BentoML/pull/2572

fix(store): use >= for time checking by @sauyon in https://github.com/bentoml/BentoML/pull/2574

fix(build): use subprocess to call pip-compile by @sauyon in https://github.com/bentoml/BentoML/pull/2573

docs: fix wrong variable name in comment by @kim-sardine in https://github.com/bentoml/BentoML/pull/2575

feat: improve logging by @sauyon in https://github.com/bentoml/BentoML/pull/2568

fix(service): JsonIO doesn't return a pydantic model by @bojiang in https://github.com/bentoml/BentoML/pull/2578

fix: update conda env yaml file name and default channel by @parano in https://github.com/bentoml/BentoML/pull/2580

chore(runner): add shcedule shortcuts to runners by @bojiang in https://github.com/bentoml/BentoML/pull/2576

fix(cli): cli encoding error on Windows by @bojiang in https://github.com/bentoml/BentoML/pull/2579

fix(bug): Make model.with_options() additive by @ssheng in https://github.com/bentoml/BentoML/pull/2519

feat: dockerfile templates advanced guides by @aarnphm in https://github.com/bentoml/BentoML/pull/2548

docs: add setuptools to docs dependencies by @parano in https://github.com/bentoml/BentoML/pull/2586

test(frameworks): minor test improvements by @sauyon in https://github.com/bentoml/BentoML/pull/2590

feat: Bring LightGBM back by @mqk in https://github.com/bentoml/BentoML/pull/2589

fix(runner): pass init params to runnable by @sauyon in https://github.com/bentoml/BentoML/pull/2587

fix: propagate should be false by @aarnphm in https://github.com/bentoml/BentoML/pull/2594

fix: Remove starlette request log by @ssheng in https://github.com/bentoml/BentoML/pull/2595

fix: Bug fix for 2596 by @timc in https://github.com/bentoml/BentoML/pull/2597

chore(frameworks): update framework template with new checks and remove old framework code by @sauyon in https://github.com/bentoml/BentoML/pull/2592

docs: Update streaming.rst by @ssheng in https://github.com/bentoml/BentoML/pull/2605

bug: Fix Yatai client push bentos with model options by @ssheng in https://github.com/bentoml/BentoML/pull/2604

docs: allow running tutorial from docker by @parano in https://github.com/bentoml/BentoML/pull/2611

fix(model): lock attrs to >=21.1.0 by @bojiang in https://github.com/bentoml/BentoML/pull/2610

docs: Fix documentation links and formats by @ssheng in https://github.com/bentoml/BentoML/pull/2612

fix(model): load ModelOptions lazily by @sauyon in https://github.com/bentoml/BentoML/pull/2608

feat: install.sh for python packages by @aarnphm in https://github.com/bentoml/BentoML/pull/2555

fix/routing path by @aarnphm in https://github.com/bentoml/BentoML/pull/2606

qa: build config by @aarnphm in https://github.com/bentoml/BentoML/pull/2581

fix: invalid build option python_version="None" when base_image is used by @parano in https://github.com/bentoml/BentoML/pull/2623

New Contributors

@kim-sardine made their first contribution in https://github.com/bentoml/BentoML/pull/2575

@timc made their first contribution in https://github.com/bentoml/BentoML/pull/2597

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.0-rc1...v1.0.0rc2
Source code(tar.gz)
Source code(zip)
bentoml-1.0.0rc2-py3-none-any.whl(734.67 KB)
bentoml-1.0.0rc2.tar.gz(662.59 KB)
v1.0.0-rc1(Jun 8, 2022)
We are very excited to share that BentoML 1.0.0rc1 has just been released with a number of dev experience improvements and bug fixes.

Enabled users to run just bentoml serve from a project directory containing a bentofile.yaml build file.

Added request contexts and opening access to request and response headers.

Introduced new runner design to simplify creation of custom runners and framework to_runner API to simplify runner creation from model.

import numpy as np import bentoml from bentoml.io import NumpyNdarray iris_clf_runner = bentoml.sklearn.get("iris_clf:latest").to_runner() svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner]) @svc.api(input=NumpyNdarray(), output=NumpyNdarray()) def classify(input_series: np.ndarray) -> np.ndarray: result = iris_clf_runner.predict.run(input_series) return result

Introduced framework save_model, load_model, and to_runnable APIs to complement the new to_runner API in the following frameworks. Other ML frameworks are still being migrated to the new Runner API at the moment. Coming in the next release are Onnx, FastAI, MLFlow and Catboost.

PyTorch (TorchScript, PyTorch Lightning)

Tensorflow

Keras

Scikit Learn

XGBoost

Huggingface Transformers

Introduced a refreshing documentation website with more contents, see https://docs.bentoml.org/.

Enhanced bentoml containerize command to include the following capabilities.

Support multi-platform docker image build with Docker Buildx.

Support for defining Environment Variables in generated docker images.

Support for installing system packages via bentofile.yaml

Support for customizing the generated Dockerfile via user-provided templates.

A big shout out to all the contributors for getting us a step closer to the BentoML 1.0 release. 🎉

What's Changed

docs: update readme installation --pre flag by @parano in https://github.com/bentoml/BentoML/pull/2515

chore(ci): quit immediately for errors e2e tests by @bojiang in https://github.com/bentoml/BentoML/pull/2517

fix(ci): cover sync endpoints; cover cors by @bojiang in https://github.com/bentoml/BentoML/pull/2520

docs: fix cuda_version string value by @rapidrabbit76 in https://github.com/bentoml/BentoML/pull/2523

fix(framework): fix tf2 and keras class variable names by @larme in https://github.com/bentoml/BentoML/pull/2525

chore(ci): add more edge cases; boost e2e tests by @bojiang in https://github.com/bentoml/BentoML/pull/2521

fix(docker): remove backslash in comments by @aarnphm in https://github.com/bentoml/BentoML/pull/2527

fix(runner): sync remote runner uri schema with runner_app by @larme in https://github.com/bentoml/BentoML/pull/2531

fix: major bugs fixes about serving and GPU placement by @bojiang in https://github.com/bentoml/BentoML/pull/2535

chore(sdk): allowed single int value as the batch_dim by @bojiang in https://github.com/bentoml/BentoML/pull/2536

chore(ci): cover add_asgi_middleware in e2e tests by @bojiang in https://github.com/bentoml/BentoML/pull/2537

chore(framework): Add api_version for current implemented frameworks by @larme in https://github.com/bentoml/BentoML/pull/2522

doc(server): remove unnecessary svc.asgi lines by @bojiang in https://github.com/bentoml/BentoML/pull/2543

chore(server): lazy load meters; cover asgi app mounting in e2e test by @bojiang in https://github.com/bentoml/BentoML/pull/2542

feat: push runner to yatai by @yetone in https://github.com/bentoml/BentoML/pull/2528

style(runner): revert b14919db(factor out batching) by @bojiang in https://github.com/bentoml/BentoML/pull/2549

chore(ci): skip unsupported frameworks for now by @bojiang in https://github.com/bentoml/BentoML/pull/2550

doc: fix github action CI badge link by @parano in https://github.com/bentoml/BentoML/pull/2554

doc(server): fix header div by @bojiang in https://github.com/bentoml/BentoML/pull/2557

fix(metrics): filter out non-API endpoints in metrics by @parano in https://github.com/bentoml/BentoML/pull/2559

fix: Update SwaggerUI config by @parano in https://github.com/bentoml/BentoML/pull/2560

fix(server): wrong status code format in metrics by @bojiang in https://github.com/bentoml/BentoML/pull/2561

fix(server): metrics name issue under specify service names by @bojiang in https://github.com/bentoml/BentoML/pull/2556

fix: path for custom dockerfile templates by @aarnphm in https://github.com/bentoml/BentoML/pull/2547

feat: include env build options in bento.yaml by @parano in https://github.com/bentoml/BentoML/pull/2562

chore: minor fixes and docs change from QA by @parano in https://github.com/bentoml/BentoML/pull/2564

fix(qa): allow cuda_version when distro is None with default by @aarnphm in https://github.com/bentoml/BentoML/pull/2565

fix(qa): bento runner resource should limit to user provided configs by @parano in https://github.com/bentoml/BentoML/pull/2566

New Contributors

@rapidrabbit76 made their first contribution in https://github.com/bentoml/BentoML/pull/2523

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.0-rc0...v1.0.0-rc1
Source code(tar.gz)
Source code(zip)
bentoml-1.0.0rc1-py3-none-any.whl(782.15 KB)
bentoml-1.0.0rc1.tar.gz(691.11 KB)
v1.0.0-rc0(May 30, 2022)
This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org/

Key changes

What's Changed

chore(server): pass runner map through envvar by @bojiang in https://github.com/bentoml/BentoML/pull/2396

fix(server): init prometheus dir for standalone running by @bojiang in https://github.com/bentoml/BentoML/pull/2397

fix(#2316): --quiet should set logger level by @parano in https://github.com/bentoml/BentoML/pull/2399

feat: allow serving from project dir using import str from bentofile.yaml by @parano in https://github.com/bentoml/BentoML/pull/2398

chore(server): default values for entrypoints by @bojiang in https://github.com/bentoml/BentoML/pull/2401

fix(ci): use local bentoml in e2e test by @bojiang in https://github.com/bentoml/BentoML/pull/2403

docs: update advanced guide on building bentos by @splch in https://github.com/bentoml/BentoML/pull/2346

freeze model info and validate metadata entries by @sauyon in https://github.com/bentoml/BentoML/pull/2363

feat: store runners in bento manifest by @yetone in https://github.com/bentoml/BentoML/pull/2407

docs: fix readthedocs build issue by @parano in https://github.com/bentoml/BentoML/pull/2422

docs: update fossa license scan badge by @parano in https://github.com/bentoml/BentoML/pull/2420

fix(server): ensure distributed serving / serving on all platforms by @bojiang in https://github.com/bentoml/BentoML/pull/2414

Docs/core and guides by @timliubentoml in https://github.com/bentoml/BentoML/pull/2417

feat(internal): implement request contexts and check inference API types by @sauyon in https://github.com/bentoml/BentoML/pull/2375

fix: ensure compatibility with attrs 20.1.0 by @sauyon in https://github.com/bentoml/BentoML/pull/2423

chore(server): resource utils by @bojiang in https://github.com/bentoml/BentoML/pull/2370

fix: consistent naming accross docker and build config by @aarnphm in https://github.com/bentoml/BentoML/pull/2426

refactor: runner/runnable interface by @bojiang in https://github.com/bentoml/BentoML/pull/2432

feat(internal): add signature to Model and remove bentoml_version by @sauyon in https://github.com/bentoml/BentoML/pull/2433

runner refactor: Model to_runner/to_runnable interface by @parano in https://github.com/bentoml/BentoML/pull/2435

runnablehandle proposal by @sauyon in https://github.com/bentoml/BentoML/pull/2438

Runnable refactors and Model info update by @sauyon in https://github.com/bentoml/BentoML/pull/2439

Runner Resources implementation by @bojiang in https://github.com/bentoml/BentoML/pull/2436

refactor(runner): clean runner handle by @bojiang in https://github.com/bentoml/BentoML/pull/2441

chore(runner): make runnable scheduling traits constant by @bojiang in https://github.com/bentoml/BentoML/pull/2442

fix(runner): async run by @bojiang in https://github.com/bentoml/BentoML/pull/2443

added details for each paramters in options by @timliubentoml in https://github.com/bentoml/BentoML/pull/2429

Runners refactor: service & bento build changes by @parano in https://github.com/bentoml/BentoML/pull/2440

refactor: runner app by @bojiang in https://github.com/bentoml/BentoML/pull/2445

fix(internal): remove unused response_code field by @sauyon in https://github.com/bentoml/BentoML/pull/2444

Fix ModelInfo cattrs serialization issue by @parano in https://github.com/bentoml/BentoML/pull/2446

feat(internal): File I/O descriptor (re-)implementation by @sauyon in https://github.com/bentoml/BentoML/pull/2272

docs: Update Development.md by @kakokat in https://github.com/bentoml/BentoML/pull/2424

docs: Update DEVELOPMENT.md by @parano in https://github.com/bentoml/BentoML/pull/2452

refactor: datacontainer api changes with ndarray draft by @larme in https://github.com/bentoml/BentoML/pull/2449

feat(server): implement runner app by @sauyon in https://github.com/bentoml/BentoML/pull/2451

chore(runner): use low level nvml API by @bojiang in https://github.com/bentoml/BentoML/pull/2450

fix(server): fix container in runner app IPC by @sauyon in https://github.com/bentoml/BentoML/pull/2454

feat(runner): scheduling strategy by @bojiang in https://github.com/bentoml/BentoML/pull/2453

Fix: attribute error runner_type in bento serve by @parano in https://github.com/bentoml/BentoML/pull/2457

refactor(runner): update Pandas and Default DataContainer by @larme in https://github.com/bentoml/BentoML/pull/2455

chore(yatai): add version and org_uid to tracking by @aarnphm in https://github.com/bentoml/BentoML/pull/2458

chore(internal): fix typing by @sauyon in https://github.com/bentoml/BentoML/pull/2460

tests: fix runner1.0 branch unit tests by @parano in https://github.com/bentoml/BentoML/pull/2462

docs(model): update ModelSignature documentation by @sauyon in https://github.com/bentoml/BentoML/pull/2463

feat(xgboost): 1.0 XGBoost implementation by @sauyon in https://github.com/bentoml/BentoML/pull/2459

feat(frameworks): update framework template by @sauyon in https://github.com/bentoml/BentoML/pull/2461

fix(framework): fix Runnable closing over loop variable bug by @larme in https://github.com/bentoml/BentoML/pull/2466

chore: fix types by @sauyon in https://github.com/bentoml/BentoML/pull/2468

chore: make ModelInfo yaml backwards compatible by @parano in https://github.com/bentoml/BentoML/pull/2470

fix(runner): fix bugs in runner batching by @sauyon in https://github.com/bentoml/BentoML/pull/2469

docs: re-organize docs for 1.0rc release by @parano in https://github.com/bentoml/BentoML/pull/2474

chore: add furo to docs-requirements.txt by @aarnphm in https://github.com/bentoml/BentoML/pull/2475

feat(ci): re-enable e2e tests by @bojiang in https://github.com/bentoml/BentoML/pull/2456

chore: add runners-1.0 to CI by @aarnphm in https://github.com/bentoml/BentoML/pull/2431

fix(runner): remove unnecessary runnable_self arugment by @larme in https://github.com/bentoml/BentoML/pull/2482

docs: update for xgboost doc by @kakokat in https://github.com/bentoml/BentoML/pull/2481

test(runner): update DataContainer tests by @larme in https://github.com/bentoml/BentoML/pull/2476

feat: buildx backend for bentoml containerize by @aarnphm in https://github.com/bentoml/BentoML/pull/2483

refactor(runner): simplify batch dim by @bojiang in https://github.com/bentoml/BentoML/pull/2484

fix(runner): removing inspect by @bojiang in https://github.com/bentoml/BentoML/pull/2485

fix(server): fix development_mode in the config by @bojiang in https://github.com/bentoml/BentoML/pull/2488

fix(server): fix containerize subcommand by @bojiang in https://github.com/bentoml/BentoML/pull/2490

fix(tests): update model unit tests for new batch_dim type by @sauyon in https://github.com/bentoml/BentoML/pull/2487

refactor(server): supervise dev server with circus by @bojiang in https://github.com/bentoml/BentoML/pull/2489

fix(server): correctly use starlette APIs by @sauyon in https://github.com/bentoml/BentoML/pull/2486

fix(internal): revert typing strictness changes by @sauyon in https://github.com/bentoml/BentoML/pull/2494

feat: Transformers framework runner implementation 1.0 by @ssheng in https://github.com/bentoml/BentoML/pull/2479

Runners 1.0 tensorflow_v2 impl by @larme in https://github.com/bentoml/BentoML/pull/2430

Testing framework and runner app update by @sauyon in https://github.com/bentoml/BentoML/pull/2500

refactor(framework): update keras to runners-1.0 branch by @larme in https://github.com/bentoml/BentoML/pull/2498

fix: swagger UI bundle update by @parano in https://github.com/bentoml/BentoML/pull/2501

refactor: Dockerfile generation by @aarnphm in https://github.com/bentoml/BentoML/pull/2473

feat(internal): add save_format_version for BentoML model by @larme in https://github.com/bentoml/BentoML/pull/2502

Revert "feat(internal): add save_format_version for BentoML model" by @larme in https://github.com/bentoml/BentoML/pull/2504

docs: Update documentation for 1.0 by @parano in https://github.com/bentoml/BentoML/pull/2506

fix(framework): adapt changes for Tensorflow DataContainer by @larme in https://github.com/bentoml/BentoML/pull/2507

feat(framework): pytorch by @bojiang in https://github.com/bentoml/BentoML/pull/2499

docs: misc docs updates by @parano in https://github.com/bentoml/BentoML/pull/2511

refactor(framework): move _mapping for tf2 and keras by @larme in https://github.com/bentoml/BentoML/pull/2510

chore: unify circus logs to bentoml + fix circus config parsing for api_server by @aarnphm in https://github.com/bentoml/BentoML/pull/2509

chore: add release candidate backwards compatibility warnings by @parano in https://github.com/bentoml/BentoML/pull/2512

fix: revert pining pip version for tests by @bojiang in https://github.com/bentoml/BentoML/pull/2514

Merge 1.0 development branch by @parano in https://github.com/bentoml/BentoML/pull/2513

New Contributors

@splch made their first contribution in https://github.com/bentoml/BentoML/pull/2346

@kakokat made their first contribution in https://github.com/bentoml/BentoML/pull/2424

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.0-a7...v1.0.0-rc0
Source code(tar.gz)
Source code(zip)
v1.0.0-a7(Apr 6, 2022)
This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org/

Key changes

BREAKING CHANGE: Default serving port has been changed to 3000

This is due to an issue with new MacOS where 5000 port is always in use.

This will affect default serving port when deploying with Docker. Existing 1.0 preview release users will need to either change deployment config to use port 3000, or pass --port 5000 to the container command, in order to use the previous default port setting.

New import/export API

Users can now export models and bentos from local store to a standalone file

Lean more via bentoml export --help and bentoml models export --help

What's Changed

docs(cli): clean up cli docstrings by @larme in https://github.com/bentoml/BentoML/pull/2342

fix: YataiClientContext initialization missing email argument by @yetone in https://github.com/bentoml/BentoML/pull/2348

chore(ci): run e2e tests in docker by @bojiang in https://github.com/bentoml/BentoML/pull/2349

style: minor typing fixes by @bojiang in https://github.com/bentoml/BentoML/pull/2350

Refactor model save to include labels, metadata and custom_objects by @larme in https://github.com/bentoml/BentoML/pull/2351

fix: better error message in python < 3.9 by @larme in https://github.com/bentoml/BentoML/pull/2352

refactor(internal): move Tag out of types by @sauyon in https://github.com/bentoml/BentoML/pull/2358

fix(frameworks): use bentoml.models.create instead of Model.create by @sauyon in https://github.com/bentoml/BentoML/pull/2360

fix: add change_global_cwd params to bentoml.load by @parano in https://github.com/bentoml/BentoML/pull/2356

fix: import model from S3 by @almirb in https://github.com/bentoml/BentoML/pull/2361

fix: extract correct desired Python version by @matheusMoreno in https://github.com/bentoml/BentoML/pull/2362

fix(service): fix load_bento arguments position when retrying after import_service failed by @larme in https://github.com/bentoml/BentoML/pull/2369

fix: cgroups for cpu should be 1 when <= 0 by @aarnphm in https://github.com/bentoml/BentoML/pull/2372

chore: lock rich to be >=11.2.0 by @aarnphm in https://github.com/bentoml/BentoML/pull/2378

internal: usage tracking by @aarnphm in https://github.com/bentoml/BentoML/pull/2318

feat(internal): try to correct missing latest files by @sauyon in https://github.com/bentoml/BentoML/pull/2383

chore: cleanup 3.6 metadata by @aarnphm in https://github.com/bentoml/BentoML/pull/2388

chore: remove unecessary model_store by @aarnphm in https://github.com/bentoml/BentoML/pull/2384

fix: not lock typing_extensions to fix rich and pytorch lightning requirements by @aarnphm in https://github.com/bentoml/BentoML/pull/2390

bug: fix CLI command delete with latest tag by @parano in https://github.com/bentoml/BentoML/pull/2391

feat: improve list CLI command output by @parano in https://github.com/bentoml/BentoML/pull/2392

fix: update yatai client to work with BentoInfo changes by @parano in https://github.com/bentoml/BentoML/pull/2393

fix(server): duplicate metrics by @bojiang in https://github.com/bentoml/BentoML/pull/2394

New Contributors

@almirb made their first contribution in https://github.com/bentoml/BentoML/pull/2361

@matheusMoreno made their first contribution in https://github.com/bentoml/BentoML/pull/2362

Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.0-a6...v1.0.0-a7
Source code(tar.gz)
Source code(zip)
v1.0.0-a6(Mar 7, 2022)

This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org/
Source code(tar.gz)
Source code(zip)
v1.0.0-a5(Mar 1, 2022)

This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org/
Source code(tar.gz)
Source code(zip)
v1.0.0-a4(Feb 15, 2022)

This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org/
Source code(tar.gz)
Source code(zip)
v1.0.0-a3(Jan 28, 2022)

This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org
Source code(tar.gz)
Source code(zip)
v1.0.0-a2(Jan 20, 2022)

This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org
Source code(tar.gz)
Source code(zip)
v0.13.1(Jul 13, 2021)
Detailed Changelog: https://github.com/bentoml/BentoML/compare/v0.13.0...v0.13.1

Overview

BentoML 0.13.1 is a minor release containing mostly bug fixes and internal changes.

Changelog

feat: SLO - API server max latency (#1583)

feat: Save OpenAPI Spec Json in BentoML bundle (#1686)

fix: BentoService loading user-provided env.yml file in runtime (#1695)

fix: BentoArtifact initialize with parameter issue (#1696)

fix: Use $BENTOML_PORT as Dockerfile default port (#1706)

fix: Fix missing s3_endpoint_url (#1708)

fix: Wrap request in sagemaker model_server (#1716)

refactor: Add deprecation warnings for deployment CLI commands (#1718)

refactor replace di framework (#1697)

ci: PaddlePaddle Intergration test (#1739)

Source code(tar.gz)
Source code(zip)
BentoML-0.13.1-py3-none-any.whl(3.82 MB)
BentoML-0.13.1.tar.gz(3.49 MB)
v0.13.0(Jun 16, 2021)
Detailed Changelog: https://github.com/bentoml/BentoML/compare/v0.12.1...v0.13.0

Overview

BentoML 0.13.0 is here! It's a release packed with lots of new features and important bug fixes. We encourage all users to upgrade.

❤️ Contributors

Thanks to @aarnphm @andrewsi-z @larme @gregd33 @bojiang @ssheng @henrywu2019 @yubozhao @jack1902 @illy @sencenan @parano @soeque1 @elia-secchi @Shumpei-Kikuta @StevenReitsma @dsherry @AnvithaGadagi @joaquincabezas for the contributions!

📢 Breaking Changes

Configuration revamp

The bentoml config CLI command has been fully deprecated in this release

New config system was introduced for configuring BentoML api server, yatai, tracing and more (#1543, #1595, #1615, #1667)

Documentation: https://docs.bentoml.org/en/latest/guides/configuration.html

Add --do-not-track CLI option and environment variable (#1534)

Deprecated --enable-microbatch flag

Use the @api(batch=True|False) option to choose between microbatch enabled API vs. non-batch API

For API defined in batch mode but requires serving online traffic without batching behavior, use --mb-max-batch-size=1 instead

🎉 New Features

GPU Support

GPU serving guide https://docs.bentoml.org/en/latest/guides/gpu_serving.html

Added docker base image optimized for GPU serving (#1653)

Add support for EvalML (#1603)

Add support for ONNX-MLIR model (#1545)

Add full CORS support for bento API server (#1576)

Monitoring with Prometheus Gudie

https://docs.bentoml.org/en/latest/guides/monitoring.html

Optimize BentoML import delay (#1608)

Support upload/download for Yatai backed by local file system storage (#1586)

🐞 Bug Fixes and Other Changes

Add ensure_ascii option in JsonOutput (#1578, #1580)

Fix StringInput with batch=True API (#1581)

Fix docs.json link in API server UI (#1633)

Fix uploading to remote path (#1601)

Fix label missing after uploading Bento to remote Yatai (#1598)

Fixes /metrics endpoints with serve-gunicorn (#1666)

Upgrade conda to 4.9.2 in default docker base image (#1525)

Internal:

Add locking mechanism to yatai server (#1567)

refactor: YataiService Store Abstraction (#1541)

Source code(tar.gz)
Source code(zip)
BentoML-0.13.0-py3-none-any.whl(4.77 MB)
BentoML-0.13.0.tar.gz(3.49 MB)
v0.12.1(Apr 15, 2021)
Detailed Changelog: https://github.com/bentoml/BentoML/compare/v0.12.0...v0.12.1

PaddlePaddle Support

We are thrilled to announce that BentoML now fully supports the PaddlePaddle framework from Baidu. Users can easily serve their own models created with Paddle via Paddle Inference and serve pre-trained models from PaddleHub, which contains over 300 production-grade pre-trained models.

Tutorial notebooks for using BentoML with PaddlePaddle:

Paddle Inference: https://github.com/bentoml/gallery/blob/master/paddlepaddle/LinearRegression/LinearRegression.ipynb

PaddleHub: https://github.com/bentoml/gallery/blob/master/paddlehub/image-segmentation/image-segmentation.ipynb

See the announcement and release note from PaddleHub: https://github.com/PaddlePaddle/PaddleHub/releases/tag/v2.1.0

Thank you @cqvu @deehrlic for contributing this feature in BentoML.

Bug fixes

#1532 Fix zipkin module not found exception

#1557 Fix aiohttp import issue on Windows

#1566 Fix bundle load in docker when using the requirement_txt_file @env parameter

Source code(tar.gz)
Source code(zip)
BentoML-0.12.1-py3-none-any.whl(3.63 MB)
BentoML-0.12.1.tar.gz(3.31 MB)

v0.12.0(Mar 23, 2021)

Detailed Changelog: https://github.com/bentoml/BentoML/compare/v0.11.0...v0.12.0

New Features

Breaking Change: Default Model Worker count is set to one #1454
- Please use the --worker CLI argument for specifying a number of workers of your deployment
- For heavy production workload, we recommend experiment with different worker count and benchmark test your BentoML service in API server in your target hardware to get a better understanding of the model server performance
Breaking Change: Micro-batching layer(Marshal Server) is now enabled by default #1498
- For Inference APIs defined withbatch=True, this will enable micro-batching behavior when serving. User can disable with the --diable-microbatch flag
- For Inference APIs with batch=False, API requests are now being queued in Marshal and then forwarded to the model backend server
New: Use non-root user in BentoML's API server docker image
New: API/CLI for bulk delete of BentoML bundle in Yatai #1313
Easier dependency management for PyPI and conda
- Support all pip install options via a user-provided requirements.txt file
- Breaking Change: when requirements_txt_file option is in use, other pip package options will be ignored
- conda_override_channels option for using explicit conda channel for conda dependencies: https://docs.bentoml.org/en/latest/concepts.html#conda-packages

Better support for pip install options and remote python dependencies #1421

Let BentoML do it for you:

@bentoml.env(infer_pip_packages=True)

use the existing "pip_packages" API, to specify list of dependencies:

@bentoml.env(
    pip_packages=[
      'scikit-learn',
      'pandas @https://github.com/pypa/pip/archive/1.3.1.zip',
    ]
)

use a requirements.txt file to specify all dependencies:

@bentoml.env(requirements_txt_file='./requirements.txt')

In the ./requirements.txt file, all pip install options can be used:

#
# These requirements were autogenerated by pipenv
# To regenerate from the project's Pipfile, run:
#
#    pipenv lock --requirements
#

-i https://pypi.org/simple

scikit-learn==0.20.3
aws-sam-cli==0.33.1
psycopg2-binary
azure-cli
bentoml
pandas @https://github.com/pypa/pip/archive/1.3.1.zip

https://[username[:password]@]pypi.company.com/simple
https://user:he%2F%[email protected]

git+https://myvcs.com/some_dependency@sometag#egg=SomeDependency

API/CLI for bulk delete #1313

CLI command for delete:

# Delete all saved Bento with specific name
bentoml delete --name IrisClassifier
bentoml delete --name IrisClassifier -y # do it without confirming with user
bentoml delete --name IrisClassifier --yatai-url=yatai.mycompany.com # delete in remote Yatai

# Delete all saved Bento with specific tag
bentoml delete --labels "env=dev"
bentoml delete --labels "env=dev, user=foobar"
bentoml delete --labels "key1=value1, key2!=value2, key3 In (value3, value3a), key4 DoesNotExist"

# Delete multiple saved Bento by their name:version tag
bentoml delete --tag "IrisClassifier:v1, MyService:v3, FooBar:20200103_Lkj81a"

# Delete all
bentoml delete --all

Yatai Client Python API:

yc = get_yatai_client() # local Yatai
yc = get_yatai_client('remote.yatai.com:50051') # remoate Yatai

yc.repository.delete(prune, labels, bento_tag, bento_name, bento_version, require_confirm)

"""
Params:
prune: boolean, Set true to delete all bento services
bento_tag: Bento tag
labels: string, label selector to filter bento services to delete
bento_name: string 
bento_version: string, 
require_confirm: boolean require user confirm interactively in CLI
"""

#1334 Customize route of an API endpoint

@env(infer_pip_packages=True)
@artifacts([...])
class MyPredictionService(BentoService)

   @api(route="/my_url_route/foo/bar", batch=True, input=DataframeInput())
   def predict(self, df):
     # instead of "/predict", the URL for this API endpoint will be "/my_url_route/foo/bar"
     ...

#1416 Support custom authentication header in Yatai gRPC server
#1284 Add health check endpoint to Yatai web server
#1409 Fix Postgres disconnect issue with Yatai server

Source code(tar.gz)
Source code(zip)
BentoML-0.12.0-py3-none-any.whl(4.61 MB)
BentoML-0.12.0.tar.gz(3.30 MB)

v0.11.0(Jan 14, 2021)
New Features

Detailed Changelog: https://github.com/bentoml/BentoML/compare/v0.10.1...v0.11.0

Interactively start and stop Model API Server during development

A new API was introduced in 0.11.0 for users to start and test an API server while developing their BentoService class:

service = MyPredictionService() service.pack("model", model) # Start an API model server in the background service.start_dev_server(port=5000) # Send test request to the server or open the URL in browser requests.post(f'http://localhost:5000/predict', data=review, headers=headers) # Stop the dev server service.stop_dev_server() # Modify code and repeat ♻️

Here's an example notebook showcasing this new feature.

More PyTorch eco-system Integrations

PyTorch JIT traced model support #1293

PyTorch Lightening support #1293

Detectron2 support #1272

Logging is fully customizable now!

Users can now use one single YAML file to customize the logging behavior in BentoML, including the prediction logs and feedback logs.

https://docs.bentoml.org/en/latest/guides/logging.html

Two new configs are also introduced for quickly turning on/off console logging and file logging:

https://github.com/bentoml/BentoML/blob/v0.11.0/bentoml/configuration/default_bentoml.cfg#L29

[logging] console_logging_enabled = true file_logging_enabled = true

If you are not sure how this config works, here's a new guide on how BentoML's configuration works: https://docs.bentoml.org/en/latest/guides/configuration.html

More model management APIs

All model management CLI and Yatai client python API now supports the yatai_url parameter, making it easy to interact with a remote YataiService, for centrally manage all your BentoML packaged ML models:

Support bundling zipimport modules #1261

Bundling zipmodules with BentoML is possible now with this newly added API:

@bentoml.env(zipimport_archives=['nested_zipmodule.zip']) @bentoml.artifacts([SklearnModelArtifact('model')]) class IrisClassifier(bentoml.BentoService): ...

BentoML also manages the sys.path when loading a saved BentoService with zipimport archives, making sure the zip modules can be imported in user code.

Announcements

Monthly Community Meeting

Thank you again for everyone coming to the first community meeting this week! If you are not invited to the community meeting calendar yet, make sure to join it here: https://github.com/bentoml/BentoML/discussions/1396

Hiring

BentoML team is hiring multiple Software Engineer roles to help build the future of this open-source project and the business behind it - we are looking for someone with experience in one of the following areas: ML infrastructure, backend systems, data engineering, SRE, full-stack, and technical writing. Feel free to pass along the message to anyone you know who might be interested, we'd really appreciate that!
Source code(tar.gz)
Source code(zip)
BentoML-0.11.0-py3-none-any.whl(3.62 MB)
BentoML-0.11.0.tar.gz(3.29 MB)
v0.10.1(Dec 10, 2020)

Bug Fix

This is a minor release containing one bug fix for issue #1318, where the docker build process for the BentoML API model server was broken due to an error in the init shell script. The issue has been fixed in #1319 and included in this new release.

The reason our integration tests did not catch this issue was due to the fact that we are bundling the "dirty" BentoML installation in the generated docker file in the development environment and CI/Test environment, whereas the production release version of BentoML, uses the BentoML installed from PyPI. And the issue in #1318 was an edge case that can be triggered only when using the released version of BentoML and published docker image. We are investigating ways to run all our integration tests with a preview release before making a final release, as part of our QA process, which should help us prevent this type of bugs from getting into final releases in the future.
Source code(tar.gz)
Source code(zip)
BentoML-0.10.1-py3-none-any.whl(3.59 MB)
BentoML-0.10.1.tar.gz(3.27 MB)
v0.10.0(Dec 7, 2020)
New Features & Improvements

Improved Model Management APIs #1126 #1241 by @yubozhao Python APIs for model management:

from bentoml.yatai.client import get_yatai_client bento_service.save() # Save and register the bento service locally # push to save bento service to remote yatai service. yc = get_yatai_client('http://staging.yatai.mycompany.com:50050') yc.repository.push( f'{bento_service.name}:{bento_service.version}', ) # Pull bento service from remote yatai server and register locally yc = get_yatai_client('http://staging.yatai.mycompany.com:50050') yc.repository.pull( 'bento_name:version', ) #delete in local yatai yatai_client = get_yatai_client() yatai_client.repository.delete('name:version') # delete in batch by labels yatai_client = get_yatai_client() yatai_client.prune(labels='cicd=failed, framework In (sklearn, xgboost)') # Get bento service metadata yatai_client.repository.get('bento_name:version', yatai_url='http://staging.yatai.mycompany.com:50050') # List bento services by label yatai_client.repositorylist(labels='label_key In (value1, value2), label_key2 Exists', yatai_url='http://staging.yatai.mycompany.com:50050')

New CLI commands for model management: Push local bento service to remote yatai service:

$ bentoml push bento_service_name:version --yatai-url http://staging.yatai.mycompany.com:50050

Added --yatai-url option for the following CLI commands to interact with remote yatai service directly:

bentoml get bentoml list bentoml delete bentoml retrieve bentoml run bentoml serve bentoml serve-gunicorn bentoml info bentoml containerize bentoml open-api-spec

Model Metadata API #1179 shoutout to @jackyzha0 for designing and building this feature! Ability to save additional metadata for any artifact type, e.g.:

model_metadata = { 'k1': 'v1', 'job_id': 'ABC', 'score': 0.84, 'datasets': ['A', 'B'], } svc.pack("model", test_model, metadata=model_metadata) svc.save_to_dir(str(tmpdir)) loaded_service = bentoml.load(str(tmpdir)) print(loaded_service.artifacts.get('model').metadata)

Improved Tensorflow Support, by @bojiang

Make the packed model behave the same as after the model was saved and loaded again #1231

TfTensorOutput raise TypeError when micro-batch enabled #1251

Opt auto casting of TfSavedModelArtifact & clearer feedback

Improve KerasModelArtifact to work with tf2 #1295

Automated AWS EC2 deployment #1160 massive 3800+ line PR by @mayurnewase

Create auto-scaling endpoint on AWS EC2 with just one command, see documentation here https://docs.bentoml.org/en/latest/deployment/aws_ec2.html

Add MXNet Gluon support #1264 by @liusy182

Enable input & output data capture in Sagemaker deployment #1189 by @j-hartshorn

Faster docker image rebuild when only model artifacts are updated #1199

Support URL location prefix in yatai-service gRPC/Web server #1063 #1184

Support relative path for showing Swagger UI page in the model server #1207

Add onnxruntime gpu as supported backend #1213

Add option to disable swagger UI #1244 by @liusy182

Add label and artifact metadata display to yatai web ui #1249

Make bentoml module executable #1274

python -m bentoml <subcommand>

Allow setting micro batching parameters from CLI #1282 by @jsemric

bentoml serve-gunicorn --enable-microbatch --mb-max-latency 3333 --mb-max-batch-size 3333 IrisClassifier:20201202154246_C8DC0A

Bug fixes

Allow deleting bento that was previously deleted with the same name and version #1211

Construct docker API client from env #1233

Pin-down SqlAlchemy version #1238

Avoid potential TypeError in batching server #1252

Fix inference API docstring override by default #1302

Documentation

Add examples of queries with requests for adapters #1202

Update import paths to reflect fastai2->fastai rename #1227

Add model artifact metadata information to the core concept page #1259

Update adapters.rst to include new input adapters #1269

Update quickstart guide #1262

Docs for gluon support #1271

Fix CURL commands for posting files in input adapters doc string #1307

Internal, CI, and Tests

Fix installing bundled pip dependencies in Azure and Sagemaker deployments #1214 (affects bentoml developers only)

Add Integration test for Fasttext #1221

Add integration test for spaCy #1236

Add integration test for models using tf native API #1245

Add tests for run_api_server_docker_container microbatch #1247

Add integration test for LightGBM #1243

Update Yatai web ui node dependencies version #1256

Add integration test for bento management #1263

Add yatai server integration tests to Github CI #1265

Update e2e yatai service tests #1266

Include additional information for EC2 test #1270

Refactor CI for TensorFlow2 #1277

Make tensorflow integration tests run faster #1278

Fix overrided protobuf version in CI #1286

Add integration test for tf1 #1285

Refactor yatai service integration test #1290

Refactor Saved Bundle Loader #1291

Fix flaky yatai service integration tests #1298

Refine KerasModelArtifact & its integration test #1295

Improve API server integration tests #1299

Add integration tests for ragged_tensor #1303

Announcements

We have started using Github Projects feature to track roadmap items for BentoML, you can find it here: https://github.com/bentoml/BentoML/projects/1

We are hiring senior engineers and a lead developer advocate to join our team, let us know if you or someone you know might be interested 👉 [email protected]

Apologize for the long wait between 0.9 and 0.10 releases, we are getting back to doing our bi-weekly release schedule now! We need help with documenting new features, writing release notes as well as QA new release before it went out, let us know if you'd be interested in helping out!

Thank you everyone for contributing to this release! @j-hartshorn @withsmilo @yubozhao @bojiang @changhw01 @mayurnewase @telescopic @jackyzha0 @pncnmnp @kishore-ganesh @rhbian @liusy182 @awalvie @cathy-kim @jsemric 🎉🎉🎉
Source code(tar.gz)
Source code(zip)
BentoML-0.10.0-py3-none-any.whl(3.59 MB)
BentoML-0.10.0.tar.gz(3.27 MB)
v0.9.2(Oct 17, 2020)
Bug fixes

Fixed retrieving BentoService from S3/MinIO based storage #1174 https://github.com/bentoml/BentoML/pull/1175

Fixed an issue when using inference API function optional parameter tasks / task #1171

Source code(tar.gz)
Source code(zip)
BentoML-0.9.2-py3-none-any.whl(3.56 MB)
BentoML-0.9.2.tar.gz(3.21 MB)
v0.9.1(Oct 1, 2020)

A minor release with a bug fix

0.9.1 fixed an issue when using the requirements_txt_file parameter in @env definition, API server fails to start in a docker container. See more details in #1153.
Source code(tar.gz)
Source code(zip)
BentoML-0.9.1-py3-none-any.whl(2.88 MB)
BentoML-0.9.1.tar.gz(2.54 MB)