BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models.

Overview

Model Serving Made Easy Tweet

BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models.

  • Supports multiple ML frameworks, including Tensorflow, PyTorch, Keras, XGBoost and more
  • Cloud native deployment with Docker, Kubernetes, AWS, Azure and many more
  • High-Performance online API serving and offline batch serving
  • Web dashboards and APIs for model registry and deployment management

BentoML bridges the gap between Data Science and DevOps. By providing a standard interface for describing a prediction service, BentoML abstracts away how to run model inference efficiently and how model serving workloads can integrate with cloud infrastructures. See how it works!

Join our community on Slack ๐Ÿ‘ˆ


pypi status Downloads Actions Status Documentation Status join BentoML Slack

Documentation

BentoML documentation: https://docs.bentoml.org/

Key Features

Production-ready online serving:

  • Support multiple ML frameworks including PyTorch, TensorFlow, Scikit-Learn, XGBoost, and many more
  • Containerized model server for production deployment with Docker, Kubernetes, OpenShift, AWS ECS, Azure, GCP GKE, etc
  • Adaptive micro-batching for optimal online serving performance
  • Discover and package all dependencies automatically, including PyPI, conda packages and local python modules
  • Serve compositions of multiple models
  • Serve multiple endpoints in one model server
  • Serve any Python code along with trained models
  • Automatically generate REST API spec in Swagger/OpenAPI format
  • Prediction logging and feedback logging endpoint
  • Health check endpoint and Prometheus /metrics endpoint for monitoring

Standardize model serving and deployment workflow for teams:

  • Central repository for managing all your team's prediction services via Web UI and API
  • Launch offline batch inference job from CLI or Python
  • One-click deployment to cloud platforms including AWS EC2, AWS Lambda, AWS SageMaker, and Azure Functions
  • Distributed batch or streaming serving with Apache Spark
  • Utilities that simplify CI/CD pipelines for ML
  • Automated offline batch inference job with Dask (roadmap)
  • Advanced model deployment for Kubernetes ecosystem (roadmap)
  • Integration with training and experimentation management products including MLFlow, Kubeflow (roadmap)

ML Frameworks

Deployment Options

Be sure to check out deployment overview doc to understand which deployment option is best suited for your use case.

Introduction

BentoML provides APIs for defining a prediction service, a servable model so to speak, which includes the trained ML model itself, plus its pre-processing, post-processing code, input/output specifications and dependencies. Here's what a simple prediction service look like in BentoML:

import pandas as pd

from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import DataframeInput, JsonOutput
from bentoml.frameworks.sklearn import SklearnModelArtifact

# BentoML packages local python modules automatically for deployment
from my_ml_utils import my_encoder

@env(infer_pip_packages=True)
@artifacts([SklearnModelArtifact('my_model')])
class MyPredictionService(BentoService):
    """
    A simple prediction service exposing a Scikit-learn model
    """

    @api(input=DataframeInput(), output=JsonOutput(), batch=True)
    def predict(self, df: pd.DataFrame):
        """
        An inference API named `predict` that takes tabular data in pandas.DataFrame 
        format as input, and returns Json Serializable value as output.

        A batch API is expect to receive a list of inference input and should returns
        a list of prediction results.
        """
        model_input_df = my_encoder.fit_transform(df)
        predictions = self.artifacts.my_model.predict(model_input_df)

        return list(predictions)

This can be easily plugged into your model training process: import your bentoml prediction service class, pack it with your trained model, and call save to persist the entire prediction service at the end, which creates a BentoML bundle:

from my_prediction_service import MyPredictionService
svc = MyPredictionService()
svc.pack('my_model', my_sklearn_model)
svc.save()  # saves to $HOME/bentoml/repository/MyPredictionService/{version}/

The generated BentoML bundle is a file directory that contains all the code files, serialized models, and configs required for reproducing this prediction service for inference. BentoML automatically captures all the python dependencies information and have everything versioned and managed together in one place.

BentoML automatically generates a version ID for this bundle, and keeps track of all bundles created under the $HOME/bentoml directory. With a BentoML bundle, user can start a local API server hosting it, either by its file path or its name and version:

bentoml serve MyPredictionService:latest

# alternatively
bentoml serve $HOME/bentoml/repository/MyPredictionService/{version}/

A docker container image that's ready for production deployment can be created now with just one command:

bentoml containerize MyPredictionService:latest -t my_prediction_service:v3

docker run -p 5000:5000 my_prediction_service:v3 --workers 2

The container image produced will have all the required dependencies installed. Besides the model inference API, the containerized BentoML model server also comes with Prometheus metrics, health check endpoint, prediction logging, and tracing support out-of-the-box. This makes it super easy for your DevOps team to incorporate your models into production systems.

BentoML's model management component is called Yatai, it means food cart in Japanese, and you can think of it as where you'd store your bentos ๐Ÿฑ . Yatai provides CLI, Web UI, and Python API for accessing BentoML bundles you have created, and you can start a Yatai server for your team to manage all models on cloud storage(S3, GCS, MinIO etc) and build CI/CD workflow around it. Learn more about it here.

Yatai UI

Read the Quickstart Guide to learn more about the basic functionalities of BentoML. You can also try it out here on Google Colab.

Why BentoML

Moving trained Machine Learning models to serving applications in production is hard. It is a sequential process across data science, engineering and DevOps teams: after a model is trained by the data science team, they hand it over to the engineering team to refine and optimize code and creates an API, before DevOps can deploy.

And most importantly, Data Science teams want to continuously repeat this process, monitor the models deployed in production and ship new models quickly. It often takes months for an engineering team to build a model serving & deployment solution that allow data science teams to ship new models in a repeatable and reliable way.

BentoML is a framework designed to solve this problem. It provides high-level APIs for Data Science team to create prediction services, abstract away DevOps' infrastructure needs and performance optimizations in the process. This allows DevOps team to seamlessly work with data science side-by-side, deploy and operate their models packaged in BentoML format in production.

Check out Frequently Asked Questions page on how does BentoML compares to Tensorflow-serving, Clipper, AWS SageMaker, MLFlow, etc.

Contributing

Have questions or feedback? Post a new github issue or discuss in our Slack channel: join BentoML Slack

Want to help build BentoML? Check out our contributing guide and the development guide.

Releases

BentoML is under active development and is evolving rapidly. It is currently a Beta release, we may change APIs in future releases and there are still major features being worked on.

Read more about the latest updates from the releases page.

Usage Tracking

BentoML by default collects anonymous usage data using Amplitude. It only collects BentoML library's own actions and parameters, no user or model data will be collected. Here is the code that does it.

This helps BentoML team to understand how the community is using this tool and what to build next. You can easily opt-out of usage tracking by running the BentoML commands with the --do-not-track option.

% bentoml [command] --do-not-track

or by setting the BENTOML_DO_NOT_TRACK environment variable to True.

% export BENTOML_DO_NOT_TRACK=True

License

Apache License 2.0

FOSSA Status

Comments
  • Failure of bento serve in production with AnyIO error

    Failure of bento serve in production with AnyIO error

    Describe the bug

    The sklearn example available here https://docs.bentoml.org/en/latest/quickstart.html#installation fails at inference time with AnyIO error. The bento service is deployed in production mode bentoml serve iris_classifier:latest --production. When deployed in development mode, the inference works as expected.

    To Reproduce

    Steps to reproduce the issue:

    1. Install BentoML: pip install bentoml --pre
    2. Train and save bento sklearn model:
    import bentoml
    
    from sklearn import svm
    from sklearn import datasets
    
    # Load predefined training set to build an example model
    iris = datasets.load_iris()
    X, y = iris.data, iris.target
    
    # Model Training
    clf = svm.SVC(gamma='scale')
    clf.fit(X, y)
    
    # Call to bentoml.<FRAMEWORK>.save(<MODEL_NAME>, model)
    # In order to save to BentoML's standard format in a local model store
    bentoml.sklearn.save("iris_clf", clf)
    
    1. Create BentoML service
    # bento.py
    import bentoml
    import bentoml.sklearn
    import numpy as np
    
    from bentoml.io import NumpyNdarray
    
    # Load the runner for the latest ScikitLearn model we just saved
    iris_clf_runner = bentoml.sklearn.load_runner("iris_clf:latest")
    
    # Create the iris_classifier service with the ScikitLearn runner
    # Multiple runners may be specified if needed in the runners array
    # When packaged as a bento, the runners here will included
    svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner])
    
    # Create API function with pre- and post- processing logic with your new "svc" annotation
    @svc.api(input=NumpyNdarray(), output=NumpyNdarray())
    def predict(input_ndarray: np.ndarray) -> np.ndarray:
        # Define pre-processing logic
        result = iris_clf_runner.run(input_ndarray)
        # Define post-processing logic
        return result
    
    1. Create BentoML configuration file
    # bentofile.yaml
    service: "bento.py:svc"  # A convention for locating your service: <YOUR_SERVICE_PY>:<YOUR_SERVICE_ANNOTATION>
    include:
     - "*.py"  # A pattern for matching which files to include in the bento
    python:
      packages:
       - scikit-learn  # Additional libraries to be included in the bento
    
    1. Build BentoML service bentoml build
    2. Run bento in production bentoml serve iris_classifier:latest --production
    3. Send inference request
    import requests
    response = requests.post(
        "http://127.0.0.1:5000/predict",
        headers={"content-type": "application/json"},
        data="[5,4,3,2]").text
    print(response)
    

    Expected behavior

    The response should be the classification result, namely 1.

    Screenshots/Logs

    The error generated by the server is the following:

     Exception on /predict [POST]
                           โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚ /usr/local/lib/python3.8/dist-packages/anyio/from_thread.py:31 in run                                                                                              โ”‚
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚    28 โ”‚                                                                                                                                                            โ”‚
                           โ”‚    29 โ”‚   """                                                                                                                                                      โ”‚
                           โ”‚    30 โ”‚   try:                                                                                                                                                     โ”‚
                           โ”‚ โฑ  31 โ”‚   โ”‚   asynclib = threadlocals.current_async_module                                                                                                         โ”‚
                           โ”‚    32 โ”‚   except AttributeError:                                                                                                                                   โ”‚
                           โ”‚    33 โ”‚   โ”‚   raise RuntimeError('This function can only be run from an AnyIO worker thread')                                                                      โ”‚
                           โ”‚    34                                                                                                                                                              โ”‚
                           โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
                           AttributeError: '_thread._local' object has no attribute 'current_async_module'
    
                           During handling of the above exception, another exception occurred:
    
                           โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚ /usr/local/lib/python3.8/dist-packages/bentoml/_internal/server/service_app.py:356 in api_func                                                                     โ”‚
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚   353 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   if isinstance(api.input, Multipart):                                                                                                     โ”‚
                           โ”‚   354 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   output: t.Any = await run_in_threadpool(api.func, **input_data)                                                                      โ”‚
                           โ”‚   355 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   else:                                                                                                                                    โ”‚
                           โ”‚ โฑ 356 โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   output: t.Any = await run_in_threadpool(api.func, input_data)                                                                        โ”‚
                           โ”‚   357 โ”‚   โ”‚   โ”‚   โ”‚   response = await api.output.to_http_response(output)                                                                                         โ”‚
                           โ”‚   358 โ”‚   โ”‚   โ”‚   except BentoMLException as e:                                                                                                                    โ”‚
                           โ”‚   359 โ”‚   โ”‚   โ”‚   โ”‚   log_exception(request, sys.exc_info())                                                                                                       โ”‚
                           โ”‚ /usr/local/lib/python3.8/dist-packages/starlette/concurrency.py:40 in run_in_threadpool                                                                            โ”‚
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚   37 โ”‚   elif kwargs:  # pragma: no cover                                                                                                                          โ”‚
                           โ”‚   38 โ”‚   โ”‚   # loop.run_in_executor doesn't accept 'kwargs', so bind them in here                                                                                  โ”‚
                           โ”‚   39 โ”‚   โ”‚   func = functools.partial(func, **kwargs)                                                                                                              โ”‚
                           โ”‚ โฑ 40 โ”‚   return await loop.run_in_executor(None, func, *args)                                                                                                      โ”‚
                           โ”‚   41                                                                                                                                                               โ”‚
                           โ”‚   42                                                                                                                                                               โ”‚
                           โ”‚   43 class _StopIteration(Exception):                                                                                                                              โ”‚
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚ /usr/lib/python3.8/concurrent/futures/thread.py:57 in run                                                                                                          โ”‚
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚    54 โ”‚   โ”‚   โ”‚   return                                                                                                                                           โ”‚
                           โ”‚    55 โ”‚   โ”‚                                                                                                                                                        โ”‚
                           โ”‚    56 โ”‚   โ”‚   try:                                                                                                                                                 โ”‚
                           โ”‚ โฑ  57 โ”‚   โ”‚   โ”‚   result = self.fn(*self.args, **self.kwargs)                                                                                                      โ”‚
                           โ”‚    58 โ”‚   โ”‚   except BaseException as exc:                                                                                                                         โ”‚
                           โ”‚    59 โ”‚   โ”‚   โ”‚   self.future.set_exception(exc)                                                                                                                   โ”‚
                           โ”‚    60 โ”‚   โ”‚   โ”‚   # Break a reference cycle with the exception 'exc'                                                                                               โ”‚
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚ /root/bentoml/bentos/iris_classifier/fgzarmenwoh6jsyx/src/bento.py:20 in predict                                                                                   โ”‚
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚   17 @svc.api(input=NumpyNdarray(), output=NumpyNdarray())                                                                                                         โ”‚
                           โ”‚   18 def predict(input_ndarray: np.ndarray) -> np.ndarray:                                                                                                         โ”‚
                           โ”‚   19 โ”‚   # Define pre-processing logic                                                                                                                             โ”‚
                           โ”‚ โฑ 20 โ”‚   result = iris_clf_runner.run(input_ndarray)                                                                                                               โ”‚
                           โ”‚   21 โ”‚   # Define post-processing logic                                                                                                                            โ”‚
                           โ”‚   22 โ”‚   return result                                                                                                                                             โ”‚
                           โ”‚   23                                                                                                                                                               โ”‚
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚ /usr/local/lib/python3.8/dist-packages/bentoml/_internal/runner/runner.py:141 in run                                                                               โ”‚
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚   138 โ”‚   โ”‚   return await self._impl.async_run_batch(*args, **kwargs)                                                                                             โ”‚
                           โ”‚   139 โ”‚                                                                                                                                                            โ”‚
                           โ”‚   140 โ”‚   def run(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                                                                                                   โ”‚
                           โ”‚ โฑ 141 โ”‚   โ”‚   return self._impl.run(*args, **kwargs)                                                                                                               โ”‚
                           โ”‚   142 โ”‚                                                                                                                                                            โ”‚
                           โ”‚   143 โ”‚   def run_batch(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                                                                                             โ”‚
                           โ”‚   144 โ”‚   โ”‚   return self._impl.run_batch(*args, **kwargs)                                                                                                         โ”‚
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚ /usr/local/lib/python3.8/dist-packages/bentoml/_internal/runner/remote.py:111 in run                                                                               โ”‚
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚   108 โ”‚   def run(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                                                                                                   โ”‚
                           โ”‚   109 โ”‚   โ”‚   import anyio                                                                                                                                         โ”‚
                           โ”‚   110 โ”‚   โ”‚                                                                                                                                                        โ”‚
                           โ”‚ โฑ 111 โ”‚   โ”‚   return anyio.from_thread.run(self.async_run, *args, **kwargs)                                                                                        โ”‚
                           โ”‚   112 โ”‚                                                                                                                                                            โ”‚
                           โ”‚   113 โ”‚   def run_batch(self, *args: t.Any, **kwargs: t.Any) -> t.Any:                                                                                             โ”‚
                           โ”‚   114 โ”‚   โ”‚   import anyio                                                                                                                                         โ”‚
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚ /usr/local/lib/python3.8/dist-packages/anyio/from_thread.py:33 in run                                                                                              โ”‚
                           โ”‚                                                                                                                                                                    โ”‚
                           โ”‚    30 โ”‚   try:                                                                                                                                                     โ”‚
                           โ”‚    31 โ”‚   โ”‚   asynclib = threadlocals.current_async_module                                                                                                         โ”‚
                           โ”‚    32 โ”‚   except AttributeError:                                                                                                                                   โ”‚
                           โ”‚ โฑ  33 โ”‚   โ”‚   raise RuntimeError('This function can only be run from an AnyIO worker thread')                                                                      โ”‚
                           โ”‚    34 โ”‚                                                                                                                                                            โ”‚
                           โ”‚    35 โ”‚   return asynclib.run_async_from_thread(func, *args)                                                                                                       โ”‚
                           โ”‚    36                                                                                                                                                              โ”‚
                           โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
                           RuntimeError: This function can only be run from an AnyIO worker thread
    

    Environment:

    • OS: Ubuntu 20.04
    • Python Version : 3.8
    • BentoML Version: 1.0.0.a3
    • AnyIO version: 3.5.0
    • installed the requirements here https://github.com/bentoml/BentoML/blob/main/requirements/tests-requirements.txt

    Additional context

    bug 
    opened by andreea-anghel 29
  • [tests] Move yatai e2e tests to Github CI

    [tests] Move yatai e2e tests to Github CI

    Description

    Motivation and Context

    How Has This Been Tested?

    Types of changes

    • [ ] Breaking change (fix or feature that would cause existing functionality to change)
    • [ ] New feature and improvements (non-breaking change which adds/improves functionality)
    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [ ] Code Refactoring (internal change which is not user facing)
    • [ ] Documentation
    • [ ] Test, CI, or build

    Component(s) if applicable

    • [ ] BentoService (service definition, dependency management, API input/output adapters)
    • [ ] Model Artifact (model serialization, multi-framework support)
    • [ ] Model Server (mico-batching, dockerisation, logging, OpenAPI, instruments)
    • [ ] YataiService gRPC server (model registry, cloud deployment automation)
    • [ ] YataiService web server (nodejs HTTP server and web UI)
    • [ ] Internal (BentoML's own configuration, logging, utility, exception handling)
    • [ ] BentoML CLI

    Checklist:

    • [ ] My code follows the bentoml code style, both ./dev/format.sh and ./dev/lint.sh script have passed (instructions).
    • [ ] My change reduces project test coverage and requires unit tests to be added
    • [ ] I have added unit tests covering my code change
    • [ ] My change requires a change to the documentation
    • [ ] I have updated the documentation accordingly
    opened by yubozhao 26
  • Issue with conda dependencies with custom channels

    Issue with conda dependencies with custom channels

    Describe the bug

    This is related to some issues I've mentioned where you're building in an environment with only access to specific repositories.

    In short, the Dockerfile will call a script that does conda env update -n base -f ./enviornment.yml. The problem is that this will also use the global conda settings. Unlike conda install there is no --override-channels argument.

    So the build will break in such an environment

    To Reproduce

    This is quite difficult reproduce. Other than building in such an environment, you could create a service with a specific conda url as well the conda_overwrite_channels=True which will create an environment.yml with only that url. If you add a -v to the conda env update you should see it installing from other urls.

    Expected behavior

    It should be the case that only the conda channels specified are used.

    Potential Fix

    One fix is to add the line conda config --system --remove channels defaults in the bentoml-init.sh file, just before theconda env update line. This will remove the default urls so that the only channels used are specified in the environment.yml file.

    Since conda is not used after this step it should not cause other issues. There might be other ways - I've looked for a way to do this via the conda env update command with no luck.

    A safer approach might be to do this iff the conda_overwrite_channels=True but this would involve interacting with the bentoml-init.sh file when building which currently isn't done.

    opened by gregd33 24
  • Small breaking change to onnx-mlir PyExecution session requires small tweak to open source code

    Small breaking change to onnx-mlir PyExecution session requires small tweak to open source code

    Describe the bug A change to the onnx-mlir PyExecutionSession requires BentoML's onnx-mlir PyExecutionSession to be updated with a fix where run_main_graph no longer needs specified in the PyExecutionSession invocation as shown below: image

    To Reproduce Any workflow using an old copy of the compiled models will need to refresh their compiled model and repack their serving environment. Any discussions for incompatible changes needed beyond that should be discussed with me immediately.

    Expected behavior The model should run as expected and give the expected output.

    Screenshots/Logs Sufficient documentation has been provided in the documentation above and I or @andrewsi-z will provide the fix for this since he originally authored this code.

    Additional context A fix has been discussed with @andrewsi-z and should be straightforward in implementation. I'll work with the team moving forward to implement the PR.

    bug 
    opened by messerb5467 22
  • [Feature] Easy Ec2 deployment

    [Feature] Easy Ec2 deployment

    Description

    One click deployment for AWS ec2 with autoscaling group.

    Motivation and Context

    For easy deployment without load balancing

    How Has This Been Tested?

    Types of changes

    • [ ] Breaking change (fix or feature that would cause existing functionality to change)
    • [x] New feature and improvements (non-breaking change which adds/improves functionality)
    • [ ] Bug fix (non-breaking change which fixes an issue)
    • [x] Code Refactoring (internal change which is not user facing)
    • [ ] Documentation
    • [x] Test, CI, or build

    Component(s) if applicable

    • [x] BentoService (service definition, dependency management, API input/output adapters)
    • [ ] Model Artifact (model serialization, multi-framework support)
    • [x] Model Server (mico-batching, dockerisation, logging, OpenAPI, instruments)
    • [x] YataiService gRPC server (model registry, cloud deployment automation)
    • [ ] YataiService web server (nodejs HTTP server and web UI)
    • [ ] Internal (BentoML's own configuration, logging, utility, exception handling)
    • [x] BentoML CLI

    Checklist:

    • [x] My code follows the bentoml code style, both ./dev/format.sh and ./dev/lint.sh script have passed (instructions).
    • [ ] My change reduces project test coverage and requires unit tests to be added
    • [x] I have added unit tests covering my code change
    • [x] My change requires a change to the documentation
    • [x] I have updated the documentation accordingly
    opened by mayurnewase 21
  • Docker container stopped working with: ModuleNotFoundError: No module named 'ruamel'

    Docker container stopped working with: ModuleNotFoundError: No module named 'ruamel'

    Describe the bug Without any change to my code, new Docker containers aren't working anymore. When I try to run it, I get:

    Traceback (most recent call last):
      File "/opt/conda/bin/bentoml", line 5, in <module>
        from bentoml.cli import cli
      File "/opt/conda/lib/python3.6/site-packages/bentoml/__init__.py", line 27, in <module>
        from bentoml.saved_bundle import load, save_to_dir
      File "/opt/conda/lib/python3.6/site-packages/bentoml/saved_bundle/__init__.py", line 15, in <module>
        from bentoml.saved_bundle.bundler import save_to_dir
      File "/opt/conda/lib/python3.6/site-packages/bentoml/saved_bundle/bundler.py", line 31, in <module>
        from bentoml.utils.usage_stats import track_save
      File "/opt/conda/lib/python3.6/site-packages/bentoml/utils/usage_stats.py", line 22, in <module>
        from ruamel.yaml import YAML
    ModuleNotFoundError: No module named 'ruamel'
    

    To Reproduce Steps to reproduce the behavior:

    1. Create a BentoService with:
    @env(conda_channels=["conda-forge"], conda_dependencies=["libpq=12.3"],
         pip_dependencies=["mxnet==1.4.1", "gluonts==0.5", "numpy==1.16", "pandas==1.0.5", "holidays==0.9.12",
                           "python-dateutil==2.8", "convertdate==2.2", "pydantic==1.6", "luigi==2.8", "sqlalchemy==1.3",
                           "psycopg2==2.8"])
    
    1. Pack the model and build the Docker Container
    2. When the container is being built, we can see the following package being installed while updating the conda environment: ruamel_yaml-0.15.87
    3. When trying to run the container, ModuleNotFoundError: No module named 'ruamel' appears. If you open bash into the container, you can see that the ruamel is installed, but can't be imported.

    Expected behavior It was expected to run successfully.

    Screenshots/Logs

    Resulting environment.yml:

    name: bentoml-DemandForecaster
    channels:
    - defaults
    - conda-forge
    dependencies:
    - python=3.6.11
    - pip
    - libpq=12.3
    

    Log of installed packages through conda:

    Downloading and Extracting Packages
    idna-2.10            | 50 KB     | ########## | 100% 
    python-3.6.11        | 34.1 MB   | ########## | 100% 
    yaml-0.2.5           | 75 KB     | ########## | 100% 
    cryptography-2.9.2   | 556 KB    | ########## | 100% 
    pysocks-1.7.1        | 30 KB     | ########## | 100% 
    krb5-1.17.1          | 1.3 MB    | ########## | 100% 
    pyopenssl-19.1.0     | 48 KB     | ########## | 100% 
    pycparser-2.20       | 94 KB     | ########## | 100% 
    cffi-1.14.0          | 223 KB    | ########## | 100% 
    sqlite-3.32.3        | 1.4 MB    | ########## | 100% 
    tk-8.6.10            | 3.0 MB    | ########## | 100% 
    conda-4.8.3          | 2.8 MB    | ########## | 100% 
    wheel-0.34.2         | 51 KB     | ########## | 100% 
    pycosat-0.6.3        | 82 KB     | ########## | 100% 
    pip-20.2.1           | 1.8 MB    | ########## | 100% 
    openssl-1.1.1g       | 2.5 MB    | ########## | 100% 
    xz-5.2.5             | 341 KB    | ########## | 100% 
    tqdm-4.42.1          | 56 KB     | ########## | 100% 
    python_abi-3.6       | 4 KB      | ########## | 100% 
    urllib3-1.25.9       | 103 KB    | ########## | 100% 
    ruamel_yaml-0.15.87  | 270 KB    | ########## | 100% 
    conda-package-handli | 797 KB    | ########## | 100% 
    libpq-12.3           | 2.6 MB    | ########## | 100% 
    certifi-2020.6.20    | 156 KB    | ########## | 100% 
    requests-2.24.0      | 56 KB     | ########## | 100% 
    setuptools-49.2.1    | 736 KB    | ########## | 100% 
    ca-certificates-2020 | 125 KB    | ########## | 100% 
    readline-8.0         | 356 KB    | ########## | 100% 
    chardet-3.0.4        | 180 KB    | ########## | 100% 
    brotlipy-0.7.0       | 323 KB    | ########## | 100% 
    six-1.15.0           | 13 KB     | ########## | 100% 
    Preparing transaction: ...working... done
    Verifying transaction: ...working... done
    Executing transaction: ...working... done
    

    Error when trying to import bentoml:

    Traceback (most recent call last):
      File "/opt/conda/bin/bentoml", line 5, in <module>
        from bentoml.cli import cli
      File "/opt/conda/lib/python3.6/site-packages/bentoml/__init__.py", line 27, in <module>
        from bentoml.saved_bundle import load, save_to_dir
      File "/opt/conda/lib/python3.6/site-packages/bentoml/saved_bundle/__init__.py", line 15, in <module>
        from bentoml.saved_bundle.bundler import save_to_dir
      File "/opt/conda/lib/python3.6/site-packages/bentoml/saved_bundle/bundler.py", line 31, in <module>
        from bentoml.utils.usage_stats import track_save
      File "/opt/conda/lib/python3.6/site-packages/bentoml/utils/usage_stats.py", line 22, in <module>
        from ruamel.yaml import YAML
    ModuleNotFoundError: No module named 'ruamel'
    

    Environment:

    • OS: Linux Manjaro
    • Python/BentoML Version: Python 3.6.11, BentoML 0.8.5

    Additional context As a workaround, I added ruamel.yaml=0.16 in conda_dependencies.

    bug 
    opened by fernandocamargoai 21
  • got internal server error when trying to invoke sagemaker endpoint

    got internal server error when trying to invoke sagemaker endpoint

    Describe the bug I am following the tutorial described in https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html. I got the expected results until I get to this part command that causes error

    Instead of the output with the red characters, I got this message : error on terminal

    I did as the message said and checked the associated CloudWatch log and this are the error messages cloudwatch error logs

    To Reproduce Steps to reproduce the behavior:

    1. follow the tutorial described in https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html . (The only thing different is the region, I set it to ap-southeast-1)

    Expected behavior the output should be like the first screenshot above.

    Environment:

    • OS: [Linux Mint 19.3 Tricia Cinnamon]
    • Python/BentoML Version [e.g. Python 3.7.4, BentoML 0.7.8]

    Additional context What I tried to solve this error: The first message about unpickling svm created with scikit-learn 0.21.3 with scikit-learn 0.23.1 may causes errors made me upgrade my local scikit-learn package from 0.21.3 to 0.23.1 but it did nothing. I assume that the scikit-learn package referred here is the scikit-learn inside the docker image for aws sagemaker ?

    bug 
    opened by palver7 18
  • Error in Serverless deployment with AWS Lambda

    Error in Serverless deployment with AWS Lambda

    Describe the bug While running: !bentoml deploy ./model --platform aws-lambda --region us-west-2

    this error appears: [2019-08-27 17:25:40,854] INFO - Using user AWS region: us-west-2 [2019-08-27 17:25:40,855] INFO - Using AWS stage: dev Encounter error when deploying to aws-lambda Error: 'Service Information' is not in list

    To Reproduce sudo npm install [email protected] --global (tried [email protected]. didn't work for the example below too.)

    1. Go to BentoML/examples/deploy-with-serverless/deploy-with-serverless.ipynb
    2. Run all the codes until !bentoml deploy ./model --platform aws-lambda --region us-west-2
    3. See error

    Expected behavior Successful deployment

    Environment:

    • MacOS 10.13.6
    • serverless 1.49.0
    • Python 3.7.3
    • BentoML 0.3.4
    • ipython 7.6.1
    bug 
    opened by ji-clara 18
  • Allow user to customize readiness probe

    Allow user to customize readiness probe

    Is your feature request related to a problem? Please describe. Currently, the readiness probe does not bother to check anything; it just returns 200 OK if the app is started up. However, in case the developer accidentally introduces a bug into the bento/service file when modifying it, the deployment would be marked as ready, when it should not be. This makes a bugged deployment replace a previously working one, resulting in downstream failures.

    Describe the solution you'd like Allow the user to customize the readiness probe / readyz behaviour with a custom function, for instance to call the model with a known valid input, and assert that the model returns a valid output, before marking the deployment as ready. This would also allow developers to assert that connections to external resources such as a feature store are working correctly, before marking a deployment as ready to accept connections.

    Describe alternatives you've considered The developer can create a new route that is not called /readyz (since it is reserved) to perform these checks and then modify the Kubernetes deployment to use this route as the readiness probe. Not sure whether this is compatible with Yatai since I do not use it.

    Additional context None

    feature 
    opened by jiewpeng 17
  • Service with sklearn model fails on my EKS cluster

    Service with sklearn model fails on my EKS cluster

    I have created a simple service:

    model_runner = bentoml.sklearn.load_runner("mymodel:latest")
    svc = bentoml.Service("myservice", runners=[model_runner])
    
    @svc.api(input=NumpyNdarray(), output=NumpyNdarray())
    def classify(input_series: np.ndarray) -> np.ndarray:
        return model_runner.run(input_series)
    

    When I run it on my laptop (MacBook Pro M1), using

    bentoml serve ./service.py:svc --reload
    

    everything works fine when I invoke the generated classify API.

    Now when I push this service to my Yatai server as a bento and deploy it to my K8s cluster (EKS), I get the following error when I invoke the API:

    image

    Looking at the code, the problem lies in https://github.com/bentoml/BentoML/blob/119b103e2417291b18127d64d38f092893c8de4f/bentoml/_internal/frameworks/sklearn.py#L163 In my case, _num_threads answers 0. Digging a bit further, resource_quota.cpu is computed here: https://github.com/bentoml/BentoML/blob/119b103e2417291b18127d64d38f092893c8de4f/bentoml/_internal/runner/utils.py#L208. Here are the values I get on the pod running the API:

    | source | value | | --- | --- | | file /sys/fs/cgroup/cpu/cpu.cfs_quota_us | -1 | | file /sys/fs/cgroup/cpu/cpu.cfs_period_us | 100000 | | file /sys/fs/cgroup/cpu/cpu.shares | 2 | | call to os.cpu_count() | 2 |

    Given those values, query_cgroup_cpu_count() will return 0.001953125, which once rounded will end up as 0, meaning n_jobs will alway be 0. So the call will always fail on my pods.

    opened by amelki 17
  • containerize with conda fails, missing install.sh

    containerize with conda fails, missing install.sh

    Describe the bug

    bentoml containerize with conda options in bentofile.yaml fails with chmod: cannot access '/home/bentoml/bento/env/python/install.sh': No such file or directory.

    bentofile.yaml with:

    python:
          packages: 
          - scikit-learn
    

    works without issues.

    To Reproduce

    sample_bentofile.yaml:

    service: "sample_service:svc"
    include:
        - "*.py"  
    conda:
        dependencies:
        - python=3.8.13
        - pip    
        pip:
        - scikit-learn
    

    sample_service.py:

    import numpy as np
    import bentoml
    from bentoml.io import NumpyNdarray
    
    iris_clf_runner = bentoml.sklearn.get("iris_clf:latest").to_runner()
    
    svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner])
    
    @svc.api(input=NumpyNdarray(), output=NumpyNdarray())
    def classify(input_series: np.ndarray) -> np.ndarray:
        result = iris_clf_runner.predict.run(input_series)
        return result
    
    1. bentoml build -f sample_bentofile.yaml
    2. bentoml containerize iris_classifier:latest --debug --no-cache
    3. fails with:
    Building BentoML service "iris_classifier:d4lyn4xzho2n2atv" from build context "/home/ubuntu/experiments/mmdetection/bentoml"
    Packing model "iris_clf:7o4gi2hvzshtkatv"
    
    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ•—โ–‘โ–‘โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–‘โ–ˆโ–ˆโ–ˆโ•—โ–‘โ–‘โ–‘โ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•—โ–‘โ–‘โ–‘โ–‘โ–‘
    โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–‘โ–ˆโ–ˆโ•‘โ•šโ•โ•โ–ˆโ–ˆโ•”โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–‘โ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘โ–‘โ–‘โ–‘โ–‘โ–‘
    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•ฆโ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–‘โ–‘โ–ˆโ–ˆโ•”โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘โ–‘โ–‘โ–‘โ–ˆโ–ˆโ•‘โ–‘โ–‘โ–‘โ–ˆโ–ˆโ•‘โ–‘โ–‘โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ–ˆโ–ˆโ–ˆโ–ˆโ•”โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘โ–‘โ–‘โ–‘โ–‘โ–‘
    โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ•โ–‘โ–‘โ–ˆโ–ˆโ•‘โ•šโ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ–‘โ–‘โ–‘โ–ˆโ–ˆโ•‘โ–‘โ–‘โ–‘โ–ˆโ–ˆโ•‘โ–‘โ–‘โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘โ•šโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘โ–‘โ–‘โ–‘โ–‘โ–‘
    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•ฆโ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘โ–‘โ•šโ–ˆโ–ˆโ–ˆโ•‘โ–‘โ–‘โ–‘โ–ˆโ–ˆโ•‘โ–‘โ–‘โ–‘โ•šโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘โ–‘โ•šโ•โ•โ–‘โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—
    โ•šโ•โ•โ•โ•โ•โ•โ–‘โ•šโ•โ•โ•โ•โ•โ•โ•โ•šโ•โ•โ–‘โ–‘โ•šโ•โ•โ•โ–‘โ–‘โ–‘โ•šโ•โ•โ–‘โ–‘โ–‘โ–‘โ•šโ•โ•โ•โ•โ•โ–‘โ•šโ•โ•โ–‘โ–‘โ–‘โ–‘โ–‘โ•šโ•โ•โ•šโ•โ•โ•โ•โ•โ•โ•
    
    Successfully built Bento(tag="iris_classifier:d4lyn4xzho2n2atv")
    ubuntu@ip-172-31-12-44 ~/e/m/bentoml (master)> bentoml containerize iris_classifier:latest --debug --no-cache                                                       (bentotest) 
    Building docker image for Bento(tag="iris_classifier:d4lyn4xzho2n2atv")...
    [+] Building 14.0s (17/22)                                                                                                                                                      
     => [internal] load build definition from Dockerfile                                                                                                                       0.0s
     => => transferring dockerfile: 2.87kB                                                                                                                                     0.0s
     => [internal] load .dockerignore                                                                                                                                          0.0s
     => => transferring context: 2B                                                                                                                                            0.0s
     => resolve image config for docker.io/docker/dockerfile:1.4-labs                                                                                                          0.1s
     => CACHED docker-image://docker.io/docker/dockerfile:1.4-labs@sha256:b50ad4af81d1c76ab7c0e1ffc216909e7adc23e99910243e1c88331c2a8ef52d                                     0.0s
     => [internal] load build definition from Dockerfile                                                                                                                       0.0s
     => [internal] load .dockerignore                                                                                                                                          0.0s
     => [internal] load metadata for docker.io/continuumio/miniconda3:latest                                                                                                  13.6s
     => CACHED [internal] settings cache mount permissions                                                                                                                     0.0s
     => CACHED [cached 1/1] FROM docker.io/continuumio/miniconda3:latest@sha256:977263e8d1e476972fddab1c75fe050dd3cd17626390e874448bd92721fd659b                               0.0s
     => [internal] load build context                                                                                                                                          0.0s
     => => transferring context: 33.62kB                                                                                                                                       0.0s
     => [stage-1  2/13] RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache                 0.3s
     => [stage-1  3/13] RUN --mount=type=cache,from=cached,sharing=shared,target=/var/cache/apt     --mount=type=cache,from=cached,sharing=shared,target=/var/lib/apt  apt-g  11.3s
     => [stage-1  4/13] RUN rm -rf /var/lib/{apt,cache,log}                                                                                                                    0.4s
     => [stage-1  5/13] RUN groupadd -g 1034 -o bentoml && useradd -m -u 1034 -g 1034 -o -r bentoml                                                                            0.4s 
     => [stage-1  6/13] RUN mkdir /home/bentoml/bento && chown bentoml:bentoml /home/bentoml/bento -R                                                                          0.5s 
     => [stage-1  7/13] WORKDIR /home/bentoml/bento                                                                                                                            0.0s 
     => [stage-1  8/13] COPY --chown=bentoml:bentoml . ./                                                                                                                      0.0s 
     => ERROR [stage-1  9/13] RUN --mount=type=cache,mode=0777,target=/root/.cache/pip     chmod +x /home/bentoml/bento/env/python/install.sh &&     bash /home/bentoml/bento  0.4s 
    ------
     > [stage-1  9/13] RUN --mount=type=cache,mode=0777,target=/root/.cache/pip     chmod +x /home/bentoml/bento/env/python/install.sh &&     bash /home/bentoml/bento/env/python/install.sh:
    #0 0.325 chmod: cannot access '/home/bentoml/bento/env/python/install.sh': No such file or directory
    ------
    error: failed to solve: executor failed running [/bin/sh -c chmod +x /home/bentoml/bento/env/python/install.sh &&     bash /home/bentoml/bento/env/python/install.sh]: exit code: 1
    Failed building docker image: Command '['docker', 'buildx', 'build', '--progress', 'auto', '--tag', 'iris_classifier:d4lyn4xzho2n2atv', '--file', 'env/docker/Dockerfile', '--load', '--no-cache', '.']' returned non-zero exit status 1.
    

    Environment:

    • Ubuntu 18.04.6 LTS
    • Python 3.8.13
    • bentoml, version 1.0.0rc2.post24+g25b6e63
    bug 
    opened by smidm 16
  • bug: RuntimeError: Found no NVIDIA driver on your system.

    bug: RuntimeError: Found no NVIDIA driver on your system.

    Describe the bug

    I'm not sure if this is actually a bug or an error from my side, so please excuse the latter.

    I am able to successfully build a bento that uses the gpu with no problems. However, containerizing it leads to the following error (it does not find the Nvidia GPU drivers):

    RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

    Did I forget to specify more information to use nvidia drivers, may be in the Dockerfile.template ? Note that it runs in a local miniconda environment. Could this be the issue ?

    Here is the bentofile.yaml :

    service: "service:svc"  # Same as the argument passed to `bentoml serve`
    #include:
    #- "*.py"  # A pattern for matching which files to include in the bento
    exclude:
    - "examples/"
    - "*.png"
    - "*.gif"
    - "venv/"
    - "venv"
    docker:
      distro: debian
      dockerfile_template: ./Dockerfile.template
      python_version: "3.10.8"
      cuda_version: "11.6.2"
    python:
       packages:  # Additional pip packages required by the service
       - filelock
       - Pillow
       - torch
       - fire
       - humanize
       - requests
       - tqdm
       - matplotlib
       - scikit-image
       - scipy
       - numpy
    

    This is the Dockerfile.template

    {% extends bento_base_template %}
    {% block SETUP_BENTO_COMPONENTS %}
    {{ super() }}
    RUN echo "We are running this during bentoml containerize!"
    RUN apt-get update && \
        apt-get upgrade -y && \
        apt-get install -y git
    RUN pip install git+https://github.com/openai/CLIP.git
    RUN echo "CLIP installed!"
    {% endblock %} 
    

    To reproduce

    No response

    Expected behavior

    No response

    Environment

    Environment variable

    BENTOML_DEBUG=''
    BENTOML_QUIET=''
    BENTOML_BUNDLE_LOCAL_BUILD=''
    BENTOML_DO_NOT_TRACK=''
    BENTOML_CONFIG=''
    BENTOML_CONFIG_OPTIONS=''
    BENTOML_PORT=''
    BENTOML_HOST=''
    BENTOML_API_WORKERS=''
    

    System information

    bentoml: 1.0.12 python: 3.10.8 platform: Linux-5.15.85-1-MANJARO-x86_64-with-glibc2.36 uid_gid: 1000:1000 conda: 22.9.0 in_conda_env: True

    conda_packages
    name: pointe
    channels:
      - defaults
    dependencies:
      - _libgcc_mutex=0.1=main
      - _openmp_mutex=5.1=1_gnu
      - bzip2=1.0.8=h7b6447c_0
      - ca-certificates=2022.10.11=h06a4308_0
      - certifi=2022.12.7=py310h06a4308_0
      - ld_impl_linux-64=2.38=h1181459_1
      - libffi=3.4.2=h6a678d5_6
      - libgcc-ng=11.2.0=h1234567_1
      - libgomp=11.2.0=h1234567_1
      - libstdcxx-ng=11.2.0=h1234567_1
      - libuuid=1.41.5=h5eee18b_0
      - ncurses=6.3=h5eee18b_3
      - openssl=1.1.1s=h7f8727e_0
      - pip=22.3.1=py310h06a4308_0
      - python=3.10.8=h7a1cb2a_1
      - readline=8.2=h5eee18b_0
      - setuptools=65.5.0=py310h06a4308_0
      - sqlite=3.40.0=h5082296_0
      - tk=8.6.12=h1ccaba5_0
      - tzdata=2022g=h04d1e81_0
      - wheel=0.37.1=pyhd3eb1b0_0
      - xz=5.2.8=h5eee18b_0
      - zlib=1.2.13=h5eee18b_0
      - pip:
        - aiohttp==3.8.3
        - aiosignal==1.3.1
        - anyio==3.6.2
        - appdirs==1.4.4
        - asgiref==3.6.0
        - async-timeout==4.0.2
        - attrs==22.2.0
        - backoff==2.2.1
        - bentoml==1.0.12
        - build==0.9.0
        - cattrs==22.2.0
        - charset-normalizer==2.1.1
        - circus==0.18.0
        - click==8.1.3
        - click-option-group==0.5.5
        - clip==1.0
        - cloudpickle==2.2.0
        - commonmark==0.9.1
        - contextlib2==21.6.0
        - contourpy==1.0.6
        - cycler==0.11.0
        - deepmerge==1.1.0
        - deprecated==1.2.13
        - exceptiongroup==1.1.0
        - filelock==3.9.0
        - fire==0.5.0
        - fonttools==4.38.0
        - frozenlist==1.3.3
        - fs==2.4.16
        - ftfy==6.1.1
        - googleapis-common-protos==1.57.0
        - h11==0.14.0
        - humanize==4.4.0
        - idna==3.4
        - imageio==2.23.0
        - jinja2==3.1.2
        - kiwisolver==1.4.4
        - markupsafe==2.1.1
        - matplotlib==3.6.2
        - multidict==6.0.4
        - networkx==2.8.8
        - numpy==1.24.1
        - nvidia-cublas-cu11==11.10.3.66
        - nvidia-cuda-nvrtc-cu11==11.7.99
        - nvidia-cuda-runtime-cu11==11.7.99
        - nvidia-cudnn-cu11==8.5.0.96
        - opentelemetry-api==1.14.0
        - opentelemetry-exporter-otlp-proto-http==1.14.0
        - opentelemetry-instrumentation==0.35b0
        - opentelemetry-instrumentation-aiohttp-client==0.35b0
        - opentelemetry-instrumentation-asgi==0.35b0
        - opentelemetry-proto==1.14.0
        - opentelemetry-sdk==1.14.0
        - opentelemetry-semantic-conventions==0.35b0
        - opentelemetry-util-http==0.35b0
        - packaging==21.3
        - pathspec==0.10.3
        - pep517==0.13.0
        - pillow==9.4.0
        - pip-requirements-parser==32.0.1
        - pip-tools==6.12.1
        - prometheus-client==0.15.0
        - protobuf==3.20.3
        - psutil==5.9.4
        - pygments==2.14.0
        - pynvml==11.4.1
        - pyparsing==3.0.9
        - python-dateutil==2.8.2
        - python-json-logger==2.0.4
        - python-multipart==0.0.5
        - pywavelets==1.4.1
        - pyyaml==6.0
        - pyzmq==24.0.1
        - regex==2022.10.31
        - requests==2.28.1
        - rich==13.0.0
        - schema==0.7.5
        - scikit-image==0.19.3
        - scipy==1.9.3
        - simple-di==0.1.5
        - six==1.16.0
        - sniffio==1.3.0
        - starlette==0.23.1
        - termcolor==2.1.1
        - tifffile==2022.10.10
        - tomli==2.0.1
        - torch==1.13.1
        - torchvision==0.14.1
        - tornado==6.2
        - tqdm==4.64.1
        - typing-extensions==4.4.0
        - urllib3==1.26.13
        - uvicorn==0.20.0
        - watchfiles==0.18.1
        - wcwidth==0.2.5
        - wrapt==1.14.1
        - yarl==1.8.2
    prefix: /home/be/miniconda3/envs/pointe
    
    pip_packages
    aiohttp==3.8.3
    aiosignal==1.3.1
    anyio==3.6.2
    appdirs==1.4.4
    asgiref==3.6.0
    async-timeout==4.0.2
    attrs==22.2.0
    backoff==2.2.1
    bentoml==1.0.12
    build==0.9.0
    cattrs==22.2.0
    certifi @ file:///croot/certifi_1671487769961/work/certifi
    charset-normalizer==2.1.1
    circus==0.18.0
    click==8.1.3
    click-option-group==0.5.5
    clip @ git+https://github.com/openai/CLIP.git@d50d76daa670286dd6cacf3bcd80b5e4823fc8e1
    cloudpickle==2.2.0
    commonmark==0.9.1
    contextlib2==21.6.0
    contourpy==1.0.6
    cycler==0.11.0
    deepmerge==1.1.0
    Deprecated==1.2.13
    exceptiongroup==1.1.0
    filelock==3.9.0
    fire==0.5.0
    fonttools==4.38.0
    frozenlist==1.3.3
    fs==2.4.16
    ftfy==6.1.1
    googleapis-common-protos==1.57.0
    h11==0.14.0
    humanize==4.4.0
    idna==3.4
    imageio==2.23.0
    Jinja2==3.1.2
    kiwisolver==1.4.4
    MarkupSafe==2.1.1
    matplotlib==3.6.2
    multidict==6.0.4
    networkx==2.8.8
    numpy==1.24.1
    nvidia-cublas-cu11==11.10.3.66
    nvidia-cuda-nvrtc-cu11==11.7.99
    nvidia-cuda-runtime-cu11==11.7.99
    nvidia-cudnn-cu11==8.5.0.96
    opentelemetry-api==1.14.0
    opentelemetry-exporter-otlp-proto-http==1.14.0
    opentelemetry-instrumentation==0.35b0
    opentelemetry-instrumentation-aiohttp-client==0.35b0
    opentelemetry-instrumentation-asgi==0.35b0
    opentelemetry-proto==1.14.0
    opentelemetry-sdk==1.14.0
    opentelemetry-semantic-conventions==0.35b0
    opentelemetry-util-http==0.35b0
    packaging==21.3
    pathspec==0.10.3
    pep517==0.13.0
    Pillow==9.4.0
    pip-requirements-parser==32.0.1
    pip-tools==6.12.1
    -e git+https://github.com/openai/point-e.git@fc8a607c08a3ea804cc82bf1ef8628f88a3a5d2f#egg=point_e
    prometheus-client==0.15.0
    protobuf==3.20.3
    psutil==5.9.4
    Pygments==2.14.0
    pynvml==11.4.1
    pyparsing==3.0.9
    python-dateutil==2.8.2
    python-json-logger==2.0.4
    python-multipart==0.0.5
    PyWavelets==1.4.1
    PyYAML==6.0
    pyzmq==24.0.1
    regex==2022.10.31
    requests==2.28.1
    rich==13.0.0
    schema==0.7.5
    scikit-image==0.19.3
    scipy==1.9.3
    simple-di==0.1.5
    six==1.16.0
    sniffio==1.3.0
    starlette==0.23.1
    termcolor==2.1.1
    tifffile==2022.10.10
    tomli==2.0.1
    torch==1.13.1
    torchvision==0.14.1
    tornado==6.2
    tqdm==4.64.1
    typing_extensions==4.4.0
    urllib3==1.26.13
    uvicorn==0.20.0
    watchfiles==0.18.1
    wcwidth==0.2.5
    wrapt==1.14.1
    yarl==1.8.2
    
    bug 
    opened by BEpresent 9
  • bug: transformers save_model and load_model tasks conflict

    bug: transformers save_model and load_model tasks conflict

    Describe the bug

    Transformers save_model currently check for task_name and task_definition to be not None to pickle custom pipelines.

    This behaviour should be consistent with load_model

    To reproduce

    See bentoml/_internal/frameworks/transformers.py#load_model,save_model

    Expected behavior

    No response

    Environment

    na

    bug 
    opened by aarnphm 0
  • fix: quote sys.executable for circus

    fix: quote sys.executable for circus

    Ensure to quote sys.executable to work around https://github.com/circus-tent/circus/blob/b8c97d34a08b7d44ac3203440872510238b1132a/circus/process.py#L412

    Signed-off-by: Aaron Pham [email protected]

    opened by aarnphm 2
  • docs: Missing API reference for bentoml.Model

    docs: Missing API reference for bentoml.Model

    Describe the bug

    As a part of the integration of BentoML, we need to automate the process of importing and exporting BentoModels. Examining the documentation, specifically the API Reference section, there is no mentioning of the ability to do so through the Python SDK. Examining the code in this repo, we found bentoml.Model.export and bentoml.Model.import_from but they are not documented anywhere in the API reference.

    To reproduce

    Irrelevant

    Expected behavior

    Anything present int he SDK should be found in the API Reference part of the documentation

    Environment

    Irrelevant

    documentation 
    opened by eliorc 0
  • [BUG] BentoML bentoml.container.build looks for

    [BUG] BentoML bentoml.container.build looks for "=" in platform argument

    When using the "bentoml.container.build" to containerise a service, if a platform is specified, the code in the link below looks for the "=" sign and fails with an index error:

    "name": "IndexError",
    "message": "list index out of range",
    

    https://github.com/bentoml/BentoML/blob/98f6f63cfe0242a8df230fed00aa323a29735372/src/bentoml/container.py#L404 Removing line 404 fixed the issue. Example code:

    import bentoml
    bentoml.container.build(
        bento_tag="<service>:<version>",
        tag="<service>:<version>",
        platform="linux/amd64",
    )
    
    
    opened by drsantos89 2
Releases(v1.0.12)
  • v1.0.12(Dec 8, 2022)

    Important bug fixes.

    • Fixed runner call failures with keyword arguments.
    • Fixed incorrect user base image override .

    What's Changed

    • fix(runner): content-type error by @aarnphm in https://github.com/bentoml/BentoML/pull/3302
    • feat: grpc servicer implementation per version by @aarnphm in https://github.com/bentoml/BentoML/pull/3316
    • feat(grpc): adding service metadata by @aarnphm in https://github.com/bentoml/BentoML/pull/3278
    • docs: Update monitoring docs format by @ssheng in https://github.com/bentoml/BentoML/pull/3324
    • fix(runner): remote run_method with kwargs by @larme in https://github.com/bentoml/BentoML/pull/3326
    • fix: don't overwrite user base image by @aarnphm in https://github.com/bentoml/BentoML/pull/3329
    • fix: add upper bound for packaging version by @aarnphm in https://github.com/bentoml/BentoML/pull/3331
    • fix(container): podman health result string parsing by @aarnphm in https://github.com/bentoml/BentoML/pull/3330
    • fix: io descriptor backward compatibility by @sauyon in https://github.com/bentoml/BentoML/pull/3327

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.11...v1.0.12

    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.12-py3-none-any.whl(888.38 KB)
    bentoml-1.0.12.tar.gz(16.30 MB)
  • v1.0.11(Dec 7, 2022)

    ๐Ÿฑย BentoML v1.0.11 is here featuring the introduction of an inference collection and model monitoring API that can be easily integrated with any model monitoring frameworks.

    image

    • Introduced the bentoml.monitor API for monitoring any features, predictions, and target data in numerical, categorical, and numerical sequence types.

      import bentoml
      from bentoml.io import Text
      from bentoml.io import NumpyNdarray
      
      CLASS_NAMES = ["setosa", "versicolor", "virginica"]
      
      iris_clf_runner = bentoml.sklearn.get("iris_clf:latest").to_runner()
      svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner])
      
      @svc.api(
          input=NumpyNdarray.from_sample(np.array([4.9, 3.0, 1.4, 0.2], dtype=np.double)),
          output=Text(),
      )
      async def classify(features: np.ndarray) -> str:
          with bentoml.monitor("iris_classifier_prediction") as mon:
              mon.log(features[0], name="sepal length", role="feature", data_type="numerical")
              mon.log(features[1], name="sepal width", role="feature", data_type="numerical")
              mon.log(features[2], name="petal length", role="feature", data_type="numerical")
              mon.log(features[3], name="petal width", role="feature", data_type="numerical")
      
              results = await iris_clf_runner.predict.async_run([features])
              result = results[0]
              category = CLASS_NAMES[result]
      
              mon.log(category, name="pred", role="prediction", data_type="categorical")
          return category
      
    • Enabled monitoring data collection through log file forwarding using any forwarders (fluentbit, filebeat, logstash) or OTLP exporter implementations.

      • Configuration for monitoring data collection through log files.

        monitoring:
          enabled: true
          type: default
          options:
            log_path: path/to/log/file
        
      • Configuration for monitoring data collection through an OTLP exporter.

        monitoring:
          enable: true
          type: otlp
          options:
            endpoint: http://localhost:5000
            insecure: true
            credentials: null
            headers: null
            timeout: 10
            compression: null
            meta_sample_rate: 1.0
        
    • Supported third-party monitoring data collector integrations through BentoML Plugins. See bentoml/plugins repository for more details.

    ๐Ÿณย Improved containerization SDK and CLI options, read more in #3164.

    • Added support for multiple backend builder options (Docker, nerdctl, Podman, Buildah, Buildx) in addition to buildctl (standalone buildkit builder).

    • Improved Python SDK for containerization with different backend builder options.

      import bentoml
      
      bentoml.container.build("iris_classifier:latest", backend="podman", features=["grpc","grpc-reflection"], **kwargs)
      
    • Improved CLI to include the newly added options.

      bentoml containerize --help
      
    • Standardized the generated Dockerfile in bentos to be compatible with all build tools for use cases that require building from a Dockerfile directly.

    ๐Ÿ’กย We continue to update the documentation and examples on every release to help the community unlock the full power of BentoML.

    What's Changed

    • chore: add framework utils functions directory by @larme in https://github.com/bentoml/BentoML/pull/3203
    • fix: missing f-string in tag validation error message by @csh3695 in https://github.com/bentoml/BentoML/pull/3205
    • chore(build_config): bypass exception when cuda and conda is specified by @aarnphm in https://github.com/bentoml/BentoML/pull/3188
    • docs: Update asynchronous API documentation by @ssheng in https://github.com/bentoml/BentoML/pull/3204
    • style: use relative import inside _internal/ by @larme in https://github.com/bentoml/BentoML/pull/3209
    • style: fix monitoring type error by @aarnphm in https://github.com/bentoml/BentoML/pull/3208
    • chore(build): add dependabot for pyproject.toml by @aarnphm in https://github.com/bentoml/BentoML/pull/3139
    • chore(deps): bump black[jupyter] from 22.8.0 to 22.10.0 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3217
    • chore(deps): bump pylint from 2.15.3 to 2.15.5 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3212
    • chore(deps): bump pytest-asyncio from 0.19.0 to 0.20.1 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3216
    • chore(deps): bump imageio from 2.22.1 to 2.22.4 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3211
    • fix: don't index ContextVar at runtime by @sauyon in https://github.com/bentoml/BentoML/pull/3221
    • chore(deps): bump pyarrow from 9.0.0 to 10.0.0 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3214
    • chore: configuration check for development by @aarnphm in https://github.com/bentoml/BentoML/pull/3223
    • fix bento create by @quandollar in https://github.com/bentoml/BentoML/pull/3220
    • fix(docs): missing table tag by @nyongja in https://github.com/bentoml/BentoML/pull/3231
    • docs: grammar corrections by @tbazin in https://github.com/bentoml/BentoML/pull/3234
    • chore(deps): bump pytest-asyncio from 0.20.1 to 0.20.2 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3238
    • chore(deps): bump pytest-xdist[psutil] from 2.5.0 to 3.0.2 by @dependabot in https://github.com/bentoml/BentoML/pull/3245
    • chore(deps): bump pytest from 7.1.3 to 7.2.0 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3237
    • chore(deps): bump build[virtualenv] from 0.8.0 to 0.9.0 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3240
    • deps: bumping gRPC and OTLP dependencies by @aarnphm in https://github.com/bentoml/BentoML/pull/3228
    • feat(file): support custom mime type for file proto by @aarnphm in https://github.com/bentoml/BentoML/pull/3095
    • fix: multipart for client by @sauyon in https://github.com/bentoml/BentoML/pull/3253
    • fix(json): make sure to parse a list of dict for_sample by @aarnphm in https://github.com/bentoml/BentoML/pull/3229
    • chore: move test proto to internal tests only by @aarnphm in https://github.com/bentoml/BentoML/pull/3255
    • fix(framework): external_modules for loading pytorch by @bojiang in https://github.com/bentoml/BentoML/pull/3254
    • feat(container): builder implementation by @aarnphm in https://github.com/bentoml/BentoML/pull/3164
    • feat(sdk): implement otlp monitoring exporter by @bojiang in https://github.com/bentoml/BentoML/pull/3257
    • chore(grpc): add missing init.py by @aarnphm in https://github.com/bentoml/BentoML/pull/3259
    • docs(metrics): Update docs for the default metrics by @ssheng in https://github.com/bentoml/BentoML/pull/3262
    • chore: generate plain dockerfile without buildkit syntax by @aarnphm in https://github.com/bentoml/BentoML/pull/3261
    • style: remove # type: ignore by @aarnphm in https://github.com/bentoml/BentoML/pull/3265
    • fix: lazy load ONNX utils by @aarnphm in https://github.com/bentoml/BentoML/pull/3266
    • fix(pytorch): pickle is the unpickler of cloudpickle by @bojiang in https://github.com/bentoml/BentoML/pull/3269
    • fix: instructions for missing sklearn dependency by @benjamintanweihao in https://github.com/bentoml/BentoML/pull/3271
    • docs: ONNX signature docs by @larme in https://github.com/bentoml/BentoML/pull/3272
    • chore(deps): bump pyarrow from 10.0.0 to 10.0.1 by @dependabot in https://github.com/bentoml/BentoML/pull/3273
    • chore(deps): bump pylint from 2.15.5 to 2.15.6 by @dependabot in https://github.com/bentoml/BentoML/pull/3274
    • fix(pandas): only set columns when apply_column_names is set by @mqk in https://github.com/bentoml/BentoML/pull/3275
    • feat: configuration versioning by @aarnphm in https://github.com/bentoml/BentoML/pull/3052
    • fix(container): support comma in docker env by @larme in https://github.com/bentoml/BentoML/pull/3285
    • chore(stub): import filetype by @aarnphm in https://github.com/bentoml/BentoML/pull/3260
    • fix(container): ensure to stream logs when DOCKER_BUILDKIT=0 by @aarnphm in https://github.com/bentoml/BentoML/pull/3294
    • docs: update instructions for containerize message by @aarnphm in https://github.com/bentoml/BentoML/pull/3289
    • fix: unset NVIDIA_VISIBLE_DEVICES when cuda image is used by @aarnphm in https://github.com/bentoml/BentoML/pull/3298
    • fix: multipart logic by @sauyon in https://github.com/bentoml/BentoML/pull/3297
    • chore(deps): bump pylint from 2.15.6 to 2.15.7 by @dependabot in https://github.com/bentoml/BentoML/pull/3291
    • docs: wrong arguments when saving by @KimSoungRyoul in https://github.com/bentoml/BentoML/pull/3306
    • chore(deps): bump pylint from 2.15.7 to 2.15.8 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3308
    • chore(deps): bump pytest-xdist[psutil] from 3.0.2 to 3.1.0 in /requirements by @dependabot in https://github.com/bentoml/BentoML/pull/3309
    • chore(pyproject): bumping python version typeshed to 3.11 by @aarnphm in https://github.com/bentoml/BentoML/pull/3281
    • fix(monitor): disable validate for Formatter by @bojiang in https://github.com/bentoml/BentoML/pull/3317
    • doc(monitoring): monitoring guide by @bojiang in https://github.com/bentoml/BentoML/pull/3300
    • feat: parsing path for env by @aarnphm in https://github.com/bentoml/BentoML/pull/3314
    • fix: remove assertion for dtype by @aarnphm in https://github.com/bentoml/BentoML/pull/3320
    • feat: client lazy load by @aarnphm in https://github.com/bentoml/BentoML/pull/3323
    • chore: provides shim for bentoctl by @aarnphm in https://github.com/bentoml/BentoML/pull/3322

    New Contributors

    • @csh3695 made their first contribution in https://github.com/bentoml/BentoML/pull/3205
    • @nyongja made their first contribution in https://github.com/bentoml/BentoML/pull/3231
    • @tbazin made their first contribution in https://github.com/bentoml/BentoML/pull/3234
    • @KimSoungRyoul made their first contribution in https://github.com/bentoml/BentoML/pull/3306

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.10...v1.0.11

    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.11-py3-none-any.whl(884.06 KB)
    bentoml-1.0.11.tar.gz(16.30 MB)
  • v1.0.10(Nov 9, 2022)

    ๐Ÿฑย BentoML v1.0.10 is released to address a recurring broken pipe reported by the community. Also included in this release, is a list of improvements weโ€™d like to share with the community.

    • Fixed an aiohttp.client_exceptions.ClientOSError caused by asymmetrical keep alive timeout settings between the API Server and Runner.

      aiohttp.client_exceptions.ClientOSError: [Errno 32] Broken pipe
      
    • Added multi-output support for ONNX and TensorFlow frameworks.

    • Added from_sample support to all IO Descriptors in addition to just bentoml.io.NumpyNdarray and the sample is reflected in the Swagger UI.

      # Pandas Example
      @svc.api(
          input=PandasDataFrame.from_sample(
              pd.DataFrame([1,2,3,4])
          ),
      	output=PandasDataFrame(),
      )
      
      # JSON Example
      @svc.api(
          input=JSON.from_sample(
              {"foo": 1, "bar": 2}
          ),
          output=JSON(),
      )
      

      image

    ๐Ÿ’กย We continue to update the documentation and examples on every release to help the community unlock the full power of BentoML.

    What's Changed

    • feat(cli): log conditional environment variables by @aarnphm in https://github.com/bentoml/BentoML/pull/3156
    • fix: ensure conda not use pipefail and unset variables by @aarnphm in https://github.com/bentoml/BentoML/pull/3171
    • fix(templates): ensure to use python3 and pip3 by @aarnphm in https://github.com/bentoml/BentoML/pull/3170
    • fix(sdk): montioring log output by @bojiang in https://github.com/bentoml/BentoML/pull/3175
    • feat: make quickstart batchable by @sauyon in https://github.com/bentoml/BentoML/pull/3172
    • fix: lazy check for stubs via path when install local wheels by @aarnphm in https://github.com/bentoml/BentoML/pull/3180
    • fix(openapi): remove summary field under Info by @aarnphm in https://github.com/bentoml/BentoML/pull/3178
    • docs: Inference graph example by @ssheng in https://github.com/bentoml/BentoML/pull/3183
    • docs: remove whitespaces in migration guides by @wellshs in https://github.com/bentoml/BentoML/pull/3185
    • fix(build_config): validation when NoneType by @aarnphm in https://github.com/bentoml/BentoML/pull/3187
    • fix(docs): indentation in migration.rst by @aarnphm in https://github.com/bentoml/BentoML/pull/3186
    • doc(example): monitoring example for classification tasks by @bojiang in https://github.com/bentoml/BentoML/pull/3176
    • refactor(sdk): separate default monitoring impl by @bojiang in https://github.com/bentoml/BentoML/pull/3189
    • fix(ssl): provide default values in configuration by @aarnphm in https://github.com/bentoml/BentoML/pull/3191
    • fix: don't ignore logging conf by @sauyon in https://github.com/bentoml/BentoML/pull/3192
    • feat: tensorflow multi outputs support by @larme in https://github.com/bentoml/BentoML/pull/3115
    • docs: cleanup whitespace and typo by @aarnphm in https://github.com/bentoml/BentoML/pull/3195
    • chore: cleanup deadcode by @aarnphm in https://github.com/bentoml/BentoML/pull/3196
    • fix(runner): set uvicorn keep-alive by @sauyon in https://github.com/bentoml/BentoML/pull/3198
    • perf: refine onnx implementation by @larme in https://github.com/bentoml/BentoML/pull/3166
    • feat: from_sample for IO descriptor by @aarnphm in https://github.com/bentoml/BentoML/pull/3143

    New Contributors

    • @wellshs made their first contribution in https://github.com/bentoml/BentoML/pull/3185

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.8...v1.0.9

    What's Changed

    • fix: from_sample override logic by @aarnphm in https://github.com/bentoml/BentoML/pull/3202

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.9...v1.0.10

    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.10-py3-none-any.whl(852.78 KB)
    bentoml-1.0.10.tar.gz(15.34 MB)
  • v1.0.8(Nov 1, 2022)

    ๐Ÿฑย BentoML v1.0.8 is released with a list of improvement we hope that youโ€™ll find useful.

    • Introduced Bento Client for easy access to the BentoML service over HTTP. Both sync and async calls are supported. See the Bento Client Guide for more details.

      from bentoml.client import Client
      
      client = Client.from_url("http://localhost:3000")
      
      # Sync call
      response = client.classify(np.array([[4.9, 3.0, 1.4, 0.2]]))
      
      # Async call
      response = await client.async_classify(np.array([[4.9, 3.0, 1.4, 0.2]]))
      
    • Introduced custom metrics support for easy instrumentation of custom metrics over Prometheus. See Metrics Guide for more details.

      # Histogram metric
      inference_duration = bentoml.metrics.Histogram(
          name="inference_duration",
          documentation="Duration of inference",
          labelnames=["nltk_version", "sentiment_cls"],
      )
      
      # Counter metric
      polarity_counter = bentoml.metrics.Counter(
          name="polarity_total",
          documentation="Count total number of analysis by polarity scores",
          labelnames=["polarity"],
      )
      

      Full Prometheus style syntax is supported for instrumenting custom metrics inside API and Runner definitions.

      # Histogram
      inference_duration.labels(
          nltk_version=nltk.__version__, sentiment_cls=self.sia.__class__.__name__
      ).observe(time.perf_counter() - start)
      
      # Counter
      polarity_counter.labels(polarity=is_positive).inc()
      
    • Improved health checking to also cover the status of runners to avoid returning a healthy status before runners are ready.

    • Added SSL/TLS support to gRPC serving.

      bentoml serve-grpc --ssl-certfile=credentials/cert.pem --ssl-keyfile=credentials/key.pem --production --enable-reflection
      
    • Added channelz support for easy debugging gRPC serving.

    • Allowed nested requirements with the -r syntax.

      # requirements.txt
      -r nested/requirements.txt
      
      pydantic
      Pillow
      fastapi
      
    • Improved the adaptive batching dispatcher auto-tuning ability to avoid sporadic request failures due to batching in the beginning of the runner lifecycle.

    • Fixed a bug such that runners will raise a TypeError when overloaded. Now an HTTP 503 Service Unavailable will be returned when runner is overloaded.

      File "python3.9/site-packages/bentoml/_internal/runner/runner_handle/remote.py", line 188, in async_run_method
          return tuple(AutoContainer.from_payload(payload) for payload in payloads)
      TypeError: 'Response' object is not iterable
      

    ๐Ÿ’กย We continue to update the documentation and examples on every release to help the community unlock the full power of BentoML.

    ๐Ÿฅ‚ย Weโ€™d like to thank the community for your continued support and engagement.

    • Shout out to @judahrand for multiple contributions to BentoML and bentoctl.
    • Shout out to @phildamore-phdata, @quandollar, @2JooYeon, and @fortunto2 for their first contribution to BentoML.
    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.8-py3-none-any.whl(859.10 KB)
    bentoml-1.0.8.tar.gz(13.68 MB)
  • v1.0.7(Oct 3, 2022)

    ๐Ÿฑย BentoML released v1.0.7 as a patch to quickly fix a critical module import issue introduced in v1.0.6. The import error manifests in the import of any modules under io.* or models.*. The following is an example of a typical error message and traceback. Please upgrade to v1.0.7 to address this import issue.

    packages/anyio/_backends/_asyncio.py", line 21, in <module>
        from io import IOBase
    ImportError: cannot import name 'IOBase' from 'bentoml.io'
    

    What's Changed

    • test(grpc): e2e + unit tests by @aarnphm in https://github.com/bentoml/BentoML/pull/2984
    • feat: support multipart upload for large bento and model by @yetone in https://github.com/bentoml/BentoML/pull/3044
    • fix(config): respect api_server.workers by @judahrand in https://github.com/bentoml/BentoML/pull/3049
    • chore(lint): remove unused import by @aarnphm in https://github.com/bentoml/BentoML/pull/3051
    • fix(import): namespace collision by @aarnphm in https://github.com/bentoml/BentoML/pull/3058

    New Contributors

    • @judahrand made their first contribution in https://github.com/bentoml/BentoML/pull/3049

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.6...v1.0.7

    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.7-py3-none-any.whl(838.18 KB)
    bentoml-1.0.7.tar.gz(742.52 KB)
  • v1.0.6(Sep 27, 2022)

    ๐Ÿฑย BentoML has just released v1.0.6 featuring the gRPC preview! Without changing a line of code, you can now serve your Bentos as a gRPC service. Similar to serving over HTTP, BentoML gRPC supports all the ML frameworks, observability features, adaptive batching, and more out-of-the-box, simply by calling the serve-grpc CLI command.

    > pip install "bentoml[grpc]"
    > bentoml serve-grpc iris_classifier:latest --production
    

    โš ๏ธย gRPC is current under preview. The public APIs may undergo incompatible changes in the future patch releases until the official v1.1.0 minor version release.

    • Enhanced access logging format to output Trace and Span IDs in the more standard hex encoding by default.
    • Added request total, duration, and in-progress metrics to Runners, in addition to API Servers.
    • Added support for XGBoost SKLearn models.
    • Added support for restricting image mime types in the Image IO descriptor.

    ๐Ÿฅ‚ย Weโ€™d like to thank our community for their contribution and support.

    • Shout out to @benjamintanweihao for fixing a BentoML CLI bug.
    • Shout out to @lsh918 for mixing a PyTorch framework issue.
    • Shout out to @jeffthebear for enhancing the Pandas DataFrame OpenAPI schema.
    • Shout out to @jiewpeng for adding the support for customizing access logs with Trace and Span ID formats.

    What's Changed

    • fix: log runner errors explicitly by @ssheng in https://github.com/bentoml/BentoML/pull/2952
    • ci: temp fix for models test by @sauyon in https://github.com/bentoml/BentoML/pull/2949
    • fix: fix context parameter for multi-input IO descriptors by @sauyon in https://github.com/bentoml/BentoML/pull/2948
    • fix: use torch.from_numpy() instead of torch.Tensor() to keep data type by @lsh918 in https://github.com/bentoml/BentoML/pull/2951
    • docs: fix wrong name for example neural net by @ssun-g in https://github.com/bentoml/BentoML/pull/2959
    • docs: fix bentoml containerize command help message by @aarnphm in https://github.com/bentoml/BentoML/pull/2957
    • chore(cli): remove unused --no-trunc by @benjamintanweihao in https://github.com/bentoml/BentoML/pull/2965
    • fix: relax regex for setting environment variables by @benjamintanweihao in https://github.com/bentoml/BentoML/pull/2964
    • docs: update wrong paths for disabling logs by @creativedutchmen in https://github.com/bentoml/BentoML/pull/2974
    • feat: track serve update for start subcommands by @ssheng in https://github.com/bentoml/BentoML/pull/2976
    • feat: logging customization by @jiewpeng in https://github.com/bentoml/BentoML/pull/2961
    • chore(cli): using quotes instead of backslash by @sauyon in https://github.com/bentoml/BentoML/pull/2981
    • feat(cli): show full tracebacks in debug mode by @sauyon in https://github.com/bentoml/BentoML/pull/2982
    • feature(runner): add multiple output support by @larme in https://github.com/bentoml/BentoML/pull/2912
    • docs: add airflow integration page by @parano in https://github.com/bentoml/BentoML/pull/2990
    • chore(ci): fix the unit test of transformers by @bojiang in https://github.com/bentoml/BentoML/pull/3003
    • chore(ci): fix the issue caused by the change of check_task by @bojiang in https://github.com/bentoml/BentoML/pull/3004
    • fix(multipart): support multipart file inputs to non-file descriptors by @sauyon in https://github.com/bentoml/BentoML/pull/3005
    • feat(server): add runner metrics; refactoring batch size metrics by @bojiang in https://github.com/bentoml/BentoML/pull/2977
    • EXPERIMENTAL: gRPC support by @aarnphm in https://github.com/bentoml/BentoML/pull/2808
    • fix(runner): receive requests before cork by @bojiang in https://github.com/bentoml/BentoML/pull/2996
    • fix(server): service_name label of runner metrics by @bojiang in https://github.com/bentoml/BentoML/pull/3008
    • chore(misc): remove mentioned for team member from PR request by @aarnphm in https://github.com/bentoml/BentoML/pull/3009
    • feat(xgboost): support xgboost sklearn models by @sauyon in https://github.com/bentoml/BentoML/pull/2997
    • feat(io/image): allow restricting mime types by @sauyon in https://github.com/bentoml/BentoML/pull/2999
    • fix(grpc): docker message by @aarnphm in https://github.com/bentoml/BentoML/pull/3012
    • fix: broken legacy metrics by @aarnphm in https://github.com/bentoml/BentoML/pull/3019
    • fix(e2e): exception test for image IO by @aarnphm in https://github.com/bentoml/BentoML/pull/3017
    • revert(3017): filter write-only mime type for Image IO by @bojiang in https://github.com/bentoml/BentoML/pull/3020
    • chore: cleanup containerize utils by @aarnphm in https://github.com/bentoml/BentoML/pull/3014
    • feat(proto): add serialized_bytes to pb.Part by @aarnphm in https://github.com/bentoml/BentoML/pull/3022
    • docs: Update README.md by @parano in https://github.com/bentoml/BentoML/pull/3023
    • chore(grpc): vcs generated stubs by @aarnphm in https://github.com/bentoml/BentoML/pull/3016
    • feat(io/image): allow writeable mimes as output by @sauyon in https://github.com/bentoml/BentoML/pull/3024
    • docs: fix descriptor typo by @darioarias in https://github.com/bentoml/BentoML/pull/3027
    • fix(server): log localhost instead of 0.0.0.0 by @sauyon in https://github.com/bentoml/BentoML/pull/3033
    • fix(io): Pandas OpenAPI schema by @jeffthebear in https://github.com/bentoml/BentoML/pull/3032
    • chore(docker): support more cuda versions by @larme in https://github.com/bentoml/BentoML/pull/3035
    • docs: updates on blocks that failed to render by @aarnphm in https://github.com/bentoml/BentoML/pull/3031
    • chore: migrate to pyproject.toml by @aarnphm in https://github.com/bentoml/BentoML/pull/3025
    • docs: gRPC tutorial by @aarnphm in https://github.com/bentoml/BentoML/pull/3013
    • docs: gRPC advanced guides by @aarnphm in https://github.com/bentoml/BentoML/pull/3034
    • feat(configuration): override options with envvar by @bojiang in https://github.com/bentoml/BentoML/pull/3018
    • chore: update links by @aarnphm in https://github.com/bentoml/BentoML/pull/3040
    • fix(configuration): should validate config early by @aarnphm in https://github.com/bentoml/BentoML/pull/3041
    • qa(bentos): update latest options by @aarnphm in https://github.com/bentoml/BentoML/pull/3042
    • qa: ignore tools from distribution by @aarnphm in https://github.com/bentoml/BentoML/pull/3045
    • dependencies: ignore broken pypi combination by @aarnphm in https://github.com/bentoml/BentoML/pull/3043
    • feat: gRPC tracking by @aarnphm in https://github.com/bentoml/BentoML/pull/3015
    • configuration: migrate schema to api_server by @ssheng in https://github.com/bentoml/BentoML/pull/3046
    • qa: cleanup MLflow by @aarnphm in https://github.com/bentoml/BentoML/pull/2945

    New Contributors

    • @lsh918 made their first contribution in https://github.com/bentoml/BentoML/pull/2951
    • @ssun-g made their first contribution in https://github.com/bentoml/BentoML/pull/2959
    • @benjamintanweihao made their first contribution in https://github.com/bentoml/BentoML/pull/2965
    • @creativedutchmen made their first contribution in https://github.com/bentoml/BentoML/pull/2974
    • @darioarias made their first contribution in https://github.com/bentoml/BentoML/pull/3027
    • @jeffthebear made their first contribution in https://github.com/bentoml/BentoML/pull/3032

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.5...v1.0.6

    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.6-py3-none-any.whl(827.79 KB)
    bentoml-1.0.6.tar.gz(733.70 KB)
  • v1.0.5(Aug 30, 2022)

    ๐Ÿฑย BentoML v1.0.5 is released as a quick fix to a Yatai incompatibility introduced in v1.0.4.

    • The incompatibility manifests in the following error message when deploying a bento on Yatai. Upgrading BentoML to v1.0.5 will resolve the issue.
      Error while finding module specification for 'bentoml._internal.server.cli.api_server' (ModuleNotFoundError: No module named 'bentoml._internal.server.cli')
      
    • The incompatibility resides in all Yatai versions prior to v1.0.0-alpha.*. Alternatively, upgrading Yatai to v1.0.0-alpha.* will also restore the compatibility with bentos built in v1.0.4.
    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.5-py3-none-any.whl(774.05 KB)
    bentoml-1.0.5.tar.gz(702.46 KB)
  • v1.0.4(Aug 26, 2022)

    ๐Ÿฑย BentoML v1.0.4 is here!

    • Added support for explicit GPU mapping for runners. In addition to specifying the number of GPU devices allocated to a runner, we can map a list of device IDs directly to a runner through configuration.

      runners:
        iris_clf_1:
          resources:
            nvidia.com/gpu: [2, 4] # Map device 2 and 4 to iris_clf_1 runner
        iris_clf_2:
          resources:
            nvidia.com/gpu: [1, 3] # Map device 1 and 3 to iris_clf_2 runner
      
    • Added SSL support for API server through both CLI and configuration.

        --ssl-certfile TEXT          SSL certificate file
        --ssl-keyfile TEXT           SSL key file
        --ssl-keyfile-password TEXT  SSL keyfile password
        --ssl-version INTEGER        SSL version to use (see stdlib 'ssl' module)
        --ssl-cert-reqs INTEGER      Whether client certificate is required (see stdlib 'ssl' module)
        --ssl-ca-certs TEXT          CA certificates file
        --ssl-ciphers TEXT           Ciphers to use (see stdlib 'ssl' module)
      
    • Added adaptive batching size histogram metrics, BENTOML_{runner}_{method}_adaptive_batch_size_bucket, for observability of batching mechanism details.

      image

    • Added support OpenTelemetry OTLP exporter for tracing and configures the OpenTelemetry resource automatically if user has not explicitly configured it through environment variables. Upgraded OpenTelemetry python packages to version 0.33b0.

      image

    • Added support for saving external_modules alongside with models in the save_model API. Saving external Python modules is useful for models with external dependencies, such as tokenizers, preprocessors, and configurations.

    • Enhanced Swagger UI to include additional documentation and helper links.

      image

    ๐Ÿ’กย We continue to update the documentation on every release to help our users unlock the full power of BentoML.

    • Checkout the adaptive batching documentation on how to leverage batching to improve inference latency and efficiency.
    • Checkout the runner configuration documentation on how to customize resource allocation for runners at run time.

    ๐Ÿ™Œย We continue to receive great engagement and support from the BentoML community.

    • Shout out to @sptowey for their contribution on adding SSL support.
    • Shout out to @dbuades for their contribution on adding the OTLP exporter.
    • Shout out to @tweeklab for their contribution on fixing a bug on import_model in the MLflow framework.

    What's Changed

    • refactor: cli to bentoml_cli by @sauyon in https://github.com/bentoml/BentoML/pull/2880
    • chore: remove typing-extensions dependency by @sauyon in https://github.com/bentoml/BentoML/pull/2879
    • fix: remove chmod install scripts by @aarnphm in https://github.com/bentoml/BentoML/pull/2830
    • fix: relative imports to lazy by @aarnphm in https://github.com/bentoml/BentoML/pull/2882
    • fix(cli): click utilities imports by @aarnphm in https://github.com/bentoml/BentoML/pull/2883
    • docs: add custom model runner example by @parano in https://github.com/bentoml/BentoML/pull/2885
    • qa: analytics unit tests by @aarnphm in https://github.com/bentoml/BentoML/pull/2878
    • chore: script for releasing quickstart bento by @parano in https://github.com/bentoml/BentoML/pull/2892
    • fix: pushing models from Bento instead of local modelstore by @parano in https://github.com/bentoml/BentoML/pull/2887
    • fix(containerize): supports passing multiple tags by @aarnphm in https://github.com/bentoml/BentoML/pull/2872
    • feat: explicit GPU runner mappings by @jjmachan in https://github.com/bentoml/BentoML/pull/2862
    • fix: setuptools doesn't include bentoml_cli by @bojiang in https://github.com/bentoml/BentoML/pull/2898
    • feat: Add SSL support for http api servers via bentoml serve by @sptowey in https://github.com/bentoml/BentoML/pull/2886
    • patch: ssl styling and default value check by @aarnphm in https://github.com/bentoml/BentoML/pull/2899
    • fix(scheduling): raise an error for invalid resources by @bojiang in https://github.com/bentoml/BentoML/pull/2894
    • chore(templates): cleanup debian dependency logic by @aarnphm in https://github.com/bentoml/BentoML/pull/2904
    • fix(ci): unittest failed by @bojiang in https://github.com/bentoml/BentoML/pull/2908
    • chore(cli): add figlet for CLI by @aarnphm in https://github.com/bentoml/BentoML/pull/2909
    • feat: codespace by @aarnphm in https://github.com/bentoml/BentoML/pull/2907
    • feat: use yatai proxy to upload/download bentos/models by @yetone in https://github.com/bentoml/BentoML/pull/2832
    • fix(scheduling): numpy worker environs are not taking effect by @bojiang in https://github.com/bentoml/BentoML/pull/2893
    • feat: Adaptive batching size histogram metrics by @ssheng in https://github.com/bentoml/BentoML/pull/2902
    • chore(swagger): include help links by @parano in https://github.com/bentoml/BentoML/pull/2927
    • feat(tracing): add support for otlp exporter by @dbuades in https://github.com/bentoml/BentoML/pull/2918
    • chore: Lock OpenTelemetry versions and add tracing metadata by @ssheng in https://github.com/bentoml/BentoML/pull/2928
    • revert: unminify CSS by @aarnphm in https://github.com/bentoml/BentoML/pull/2931
    • fix: importing mlflow:/ urls with no extra path info by @tweeklab in https://github.com/bentoml/BentoML/pull/2930
    • fix(yatai): make presigned_urls_deprecated optional by @bojiang in https://github.com/bentoml/BentoML/pull/2933
    • feat: add timeout option for bentoml runner config by @jjmachan in https://github.com/bentoml/BentoML/pull/2890
    • perf(cli): speed up by @aarnphm in https://github.com/bentoml/BentoML/pull/2934
    • chore: remove multipart IO descriptor warning by @ssheng in https://github.com/bentoml/BentoML/pull/2936
    • fix(json): revert eager check by @aarnphm in https://github.com/bentoml/BentoML/pull/2926
    • chore: remove --config flag to load the bentoml runtime config by @jjmachan in https://github.com/bentoml/BentoML/pull/2939
    • chore: update README messaging by @ssheng in https://github.com/bentoml/BentoML/pull/2937
    • fix: use a temporary file for file uploads by @sauyon in https://github.com/bentoml/BentoML/pull/2929
    • feat(cli): add CLI command to serve a runner by @bojiang in https://github.com/bentoml/BentoML/pull/2920
    • docs: Runner configuration for batching and resource allocation by @ssheng in https://github.com/bentoml/BentoML/pull/2941
    • bug: handle bad image file by @parano in https://github.com/bentoml/BentoML/pull/2942
    • chore(docs): earlier check for buildx by @aarnphm in https://github.com/bentoml/BentoML/pull/2940
    • fix(cli): helper message default values by @ssheng in https://github.com/bentoml/BentoML/pull/2943
    • feat(sdk): add external_modules option to save_model by @bojiang in https://github.com/bentoml/BentoML/pull/2895
    • fix(cli): component name regression by @ssheng in https://github.com/bentoml/BentoML/pull/2944

    New Contributors

    • @sptowey made their first contribution in https://github.com/bentoml/BentoML/pull/2886
    • @dbuades made their first contribution in https://github.com/bentoml/BentoML/pull/2918
    • @tweeklab made their first contribution in https://github.com/bentoml/BentoML/pull/2930

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.3...v1.0.4

    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.4-py3-none-any.whl(773.63 KB)
    bentoml-1.0.4.tar.gz(702.05 KB)
  • v1.0.3(Aug 8, 2022)

    ๐Ÿฑย BentoML v1.0.3 release has brought a list of performance and feature improvement.

    • Improved Runner IO performance by enhancing the underlying serialization and deserialization, especially in models with large input and output sizes. Our image input benchmark showed a 100% throughput improvement.

      • v1.0.2 ๐ŸŒ image

      • v1.0.3 ๐Ÿ’จ image

    • Added support for specifying URLs to exclude from tracing.

    • Added support custom components for OpenAPI generation. image

    ๐Ÿ™Œย We continue to receive great engagement and support from the BentoML community.

    • Shout out to Ben Kessler for helping benchmarking performance.
    • Shout out to Jiew Peng Lim for adding the support for configuring URLs to exclude from tracing.
    • Shout out to Susana Bouchardet for add the support for JSON IO Descriptor to return empty response body.
    • Thanks to Keming and mplk for contributing their first PRs in BentoML.

    What's Changed

    • chore(deps): bump actions/setup-node from 2 to 3 by @dependabot in https://github.com/bentoml/BentoML/pull/2846
    • fix: extend --cache-from consumption to python tuple by @anwang2009 in https://github.com/bentoml/BentoML/pull/2847
    • feat: add support for excluding urls from tracing by @jiewpeng in https://github.com/bentoml/BentoML/pull/2843
    • docs: update notice about buildkit by @aarnphm in https://github.com/bentoml/BentoML/pull/2837
    • chore: add CODEOWNERS by @aarnphm in https://github.com/bentoml/BentoML/pull/2842
    • doc(frameworks): tensorflow by @bojiang in https://github.com/bentoml/BentoML/pull/2718
    • feat: add support for specifying urls to exclude from tracing as a list by @jiewpeng in https://github.com/bentoml/BentoML/pull/2851
    • fix(configuration): merging global runner config to runner specific config by @jjmachan in https://github.com/bentoml/BentoML/pull/2849
    • fix: Setting status code and cookies by @ssheng in https://github.com/bentoml/BentoML/pull/2854
    • chore: README typo by @kemingy in https://github.com/bentoml/BentoML/pull/2859
    • chore: gallery links to bentoml/examples by @aarnphm in https://github.com/bentoml/BentoML/pull/2858
    • fix(runner): use pickle instead for multi payload parameters by @aarnphm in https://github.com/bentoml/BentoML/pull/2857
    • doc(framework): pytorch guide by @bojiang in https://github.com/bentoml/BentoML/pull/2735
    • docs: add missing output to Runner docs by @mplk in https://github.com/bentoml/BentoML/pull/2868
    • chore: fix push and load interop by @aarnphm in https://github.com/bentoml/BentoML/pull/2863
    • fix: Usage stats by @ssheng in https://github.com/bentoml/BentoML/pull/2876
    • fix: JSON(IODescriptor[JSONType]).to_http_response returns empty body when the response is None. by @sbouchardet in https://github.com/bentoml/BentoML/pull/2874
    • chore: Address comments in the #2874 by @ssheng in https://github.com/bentoml/BentoML/pull/2877
    • fix: debugger breaks on circus process by @aarnphm in https://github.com/bentoml/BentoML/pull/2875
    • feat: support custom components for OpenAPI generation by @aarnphm in https://github.com/bentoml/BentoML/pull/2845

    New Contributors

    • @anwang2009 made their first contribution in https://github.com/bentoml/BentoML/pull/2847
    • @jiewpeng made their first contribution in https://github.com/bentoml/BentoML/pull/2843
    • @kemingy made their first contribution in https://github.com/bentoml/BentoML/pull/2859
    • @mplk made their first contribution in https://github.com/bentoml/BentoML/pull/2868
    • @sbouchardet made their first contribution in https://github.com/bentoml/BentoML/pull/2874

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.2...v1.0.3

    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.3-py3-none-any.whl(761.33 KB)
    bentoml-1.0.3.tar.gz(681.53 KB)
  • v1.0.2(Jul 29, 2022)

    ๐Ÿฑย We have just released BentoML v1.0.2 with a number of features and bug fixes requested by the community.

    • Added support for custom model versions, e.g. bentoml.tensorflow.save_model("model_name:1.2.4", model).
    • Fixed PyTorch Runner payload serialization issue due to tensor not on CPU.
    TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first
    
    • Fixed Transformers GPU device assignment due to kwargs handling.
    • Fixed excessive Runner thread spawning issue under high load.
    • Fixed PyTorch Runner inference error due to saving tensor during inference mode.
    RuntimeError: Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd.
    
    • Fixed Keras Runner error when the input has only a single element.
    • Deprecated the validate_json option in JSON IO descriptor and recommended specifying validation logic natively in the Pydantic model.

    ๐ŸŽจย We added an examples directory and in it you will find interesting sample projects demonstrating various applications of BentoML. We welcome your contribution if you have a project idea and would like to share with the community.

    ๐Ÿ’กย We continue to update the documentation on every release to help our users unlock the full power of BentoML.

    What's Changed

    • chore: remove all --pre from documentation by @aarnphm in https://github.com/bentoml/BentoML/pull/2738
    • chore(framework): onnx guide minor improvements by @larme in https://github.com/bentoml/BentoML/pull/2744
    • fix(framework): fix how pytorch DataContainer convert GPU tensor by @larme in https://github.com/bentoml/BentoML/pull/2739
    • doc: add missing variable by @robsonpeixoto in https://github.com/bentoml/BentoML/pull/2752
    • chore(deps): cattrs>=22.1.0 in setup.cfg by @sugatoray in https://github.com/bentoml/BentoML/pull/2758
    • fix(transformers): kwargs and migrate to framework tests by @ssheng in https://github.com/bentoml/BentoML/pull/2761
    • chore: add type hint for run and async_run by @aarnphm in https://github.com/bentoml/BentoML/pull/2760
    • docs: fix typo in SECURITY.md by @parano in https://github.com/bentoml/BentoML/pull/2766
    • chore: use pypa/build as PEP517 backend by @aarnphm in https://github.com/bentoml/BentoML/pull/2680
    • chore(e2e): capture log output by @aarnphm in https://github.com/bentoml/BentoML/pull/2767
    • chore: more robust prometheus directory ensuring by @bojiang in https://github.com/bentoml/BentoML/pull/2526
    • doc(framework): add scikit-learn section to ONNX documentation by @larme in https://github.com/bentoml/BentoML/pull/2764
    • chore: clean up dependencies by @sauyon in https://github.com/bentoml/BentoML/pull/2769
    • docs: misc docs reorganize and cleanups by @parano in https://github.com/bentoml/BentoML/pull/2768
    • fix(io descriptors): finish removing init_http_response by @sauyon in https://github.com/bentoml/BentoML/pull/2774
    • chore: fix typo by @aarnphm in https://github.com/bentoml/BentoML/pull/2776
    • feat(model): allow custom model versions by @sauyon in https://github.com/bentoml/BentoML/pull/2775
    • chore: add watchfiles as bentoml dependency by @aarnphm in https://github.com/bentoml/BentoML/pull/2777
    • doc(framework): keras guide by @larme in https://github.com/bentoml/BentoML/pull/2741
    • docs: Update service schema and validation by @ssheng in https://github.com/bentoml/BentoML/pull/2778
    • doc(frameworks): fix pip package syntax by @larme in https://github.com/bentoml/BentoML/pull/2782
    • fix(runner): thread limiter doesn't take effect by @bojiang in https://github.com/bentoml/BentoML/pull/2781
    • feat: add additional env var configuring num of threads in Runner by @parano in https://github.com/bentoml/BentoML/pull/2786
    • fix(templates): sharing variables at template level by @aarnphm in https://github.com/bentoml/BentoML/pull/2796
    • bug: fix JSON io_descriptor validate_json option by @parano in https://github.com/bentoml/BentoML/pull/2803
    • chore: improve error message when failed importing user service code by @parano in https://github.com/bentoml/BentoML/pull/2806
    • chore: automatic cache action version update and remove stale bot by @aarnphm in https://github.com/bentoml/BentoML/pull/2798
    • chore(deps): bump actions/checkout from 2 to 3 by @dependabot in https://github.com/bentoml/BentoML/pull/2810
    • chore(deps): bump codecov/codecov-action from 2 to 3 by @dependabot in https://github.com/bentoml/BentoML/pull/2811
    • chore(deps): bump github/codeql-action from 1 to 2 by @dependabot in https://github.com/bentoml/BentoML/pull/2813
    • chore(deps): bump actions/cache from 2 to 3 by @dependabot in https://github.com/bentoml/BentoML/pull/2812
    • chore(deps): bump actions/setup-python from 2 to 4 by @dependabot in https://github.com/bentoml/BentoML/pull/2814
    • fix(datacontainer): pytorch to_payload should disable gradient by @aarnphm in https://github.com/bentoml/BentoML/pull/2821
    • fix(framework): fix keras single input edge case by @larme in https://github.com/bentoml/BentoML/pull/2822
    • fix(framework): keras GPU handling by @larme in https://github.com/bentoml/BentoML/pull/2824
    • docs: update custom bentoserver guide by @parano in https://github.com/bentoml/BentoML/pull/2809
    • fix(runner): bind limiter to runner_ref instead by @bojiang in https://github.com/bentoml/BentoML/pull/2826
    • fix(pytorch): inference_mode context is thead local by @bojiang in https://github.com/bentoml/BentoML/pull/2828
    • fix: address multiple tags for containerize by @aarnphm in https://github.com/bentoml/BentoML/pull/2797
    • chore: Add gallery projects under examples by @ssheng in https://github.com/bentoml/BentoML/pull/2833
    • chore: running formatter on examples folder by @aarnphm in https://github.com/bentoml/BentoML/pull/2834
    • docs: update security auth middleware by @g0nz4rth in https://github.com/bentoml/BentoML/pull/2835
    • fix(io_descriptor): DataFrame columns check by @alizia in https://github.com/bentoml/BentoML/pull/2836
    • fix: examples directory structure by @ssheng in https://github.com/bentoml/BentoML/pull/2839
    • revert: "fix: address multiple tags for containerize (#2797)" by @ssheng in https://github.com/bentoml/BentoML/pull/2840

    New Contributors

    • @robsonpeixoto made their first contribution in https://github.com/bentoml/BentoML/pull/2752
    • @sugatoray made their first contribution in https://github.com/bentoml/BentoML/pull/2758
    • @g0nz4rth made their first contribution in https://github.com/bentoml/BentoML/pull/2835
    • @alizia made their first contribution in https://github.com/bentoml/BentoML/pull/2836

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.0...v1.0.1

    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.2-py3-none-any.whl(756.17 KB)
    bentoml-1.0.2.tar.gz(677.08 KB)
  • v1.0.0(Jul 13, 2022)

    ๐Ÿฑย The wait is over. BentoML has officially releasedย v1.0.0. We are excited to share with you the notable features improvements.

    • Introduced BentoML Runner, an abstraction for parallel model inference. It allows the compute intensive model inference step to scale separately from the transformation and business logic. The Runner is easily instantiated and invoked, but behind the scenes, BentoML is optimizing for micro-batching and fanning out inference if needed. Hereโ€™s a simple example of instantiating a Runner. Learn more about using runners.
    • Redesigned how models are saved, moved, and loaded with BentoML. We introduced new primitives which allow users to call a save_model() method which saves the model in the most optimal way based the recommended practices of the ML framework. The model is then stored in a flexible local repository where users can use โ€œimportโ€ and โ€œexportโ€ functionality to push and pull โ€œfinalizedโ€ models from remote locations like S3. Bentos can be built locally or remotely with these models. Once built, Yatai or bentoctl can easily deploy to the cloud service of your choice. Learn more about preparing models and building bentos.
    • Enhanced micro-batching capability with the new runner abstraction, batching is even more powerful. When incoming data is spread to different transformation processes, the runner will fan in inferences when inference is invoked. Multiple inputs will be batched into a single inference call. Most ML frameworks implement some form of vectorization which improves performance for multiple inputs at once. Our adaptive batching not only batches inputs as they are received, but also regresses the time of the last several groups of inputs in order to optimize the batch size and latency windows.
    • Improved reproducibility of the model by recording and locking the dependent library versions. We use the versions to package the correct dependencies so that the environment in which the model runs in production is identical to the environment it was trained in. All direct and transitive dependencies are recorded and deployed with the model when running in production. In our 1.0 version we now support Conda as well as several different ways to customize your pip packages when โ€œbuilding your Bentoโ€. Learn more about building bentos.
    • Simplified Docker image creation during containerization to generate the right image for you depending on the features that youโ€™ve decided to implement in your service. For example, if your runner specifies that it can run on a GPU, we will automatically choose the right Nvidia docker image as a base when containerizing your service. If needed, we also provide the flexibility to customize your docker image as well. Learn more about containerization.
    • Improved input and output validation with native type validation rules. Numpy and Pandas DataFrame can specify a static shape or even dynamically infer schema by providing sample data. The Pydantic schema that is produced per endpoint also integrates with our Swagger UI so that each endpoint is better documented for sharing. Learn more about service APIs and IO Descriptors.

    โš ๏ธย BentoML v1.0.0 is backward incompatible with v0.13.1. If you wish to stay on the v0.13.1 LTS version, please lock the dependency with bentoml==0.13.1. We have also prepared a migration guide from v0.13.1 to v1.0.0 to help with your project migration. We are committed to supporting the v0.13-LTS versions with critical bug fixes and security patches.

    ๐ŸŽ‰ย After years of seeing hundreds of model serving use cases, we are proud to present the official release of BentoML 1.0. We could not have done it without the growth and support of our community.

    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.0-py3-none-any.whl(755.87 KB)
    bentoml-1.0.0.tar.gz(678.30 KB)
  • v1.0.0-rc3(Jul 1, 2022)

    We have just released BentoML 1.0.0rc3 with a number of highly anticipated features and improvements. Check it out with the following command!

    $ pip install -U bentoml --pre
    

    โš ๏ธย BentoML will release the official 1.0.0 version next week and remove the need to use --pre tag to install BentoML versions after 1.0.0. If you wish to stay on the 0.13.1 LTS version, please lock the dependency with bentoml==0.13.1.

    • Added support for framework runners in the following ML frameworks.
    • Added support for Huggingface Transformers custom pipelines.
    • Fixed a logging issue causing the api_server and runners to not generate error logs.
    • Optimized Tensorflow inference procedure.
    • Improved resource request configuration for runners.
      • Resource request can be now configured in the BentoML configuration. If unspecified, runners will be scheduled to best utilized the available system resources.

        runners:
          resources:
            cpu: 8.0
            nvidia.com/gpu: 4.0
        
      • Updated the API for custom runners to declare the types of supported resources.

        import bentoml
        
        class MyRunnable(bentoml.Runnable):
        	SUPPORTS_CPU_MULTI_THREADING = True  # Deprecated SUPPORT_CPU_MULTI_THREADING
                SUPPORTED_RESOURCES = ("nvidia.com/gpu", "cpu")  # Deprecated SUPPORT_NVIDIA_GPU
                ...
        
        my_runner = bentoml.Runner(
            MyRunnable,
            runnable_init_params={"foo": foo, "bar": bar},
            name="custom_runner_name",
            ...
        )
        
      • Deprecated the API for specifying resources from the framework to_runner() and custom Runner APIs. For better flexibility at runtime, it is recommended to specifying resources through configuration.

    What's Changed

    • fix(dependencies): require pyyaml>=5 by @sauyon in https://github.com/bentoml/BentoML/pull/2626
    • refactor(server): merge contexts; add yatai headers by @bojiang in https://github.com/bentoml/BentoML/pull/2621
    • chore(pylint): update pylint configuration by @sauyon in https://github.com/bentoml/BentoML/pull/2627
    • fix: Transformers NVIDIA_VISIBLE_DEVICES value type casting by @ssheng in https://github.com/bentoml/BentoML/pull/2624
    • fix: Server silently crash without logging exceptions by @ssheng in https://github.com/bentoml/BentoML/pull/2635
    • fix(framework): some GPU related fixes by @larme in https://github.com/bentoml/BentoML/pull/2637
    • tests: minor e2e test cleanup by @sauyon in https://github.com/bentoml/BentoML/pull/2643
    • docs: Add model in bentoml.pytorch.save_model() pytorch integration example by @AlexandreNap in https://github.com/bentoml/BentoML/pull/2644
    • chore(ci): always enable actions on PR by @sauyon in https://github.com/bentoml/BentoML/pull/2646
    • chore: updates ci by @aarnphm in https://github.com/bentoml/BentoML/pull/2650
    • fix(docker): templates bash heredoc should pass -ex by @aarnphm in https://github.com/bentoml/BentoML/pull/2651
    • feat: CatBoost integration by @yetone in https://github.com/bentoml/BentoML/pull/2615
    • feat: FastAI by @aarnphm in https://github.com/bentoml/BentoML/pull/2571
    • feat: Support Transformers custom pipeline by @ssheng in https://github.com/bentoml/BentoML/pull/2640
    • feat(framework): onnx support by @larme in https://github.com/bentoml/BentoML/pull/2629
    • chore(tensorflow): optimize inference procedure by @bojiang in https://github.com/bentoml/BentoML/pull/2567
    • fix(runner): validate runner names by @sauyon in https://github.com/bentoml/BentoML/pull/2588
    • fix(runner): lowercase runner names and add tests by @sauyon in https://github.com/bentoml/BentoML/pull/2656
    • style: github naming by @aarnphm in https://github.com/bentoml/BentoML/pull/2659
    • tests(framework): add new framework tests by @sauyon in https://github.com/bentoml/BentoML/pull/2660
    • docs: missing code annotation by @jjmachan in https://github.com/bentoml/BentoML/pull/2654
    • perf(templates): cache python installation via conda by @aarnphm in https://github.com/bentoml/BentoML/pull/2662
    • fix(ci): destroy the runner after init_local by @bojiang in https://github.com/bentoml/BentoML/pull/2665
    • fix(conda): python installation order by @aarnphm in https://github.com/bentoml/BentoML/pull/2668
    • fix(tensorflow): casting error on kwargs by @bojiang in https://github.com/bentoml/BentoML/pull/2664
    • feat(runner): implement resource configuration by @sauyon in https://github.com/bentoml/BentoML/pull/2632

    New Contributors

    • @AlexandreNap made their first contribution in https://github.com/bentoml/BentoML/pull/2644

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.0-rc2...v1.0.0-rc3

    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.0rc3-py3-none-any.whl(749.64 KB)
    bentoml-1.0.0rc3.tar.gz(665.56 KB)
  • v1.0.0-rc2(Jun 22, 2022)

    We have just released BentoML 1.0.0rc2 with an exciting lineup of improvements. Check it out with the following command!

    $ pip install -U bentoml --pre
    
    • Standardized logging configuration and improved logging performance.

      • If imported as a library, BentoML will no longer configure logging explicitly and will respect the logging configuration of the importing Python process. To customize BentoML logging as a library, configurations can be added for the bentoml logger.
      
      formatters:
        ...
      handlers:
        ...
      loggers:
        ...
        bentoml:
          handlers: [...]
          level: INFO
          ...
      
      • If started as a server, BentoML will continue to configure logging format and output to stdout at INFO level. All third party libraries will be configured to log at the WARNING level.
    • Added LightGBM framework support.

    • Updated model and bento creation timestamps CLI display to use the local timezone for better use experience, while timestamps in metadata will remain in the UTC timezone.

    • Improved the reliability of bento build with advanced options including base_image and dockerfile_template.

    Beside all the exciting product work, we also started a blog at modelserving.com sharing our learnings gained from building BentoML and supporting the MLOps community. Checkout our latest blog [Breaking up with Flask & FastAPI: Why ML model serving requires a specialized framework] (share your thoughts with us on our LinkedIn post.

    Lastly, a big shoutout to @Mike Kuhlen for adding the LightGBM framework support. ๐Ÿฅ‚

    What's Changed

    • feat(cli): output times in the local timezone by @sauyon in https://github.com/bentoml/BentoML/pull/2572
    • fix(store): use >= for time checking by @sauyon in https://github.com/bentoml/BentoML/pull/2574
    • fix(build): use subprocess to call pip-compile by @sauyon in https://github.com/bentoml/BentoML/pull/2573
    • docs: fix wrong variable name in comment by @kim-sardine in https://github.com/bentoml/BentoML/pull/2575
    • feat: improve logging by @sauyon in https://github.com/bentoml/BentoML/pull/2568
    • fix(service): JsonIO doesn't return a pydantic model by @bojiang in https://github.com/bentoml/BentoML/pull/2578
    • fix: update conda env yaml file name and default channel by @parano in https://github.com/bentoml/BentoML/pull/2580
    • chore(runner): add shcedule shortcuts to runners by @bojiang in https://github.com/bentoml/BentoML/pull/2576
    • fix(cli): cli encoding error on Windows by @bojiang in https://github.com/bentoml/BentoML/pull/2579
    • fix(bug): Make model.with_options() additive by @ssheng in https://github.com/bentoml/BentoML/pull/2519
    • feat: dockerfile templates advanced guides by @aarnphm in https://github.com/bentoml/BentoML/pull/2548
    • docs: add setuptools to docs dependencies by @parano in https://github.com/bentoml/BentoML/pull/2586
    • test(frameworks): minor test improvements by @sauyon in https://github.com/bentoml/BentoML/pull/2590
    • feat: Bring LightGBM back by @mqk in https://github.com/bentoml/BentoML/pull/2589
    • fix(runner): pass init params to runnable by @sauyon in https://github.com/bentoml/BentoML/pull/2587
    • fix: propagate should be false by @aarnphm in https://github.com/bentoml/BentoML/pull/2594
    • fix: Remove starlette request log by @ssheng in https://github.com/bentoml/BentoML/pull/2595
    • fix: Bug fix for 2596 by @timc in https://github.com/bentoml/BentoML/pull/2597
    • chore(frameworks): update framework template with new checks and remove old framework code by @sauyon in https://github.com/bentoml/BentoML/pull/2592
    • docs: Update streaming.rst by @ssheng in https://github.com/bentoml/BentoML/pull/2605
    • bug: Fix Yatai client push bentos with model options by @ssheng in https://github.com/bentoml/BentoML/pull/2604
    • docs: allow running tutorial from docker by @parano in https://github.com/bentoml/BentoML/pull/2611
    • fix(model): lock attrs to >=21.1.0 by @bojiang in https://github.com/bentoml/BentoML/pull/2610
    • docs: Fix documentation links and formats by @ssheng in https://github.com/bentoml/BentoML/pull/2612
    • fix(model): load ModelOptions lazily by @sauyon in https://github.com/bentoml/BentoML/pull/2608
    • feat: install.sh for python packages by @aarnphm in https://github.com/bentoml/BentoML/pull/2555
    • fix/routing path by @aarnphm in https://github.com/bentoml/BentoML/pull/2606
    • qa: build config by @aarnphm in https://github.com/bentoml/BentoML/pull/2581
    • fix: invalid build option python_version="None" when base_image is used by @parano in https://github.com/bentoml/BentoML/pull/2623

    New Contributors

    • @kim-sardine made their first contribution in https://github.com/bentoml/BentoML/pull/2575
    • @timc made their first contribution in https://github.com/bentoml/BentoML/pull/2597

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.0-rc1...v1.0.0rc2

    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.0rc2-py3-none-any.whl(734.67 KB)
    bentoml-1.0.0rc2.tar.gz(662.59 KB)
  • v1.0.0-rc1(Jun 8, 2022)

    We are very excited to share that BentoML 1.0.0rc1 has just been released with a number of dev experience improvements and bug fixes.

    import numpy as np
    import bentoml
    from bentoml.io import NumpyNdarray
    
    iris_clf_runner = bentoml.sklearn.get("iris_clf:latest").to_runner()
    
    svc = bentoml.Service("iris_classifier", runners=[iris_clf_runner])
    
    @svc.api(input=NumpyNdarray(), output=NumpyNdarray())
    def classify(input_series: np.ndarray) -> np.ndarray:
        result = iris_clf_runner.predict.run(input_series)
        return result
    
    • Introduced framework save_model, load_model, and to_runnable APIs to complement the new to_runner API in the following frameworks. Other ML frameworks are still being migrated to the new Runner API at the moment. Coming in the next release are Onnx, FastAI, MLFlow and Catboost.
      • PyTorch (TorchScript, PyTorch Lightning)
      • Tensorflow
      • Keras
      • Scikit Learn
      • XGBoost
      • Huggingface Transformers
    • Introduced a refreshing documentation website with more contents, see https://docs.bentoml.org/.
    • Enhanced bentoml containerize command to include the following capabilities.
      • Support multi-platform docker image build with Docker Buildx.
      • Support for defining Environment Variables in generated docker images.
      • Support for installing system packages via bentofile.yaml
      • Support for customizing the generated Dockerfile via user-provided templates.

    A big shout out to all the contributors for getting us a step closer to the BentoML 1.0 release. ๐ŸŽ‰

    What's Changed

    • docs: update readme installation --pre flag by @parano in https://github.com/bentoml/BentoML/pull/2515
    • chore(ci): quit immediately for errors e2e tests by @bojiang in https://github.com/bentoml/BentoML/pull/2517
    • fix(ci): cover sync endpoints; cover cors by @bojiang in https://github.com/bentoml/BentoML/pull/2520
    • docs: fix cuda_version string value by @rapidrabbit76 in https://github.com/bentoml/BentoML/pull/2523
    • fix(framework): fix tf2 and keras class variable names by @larme in https://github.com/bentoml/BentoML/pull/2525
    • chore(ci): add more edge cases; boost e2e tests by @bojiang in https://github.com/bentoml/BentoML/pull/2521
    • fix(docker): remove backslash in comments by @aarnphm in https://github.com/bentoml/BentoML/pull/2527
    • fix(runner): sync remote runner uri schema with runner_app by @larme in https://github.com/bentoml/BentoML/pull/2531
    • fix: major bugs fixes about serving and GPU placement by @bojiang in https://github.com/bentoml/BentoML/pull/2535
    • chore(sdk): allowed single int value as the batch_dim by @bojiang in https://github.com/bentoml/BentoML/pull/2536
    • chore(ci): cover add_asgi_middleware in e2e tests by @bojiang in https://github.com/bentoml/BentoML/pull/2537
    • chore(framework): Add api_version for current implemented frameworks by @larme in https://github.com/bentoml/BentoML/pull/2522
    • doc(server): remove unnecessary svc.asgi lines by @bojiang in https://github.com/bentoml/BentoML/pull/2543
    • chore(server): lazy load meters; cover asgi app mounting in e2e test by @bojiang in https://github.com/bentoml/BentoML/pull/2542
    • feat: push runner to yatai by @yetone in https://github.com/bentoml/BentoML/pull/2528
    • style(runner): revert b14919db(factor out batching) by @bojiang in https://github.com/bentoml/BentoML/pull/2549
    • chore(ci): skip unsupported frameworks for now by @bojiang in https://github.com/bentoml/BentoML/pull/2550
    • doc: fix github action CI badge link by @parano in https://github.com/bentoml/BentoML/pull/2554
    • doc(server): fix header div by @bojiang in https://github.com/bentoml/BentoML/pull/2557
    • fix(metrics): filter out non-API endpoints in metrics by @parano in https://github.com/bentoml/BentoML/pull/2559
    • fix: Update SwaggerUI config by @parano in https://github.com/bentoml/BentoML/pull/2560
    • fix(server): wrong status code format in metrics by @bojiang in https://github.com/bentoml/BentoML/pull/2561
    • fix(server): metrics name issue under specify service names by @bojiang in https://github.com/bentoml/BentoML/pull/2556
    • fix: path for custom dockerfile templates by @aarnphm in https://github.com/bentoml/BentoML/pull/2547
    • feat: include env build options in bento.yaml by @parano in https://github.com/bentoml/BentoML/pull/2562
    • chore: minor fixes and docs change from QA by @parano in https://github.com/bentoml/BentoML/pull/2564
    • fix(qa): allow cuda_version when distro is None with default by @aarnphm in https://github.com/bentoml/BentoML/pull/2565
    • fix(qa): bento runner resource should limit to user provided configs by @parano in https://github.com/bentoml/BentoML/pull/2566

    New Contributors

    • @rapidrabbit76 made their first contribution in https://github.com/bentoml/BentoML/pull/2523

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.0-rc0...v1.0.0-rc1

    Source code(tar.gz)
    Source code(zip)
    bentoml-1.0.0rc1-py3-none-any.whl(782.15 KB)
    bentoml-1.0.0rc1.tar.gz(691.11 KB)
  • v1.0.0-rc0(May 30, 2022)

    This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org/

    Key changes

    What's Changed

    • chore(server): pass runner map through envvar by @bojiang in https://github.com/bentoml/BentoML/pull/2396
    • fix(server): init prometheus dir for standalone running by @bojiang in https://github.com/bentoml/BentoML/pull/2397
    • fix(#2316): --quiet should set logger level by @parano in https://github.com/bentoml/BentoML/pull/2399
    • feat: allow serving from project dir using import str from bentofile.yaml by @parano in https://github.com/bentoml/BentoML/pull/2398
    • chore(server): default values for entrypoints by @bojiang in https://github.com/bentoml/BentoML/pull/2401
    • fix(ci): use local bentoml in e2e test by @bojiang in https://github.com/bentoml/BentoML/pull/2403
    • docs: update advanced guide on building bentos by @splch in https://github.com/bentoml/BentoML/pull/2346
    • freeze model info and validate metadata entries by @sauyon in https://github.com/bentoml/BentoML/pull/2363
    • feat: store runners in bento manifest by @yetone in https://github.com/bentoml/BentoML/pull/2407
    • docs: fix readthedocs build issue by @parano in https://github.com/bentoml/BentoML/pull/2422
    • docs: update fossa license scan badge by @parano in https://github.com/bentoml/BentoML/pull/2420
    • fix(server): ensure distributed serving / serving on all platforms by @bojiang in https://github.com/bentoml/BentoML/pull/2414
    • Docs/core and guides by @timliubentoml in https://github.com/bentoml/BentoML/pull/2417
    • feat(internal): implement request contexts and check inference API types by @sauyon in https://github.com/bentoml/BentoML/pull/2375
    • fix: ensure compatibility with attrs 20.1.0 by @sauyon in https://github.com/bentoml/BentoML/pull/2423
    • chore(server): resource utils by @bojiang in https://github.com/bentoml/BentoML/pull/2370
    • fix: consistent naming accross docker and build config by @aarnphm in https://github.com/bentoml/BentoML/pull/2426
    • refactor: runner/runnable interface by @bojiang in https://github.com/bentoml/BentoML/pull/2432
    • feat(internal): add signature to Model and remove bentoml_version by @sauyon in https://github.com/bentoml/BentoML/pull/2433
    • runner refactor: Model to_runner/to_runnable interface by @parano in https://github.com/bentoml/BentoML/pull/2435
    • runnablehandle proposal by @sauyon in https://github.com/bentoml/BentoML/pull/2438
    • Runnable refactors and Model info update by @sauyon in https://github.com/bentoml/BentoML/pull/2439
    • Runner Resources implementation by @bojiang in https://github.com/bentoml/BentoML/pull/2436
    • refactor(runner): clean runner handle by @bojiang in https://github.com/bentoml/BentoML/pull/2441
    • chore(runner): make runnable scheduling traits constant by @bojiang in https://github.com/bentoml/BentoML/pull/2442
    • fix(runner): async run by @bojiang in https://github.com/bentoml/BentoML/pull/2443
    • added details for each paramters in options by @timliubentoml in https://github.com/bentoml/BentoML/pull/2429
    • Runners refactor: service & bento build changes by @parano in https://github.com/bentoml/BentoML/pull/2440
    • refactor: runner app by @bojiang in https://github.com/bentoml/BentoML/pull/2445
    • fix(internal): remove unused response_code field by @sauyon in https://github.com/bentoml/BentoML/pull/2444
    • Fix ModelInfo cattrs serialization issue by @parano in https://github.com/bentoml/BentoML/pull/2446
    • feat(internal): File I/O descriptor (re-)implementation by @sauyon in https://github.com/bentoml/BentoML/pull/2272
    • docs: Update Development.md by @kakokat in https://github.com/bentoml/BentoML/pull/2424
    • docs: Update DEVELOPMENT.md by @parano in https://github.com/bentoml/BentoML/pull/2452
    • refactor: datacontainer api changes with ndarray draft by @larme in https://github.com/bentoml/BentoML/pull/2449
    • feat(server): implement runner app by @sauyon in https://github.com/bentoml/BentoML/pull/2451
    • chore(runner): use low level nvml API by @bojiang in https://github.com/bentoml/BentoML/pull/2450
    • fix(server): fix container in runner app IPC by @sauyon in https://github.com/bentoml/BentoML/pull/2454
    • feat(runner): scheduling strategy by @bojiang in https://github.com/bentoml/BentoML/pull/2453
    • Fix: attribute error runner_type in bento serve by @parano in https://github.com/bentoml/BentoML/pull/2457
    • refactor(runner): update Pandas and Default DataContainer by @larme in https://github.com/bentoml/BentoML/pull/2455
    • chore(yatai): add version and org_uid to tracking by @aarnphm in https://github.com/bentoml/BentoML/pull/2458
    • chore(internal): fix typing by @sauyon in https://github.com/bentoml/BentoML/pull/2460
    • tests: fix runner1.0 branch unit tests by @parano in https://github.com/bentoml/BentoML/pull/2462
    • docs(model): update ModelSignature documentation by @sauyon in https://github.com/bentoml/BentoML/pull/2463
    • feat(xgboost): 1.0 XGBoost implementation by @sauyon in https://github.com/bentoml/BentoML/pull/2459
    • feat(frameworks): update framework template by @sauyon in https://github.com/bentoml/BentoML/pull/2461
    • fix(framework): fix Runnable closing over loop variable bug by @larme in https://github.com/bentoml/BentoML/pull/2466
    • chore: fix types by @sauyon in https://github.com/bentoml/BentoML/pull/2468
    • chore: make ModelInfo yaml backwards compatible by @parano in https://github.com/bentoml/BentoML/pull/2470
    • fix(runner): fix bugs in runner batching by @sauyon in https://github.com/bentoml/BentoML/pull/2469
    • docs: re-organize docs for 1.0rc release by @parano in https://github.com/bentoml/BentoML/pull/2474
    • chore: add furo to docs-requirements.txt by @aarnphm in https://github.com/bentoml/BentoML/pull/2475
    • feat(ci): re-enable e2e tests by @bojiang in https://github.com/bentoml/BentoML/pull/2456
    • chore: add runners-1.0 to CI by @aarnphm in https://github.com/bentoml/BentoML/pull/2431
    • fix(runner): remove unnecessary runnable_self arugment by @larme in https://github.com/bentoml/BentoML/pull/2482
    • docs: update for xgboost doc by @kakokat in https://github.com/bentoml/BentoML/pull/2481
    • test(runner): update DataContainer tests by @larme in https://github.com/bentoml/BentoML/pull/2476
    • feat: buildx backend for bentoml containerize by @aarnphm in https://github.com/bentoml/BentoML/pull/2483
    • refactor(runner): simplify batch dim by @bojiang in https://github.com/bentoml/BentoML/pull/2484
    • fix(runner): removing inspect by @bojiang in https://github.com/bentoml/BentoML/pull/2485
    • fix(server): fix development_mode in the config by @bojiang in https://github.com/bentoml/BentoML/pull/2488
    • fix(server): fix containerize subcommand by @bojiang in https://github.com/bentoml/BentoML/pull/2490
    • fix(tests): update model unit tests for new batch_dim type by @sauyon in https://github.com/bentoml/BentoML/pull/2487
    • refactor(server): supervise dev server with circus by @bojiang in https://github.com/bentoml/BentoML/pull/2489
    • fix(server): correctly use starlette APIs by @sauyon in https://github.com/bentoml/BentoML/pull/2486
    • fix(internal): revert typing strictness changes by @sauyon in https://github.com/bentoml/BentoML/pull/2494
    • feat: Transformers framework runner implementation 1.0 by @ssheng in https://github.com/bentoml/BentoML/pull/2479
    • Runners 1.0 tensorflow_v2 impl by @larme in https://github.com/bentoml/BentoML/pull/2430
    • Testing framework and runner app update by @sauyon in https://github.com/bentoml/BentoML/pull/2500
    • refactor(framework): update keras to runners-1.0 branch by @larme in https://github.com/bentoml/BentoML/pull/2498
    • fix: swagger UI bundle update by @parano in https://github.com/bentoml/BentoML/pull/2501
    • refactor: Dockerfile generation by @aarnphm in https://github.com/bentoml/BentoML/pull/2473
    • feat(internal): add save_format_version for BentoML model by @larme in https://github.com/bentoml/BentoML/pull/2502
    • Revert "feat(internal): add save_format_version for BentoML model" by @larme in https://github.com/bentoml/BentoML/pull/2504
    • docs: Update documentation for 1.0 by @parano in https://github.com/bentoml/BentoML/pull/2506
    • fix(framework): adapt changes for Tensorflow DataContainer by @larme in https://github.com/bentoml/BentoML/pull/2507
    • feat(framework): pytorch by @bojiang in https://github.com/bentoml/BentoML/pull/2499
    • docs: misc docs updates by @parano in https://github.com/bentoml/BentoML/pull/2511
    • refactor(framework): move _mapping for tf2 and keras by @larme in https://github.com/bentoml/BentoML/pull/2510
    • chore: unify circus logs to bentoml + fix circus config parsing for api_server by @aarnphm in https://github.com/bentoml/BentoML/pull/2509
    • chore: add release candidate backwards compatibility warnings by @parano in https://github.com/bentoml/BentoML/pull/2512
    • fix: revert pining pip version for tests by @bojiang in https://github.com/bentoml/BentoML/pull/2514
    • Merge 1.0 development branch by @parano in https://github.com/bentoml/BentoML/pull/2513

    New Contributors

    • @splch made their first contribution in https://github.com/bentoml/BentoML/pull/2346
    • @kakokat made their first contribution in https://github.com/bentoml/BentoML/pull/2424

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.0-a7...v1.0.0-rc0

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0-a7(Apr 6, 2022)

    This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org/

    Key changes

    • BREAKING CHANGE: Default serving port has been changed to 3000
      • This is due to an issue with new MacOS where 5000 port is always in use.
      • This will affect default serving port when deploying with Docker. Existing 1.0 preview release users will need to either change deployment config to use port 3000, or pass --port 5000 to the container command, in order to use the previous default port setting.
    • New import/export API
      • Users can now export models and bentos from local store to a standalone file
      • Lean more via bentoml export --help and bentoml models export --help

    What's Changed

    • docs(cli): clean up cli docstrings by @larme in https://github.com/bentoml/BentoML/pull/2342
    • fix: YataiClientContext initialization missing email argument by @yetone in https://github.com/bentoml/BentoML/pull/2348
    • chore(ci): run e2e tests in docker by @bojiang in https://github.com/bentoml/BentoML/pull/2349
    • style: minor typing fixes by @bojiang in https://github.com/bentoml/BentoML/pull/2350
    • Refactor model save to include labels, metadata and custom_objects by @larme in https://github.com/bentoml/BentoML/pull/2351
    • fix: better error message in python < 3.9 by @larme in https://github.com/bentoml/BentoML/pull/2352
    • refactor(internal): move Tag out of types by @sauyon in https://github.com/bentoml/BentoML/pull/2358
    • fix(frameworks): use bentoml.models.create instead of Model.create by @sauyon in https://github.com/bentoml/BentoML/pull/2360
    • fix: add change_global_cwd params to bentoml.load by @parano in https://github.com/bentoml/BentoML/pull/2356
    • fix: import model from S3 by @almirb in https://github.com/bentoml/BentoML/pull/2361
    • fix: extract correct desired Python version by @matheusMoreno in https://github.com/bentoml/BentoML/pull/2362
    • fix(service): fix load_bento arguments position when retrying after import_service failed by @larme in https://github.com/bentoml/BentoML/pull/2369
    • fix: cgroups for cpu should be 1 when <= 0 by @aarnphm in https://github.com/bentoml/BentoML/pull/2372
    • chore: lock rich to be >=11.2.0 by @aarnphm in https://github.com/bentoml/BentoML/pull/2378
    • internal: usage tracking by @aarnphm in https://github.com/bentoml/BentoML/pull/2318
    • feat(internal): try to correct missing latest files by @sauyon in https://github.com/bentoml/BentoML/pull/2383
    • chore: cleanup 3.6 metadata by @aarnphm in https://github.com/bentoml/BentoML/pull/2388
    • chore: remove unecessary model_store by @aarnphm in https://github.com/bentoml/BentoML/pull/2384
    • fix: not lock typing_extensions to fix rich and pytorch lightning requirements by @aarnphm in https://github.com/bentoml/BentoML/pull/2390
    • bug: fix CLI command delete with latest tag by @parano in https://github.com/bentoml/BentoML/pull/2391
    • feat: improve list CLI command output by @parano in https://github.com/bentoml/BentoML/pull/2392
    • fix: update yatai client to work with BentoInfo changes by @parano in https://github.com/bentoml/BentoML/pull/2393
    • fix(server): duplicate metrics by @bojiang in https://github.com/bentoml/BentoML/pull/2394

    New Contributors

    • @almirb made their first contribution in https://github.com/bentoml/BentoML/pull/2361
    • @matheusMoreno made their first contribution in https://github.com/bentoml/BentoML/pull/2362

    Full Changelog: https://github.com/bentoml/BentoML/compare/v1.0.0-a6...v1.0.0-a7

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0-a6(Mar 7, 2022)

    This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org/

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0-a5(Mar 1, 2022)

    This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org/

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0-a4(Feb 15, 2022)

    This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org/

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0-a3(Jan 28, 2022)

    This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0-a2(Jan 20, 2022)

    This is a preview release for BentoML 1.0, check out the quick start guide here: https://docs.bentoml.org/en/latest/quickstart.html and documentation at http://docs.bentoml.org

    Source code(tar.gz)
    Source code(zip)
  • v0.13.1(Jul 13, 2021)

    Detailed Changelog: https://github.com/bentoml/BentoML/compare/v0.13.0...v0.13.1

    Overview

    BentoML 0.13.1 is a minor release containing mostly bug fixes and internal changes.

    Changelog

    • feat: SLO - API server max latency (#1583)

    • feat: Save OpenAPI Spec Json in BentoML bundle (#1686)

    • fix: BentoService loading user-provided env.yml file in runtime (#1695)

    • fix: BentoArtifact initialize with parameter issue (#1696)

    • fix: Use $BENTOML_PORT as Dockerfile default port (#1706)

    • fix: Fix missing s3_endpoint_url (#1708)

    • fix: Wrap request in sagemaker model_server (#1716)

    • refactor: Add deprecation warnings for deployment CLI commands (#1718)

    • refactor replace di framework (#1697)

    • ci: PaddlePaddle Intergration test (#1739)

    Source code(tar.gz)
    Source code(zip)
    BentoML-0.13.1-py3-none-any.whl(3.82 MB)
    BentoML-0.13.1.tar.gz(3.49 MB)
  • v0.13.0(Jun 16, 2021)

    Detailed Changelog: https://github.com/bentoml/BentoML/compare/v0.12.1...v0.13.0

    Overview

    BentoML 0.13.0 is here! It's a release packed with lots of new features and important bug fixes. We encourage all users to upgrade.

    โค๏ธ Contributors

    Thanks to @aarnphm @andrewsi-z @larme @gregd33 @bojiang @ssheng @henrywu2019 @yubozhao @jack1902 @illy @sencenan @parano @soeque1 @elia-secchi @Shumpei-Kikuta @StevenReitsma @dsherry @AnvithaGadagi @joaquincabezas for the contributions!

    ๐Ÿ“ข Breaking Changes

    • Configuration revamp

      • The bentoml config CLI command has been fully deprecated in this release
      • New config system was introduced for configuring BentoML api server, yatai, tracing and more (#1543, #1595, #1615, #1667)
      • Documentation: https://docs.bentoml.org/en/latest/guides/configuration.html
      • Add --do-not-track CLI option and environment variable (#1534)
    • Deprecated --enable-microbatch flag

      • Use the @api(batch=True|False) option to choose between microbatch enabled API vs. non-batch API
      • For API defined in batch mode but requires serving online traffic without batching behavior, use --mb-max-batch-size=1 instead

    ๐ŸŽ‰ New Features

    • GPU Support

      • GPU serving guide https://docs.bentoml.org/en/latest/guides/gpu_serving.html
      • Added docker base image optimized for GPU serving (#1653)
    • Add support for EvalML (#1603)

    • Add support for ONNX-MLIR model (#1545)

    • Add full CORS support for bento API server (#1576)

    • Monitoring with Prometheus Gudie

      • https://docs.bentoml.org/en/latest/guides/monitoring.html
    • Optimize BentoML import delay (#1608)

    • Support upload/download for Yatai backed by local file system storage (#1586)

    ๐Ÿž Bug Fixes and Other Changes

    • Add ensure_ascii option in JsonOutput (#1578, #1580)

    • Fix StringInput with batch=True API (#1581)

    • Fix docs.json link in API server UI (#1633)

    • Fix uploading to remote path (#1601)

    • Fix label missing after uploading Bento to remote Yatai (#1598)

    • Fixes /metrics endpoints with serve-gunicorn (#1666)

    • Upgrade conda to 4.9.2 in default docker base image (#1525)

    • Internal:

      • Add locking mechanism to yatai server (#1567)
      • refactor: YataiService Store Abstraction (#1541)
    Source code(tar.gz)
    Source code(zip)
    BentoML-0.13.0-py3-none-any.whl(4.77 MB)
    BentoML-0.13.0.tar.gz(3.49 MB)
  • v0.12.1(Apr 15, 2021)

    Detailed Changelog: https://github.com/bentoml/BentoML/compare/v0.12.0...v0.12.1

    PaddlePaddle Support

    We are thrilled to announce that BentoML now fully supports the PaddlePaddle framework from Baidu. Users can easily serve their own models created with Paddle via Paddle Inference and serve pre-trained models from PaddleHub, which contains over 300 production-grade pre-trained models.

    Tutorial notebooks for using BentoML with PaddlePaddle:

    • Paddle Inference: https://github.com/bentoml/gallery/blob/master/paddlepaddle/LinearRegression/LinearRegression.ipynb
    • PaddleHub: https://github.com/bentoml/gallery/blob/master/paddlehub/image-segmentation/image-segmentation.ipynb

    See the announcement and release note from PaddleHub: https://github.com/PaddlePaddle/PaddleHub/releases/tag/v2.1.0

    Thank you @cqvu @deehrlic for contributing this feature in BentoML.

    Bug fixes

    • #1532 Fix zipkin module not found exception
    • #1557 Fix aiohttp import issue on Windows
    • #1566 Fix bundle load in docker when using the requirement_txt_file @env parameter
    Source code(tar.gz)
    Source code(zip)
    BentoML-0.12.1-py3-none-any.whl(3.63 MB)
    BentoML-0.12.1.tar.gz(3.31 MB)
  • v0.12.0(Mar 23, 2021)

    Detailed Changelog: https://github.com/bentoml/BentoML/compare/v0.11.0...v0.12.0

    New Features

    • Breaking Change: Default Model Worker count is set to one #1454

      • Please use the --worker CLI argument for specifying a number of workers of your deployment
      • For heavy production workload, we recommend experiment with different worker count and benchmark test your BentoML service in API server in your target hardware to get a better understanding of the model server performance
    • Breaking Change: Micro-batching layer(Marshal Server) is now enabled by default #1498

      • For Inference APIs defined withbatch=True, this will enable micro-batching behavior when serving. User can disable with the --diable-microbatch flag
      • For Inference APIs with batch=False, API requests are now being queued in Marshal and then forwarded to the model backend server
    • New: Use non-root user in BentoML's API server docker image

    • New: API/CLI for bulk delete of BentoML bundle in Yatai #1313

    • Easier dependency management for PyPI and conda

      • Support all pip install options via a user-provided requirements.txt file
      • Breaking Change: when requirements_txt_file option is in use, other pip package options will be ignored
      • conda_override_channels option for using explicit conda channel for conda dependencies: https://docs.bentoml.org/en/latest/concepts.html#conda-packages

    • Better support for pip install options and remote python dependencies #1421
    1. Let BentoML do it for you:
    @bentoml.env(infer_pip_packages=True)
    
    1. use the existing "pip_packages" API, to specify list of dependencies:
    @bentoml.env(
        pip_packages=[
          'scikit-learn',
          'pandas @https://github.com/pypa/pip/archive/1.3.1.zip',
        ]
    )
    
    1. use a requirements.txt file to specify all dependencies:
    @bentoml.env(requirements_txt_file='./requirements.txt')
    

    In the ./requirements.txt file, all pip install options can be used:

    #
    # These requirements were autogenerated by pipenv
    # To regenerate from the project's Pipfile, run:
    #
    #    pipenv lock --requirements
    #
    
    -i https://pypi.org/simple
    
    scikit-learn==0.20.3
    aws-sam-cli==0.33.1
    psycopg2-binary
    azure-cli
    bentoml
    pandas @https://github.com/pypa/pip/archive/1.3.1.zip
    
    https://[username[:password]@]pypi.company.com/simple
    https://user:he%2F%[email protected]
    
    git+https://myvcs.com/some_dependency@sometag#egg=SomeDependency
    
    • API/CLI for bulk delete #1313

    CLI command for delete:

    # Delete all saved Bento with specific name
    bentoml delete --name IrisClassifier
    bentoml delete --name IrisClassifier -y # do it without confirming with user
    bentoml delete --name IrisClassifier --yatai-url=yatai.mycompany.com # delete in remote Yatai
    
    # Delete all saved Bento with specific tag
    bentoml delete --labels "env=dev"
    bentoml delete --labels "env=dev, user=foobar"
    bentoml delete --labels "key1=value1, key2!=value2, key3 In (value3, value3a), key4 DoesNotExist"
    
    # Delete multiple saved Bento by their name:version tag
    bentoml delete --tag "IrisClassifier:v1, MyService:v3, FooBar:20200103_Lkj81a"
    
    # Delete all
    bentoml delete --all
    

    Yatai Client Python API:

    yc = get_yatai_client() # local Yatai
    yc = get_yatai_client('remote.yatai.com:50051') # remoate Yatai
    
    yc.repository.delete(prune, labels, bento_tag, bento_name, bento_version, require_confirm)
    
    """
    Params:
    prune: boolean, Set true to delete all bento services
    bento_tag: Bento tag
    labels: string, label selector to filter bento services to delete
    bento_name: string 
    bento_version: string, 
    require_confirm: boolean require user confirm interactively in CLI
    """
    
    • #1334 Customize route of an API endpoint
    @env(infer_pip_packages=True)
    @artifacts([...])
    class MyPredictionService(BentoService)
    
       @api(route="/my_url_route/foo/bar", batch=True, input=DataframeInput())
       def predict(self, df):
         # instead of "/predict", the URL for this API endpoint will be "/my_url_route/foo/bar"
         ...
    
    • #1416 Support custom authentication header in Yatai gRPC server
    • #1284 Add health check endpoint to Yatai web server
    • #1409 Fix Postgres disconnect issue with Yatai server
    Source code(tar.gz)
    Source code(zip)
    BentoML-0.12.0-py3-none-any.whl(4.61 MB)
    BentoML-0.12.0.tar.gz(3.30 MB)
  • v0.11.0(Jan 14, 2021)

    New Features

    Detailed Changelog: https://github.com/bentoml/BentoML/compare/v0.10.1...v0.11.0

    Interactively start and stop Model API Server during development

    A new API was introduced in 0.11.0 for users to start and test an API server while developing their BentoService class:

    service = MyPredictionService()
    service.pack("model", model)
    
    # Start an API model server in the background
    service.start_dev_server(port=5000)
    
    # Send test request to the server or open the URL in browser
    requests.post(f'http://localhost:5000/predict', data=review, headers=headers)
    
    # Stop the dev server
    service.stop_dev_server()
    
    # Modify code and repeat โ™ป๏ธ
    

    Here's an example notebook showcasing this new feature.

    More PyTorch eco-system Integrations

    • PyTorch JIT traced model support #1293
    • PyTorch Lightening support #1293
    • Detectron2 support #1272

    Logging is fully customizable now!

    Users can now use one single YAML file to customize the logging behavior in BentoML, including the prediction logs and feedback logs.

    https://docs.bentoml.org/en/latest/guides/logging.html

    Two new configs are also introduced for quickly turning on/off console logging and file logging:

    https://github.com/bentoml/BentoML/blob/v0.11.0/bentoml/configuration/default_bentoml.cfg#L29

    [logging]
    console_logging_enabled = true
    file_logging_enabled = true
    

    If you are not sure how this config works, here's a new guide on how BentoML's configuration works: https://docs.bentoml.org/en/latest/guides/configuration.html

    More model management APIs

    All model management CLI and Yatai client python API now supports the yatai_url parameter, making it easy to interact with a remote YataiService, for centrally manage all your BentoML packaged ML models:

    Screen Shot 2021-01-13 at 10 39 49 PM

    Support bundling zipimport modules #1261

    Bundling zipmodules with BentoML is possible now with this newly added API:

    @bentoml.env(zipimport_archives=['nested_zipmodule.zip'])
    @bentoml.artifacts([SklearnModelArtifact('model')])
    class IrisClassifier(bentoml.BentoService):
        ...
    

    BentoML also manages the sys.path when loading a saved BentoService with zipimport archives, making sure the zip modules can be imported in user code.

    Announcements

    Monthly Community Meeting

    Thank you again for everyone coming to the first community meeting this week! If you are not invited to the community meeting calendar yet, make sure to join it here: https://github.com/bentoml/BentoML/discussions/1396

    Hiring

    BentoML team is hiring multiple Software Engineer roles to help build the future of this open-source project and the business behind it - we are looking for someone with experience in one of the following areas: ML infrastructure, backend systems, data engineering, SRE, full-stack, and technical writing. Feel free to pass along the message to anyone you know who might be interested, we'd really appreciate that!

    Source code(tar.gz)
    Source code(zip)
    BentoML-0.11.0-py3-none-any.whl(3.62 MB)
    BentoML-0.11.0.tar.gz(3.29 MB)
  • v0.10.1(Dec 10, 2020)

    Bug Fix

    This is a minor release containing one bug fix for issue #1318, where the docker build process for the BentoML API model server was broken due to an error in the init shell script. The issue has been fixed in #1319 and included in this new release.

    The reason our integration tests did not catch this issue was due to the fact that we are bundling the "dirty" BentoML installation in the generated docker file in the development environment and CI/Test environment, whereas the production release version of BentoML, uses the BentoML installed from PyPI. And the issue in #1318 was an edge case that can be triggered only when using the released version of BentoML and published docker image. We are investigating ways to run all our integration tests with a preview release before making a final release, as part of our QA process, which should help us prevent this type of bugs from getting into final releases in the future.

    Source code(tar.gz)
    Source code(zip)
    BentoML-0.10.1-py3-none-any.whl(3.59 MB)
    BentoML-0.10.1.tar.gz(3.27 MB)
  • v0.10.0(Dec 7, 2020)

    New Features & Improvements

    • Improved Model Management APIs #1126 #1241 by @yubozhao Python APIs for model management:
    from bentoml.yatai.client import get_yatai_client
    
    bento_service.save() # Save and register the bento service locally
    
    # push to save bento service to remote yatai service.
    yc = get_yatai_client('http://staging.yatai.mycompany.com:50050')
    yc.repository.push(
        f'{bento_service.name}:{bento_service.version}',
    ) 
    
    # Pull bento service from remote yatai server and register locally
    yc = get_yatai_client('http://staging.yatai.mycompany.com:50050')
    yc.repository.pull(
        'bento_name:version',
    )
    
    #delete in local yatai
    yatai_client = get_yatai_client()
    yatai_client.repository.delete('name:version')
    
    # delete in batch by labels
    yatai_client = get_yatai_client()
    yatai_client.prune(labels='cicd=failed, framework In (sklearn, xgboost)')
    
    # Get bento service metadata
    yatai_client.repository.get('bento_name:version', yatai_url='http://staging.yatai.mycompany.com:50050')
    
    # List bento services by label
    yatai_client.repositorylist(labels='label_key In (value1, value2), label_key2 Exists', yatai_url='http://staging.yatai.mycompany.com:50050')
    

    New CLI commands for model management: Push local bento service to remote yatai service:

    $ bentoml push bento_service_name:version --yatai-url http://staging.yatai.mycompany.com:50050
    

    Added --yatai-url option for the following CLI commands to interact with remote yatai service directly:

    bentoml get
    bentoml list
    bentoml delete
    bentoml retrieve
    bentoml run
    bentoml serve
    bentoml serve-gunicorn
    bentoml info
    bentoml containerize
    bentoml open-api-spec
    
    • Model Metadata API #1179 shoutout to @jackyzha0 for designing and building this feature! Ability to save additional metadata for any artifact type, e.g.:
        model_metadata = {
            'k1': 'v1',
            'job_id': 'ABC',
            'score': 0.84,
            'datasets': ['A', 'B'],
        }
        svc.pack("model", test_model, metadata=model_metadata)
    
        svc.save_to_dir(str(tmpdir))
        loaded_service = bentoml.load(str(tmpdir))
        print(loaded_service.artifacts.get('model').metadata)
    
    • Improved Tensorflow Support, by @bojiang

      • Make the packed model behave the same as after the model was saved and loaded again #1231
      • TfTensorOutput raise TypeError when micro-batch enabled #1251
      • Opt auto casting of TfSavedModelArtifact & clearer feedback
      • Improve KerasModelArtifact to work with tf2 #1295
    • Automated AWS EC2 deployment #1160 massive 3800+ line PR by @mayurnewase

      • Create auto-scaling endpoint on AWS EC2 with just one command, see documentation here https://docs.bentoml.org/en/latest/deployment/aws_ec2.html
    • Add MXNet Gluon support #1264 by @liusy182

    • Enable input & output data capture in Sagemaker deployment #1189 by @j-hartshorn

    • Faster docker image rebuild when only model artifacts are updated #1199

    • Support URL location prefix in yatai-service gRPC/Web server #1063 #1184

    • Support relative path for showing Swagger UI page in the model server #1207

    • Add onnxruntime gpu as supported backend #1213

    • Add option to disable swagger UI #1244 by @liusy182

    • Add label and artifact metadata display to yatai web ui #1249

    • Make bentoml module executable #1274

    python -m bentoml <subcommand>
    
    • Allow setting micro batching parameters from CLI #1282 by @jsemric
    bentoml serve-gunicorn --enable-microbatch --mb-max-latency 3333 --mb-max-batch-size 3333 IrisClassifier:20201202154246_C8DC0A                                                                                                                                   
    

    Bug fixes

    • Allow deleting bento that was previously deleted with the same name and version #1211
    • Construct docker API client from env #1233
    • Pin-down SqlAlchemy version #1238
    • Avoid potential TypeError in batching server #1252
    • Fix inference API docstring override by default #1302

    Documentation

    • Add examples of queries with requests for adapters #1202
    • Update import paths to reflect fastai2->fastai rename #1227
    • Add model artifact metadata information to the core concept page #1259
    • Update adapters.rst to include new input adapters #1269
    • Update quickstart guide #1262
    • Docs for gluon support #1271
    • Fix CURL commands for posting files in input adapters doc string #1307

    Internal, CI, and Tests

    • Fix installing bundled pip dependencies in Azure and Sagemaker deployments #1214 (affects bentoml developers only)
    • Add Integration test for Fasttext #1221
    • Add integration test for spaCy #1236
    • Add integration test for models using tf native API #1245
    • Add tests for run_api_server_docker_container microbatch #1247
    • Add integration test for LightGBM #1243
    • Update Yatai web ui node dependencies version #1256
    • Add integration test for bento management #1263
    • Add yatai server integration tests to Github CI #1265
    • Update e2e yatai service tests #1266
    • Include additional information for EC2 test #1270
    • Refactor CI for TensorFlow2 #1277
    • Make tensorflow integration tests run faster #1278
    • Fix overrided protobuf version in CI #1286
    • Add integration test for tf1 #1285
    • Refactor yatai service integration test #1290
    • Refactor Saved Bundle Loader #1291
    • Fix flaky yatai service integration tests #1298
    • Refine KerasModelArtifact & its integration test #1295
    • Improve API server integration tests #1299
    • Add integration tests for ragged_tensor #1303

    Announcements

    • We have started using Github Projects feature to track roadmap items for BentoML, you can find it here: https://github.com/bentoml/BentoML/projects/1
    • We are hiring senior engineers and a lead developer advocate to join our team, let us know if you or someone you know might be interested ๐Ÿ‘‰ [email protected]
    • Apologize for the long wait between 0.9 and 0.10 releases, we are getting back to doing our bi-weekly release schedule now! We need help with documenting new features, writing release notes as well as QA new release before it went out, let us know if you'd be interested in helping out!

    Thank you everyone for contributing to this release! @j-hartshorn @withsmilo @yubozhao @bojiang @changhw01 @mayurnewase @telescopic @jackyzha0 @pncnmnp @kishore-ganesh @rhbian @liusy182 @awalvie @cathy-kim @jsemric ๐ŸŽ‰๐ŸŽ‰๐ŸŽ‰

    Source code(tar.gz)
    Source code(zip)
    BentoML-0.10.0-py3-none-any.whl(3.59 MB)
    BentoML-0.10.0.tar.gz(3.27 MB)
  • v0.9.2(Oct 17, 2020)

  • v0.9.1(Oct 1, 2020)

Owner
BentoML
An open standard for serving and deploying machine-learning models
BentoML
Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

FINRA 25 Dec 28, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 7, 2023
XManager: A framework for managing machine learning experiments ๐Ÿง‘โ€๐Ÿ”ฌ

XManager is a platform for packaging, running and keeping track of machine learning experiments. It currently enables one to launch experiments locally or on Google Cloud Platform (GCP). Interaction with experiments is done via XManager's APIs through Python launch scripts.

DeepMind 620 Dec 27, 2022
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Jan 8, 2023
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Jan 5, 2023
TensorFlow Decision Forests (TF-DF) is a collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models.

TensorFlow Decision Forests (TF-DF) is a collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models. The library is a collection of Keras models and supports classification, regression and ranking. TF-DF is a TensorFlow wrapper around the Yggdrasil Decision Forests C++ libraries. Models trained with TF-DF are compatible with Yggdrasil Decision Forests' models, and vice versa.

null 538 Jan 1, 2023
A high performance and generic framework for distributed DNN training

BytePS BytePS is a high performance and general distributed training framework. It supports TensorFlow, Keras, PyTorch, and MXNet, and can run on eith

Bytedance Inc. 3.3k Dec 28, 2022
Meerkat provides fast and flexible data structures for working with complex machine learning datasets.

Meerkat makes it easier for ML practitioners to interact with high-dimensional, multi-modal data. It provides simple abstractions for data inspection, model evaluation and model training supported by efficient and robust IO under the hood.

Robustness Gym 115 Dec 12, 2022
High performance implementation of Extreme Learning Machines (fast randomized neural networks).

High Performance toolbox for Extreme Learning Machines. Extreme learning machines (ELM) are a particular kind of Artificial Neural Networks, which sol

Anton Akusok 174 Dec 7, 2022
PyTorch extensions for high performance and large scale training.

Description FairScale is a PyTorch extension library for high performance and large scale training on one or multiple machines/nodes. This library ext

Facebook Research 2k Dec 28, 2022
High performance Python GLMs with all the features!

High performance Python GLMs with all the features!

QuantCo 200 Dec 14, 2022
AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications.

AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just a few lines of code, you can train and deploy high-accuracy machine learning and deep learning models tabular data.

Robin 55 Dec 27, 2022
AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications.

AutoTabular AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just

wenqi 2 Jun 26, 2022
A Lucid Framework for Transparent and Interpretable Machine Learning Models.

Currently a Beta-Version lucidmode is an open-source, low-code and lightweight Python framework for transparent and interpretable machine learning mod

lucidmode 15 Aug 12, 2022
A collection of interactive machine-learning experiments: ๐Ÿ‹๏ธmodels training + ๐ŸŽจmodels demo

?? Interactive Machine Learning experiments: ??๏ธmodels training + ??models demo

Oleksii Trekhleb 1.4k Jan 6, 2023
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Jan 9, 2023
Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Augusto Almeida 84 Nov 25, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Vowpal Wabbit 8.1k Dec 30, 2022
CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Iterative 19 Oct 3, 2022