MLServer

Seldon

Last update: Jan 3, 2023

Related tags

FastAPI Projects MLServer

Overview

MLServer

An open source inference server to serve your machine learning models.

⚠️ This is a Work in Progress.

Overview

MLServer aims to provide an easy way to start serving your machine learning models through a REST and gRPC interface, fully compliant with KFServing's V2 Dataplane spec.

You can read more about the goals of this project on the inital design document.

Usage

You can install the mlserver package running:

pip install mlserver

Note that to use any of the optional inference runtimes, you'll need to install the relevant package. For example, to serve a scikit-learn model, you would need to install the mlserver-sklearn package:

pip install mlserver-sklearn

For further information on how to use MLServer, you can check any of the available examples.

Inference Runtimes

Inference runtimes allow you to define how your model should be used within MLServer. Out of the box, MLServer comes with a set of pre-packaged runtimes which let you interact with a subset of common ML frameworks. This allows you to start serving models saved in these frameworks straight away.

To avoid bringing in dependencies for frameworks that you don't need to use, these runtimes are implemented as independent optional packages. This mechanism also allows you to rollout your [own custom runtimes]( very easily.

To pick which runtime you want to use for your model, you just need to make sure that the right package is installed, and then point to the correct runtime class in your model-settings.json file.

The included runtimes are:

Framework	Package Name	Implementation Class	Example	Source Code
Scikit-Learn	`mlserver-sklearn`	`mlserver_sklearn.SKLearnModel`	Scikit-Learn example	`./runtimes/sklearn`
XGBoost	`mlserver-xgboost`	`mlserver_xgboost.XGBoostModel`	XGBoost example	`./runtimes/xgboost`
Spark MLlib	`mlserver-mllib`	`mlserver_mllib.MLlibModel`	Coming Soon	`./runtimes/mllib`
LightGBM	`mlserver-lightgbm`	`mlserver_lightgbm.LightGBMModel`	Coming Soon	`./runtimes/lightgbm`

Examples

On the list below, you can find a few examples on how you can leverage mlserver to start serving your machine learning models.

Developer Guide

Versioning

Both the main mlserver package and the inference runtimes packages try to follow the same versioning schema. To bump the version across all of them, you can use the ./hack/update-version.sh script. For example:

./hack/update-version.sh 0.2.0.dev1

Comments

multi model serving IsADirectoryError: [Errno 21] Is a directory: '/mnt/models'

Hi, I'm having this error (full log) when serving two sklearn models on my local kind cluster following this guide Chatted with Alejandro on slack about this issue, he was able to reproduce it, link to the thread: https://seldondev.slack.com/archives/C03DQFTFXMX/p1659988373014049

  File "/usr/local/bin/mlserver", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/site-packages/mlserver/cli/main.py", line 79, in main
    root()
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/mlserver/cli/main.py", line 20, in wrapper
    return asyncio.run(f(*args, **kwargs))
  File "/usr/local/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "uvloop/loop.pyx", line 1501, in uvloop.loop.Loop.run_until_complete
  File "/usr/local/lib/python3.8/site-packages/mlserver/cli/main.py", line 44, in start
    await server.start(models_settings)
  File "/usr/local/lib/python3.8/site-packages/mlserver/server.py", line 98, in start
    await asyncio.gather(
  File "/usr/local/lib/python3.8/site-packages/mlserver/registry.py", line 272, in load
    return await self._models[model_settings.name].load(model_settings)
  File "/usr/local/lib/python3.8/site-packages/mlserver/registry.py", line 143, in load
    await self._load_model(new_model)
  File "/usr/local/lib/python3.8/site-packages/mlserver/registry.py", line 151, in _load_model
    await model.load()
  File "/usr/local/lib/python3.8/site-packages/mlserver_sklearn/sklearn.py", line 36, in load
    self._model = joblib.load(model_uri)
  File "/usr/local/lib/python3.8/site-packages/joblib/numpy_pickle.py", line 579, in load
    with open(filename, 'rb') as f:
IsADirectoryError: [Errno 21] Is a directory: '/mnt/models'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/usr/local/lib/python3.8/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
  File "/usr/local/lib/python3.8/multiprocessing/synchronize.py", line 110, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory

My Seldon deployment definition file:

metadata:
  name: multi-model
  namespace: seldon
spec:
  protocol: v2
  name: multi-model
  predictors:
  - graph:
      type: MODEL
      implementation: SKLEARN_SERVER
      modelUri: gs://<bucket_name>/MODELS/MultiModel
      name: multi
      parameters:
        - name: method
          type: STRING
          value: predict
      envSecretRefName: seldon-rclone-secret
    name: default
    replicas: 1

Inside MultiModel directory:

├── IrisModel
│   ├── model-settings.json
│   └── model.joblib
├── RandomForestModel
│   ├── model-settings.json
│   └── model.joblib
├── multi_model.yaml
└── settings.json

model-setting.json:

{ "name": "RandomForestModel", "implementation": "mlserver_sklearn.SKLearnModel" }

and

{ "name": "RandomForestModel", "implementation": "mlserver_sklearn.SKLearnModel" }

opened by yc2984 15

Hugging face token-classification output causes non JSON serializable error

Looks like there was arecent MR to support all types of hugginface inputs/outputs however token-classification outputs are failing for me. I was able to reproduce with a couple token-classification models, cmarkea/distilcamembert-base-ner for example threw the following error.

Using docker image index.docker.io/seldonio/mlserver@sha256:7f7806e8ed781979bb5ef4d7774156a31046c8832d76b57403127add33064872

2022-12-16 19:42:28,391 [mlserver] INFO - Loaded model 'huggingface-token-class' succesfully.
2022-12-16 19:42:28,436 [mlserver] INFO - Loaded model 'huggingface-token-class' succesfully.
2022-12-16 19:43:05,194 [mlserver] DEBUG - Payload id='e05f34eb-014f-4b76-9c00-8ec9b29ff3aa' parameters=Parameters(content_type=None, headers={'host': 'localhost:8080', 'user-agent': 'python-requests/2.28.1', 'content-length': '131', 'accept': '*/*', 'accept-encoding': 'gzip, deflate, br', 'content-type': 'application/json', 'x-forwarded-for': '127.0.0.1', 'Ce-Specversion': '0.3', 'Ce-Source': 'io.seldon.serving.deployment.mlserver.default', 'Ce-Type': 'io.seldon.serving.inference.request', 'Ce-Modelid': 'huggingface-token-class', 'Ce-Inferenceservicename': 'mlserver', 'Ce-Endpoint': 'huggingface-token-class', 'Ce-Id': 'e05f34eb-014f-4b76-9c00-8ec9b29ff3aa', 'Ce-Requestid': 'e05f34eb-014f-4b76-9c00-8ec9b29ff3aa', 'Ce-Namespace': 'default'}) inputs=[RequestInput(name='args', shape=[1], datatype='BYTES', parameters=None, data=TensorData(__root__=['My name is Clara and I live in Berkeley, California.']))] outputs=None
2022-12-16 19:43:05,272 [mlserver] DEBUG - Prediction [[{'entity': 'I-MISC', 'score': 0.58051825, 'index': 5, 'word': '▁Clara', 'start': 10, 'end': 16}, {'entity': 'I-LOC', 'score': 0.9975036, 'index': 10, 'word': '▁Ber', 'start': 30, 'end': 34}, {'entity': 'I-LOC', 'score': 0.99758935, 'index': 11, 'word': 'ke', 'start': 34, 'end': 36}, {'entity': 'I-LOC', 'score': 0.99756134, 'index': 12, 'word': 'ley', 'start': 36, 'end': 39}, {'entity': 'I-LOC', 'score': 0.99869055, 'index': 14, 'word': '▁Cali', 'start': 40, 'end': 45}, {'entity': 'I-LOC', 'score': 0.99901426, 'index': 15, 'word': 'for', 'start': 45, 'end': 48}, {'entity': 'I-LOC', 'score': 0.9989453, 'index': 16, 'word': 'nia', 'start': 48, 'end': 51}]]
2022-12-16 19:43:05,273 [mlserver.parallel] ERROR - An error occurred calling method 'predict' from model 'huggingface-token-class'.
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/worker.py", line 122, in _process_request
    return_value = await method(*request.method_args, **request.method_kwargs)
  File "/opt/conda/lib/python3.8/site-packages/mlserver_huggingface/runtime.py", line 109, in predict
    str_out = [json.dumps(pred, cls=NumpyEncoder) for pred in prediction]
  File "/opt/conda/lib/python3.8/site-packages/mlserver_huggingface/runtime.py", line 109, in <listcomp>
    str_out = [json.dumps(pred, cls=NumpyEncoder) for pred in prediction]
  File "/opt/conda/lib/python3.8/json/__init__.py", line 234, in dumps
    return cls(
  File "/opt/conda/lib/python3.8/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/opt/conda/lib/python3.8/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/opt/conda/lib/python3.8/site-packages/mlserver_huggingface/common.py", line 135, in default
    return json.JSONEncoder.default(self, obj)
  File "/opt/conda/lib/python3.8/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type float32 is not JSON serializable
INFO:     None:0 - "POST /v2/models/huggingface-token-class/infer HTTP/1.1" 500 Internal Server Error

opened by ajsalow 10

Uvicorn logging settings
Closes #530

add support to uvicorn logging setting:

add a parameter to define a logging configuration file into application settings

edit uvicorn Config initialization to add this parameter
opened by pablobgar 10
Seldon MLServer only supports a single input tensor
We are getting below error when we are trying KFS V2 request with multiple tensors:

InvalidArgument (3) inference.GRPCInferenceService/ModelInfer: INVALID_ARGUMENT: SKLearnModel only supports a single input tensor (4 were received)

we have checked the implementations and it looks like it has been designed to work with only single tensor. https://github.com/SeldonIO/MLServer/blob/56a90a45119ed5ab3cd7c99c68a109f5426828f6/runtimes/sklearn/mlserver_sklearn/sklearn.py#L47

Any thought(s) to make it work for multiple tensors. Thanks
opened by amnpandey 10
HuggingFace speech models not supported
MLserver HuggingFace runtime cannot work with speech models in the batched mode as the pipeline accepts a list of arrays [(request1), (request2), (request3), (request4), (request5)] which the type of each request is a NumPy array. However, MLServer stacked the NumPy data as an array of arrays of shape (batch_size, input_data) which will result in the following error when sending to the HuggingFace pipeline. It thinks the batched inputs are multi-channel inputs rather than batched single-channel inputs.

raise ValueError("We expect a single channel audio input for AutomaticSpeechRecognitionPipeline") ValueError: We expect a single channel audio input for AutomaticSpeechRecognitionPipeline
opened by saeid93 9

MLServer not working with deployments options with Ambassador, seldon-core

I have been trying to deploy a micro service we developed to use with MLServer. We have been deploying it previously with seldon-core and we are using ambassador as well.

seldon_deployment.yaml file is given below:

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: enmltranslation 
spec:
  protocol: kfserving
  annotations:
    project_name: enmltranslation
    deployment_version: v0.1.0
    seldon.io/rest-timeout: '60000'
    seldon.io/rest-connection-timeout: '60000'
    seldon.io/grpc-read-timeout: '60000'
    seldon.io/ambassador-config: |
      ---
      apiVersion: ambassador/v1
      kind: Mapping
      name: enmltrans_mapping_v0.1.0
      prefix: /microservices/nlp/enmltrans/v0/getpredictions
      service: enmltranslation-pod.default:8000
      rewrite: /nlp/enml/predictions

  predictors:
    - name: default
      graph:
        name: manglish-model 
        type: MODEL
      componentSpecs:
        - spec:
            containers:
              - name: manglish-model  
                image: manglishdummy:v0.1.0 
                ports:
                  - containerPort: 8080
                    name: http
                    protocol: TCP
                  - containerPort: 8081
                    name: grpc
                    protocol: TCP

When accessing the URL via ambassador we are 503 HTTP error indicating service is unavailable.

Update: (April 1, 2022)

I was able to bring up a normal kubernetees deployment by following a deployment.yaml similar to the one provided in tutorial. Yet ambassador support for MLServer seems not working at the moment.

Update (April 20, 2022)

With help of @adriangonz solution, by passing no-executors in seldon-core we are now able to customize the URL with ambassador. Yet is it possible without by passing no-executors so that we can avail the seldon-core's graph functionality?

opened by kurianbenoy-sentient 8

Problems with own logging configuration

Currently I have the problem that my logging configuration is not accepted everywhere. As soon as the REST server starts (Uvicorn Worker), my logging configuration is ignored. I have created a repo that represents my scenario and also which is configuration used. Maybe my configuration is just wrong. In the model itself, I print out all the loggers and the associated handlers and formatter and can see here that it should actually fit. Do you have any ideas?

Here is my small example repo: https://github.com/JustinDroege/mlserver-logging

opened by JustinDroege 7
How can I use python3.7 as the python version?

Hello, I'm currently building the docker image for one of my model but one of the dependencies require python3.7. However, the latest mlserver uses python3.8. May I ask if it's possible to use python3.7 when building the docker image and if so what's the best way to do? Thank you!

opened by zhangk430 7

problem following Huggingface example

Using the following json:

{
    "name": "transformer",
    "implementation": "mlserver_huggingface.HuggingFaceRuntime",
    "parallel_workers": 0,
    "parameters": {
        "extra": {
            "task": "text-generation",
            "pretrained_model": "distilgpt2",
            "optimum_model": true
        }
    }
}

from huggingface example resulted in the following error:

2022-09-04 16:07:24,722 [mlserver] INFO - Using asyncio event-loop policy: uvloop
2022-09-04 16:07:25.588056: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-09-04 16:07:25.588085: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
  File "/home/cc/miniconda3/envs/central/bin/mlserver", line 8, in <module>
    sys.exit(main())
  File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/mlserver/cli/main.py", line 79, in main
    root()
  File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/mlserver/cli/main.py", line 20, in wrapper
    return asyncio.run(f(*args, **kwargs))
  File "/home/cc/miniconda3/envs/central/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "uvloop/loop.pyx", line 1501, in uvloop.loop.Loop.run_until_complete
  File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/mlserver/cli/main.py", line 41, in start
    settings, models_settings = await load_settings(folder)
  File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/mlserver/cli/serve.py", line 36, in load_settings
    models_settings = await repository.list()
  File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/mlserver/repository.py", line 32, in list
    model_settings = self._load_model_settings(model_settings_path)
  File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/mlserver/repository.py", line 45, in _load_model_settings
    model_settings = ModelSettings.parse_file(model_settings_path)
  File "pydantic/main.py", line 564, in pydantic.main.BaseModel.parse_file
  File "pydantic/main.py", line 521, in pydantic.main.BaseModel.parse_obj
  File "pydantic/env_settings.py", line 38, in pydantic.env_settings.BaseSettings.__init__
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for ModelSettings
implementation
  ensure this value contains valid import path or valid callable: cannot import name 'deepspeed_reinit' from 'transformers.deepspeed' (/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/transformers/deepspeed.py) (type=type_error.pyobject; error_message=cannot import name 'deepspeed_reinit' from 'transformers.deepspeed' (/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/transformers/deepspeed.py))

I couldn't follow what is wrong with the pydantic validation. I copy pasted from the readme notebook. Installed packages:

pip freeze | grep mlser
mlserver==1.1.0
mlserver-huggingface==1.1.0

opened by saeid93 7

[FR] Return the name or ID of the model artifact when running inference
It would be useful for traceability to be able to know exactly which model a prediction came from. A use-case for this would be:

I have a model called my-model deployed with mlserver

I change the internal implementation of my-model (or re-train it or whatever) and re-deploy it

The client services still use the same old endpoint, but underneath my new model is running

If the model ID was returned, the client could trace each prediction down to a single model artifact, be it an mlflow model or something else

Another use-case:

I have two different models running, main and challenger

The client is set up to make daily inference with both of them and compares which is better

The implementation of challenger changes over time

After some time the client wants to go back and see exactly which model performed best, and trace it all the way back to the model artifact or even back to the code and the training data itself

Do anyone else have a similar use-case?

Potential solution

We return a header with some metadata, ie. mlserver-model-id: my-model-123 when we run inference

We make this configurable and disabled by default so that we don't risk exposing any secret information for users. Something like MLSERVER_EXPOSE_MODEL_ID_HEADER=1 I guess?

The model ID is inferred from the parameters.uri key in model-settings.json, or if that fails or doesn't exist, the model name

Let me know what you think!

PS. I can potentially help implement it.
opened by dingobar 7

Worker doesn't pickup requests when using MLServer parallel inference

Hi,

I encountered a performance issue with MLServer when trying to serve a mlflow model. Here are my environment variables and I didn't use the settings.json

MLSERVER_MODEL_IMPLEMENTATION=mlserver_mlflow.MLflowRuntime
MLSERVER_MODEL_PARALLEL_WORKERS=4
MLSERVER_MODEL_URI=model
MLSERVER_PARALLEL_WORKERS=4
MLSERVER_MODEL_NAME=default

After starting the server, I see there were multiple processes being created

UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0 21 04:16 ?        00:34:04 /opt/conda/envs/mlflow-env/bin/python3.9 /opt/conda/envs/mlflow-env/bin//mlserver start home
root           5       1  0 04:16 ?        00:00:00 /opt/conda/envs/mlflow-env/bin/python3.9 -c from multiprocessing.resource_tracker import main;main(14)
root           6       1  0 04:16 ?        00:00:01 /opt/conda/envs/mlflow-env/bin/python3.9 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=15, pipe_handle=21) --multiprocessing-fork
root           7       1  0 04:16 ?        00:00:02 /opt/conda/envs/mlflow-env/bin/python3.9 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=15, pipe_handle=25) --multiprocessing-fork
root           8       1  0 04:16 ?        00:00:01 /opt/conda/envs/mlflow-env/bin/python3.9 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=15, pipe_handle=29) --multiprocessing-fork
root           9       1  0 04:16 ?        00:00:01 /opt/conda/envs/mlflow-env/bin/python3.9 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=15, pipe_handle=33) --multiprocessing-fork

However, when I sent parallel requests to the server, it seems all the requests were handled by the main mlserver process, while the others are idle. The requests did succeed though, so the server was functioning.

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND  PPID
      1 root      20   0 1662248 121124     52 R  97.0  0.7  31:35.43 mlserver    0
      5 root      20   0   26308   5368      4 S   0.0  0.0   0:00.04 python3.9   1
      6 root      20   0  628088 119340     12 S   0.0  0.7   0:01.97 python3.9   1
      7 root      20   0  628096 119324     12 S   0.0  0.7   0:02.04 python3.9   1
      8 root      20   0  628088 119288     12 S   0.0  0.7   0:01.90 python3.9.  1
      9 root      20   0  628088 119424     12 S   0.0  0.7   0:01.97 python3.9   1

Is there some settings that I missed?

opened by deli816 7

Using custom build of MLServer for making model's images
Following up community Slack discussion it would be useful to add the capability of building model images from custom builds in the mlserver cli, e.g. something like this:

using modified base image to the CLI e.g. something like this when building the image: mlserver build . --base-image <modified base iamge> or mlserver dockerfile . --base-image <modified base
opened by saeid93 0
Dynamic change of batch size

In some cases we need to be able to change some of the configurations of the deployed models like the batch size on the fly without reloading the model, I think this can be implemented by adding an endpoint that changes the model settings' values.

opened by saeid93 0
support huggingface translation task suffix

When I run huggingface task translation, got this raise ValueError('The task defaults can\'t be correctly selected. You probably meant "translation_XX_to_YY"')

So, I add a new setting item in HuggingfaceSetting, to support task name suffix.

opened by pepesi 0
Bump sphinx from 5.3.0 to 6.0.0
Bumps sphinx from 5.3.0 to 6.0.0.

Release notes

Sourced from sphinx's releases.

v6.0.0

Changelog: https://www.sphinx-doc.org/en/master/changes.html

v6.0.0b2

Changelog: https://www.sphinx-doc.org/en/master/changes.html

v6.0.0b1

Changelog: https://www.sphinx-doc.org/en/master/changes.html

Changelog

Sourced from sphinx's changelog.

Release 6.0.0 (released Dec 29, 2022)

Dependencies

#10468: Drop Python 3.6 support

#10470: Drop Python 3.7, Docutils 0.14, Docutils 0.15, Docutils 0.16, and Docutils 0.17 support. Patch by Adam Turner

Incompatible changes

#7405: Removed the jQuery and underscore.js JavaScript frameworks.

These frameworks are no longer be automatically injected into themes from Sphinx 6.0. If you develop a theme or extension that uses the jQuery, $, or $u global objects, you need to update your JavaScript to modern standards, or use the mitigation below.

The first option is to use the sphinxcontrib.jquery_ extension, which has been developed by the Sphinx team and contributors. To use this, add sphinxcontrib.jquery to the extensions list in conf.py, or call app.setup_extension("sphinxcontrib.jquery") if you develop a Sphinx theme or extension.

The second option is to manually ensure that the frameworks are present. To re-add jQuery and underscore.js, you will need to copy jquery.js and underscore.js from the Sphinx repository_ to your static directory, and add the following to your layout.html:

.. code-block:: html+jinja

{%- block scripts %} {{ super() }} {%- endblock %}

.. _sphinxcontrib.jquery: https://github.com/sphinx-contrib/jquery/

Patch by Adam Turner.

#10471, #10565: Removed deprecated APIs scheduled for removal in Sphinx 6.0. See :ref:dev-deprecated-apis for details. Patch by Adam Turner.

#10901: C Domain: Remove support for parsing pre-v3 style type directives and roles. Also remove associated configuration variables c_allow_pre_v3 and c_warn_on_allowed_pre_v3. Patch by Adam Turner.

Features added

... (truncated)

Commits

5b56a23 Bump to 6.0.0 final

f1d1e9c Update coverage workflow for Tox 4

66a738c Update coverage workflow for new configuration location

041e5f8 Add test coverage for 'today_fmt' reference substitution (#10980)

da25145 Remove unnecessary conditional import in sphinx.ext.napoleon (#11043)

45a0ea9 Migrate coveragepy config into pyproject.toml (#11025)

3ec54f1 Create a pydata_sphinx_theme section in usage examples (#11046)

32bce8f Copy edit the tutorial (#11049)

9844162 Fix example using add_config_value (#10937)

bf4a626 RTD builder: add graphviz depedendency (#11040)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies python
opened by dependabot[bot] 0

Bump tox from 3.27.1 to 4.1.2

Bumps tox from 3.27.1 to 4.1.2.

Release notes

Sourced from tox's releases.

4.1.2

What's Changed

Fix --skip-missing-interpreters by @q0w in tox-dev/tox#2793

Show installed packages after setup in CI envs by @gaborbernat in tox-dev/tox#2794

Fix extracting extras from markers with many extras by @q0w in tox-dev/tox#2792

Full Changelog: https://github.com/tox-dev/tox/compare/4.1.1...4.1.2

4.1.1

What's Changed

Add faq entry about re-use of environments by @jugmac00 in tox-dev/tox#2789

Fix logging error with emoji in git branch name. by @jamwil in tox-dev/tox#2790

Full Changelog: https://github.com/tox-dev/tox/compare/4.1.0...4.1.1

4.1.0

What's Changed

docs(config): Fix minor typo by @rpatterson in tox-dev/tox#2785

Update user_guide.rst by @jamwil in tox-dev/tox#2787

Improved factor selection to allow multiple uses of -f for "OR" and to allow hyphenated factors by @sirosen in tox-dev/tox#2786

New Contributors

@rpatterson made their first contribution in tox-dev/tox#2785

@jamwil made their first contribution in tox-dev/tox#2787

@sirosen made their first contribution in tox-dev/tox#2786

Full Changelog: https://github.com/tox-dev/tox/compare/4.0.19...4.1.0

4.0.18

What's Changed

Handle whitespace around requirements by @gaborbernat in tox-dev/tox#2779

Full Changelog: https://github.com/tox-dev/tox/compare/4.0.17...4.0.18

4.0.17

What's Changed

Suppress a report output when verbosity = 0. by @q0w in tox-dev/tox#2774

Fix --sdistonly behaviour by @q0w in tox-dev/tox#2775

Override toxworkdir with --workdir. by @q0w in tox-dev/tox#2776

New Contributors

@q0w made their first contribution in tox-dev/tox#2774

Full Changelog: https://github.com/tox-dev/tox/compare/4.0.16...4.0.17

4.0.16

What's Changed

... (truncated)

Changelog

Sourced from tox's changelog.

v4.1.2 (2022-12-30)

Bugfixes - 4.1.2

- Fix ``--skip-missing-interpreters`` behaviour - by :user:`q0w`. (:issue:`2649`) - Restore tox 3 behaviour of showing the output of pip freeze, however now only active when running inside a CI environment - by :user:`gaborbernat`. (:issue:`2685`) - Fix extracting extras from markers with many extras - by :user:`q0w`. (:issue:`2791`) v4.1.1 (2022-12-29)

Bugfixes - 4.1.1

Fix logging error with emoji in git branch name. (:issue:2768)

Improved Documentation - 4.1.1

- Add faq entry about re-use of environments - by :user:`jugmac00`. (:issue:`2788`)
v4.1.0 (2022-12-29)
Features - 4.1.0
- ``-f`` can be used multiple times and on hyphenated factors (e.g. ``-f py311-django -f py39``) - by :user:`sirosen`. (:issue:`2766`)
Improved Documentation - 4.1.0
</code></pre>
<ul>
<li>Fix a grammatical typo in docs/user_guide.rst. (:issue:<code>2787</code>)</li>
</ul>
<h2>v4.0.19 (2022-12-28)</h2>
<p>Bugfixes - 4.0.19</p>
<pre><code>- Create temp_dir if not exists - by :user:q0w. (:issue:2770)
v4.0.18 (2022-12-26)
Bugfixes - 4.0.18
</code></pre>
<ul>
<li>Strip leading and trailing whitespace when parsing elements in requirement files - by :user:<code>gaborbernat</code>. (:issue:<code>2773</code>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/tox-dev/tox/commit/6253d6220486e17c6132f613cc6218c87b084417"><code>6253d62</code></a> release 4.1.2</li>
<li><a href="https://github.com/tox-dev/tox/commit/196b20de4c969a163d34692d8a5d646cad4717d6"><code>196b20d</code></a> Fix extracting extras from markers with many extras (<a href="https://github-redirect.dependabot.com/tox-dev/tox/issues/2792">#2792</a>)</li>
<li><a href="https://github.com/tox-dev/tox/commit/a3d3ec042d38195392841a9112911c2bde3587d1"><code>a3d3ec0</code></a> Show installed packages after setup in CI envs (<a href="https://github-redirect.dependabot.com/tox-dev/tox/issues/2794">#2794</a>)</li>
<li><a href="https://github.com/tox-dev/tox/commit/d8c4cb0ffa1999b5d6466e0099dab76f242b1ba8"><code>d8c4cb0</code></a> Fix --skip-missing-interpreters (<a href="https://github-redirect.dependabot.com/tox-dev/tox/issues/2793">#2793</a>)</li>
<li><a href="https://github.com/tox-dev/tox/commit/1d739a2641bcb0545815afbabf8b3da9694ec0ff"><code>1d739a2</code></a> release 4.1.1</li>
<li><a href="https://github.com/tox-dev/tox/commit/b49d11867ab6bd7219c5b2c50f610e7829395975"><code>b49d118</code></a> Fix logging error with emoji in git branch name. (<a href="https://github-redirect.dependabot.com/tox-dev/tox/issues/2790">#2790</a>)</li>
<li><a href="https://github.com/tox-dev/tox/commit/c83819262b4b739899af8a0bd7b6f32442344e4a"><code>c838192</code></a> Add faq entry about re-use of environments (<a href="https://github-redirect.dependabot.com/tox-dev/tox/issues/2789">#2789</a>)</li>
<li><a href="https://github.com/tox-dev/tox/commit/e0aed508607d34559610fbb0d2b4fd038da9c11b"><code>e0aed50</code></a> release 4.1.0</li>
<li><a href="https://github.com/tox-dev/tox/commit/6cdd99cc3ce4fc73455374b40f2dd8a95ef101c5"><code>6cdd99c</code></a> Improved factor selection to allow multiple uses of <code>-f</code> for &quot;OR&quot; and to allo...</li>
<li><a href="https://github.com/tox-dev/tox/commit/6f056cafcca6cee4b23a35fdfc2044647c99e8d7"><code>6f056ca</code></a> Update user_guide.rst (<a href="https://github-redirect.dependabot.com/tox-dev/tox/issues/2787">#2787</a>)</li>
<li>Additional commits viewable in <a href="https://github.com/tox-dev/tox/compare/3.27.1...4.1.2">compare view</a></li>
</ul>
</details>
<br />


Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options


You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)



		                                  
		                                    dependencies python


	                                 
										opened  by dependabot[bot]  0


				
	                 
	                 	  Releases(1.2.1)
	                      
		                  
		                     
		                        
		                           
		                              
		                              
		                                 
		                                     
		                                    1.2.1(Dec 19, 2022)
		                                     
		                                 
		                                  
		                                 
		                                   
			                                   
Full Changelog: https://github.com/SeldonIO/MLServer/compare/1.2.0...1.2.1

			                                  
			                                    Source code(tar.gz)
Source code(zip)

		                                    
		                                 
		                                
		                              
		                           
		                        
		                        
		                           
		                              
		                              
		                                 
		                                     
		                                    1.2.0(Nov 25, 2022)
		                                     
		                                 
		                                  
		                                 
		                                   
			                                   
What's Changed
Simplified Interface for Custom Runtimes
MLServer now exposes an alternative “simplified” interface which can be used to write custom runtimes. This interface can be enabled by decorating your predict() method with the mlserver.codecs.decode_args decorator, and it lets you specify in the method signature both how you want your request payload to be decoded and how to encode the response back.
Based on the information provided in the method signature, MLServer will automatically decode the request payload into the different inputs specified as keyword arguments. Under the hood, this is implemented through MLServer’s codecs and content types system.
from mlserver import MLModel
from mlserver.codecs import decode_args

class MyCustomRuntime(MLModel):

  async def load(self) -> bool:
    # TODO: Replace for custom logic to load a model artifact
    self._model = load_my_custom_model()
    self.ready = True
    return self.ready

  @decode_args
  async def predict(self, questions: List[str], context: List[str]) -> np.ndarray:
    # TODO: Replace for custom logic to run inference
    return self._model.predict(questions, context)

Built-in Templates for Custom Runtimes
To make it easier to write your own custom runtimes, MLServer now ships with a mlserver init command that will generate a templated project. This project will include a skeleton with folders, unit tests, Dockerfiles, etc. for you to fill.

Dynamic Loading of Custom Runtimes
MLServer now lets you load custom runtimes dynamically into a running instance of MLServer. Once you have your custom runtime ready, all you need to do is to move it to your model folder, next to your model-settings.json configuration file.
For example, if we assume a flat model repository where each folder represents a model, you would end up with a folder structure like the one below:
.
├── models
│   └── sum-model
│       ├── model-settings.json
│       ├── models.py

Batch Inference Client
This release of MLServer introduces a new mlserver infer command, which will let you run inference over a large batch of input data on the client side. Under the hood, this command will stream a large set of inference requests from specified input file, arrange them in microbatches, orchestrate the request / response lifecycle, and will finally write back the obtained responses into output file.
Parallel Inference Improvements
The 1.2.0 release of MLServer, includes a number of fixes around the parallel inference pool focused on improving the architecture to optimise memory usage and reduce latency. These changes include (but are not limited to):

The main MLServer process won’t load an extra replica of the model anymore. Instead, all computing will occur on the parallel inference pool.
The worker pool will now ensure that all requests are executed on each worker’s AsyncIO loop, thus optimising compute time vs IO time.
Several improvements around logging from the inference workers.

Dropped support for Python 3.7
MLServer has now dropped support for Python 3.7. Going forward, only 3.8, 3.9 and 3.10 will be supported (with 3.8 being used in our official set of images).
Move to UBI Base Images
The official set of MLServer images has now moved to use UBI 9 as a base image. This ensures support to run MLServer in OpenShift clusters, as well as a well-maintained baseline for our images.
Support for MLflow 2.0
In line with MLServer’s close relationship with the MLflow team, this release of MLServer introduces support for the recently released MLflow 2.0. This introduces changes to the drop-in MLflow “scoring protocol” support, in the MLflow runtime for MLServer, to ensure it’s aligned with MLflow 2.0.
MLServer is also shipped as a dependency of MLflow, therefore you can try it out today by installing MLflow as:
$ pip install mlflow[extras]

To learn more about how to use MLServer directly from the MLflow CLI, check out the MLflow docs.
New Contributors

@johnpaulett made their first contribution in https://github.com/SeldonIO/MLServer/pull/633
@saeid93 made their first contribution in https://github.com/SeldonIO/MLServer/pull/711
@RafalSkolasinski made their first contribution in https://github.com/SeldonIO/MLServer/pull/720
@dumaas made their first contribution in https://github.com/SeldonIO/MLServer/pull/742
@Salehbigdeli made their first contribution in https://github.com/SeldonIO/MLServer/pull/776
@regen100 made their first contribution in https://github.com/SeldonIO/MLServer/pull/839

Full Changelog: https://github.com/SeldonIO/MLServer/compare/1.1.0...1.2.0

			                                  
			                                    Source code(tar.gz)
Source code(zip)

		                                    
		                                 
		                                
		                              
		                           
		                        
		                        
		                           
		                              
		                              
		                                 
		                                     
		                                    1.2.0.dev1(Aug 1, 2022)
		                                     
		                                 
		                                  
		                                 
		                                   
			                                   
			                                  
			                                    Source code(tar.gz)
Source code(zip)

		                                    
		                                 
		                                
		                              
		                           
		                        
		                        
		                           
		                              
		                              
		                                 
		                                     
		                                    1.1.0(Aug 1, 2022)
		                                     
		                                 
		                                  
		                                 
		                                   
			                                   
			                                  
			                                    Source code(tar.gz)
Source code(zip)

MLServer

Related tags

Overview

MLServer

Overview

Usage

Inference Runtimes

Examples

Developer Guide

Versioning

Comments

Potential solution

v6.0.0

v6.0.0b2

v6.0.0b1

Release 6.0.0 (released Dec 29, 2022)

Dependencies

Incompatible changes

Features added

4.1.2

What's Changed

4.1.1

What's Changed

4.1.0

What's Changed

New Contributors

4.0.18

What's Changed

4.0.17

What's Changed

New Contributors

4.0.16

What's Changed

v4.1.2 (2022-12-30)

v4.1.1 (2022-12-29)

v4.1.0 (2022-12-29)

v4.0.18 (2022-12-26)

Releases(1.2.1)

1.2.1(Dec 19, 2022)

1.2.0(Nov 25, 2022)

What's Changed

Simplified Interface for Custom Runtimes

Built-in Templates for Custom Runtimes

Dynamic Loading of Custom Runtimes

Batch Inference Client

Parallel Inference Improvements

Dropped support for Python 3.7

Move to UBI Base Images

Support for MLflow 2.0

New Contributors

1.2.0.dev1(Aug 1, 2022)

1.1.0(Aug 1, 2022)

Owner

Seldon