MLServer

Overview

MLServer

An open source inference server to serve your machine learning models.

⚠️ This is a Work in Progress.

Overview

MLServer aims to provide an easy way to start serving your machine learning models through a REST and gRPC interface, fully compliant with KFServing's V2 Dataplane spec.

You can read more about the goals of this project on the inital design document.

Usage

You can install the mlserver package running:

pip install mlserver

Note that to use any of the optional inference runtimes, you'll need to install the relevant package. For example, to serve a scikit-learn model, you would need to install the mlserver-sklearn package:

pip install mlserver-sklearn

For further information on how to use MLServer, you can check any of the available examples.

Inference Runtimes

Inference runtimes allow you to define how your model should be used within MLServer. Out of the box, MLServer comes with a set of pre-packaged runtimes which let you interact with a subset of common ML frameworks. This allows you to start serving models saved in these frameworks straight away.

To avoid bringing in dependencies for frameworks that you don't need to use, these runtimes are implemented as independent optional packages. This mechanism also allows you to rollout your [own custom runtimes]( very easily.

To pick which runtime you want to use for your model, you just need to make sure that the right package is installed, and then point to the correct runtime class in your model-settings.json file.

The included runtimes are:

Framework Package Name Implementation Class Example Source Code
Scikit-Learn mlserver-sklearn mlserver_sklearn.SKLearnModel Scikit-Learn example ./runtimes/sklearn
XGBoost mlserver-xgboost mlserver_xgboost.XGBoostModel XGBoost example ./runtimes/xgboost
Spark MLlib mlserver-mllib mlserver_mllib.MLlibModel Coming Soon ./runtimes/mllib
LightGBM mlserver-lightgbm mlserver_lightgbm.LightGBMModel Coming Soon ./runtimes/lightgbm

Examples

On the list below, you can find a few examples on how you can leverage mlserver to start serving your machine learning models.

Developer Guide

Versioning

Both the main mlserver package and the inference runtimes packages try to follow the same versioning schema. To bump the version across all of them, you can use the ./hack/update-version.sh script. For example:

./hack/update-version.sh 0.2.0.dev1
Comments
  • multi model serving IsADirectoryError: [Errno 21] Is a directory: '/mnt/models'

    multi model serving IsADirectoryError: [Errno 21] Is a directory: '/mnt/models'

    Hi, I'm having this error (full log) when serving two sklearn models on my local kind cluster following this guide Chatted with Alejandro on slack about this issue, he was able to reproduce it, link to the thread: https://seldondev.slack.com/archives/C03DQFTFXMX/p1659988373014049

      File "/usr/local/bin/mlserver", line 8, in <module>
        sys.exit(main())
      File "/usr/local/lib/python3.8/site-packages/mlserver/cli/main.py", line 79, in main
        root()
      File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
        return self.main(*args, **kwargs)
      File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1055, in main
        rv = self.invoke(ctx)
      File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/usr/local/lib/python3.8/site-packages/click/core.py", line 760, in invoke
        return __callback(*args, **kwargs)
      File "/usr/local/lib/python3.8/site-packages/mlserver/cli/main.py", line 20, in wrapper
        return asyncio.run(f(*args, **kwargs))
      File "/usr/local/lib/python3.8/asyncio/runners.py", line 44, in run
        return loop.run_until_complete(main)
      File "uvloop/loop.pyx", line 1501, in uvloop.loop.Loop.run_until_complete
      File "/usr/local/lib/python3.8/site-packages/mlserver/cli/main.py", line 44, in start
        await server.start(models_settings)
      File "/usr/local/lib/python3.8/site-packages/mlserver/server.py", line 98, in start
        await asyncio.gather(
      File "/usr/local/lib/python3.8/site-packages/mlserver/registry.py", line 272, in load
        return await self._models[model_settings.name].load(model_settings)
      File "/usr/local/lib/python3.8/site-packages/mlserver/registry.py", line 143, in load
        await self._load_model(new_model)
      File "/usr/local/lib/python3.8/site-packages/mlserver/registry.py", line 151, in _load_model
        await model.load()
      File "/usr/local/lib/python3.8/site-packages/mlserver_sklearn/sklearn.py", line 36, in load
        self._model = joblib.load(model_uri)
      File "/usr/local/lib/python3.8/site-packages/joblib/numpy_pickle.py", line 579, in load
        with open(filename, 'rb') as f:
    IsADirectoryError: [Errno 21] Is a directory: '/mnt/models'
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/usr/local/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
        exitcode = _main(fd, parent_sentinel)
      File "/usr/local/lib/python3.8/multiprocessing/spawn.py", line 126, in _main
        self = reduction.pickle.load(from_parent)
      File "/usr/local/lib/python3.8/multiprocessing/synchronize.py", line 110, in __setstate__
        self._semlock = _multiprocessing.SemLock._rebuild(*state)
    FileNotFoundError: [Errno 2] No such file or directory
    

    My Seldon deployment definition file:

    metadata:
      name: multi-model
      namespace: seldon
    spec:
      protocol: v2
      name: multi-model
      predictors:
      - graph:
          type: MODEL
          implementation: SKLEARN_SERVER
          modelUri: gs://<bucket_name>/MODELS/MultiModel
          name: multi
          parameters:
            - name: method
              type: STRING
              value: predict
          envSecretRefName: seldon-rclone-secret
        name: default
        replicas: 1
    

    Inside MultiModel directory:

    ├── IrisModel
    │   ├── model-settings.json
    │   └── model.joblib
    ├── RandomForestModel
    │   ├── model-settings.json
    │   └── model.joblib
    ├── multi_model.yaml
    └── settings.json
    

    model-setting.json:

    { "name": "RandomForestModel", "implementation": "mlserver_sklearn.SKLearnModel" }

    and

    { "name": "RandomForestModel", "implementation": "mlserver_sklearn.SKLearnModel" }

    opened by yc2984 15
  • Hugging face token-classification output causes non JSON serializable error

    Hugging face token-classification output causes non JSON serializable error

    Looks like there was arecent MR to support all types of hugginface inputs/outputs however token-classification outputs are failing for me. I was able to reproduce with a couple token-classification models, cmarkea/distilcamembert-base-ner for example threw the following error.

    Using docker image index.docker.io/seldonio/mlserver@sha256:7f7806e8ed781979bb5ef4d7774156a31046c8832d76b57403127add33064872

    2022-12-16 19:42:28,391 [mlserver] INFO - Loaded model 'huggingface-token-class' succesfully.
    2022-12-16 19:42:28,436 [mlserver] INFO - Loaded model 'huggingface-token-class' succesfully.
    2022-12-16 19:43:05,194 [mlserver] DEBUG - Payload id='e05f34eb-014f-4b76-9c00-8ec9b29ff3aa' parameters=Parameters(content_type=None, headers={'host': 'localhost:8080', 'user-agent': 'python-requests/2.28.1', 'content-length': '131', 'accept': '*/*', 'accept-encoding': 'gzip, deflate, br', 'content-type': 'application/json', 'x-forwarded-for': '127.0.0.1', 'Ce-Specversion': '0.3', 'Ce-Source': 'io.seldon.serving.deployment.mlserver.default', 'Ce-Type': 'io.seldon.serving.inference.request', 'Ce-Modelid': 'huggingface-token-class', 'Ce-Inferenceservicename': 'mlserver', 'Ce-Endpoint': 'huggingface-token-class', 'Ce-Id': 'e05f34eb-014f-4b76-9c00-8ec9b29ff3aa', 'Ce-Requestid': 'e05f34eb-014f-4b76-9c00-8ec9b29ff3aa', 'Ce-Namespace': 'default'}) inputs=[RequestInput(name='args', shape=[1], datatype='BYTES', parameters=None, data=TensorData(__root__=['My name is Clara and I live in Berkeley, California.']))] outputs=None
    2022-12-16 19:43:05,272 [mlserver] DEBUG - Prediction [[{'entity': 'I-MISC', 'score': 0.58051825, 'index': 5, 'word': '▁Clara', 'start': 10, 'end': 16}, {'entity': 'I-LOC', 'score': 0.9975036, 'index': 10, 'word': '▁Ber', 'start': 30, 'end': 34}, {'entity': 'I-LOC', 'score': 0.99758935, 'index': 11, 'word': 'ke', 'start': 34, 'end': 36}, {'entity': 'I-LOC', 'score': 0.99756134, 'index': 12, 'word': 'ley', 'start': 36, 'end': 39}, {'entity': 'I-LOC', 'score': 0.99869055, 'index': 14, 'word': '▁Cali', 'start': 40, 'end': 45}, {'entity': 'I-LOC', 'score': 0.99901426, 'index': 15, 'word': 'for', 'start': 45, 'end': 48}, {'entity': 'I-LOC', 'score': 0.9989453, 'index': 16, 'word': 'nia', 'start': 48, 'end': 51}]]
    2022-12-16 19:43:05,273 [mlserver.parallel] ERROR - An error occurred calling method 'predict' from model 'huggingface-token-class'.
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/worker.py", line 122, in _process_request
        return_value = await method(*request.method_args, **request.method_kwargs)
      File "/opt/conda/lib/python3.8/site-packages/mlserver_huggingface/runtime.py", line 109, in predict
        str_out = [json.dumps(pred, cls=NumpyEncoder) for pred in prediction]
      File "/opt/conda/lib/python3.8/site-packages/mlserver_huggingface/runtime.py", line 109, in <listcomp>
        str_out = [json.dumps(pred, cls=NumpyEncoder) for pred in prediction]
      File "/opt/conda/lib/python3.8/json/__init__.py", line 234, in dumps
        return cls(
      File "/opt/conda/lib/python3.8/json/encoder.py", line 199, in encode
        chunks = self.iterencode(o, _one_shot=True)
      File "/opt/conda/lib/python3.8/json/encoder.py", line 257, in iterencode
        return _iterencode(o, 0)
      File "/opt/conda/lib/python3.8/site-packages/mlserver_huggingface/common.py", line 135, in default
        return json.JSONEncoder.default(self, obj)
      File "/opt/conda/lib/python3.8/json/encoder.py", line 179, in default
        raise TypeError(f'Object of type {o.__class__.__name__} '
    TypeError: Object of type float32 is not JSON serializable
    INFO:     None:0 - "POST /v2/models/huggingface-token-class/infer HTTP/1.1" 500 Internal Server Error
    
    opened by ajsalow 10
  • Uvicorn logging settings

    Uvicorn logging settings

    Closes #530

    add support to uvicorn logging setting:

    • add a parameter to define a logging configuration file into application settings
    • edit uvicorn Config initialization to add this parameter
    opened by pablobgar 10
  • Seldon MLServer only supports a single input tensor

    Seldon MLServer only supports a single input tensor

    We are getting below error when we are trying KFS V2 request with multiple tensors:

    InvalidArgument (3)
    inference.GRPCInferenceService/ModelInfer: INVALID_ARGUMENT: SKLearnModel only supports a single input tensor (4 were received)
    

    we have checked the implementations and it looks like it has been designed to work with only single tensor. https://github.com/SeldonIO/MLServer/blob/56a90a45119ed5ab3cd7c99c68a109f5426828f6/runtimes/sklearn/mlserver_sklearn/sklearn.py#L47

    Any thought(s) to make it work for multiple tensors. Thanks

    opened by amnpandey 10
  • HuggingFace speech models not supported

    HuggingFace speech models not supported

    MLserver HuggingFace runtime cannot work with speech models in the batched mode as the pipeline accepts a list of arrays [(request1), (request2), (request3), (request4), (request5)] which the type of each request is a NumPy array. However, MLServer stacked the NumPy data as an array of arrays of shape (batch_size, input_data) which will result in the following error when sending to the HuggingFace pipeline. It thinks the batched inputs are multi-channel inputs rather than batched single-channel inputs.

    raise ValueError("We expect a single channel audio input for AutomaticSpeechRecognitionPipeline")
    ValueError: We expect a single channel audio input for AutomaticSpeechRecognitionPipeline
    
    opened by saeid93 9
  • MLServer not working with deployments options with Ambassador, seldon-core

    MLServer not working with deployments options with Ambassador, seldon-core

    I have been trying to deploy a micro service we developed to use with MLServer. We have been deploying it previously with seldon-core and we are using ambassador as well.

    seldon_deployment.yaml file is given below:

    apiVersion: machinelearning.seldon.io/v1
    kind: SeldonDeployment
    metadata:
      name: enmltranslation 
    spec:
      protocol: kfserving
      annotations:
        project_name: enmltranslation
        deployment_version: v0.1.0
        seldon.io/rest-timeout: '60000'
        seldon.io/rest-connection-timeout: '60000'
        seldon.io/grpc-read-timeout: '60000'
        seldon.io/ambassador-config: |
          ---
          apiVersion: ambassador/v1
          kind: Mapping
          name: enmltrans_mapping_v0.1.0
          prefix: /microservices/nlp/enmltrans/v0/getpredictions
          service: enmltranslation-pod.default:8000
          rewrite: /nlp/enml/predictions
    
      predictors:
        - name: default
          graph:
            name: manglish-model 
            type: MODEL
          componentSpecs:
            - spec:
                containers:
                  - name: manglish-model  
                    image: manglishdummy:v0.1.0 
                    ports:
                      - containerPort: 8080
                        name: http
                        protocol: TCP
                      - containerPort: 8081
                        name: grpc
                        protocol: TCP
    

    When accessing the URL via ambassador we are 503 HTTP error indicating service is unavailable.

    Update: (April 1, 2022)

    I was able to bring up a normal kubernetees deployment by following a deployment.yaml similar to the one provided in tutorial. Yet ambassador support for MLServer seems not working at the moment.

    Update (April 20, 2022)

    With help of @adriangonz solution, by passing no-executors in seldon-core we are now able to customize the URL with ambassador. Yet is it possible without by passing no-executors so that we can avail the seldon-core's graph functionality?

    opened by kurianbenoy-sentient 8
  • Problems with own logging configuration

    Problems with own logging configuration

    Currently I have the problem that my logging configuration is not accepted everywhere. As soon as the REST server starts (Uvicorn Worker), my logging configuration is ignored. I have created a repo that represents my scenario and also which is configuration used. Maybe my configuration is just wrong. In the model itself, I print out all the loggers and the associated handlers and formatter and can see here that it should actually fit. Do you have any ideas?

    Here is my small example repo: https://github.com/JustinDroege/mlserver-logging

    opened by JustinDroege 7
  • How can I use python3.7 as the python version?

    How can I use python3.7 as the python version?

    Hello, I'm currently building the docker image for one of my model but one of the dependencies require python3.7. However, the latest mlserver uses python3.8. May I ask if it's possible to use python3.7 when building the docker image and if so what's the best way to do? Thank you!

    opened by zhangk430 7
  • problem following Huggingface example

    problem following Huggingface example

    Using the following json:

    {
        "name": "transformer",
        "implementation": "mlserver_huggingface.HuggingFaceRuntime",
        "parallel_workers": 0,
        "parameters": {
            "extra": {
                "task": "text-generation",
                "pretrained_model": "distilgpt2",
                "optimum_model": true
            }
        }
    }
    

    from huggingface example resulted in the following error:

    2022-09-04 16:07:24,722 [mlserver] INFO - Using asyncio event-loop policy: uvloop
    2022-09-04 16:07:25.588056: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
    2022-09-04 16:07:25.588085: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
    Traceback (most recent call last):
      File "/home/cc/miniconda3/envs/central/bin/mlserver", line 8, in <module>
        sys.exit(main())
      File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/mlserver/cli/main.py", line 79, in main
        root()
      File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
        return self.main(*args, **kwargs)
      File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/click/core.py", line 1053, in main
        rv = self.invoke(ctx)
      File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/click/core.py", line 754, in invoke
        return __callback(*args, **kwargs)
      File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/mlserver/cli/main.py", line 20, in wrapper
        return asyncio.run(f(*args, **kwargs))
      File "/home/cc/miniconda3/envs/central/lib/python3.8/asyncio/runners.py", line 44, in run
        return loop.run_until_complete(main)
      File "uvloop/loop.pyx", line 1501, in uvloop.loop.Loop.run_until_complete
      File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/mlserver/cli/main.py", line 41, in start
        settings, models_settings = await load_settings(folder)
      File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/mlserver/cli/serve.py", line 36, in load_settings
        models_settings = await repository.list()
      File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/mlserver/repository.py", line 32, in list
        model_settings = self._load_model_settings(model_settings_path)
      File "/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/mlserver/repository.py", line 45, in _load_model_settings
        model_settings = ModelSettings.parse_file(model_settings_path)
      File "pydantic/main.py", line 564, in pydantic.main.BaseModel.parse_file
      File "pydantic/main.py", line 521, in pydantic.main.BaseModel.parse_obj
      File "pydantic/env_settings.py", line 38, in pydantic.env_settings.BaseSettings.__init__
      File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
    pydantic.error_wrappers.ValidationError: 1 validation error for ModelSettings
    implementation
      ensure this value contains valid import path or valid callable: cannot import name 'deepspeed_reinit' from 'transformers.deepspeed' (/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/transformers/deepspeed.py) (type=type_error.pyobject; error_message=cannot import name 'deepspeed_reinit' from 'transformers.deepspeed' (/home/cc/miniconda3/envs/central/lib/python3.8/site-packages/transformers/deepspeed.py))
    

    I couldn't follow what is wrong with the pydantic validation. I copy pasted from the readme notebook. Installed packages:

    pip freeze | grep mlser
    mlserver==1.1.0
    mlserver-huggingface==1.1.0
    
    opened by saeid93 7
  • [FR] Return the name or ID of the model artifact when running inference

    [FR] Return the name or ID of the model artifact when running inference

    It would be useful for traceability to be able to know exactly which model a prediction came from. A use-case for this would be:

    • I have a model called my-model deployed with mlserver
    • I change the internal implementation of my-model (or re-train it or whatever) and re-deploy it
    • The client services still use the same old endpoint, but underneath my new model is running
    • If the model ID was returned, the client could trace each prediction down to a single model artifact, be it an mlflow model or something else

    Another use-case:

    • I have two different models running, main and challenger
    • The client is set up to make daily inference with both of them and compares which is better
    • The implementation of challenger changes over time
    • After some time the client wants to go back and see exactly which model performed best, and trace it all the way back to the model artifact or even back to the code and the training data itself

    Do anyone else have a similar use-case?

    Potential solution

    • We return a header with some metadata, ie. mlserver-model-id: my-model-123 when we run inference
    • We make this configurable and disabled by default so that we don't risk exposing any secret information for users. Something like MLSERVER_EXPOSE_MODEL_ID_HEADER=1 I guess?
    • The model ID is inferred from the parameters.uri key in model-settings.json, or if that fails or doesn't exist, the model name

    Let me know what you think!

    PS. I can potentially help implement it.

    opened by dingobar 7
  • Worker doesn't pickup requests when using MLServer parallel inference

    Worker doesn't pickup requests when using MLServer parallel inference

    Hi,

    I encountered a performance issue with MLServer when trying to serve a mlflow model. Here are my environment variables and I didn't use the settings.json

    MLSERVER_MODEL_IMPLEMENTATION=mlserver_mlflow.MLflowRuntime
    MLSERVER_MODEL_PARALLEL_WORKERS=4
    MLSERVER_MODEL_URI=model
    MLSERVER_PARALLEL_WORKERS=4
    MLSERVER_MODEL_NAME=default
    

    After starting the server, I see there were multiple processes being created

    UID          PID    PPID  C STIME TTY          TIME CMD
    root           1       0 21 04:16 ?        00:34:04 /opt/conda/envs/mlflow-env/bin/python3.9 /opt/conda/envs/mlflow-env/bin//mlserver start home
    root           5       1  0 04:16 ?        00:00:00 /opt/conda/envs/mlflow-env/bin/python3.9 -c from multiprocessing.resource_tracker import main;main(14)
    root           6       1  0 04:16 ?        00:00:01 /opt/conda/envs/mlflow-env/bin/python3.9 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=15, pipe_handle=21) --multiprocessing-fork
    root           7       1  0 04:16 ?        00:00:02 /opt/conda/envs/mlflow-env/bin/python3.9 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=15, pipe_handle=25) --multiprocessing-fork
    root           8       1  0 04:16 ?        00:00:01 /opt/conda/envs/mlflow-env/bin/python3.9 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=15, pipe_handle=29) --multiprocessing-fork
    root           9       1  0 04:16 ?        00:00:01 /opt/conda/envs/mlflow-env/bin/python3.9 -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=15, pipe_handle=33) --multiprocessing-fork
    

    However, when I sent parallel requests to the server, it seems all the requests were handled by the main mlserver process, while the others are idle. The requests did succeed though, so the server was functioning.

        PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND  PPID
          1 root      20   0 1662248 121124     52 R  97.0  0.7  31:35.43 mlserver    0
          5 root      20   0   26308   5368      4 S   0.0  0.0   0:00.04 python3.9   1
          6 root      20   0  628088 119340     12 S   0.0  0.7   0:01.97 python3.9   1
          7 root      20   0  628096 119324     12 S   0.0  0.7   0:02.04 python3.9   1
          8 root      20   0  628088 119288     12 S   0.0  0.7   0:01.90 python3.9.  1
          9 root      20   0  628088 119424     12 S   0.0  0.7   0:01.97 python3.9   1
    

    Is there some settings that I missed?

    opened by deli816 7
  • Using custom build of MLServer for making model's images

    Using custom build of MLServer for making model's images

    Following up community Slack discussion it would be useful to add the capability of building model images from custom builds in the mlserver cli, e.g. something like this:

    using modified base image to the CLI e.g. something like this when building the image:
    mlserver build . --base-image <modified base iamge>
    or
    mlserver dockerfile . --base-image <modified base 
    
    opened by saeid93 0
  • Dynamic change of batch size

    Dynamic change of batch size

    In some cases we need to be able to change some of the configurations of the deployed models like the batch size on the fly without reloading the model, I think this can be implemented by adding an endpoint that changes the model settings' values.

    opened by saeid93 0
  • support huggingface translation task suffix

    support huggingface translation task suffix

    When I run huggingface task translation, got this raise ValueError('The task defaults can\'t be correctly selected. You probably meant "translation_XX_to_YY"')

    So, I add a new setting item in HuggingfaceSetting, to support task name suffix.

    opened by pepesi 0
  • Bump sphinx from 5.3.0 to 6.0.0

    Bump sphinx from 5.3.0 to 6.0.0

    Bumps sphinx from 5.3.0 to 6.0.0.

    Release notes

    Sourced from sphinx's releases.

    v6.0.0

    Changelog: https://www.sphinx-doc.org/en/master/changes.html

    v6.0.0b2

    Changelog: https://www.sphinx-doc.org/en/master/changes.html

    v6.0.0b1

    Changelog: https://www.sphinx-doc.org/en/master/changes.html

    Changelog

    Sourced from sphinx's changelog.

    Release 6.0.0 (released Dec 29, 2022)

    Dependencies

    • #10468: Drop Python 3.6 support
    • #10470: Drop Python 3.7, Docutils 0.14, Docutils 0.15, Docutils 0.16, and Docutils 0.17 support. Patch by Adam Turner

    Incompatible changes

    • #7405: Removed the jQuery and underscore.js JavaScript frameworks.

      These frameworks are no longer be automatically injected into themes from Sphinx 6.0. If you develop a theme or extension that uses the jQuery, $, or $u global objects, you need to update your JavaScript to modern standards, or use the mitigation below.

      The first option is to use the sphinxcontrib.jquery_ extension, which has been developed by the Sphinx team and contributors. To use this, add sphinxcontrib.jquery to the extensions list in conf.py, or call app.setup_extension("sphinxcontrib.jquery") if you develop a Sphinx theme or extension.

      The second option is to manually ensure that the frameworks are present. To re-add jQuery and underscore.js, you will need to copy jquery.js and underscore.js from the Sphinx repository_ to your static directory, and add the following to your layout.html:

      .. code-block:: html+jinja

      {%- block scripts %} {{ super() }} {%- endblock %}

      .. _sphinxcontrib.jquery: https://github.com/sphinx-contrib/jquery/

      Patch by Adam Turner.

    • #10471, #10565: Removed deprecated APIs scheduled for removal in Sphinx 6.0. See :ref:dev-deprecated-apis for details. Patch by Adam Turner.

    • #10901: C Domain: Remove support for parsing pre-v3 style type directives and roles. Also remove associated configuration variables c_allow_pre_v3 and c_warn_on_allowed_pre_v3. Patch by Adam Turner.

    Features added

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies python 
    opened by dependabot[bot] 0
  • Bump tox from 3.27.1 to 4.1.2

    Bump tox from 3.27.1 to 4.1.2

    Bumps tox from 3.27.1 to 4.1.2.

    Release notes

    Sourced from tox's releases.

    4.1.2

    What's Changed

    Full Changelog: https://github.com/tox-dev/tox/compare/4.1.1...4.1.2

    4.1.1

    What's Changed

    Full Changelog: https://github.com/tox-dev/tox/compare/4.1.0...4.1.1

    4.1.0

    What's Changed

    New Contributors

    Full Changelog: https://github.com/tox-dev/tox/compare/4.0.19...4.1.0

    4.0.18

    What's Changed

    Full Changelog: https://github.com/tox-dev/tox/compare/4.0.17...4.0.18

    4.0.17

    What's Changed

    New Contributors

    Full Changelog: https://github.com/tox-dev/tox/compare/4.0.16...4.0.17

    4.0.16

    What's Changed

    ... (truncated)

    Changelog

    Sourced from tox's changelog.

    v4.1.2 (2022-12-30)

    Bugfixes - 4.1.2

    - Fix ``--skip-missing-interpreters`` behaviour - by :user:`q0w`. (:issue:`2649`)
    - Restore tox 3 behaviour of showing the output of pip freeze, however now only active when running inside a CI
      environment - by :user:`gaborbernat`. (:issue:`2685`)
    - Fix extracting extras from markers with many extras - by :user:`q0w`. (:issue:`2791`)
    

    v4.1.1 (2022-12-29)

    Bugfixes - 4.1.1

    • Fix logging error with emoji in git branch name. (:issue:2768)

    Improved Documentation - 4.1.1

    - Add faq entry about re-use of environments - by :user:`jugmac00`. (:issue:`2788`)
    

    v4.1.0 (2022-12-29)

    Features - 4.1.0

    - ``-f`` can be used multiple times and on hyphenated factors (e.g. ``-f py311-django -f py39``) - by :user:`sirosen`. (:issue:`2766`)
    

    Improved Documentation - 4.1.0 </code></pre> <ul> <li>Fix a grammatical typo in docs/user_guide.rst. (:issue:<code>2787</code>)</li> </ul> <h2>v4.0.19 (2022-12-28)</h2> <p>Bugfixes - 4.0.19</p> <pre><code>- Create temp_dir if not exists - by :user:q0w. (:issue:2770)

    v4.0.18 (2022-12-26)

    Bugfixes - 4.0.18 </code></pre> <ul> <li>Strip leading and trailing whitespace when parsing elements in requirement files - by :user:<code>gaborbernat</code>. (:issue:<code>2773</code>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary>

    <ul> <li><a href="https://github.com/tox-dev/tox/commit/6253d6220486e17c6132f613cc6218c87b084417"><code>6253d62</code></a> release 4.1.2</li> <li><a href="https://github.com/tox-dev/tox/commit/196b20de4c969a163d34692d8a5d646cad4717d6"><code>196b20d</code></a> Fix extracting extras from markers with many extras (<a href="https://github-redirect.dependabot.com/tox-dev/tox/issues/2792">#2792</a>)</li> <li><a href="https://github.com/tox-dev/tox/commit/a3d3ec042d38195392841a9112911c2bde3587d1"><code>a3d3ec0</code></a> Show installed packages after setup in CI envs (<a href="https://github-redirect.dependabot.com/tox-dev/tox/issues/2794">#2794</a>)</li> <li><a href="https://github.com/tox-dev/tox/commit/d8c4cb0ffa1999b5d6466e0099dab76f242b1ba8"><code>d8c4cb0</code></a> Fix --skip-missing-interpreters (<a href="https://github-redirect.dependabot.com/tox-dev/tox/issues/2793">#2793</a>)</li> <li><a href="https://github.com/tox-dev/tox/commit/1d739a2641bcb0545815afbabf8b3da9694ec0ff"><code>1d739a2</code></a> release 4.1.1</li> <li><a href="https://github.com/tox-dev/tox/commit/b49d11867ab6bd7219c5b2c50f610e7829395975"><code>b49d118</code></a> Fix logging error with emoji in git branch name. (<a href="https://github-redirect.dependabot.com/tox-dev/tox/issues/2790">#2790</a>)</li> <li><a href="https://github.com/tox-dev/tox/commit/c83819262b4b739899af8a0bd7b6f32442344e4a"><code>c838192</code></a> Add faq entry about re-use of environments (<a href="https://github-redirect.dependabot.com/tox-dev/tox/issues/2789">#2789</a>)</li> <li><a href="https://github.com/tox-dev/tox/commit/e0aed508607d34559610fbb0d2b4fd038da9c11b"><code>e0aed50</code></a> release 4.1.0</li> <li><a href="https://github.com/tox-dev/tox/commit/6cdd99cc3ce4fc73455374b40f2dd8a95ef101c5"><code>6cdd99c</code></a> Improved factor selection to allow multiple uses of <code>-f</code> for &quot;OR&quot; and to allo...</li> <li><a href="https://github.com/tox-dev/tox/commit/6f056cafcca6cee4b23a35fdfc2044647c99e8d7"><code>6f056ca</code></a> Update user_guide.rst (<a href="https://github-redirect.dependabot.com/tox-dev/tox/issues/2787">#2787</a>)</li> <li>Additional commits viewable in <a href="https://github.com/tox-dev/tox/compare/3.27.1...4.1.2">compare view</a></li> </ul> </details>

    <br />

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies python 
    opened by dependabot[bot] 0
Releases(1.2.1)
  • 1.2.1(Dec 19, 2022)

  • 1.2.0(Nov 25, 2022)

    What's Changed

    Simplified Interface for Custom Runtimes

    MLServer now exposes an alternative “simplified” interface which can be used to write custom runtimes. This interface can be enabled by decorating your predict() method with the mlserver.codecs.decode_args decorator, and it lets you specify in the method signature both how you want your request payload to be decoded and how to encode the response back.

    Based on the information provided in the method signature, MLServer will automatically decode the request payload into the different inputs specified as keyword arguments. Under the hood, this is implemented through MLServer’s codecs and content types system.

    from mlserver import MLModel
    from mlserver.codecs import decode_args
    
    class MyCustomRuntime(MLModel):
    
      async def load(self) -> bool:
        # TODO: Replace for custom logic to load a model artifact
        self._model = load_my_custom_model()
        self.ready = True
        return self.ready
    
      @decode_args
      async def predict(self, questions: List[str], context: List[str]) -> np.ndarray:
        # TODO: Replace for custom logic to run inference
        return self._model.predict(questions, context)
    

    Built-in Templates for Custom Runtimes

    To make it easier to write your own custom runtimes, MLServer now ships with a mlserver init command that will generate a templated project. This project will include a skeleton with folders, unit tests, Dockerfiles, etc. for you to fill.

    image1

    Dynamic Loading of Custom Runtimes

    MLServer now lets you load custom runtimes dynamically into a running instance of MLServer. Once you have your custom runtime ready, all you need to do is to move it to your model folder, next to your model-settings.json configuration file.

    For example, if we assume a flat model repository where each folder represents a model, you would end up with a folder structure like the one below:

    .
    ├── models
    │   └── sum-model
    │       ├── model-settings.json
    │       ├── models.py
    

    Batch Inference Client

    This release of MLServer introduces a new mlserver infer command, which will let you run inference over a large batch of input data on the client side. Under the hood, this command will stream a large set of inference requests from specified input file, arrange them in microbatches, orchestrate the request / response lifecycle, and will finally write back the obtained responses into output file.

    Parallel Inference Improvements

    The 1.2.0 release of MLServer, includes a number of fixes around the parallel inference pool focused on improving the architecture to optimise memory usage and reduce latency. These changes include (but are not limited to):

    • The main MLServer process won’t load an extra replica of the model anymore. Instead, all computing will occur on the parallel inference pool.
    • The worker pool will now ensure that all requests are executed on each worker’s AsyncIO loop, thus optimising compute time vs IO time.
    • Several improvements around logging from the inference workers.

    Dropped support for Python 3.7

    MLServer has now dropped support for Python 3.7. Going forward, only 3.8, 3.9 and 3.10 will be supported (with 3.8 being used in our official set of images).

    Move to UBI Base Images

    The official set of MLServer images has now moved to use UBI 9 as a base image. This ensures support to run MLServer in OpenShift clusters, as well as a well-maintained baseline for our images.

    Support for MLflow 2.0

    In line with MLServer’s close relationship with the MLflow team, this release of MLServer introduces support for the recently released MLflow 2.0. This introduces changes to the drop-in MLflow “scoring protocol” support, in the MLflow runtime for MLServer, to ensure it’s aligned with MLflow 2.0.

    MLServer is also shipped as a dependency of MLflow, therefore you can try it out today by installing MLflow as:

    $ pip install mlflow[extras]
    

    To learn more about how to use MLServer directly from the MLflow CLI, check out the MLflow docs.

    New Contributors

    • @johnpaulett made their first contribution in https://github.com/SeldonIO/MLServer/pull/633
    • @saeid93 made their first contribution in https://github.com/SeldonIO/MLServer/pull/711
    • @RafalSkolasinski made their first contribution in https://github.com/SeldonIO/MLServer/pull/720
    • @dumaas made their first contribution in https://github.com/SeldonIO/MLServer/pull/742
    • @Salehbigdeli made their first contribution in https://github.com/SeldonIO/MLServer/pull/776
    • @regen100 made their first contribution in https://github.com/SeldonIO/MLServer/pull/839

    Full Changelog: https://github.com/SeldonIO/MLServer/compare/1.1.0...1.2.0

    Source code(tar.gz)
    Source code(zip)
  • 1.2.0.dev1(Aug 1, 2022)

Owner
Seldon
Machine Learning Deployment for Kubernetes
Seldon