Toolkit for developing and maintaining ML models

Overview

Logo

modelkit

Python framework for production ML systems.


modelkit is a minimalist yet powerful MLOps library for Python, built for people who want to deploy ML models to production.

It packs several features which make your go-to-production journey a breeze, and ensures that the same exact code will run in production, on your machine, or on data processing pipelines.

Quickstart

modelkit provides a straightforward and consistent way to wrap your prediction code in a Model class:

from modelkit import Model

class MyModel(Model):
    def _predict(self, item):
        # This is where your prediction logic goes
        ...
        return result

Be sure to check out our tutorials in the documentation.

Features

Wrapping your prediction code in modelkit instantly gives acces to all features:

  • fast Model predictions can be batched for speed (you define the batching logic) with minimal overhead.
  • composable Models can depend on other models, and evaluate them however you need to
  • extensible Models can rely on arbitrary supporting configurations files called assets hosted on local or cloud object stores
  • type-safe Models' inputs and outputs can be validated by pydantic, you get type annotations for your predictions and can catch errors with static type analysis tools during development.
  • async Models support async and sync prediction functions. modelkit supports calling async code from sync code so you don't have to suffer from partially async code.
  • testable Models carry their own unit test cases, and unit testing fixtures are available for pytest
  • fast to deploy Models can be served in a single CLI call using fastapi

In addition, you will find that modelkit is:

  • simple Use pip to install modelkit, it is just a Python library.
  • robust Follow software development best practices: version and test all your configurations and artifacts.
  • customizable Go beyond off-the-shelf models: custom processing, heuristics, business logic, different frameworks, etc.
  • framework agnostic Bring your own framework to the table, and use whatever code or library you want. modelkit is not opinionated about how you build or train your models.
  • organized Version and share you ML library and artifacts with others, as a Python package or as a service. Let others use and evaluate your models!
  • fast to code Just write the prediction logic and that's it. No cumbersome pre or postprocessing logic, branching options, etc... The boilerplate code is minimal and sensible.

Installation

Install with pip:

pip install modelkit
Comments
  • NamedTuple Error with python 3.9.5

    NamedTuple Error with python 3.9.5

    I had this error running pytest in python 3.9.5

      File "/Users/clemat/.pyenv/versions/3.9.5/lib/python3.9/importlib/__init__.py", line 127, in import_module
        return _bootstrap._gcd_import(name[level:], package, level)
      File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
      File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
      File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
      File "/Users/clemat/.pyenv/versions/3.9.5/envs/modelkit/lib/python3.9/site-packages/_pytest/assertion/rewrite.py", line 170, in exec_module
        exec(co, module.__dict__)
      File "/Users/clemat/dev/modelkit/tests/assets/conftest.py", line 18, in <module>
        from modelkit.assets.manager import AssetsManager
      File "/Users/clemat/dev/modelkit/modelkit/__init__.py", line 3, in <module>
        from modelkit.core.library import ModelLibrary, load_model  # NOQA
      File "/Users/clemat/dev/modelkit/modelkit/core/__init__.py", line 1, in <module>
        from modelkit.core.library import ModelLibrary, load_model
      File "/Users/clemat/dev/modelkit/modelkit/core/library.py", line 35, in <module>
        from modelkit.core.model import Asset, AsyncModel, Model
      File "/Users/clemat/dev/modelkit/modelkit/core/model.py", line 31, in <module>
        from modelkit.utils.cache import Cache, CacheItem
      File "/Users/clemat/dev/modelkit/modelkit/utils/cache.py", line 15, in <module>
        class CacheItem(NamedTuple, Generic[ItemType]):
      File "/Users/clemat/.pyenv/versions/3.9.5/lib/python3.9/typing.py", line 1881, in _namedtuple_mro_entries
        raise TypeError("Multiple inheritance with NamedTuple is not supported")
    TypeError: Multiple inheritance with NamedTuple is not supported
    ======================================================================================= short test summary info ========================================================================================
    ERROR  - TypeError: Multiple inheritance with NamedTuple is not supported
    
    opened by CyrilLeMat 8
  • Pickable storage driver clients

    Pickable storage driver clients

    Hi !

    This PR aims at adding a way to pickle the ModelLibrary which, for the moment, is impossible due to storage drivers (especially boto3 and gcs).

    It introduces a MODELKIT_LAZY_DRIVER environment variable which, if set, will prevent the Storage Providers from storing the drivers (boto3, gcs, azure), instead, the configuration settings will be stored allowing the corresponding Storage Provider to build it on the fly.

    This will make way easier the use of python libraries which use pickle such as Apache Spark and multi-processing, leveraging modelkit.

    Thanks for reviewing, as always!

    opened by antoinejeannot 7
  • configurations: allow models without CONFIGURATIONS dict

    configurations: allow models without CONFIGURATIONS dict

    Tired of having to add empty CONFIGURATIONS attributes to your models?

    Look no further, in this PR, I introduce what I think is a neat little feature for Models. Whenever there is no CONFIGURATIONS dict defined, the model will still be available with a name corresponding to its class name, but snaked case.

    So,

    from modelkit import Model, ModelLibrary
    
    class ModelWithoutConfig(Model):
        def _predict(self, item):
            return item
    
    library = ModelLibrary(models=ModelWithoutConfig)
    model = library.get("model_without_config")
    

    This is much better, in particular because unless the type is specified fully, mypy will complain, so the current alternative is much more verbose:

    class ModelWithoutConfig(Model):
        CONFIGURATIONS : Dict[str, Any] = {
            "model_without_config": {}
        }
    
        def _predict(self, item):
            return item
    
    

    What do you think?

    opened by victorbenichoux 6
  • docs: add NLP x Sentiment Analysis tutorial

    docs: add NLP x Sentiment Analysis tutorial

    In this draft, I propose an overview of most Modelkit's features, by solving a Sentiment Analysis problem.

    I am pretty sure they are still typos, but they will be fixed anytime soon!

    Looking forward to having your opinion on this.

    documentation 
    opened by antoinejeannot 6
  • Assets azure

    Assets azure

    In this PR I add support for assets stored in Azure blob storage. It's relatively simple, since I only need to correctly implement the driver in assets/drivers/azure.py.

    The rest follows: I add an assets manager fixture in tests/assets/conftest.py, and then duplicate a bunch of tests using the same approach as for other drivers.

    opened by victorbenichoux 5
  • Support usage of inheritance for model definitions

    Support usage of inheritance for model definitions

    After change done in #123,

    Abstract base model that were previously bypassed during configure step because of missing CONFIGURATIONS dict are now considered as valid Model to load.

    Hence breaking usage where those Abstract base model are part of the same module than the Concrete derived model and we decide to load full python module.

    With this change modelkit can support this use-case with small adaptation of client code:

    class YourABCModel(AbstractMixin, Asset):
        ...
    
    class YourDerivedModel(ConcreteMixin, YourABCModel):
       ...
    

    Doing so will ensure ABC model are again bypassed by configure step but not the Derived model.

    opened by ldeflandre 4
  • Improve model describe method with load time and memory including dependencies

    Improve model describe method with load time and memory including dependencies

    I noticed a model _load_time and _load_memory_increment doesn't include its dependencies info. It makes sense but I feel that, as we push for composable models, it is more informative to cumulate the model data if all of its dependencies. @victorbenichoux what do you think? Is there a specific need to get the model load time and memory without its dependencies?

    opened by CyrilLeMat 4
  • Assets versioning with dates

    Assets versioning with dates

    As of today, assets can only be versioned with numbers (eg: 1.4) https://cornerstone-ondemand.github.io/modelkit/assets/managing_assets/

    Modelkit should support date versionning (eg: 2021-10-16) to support organisations where models are trained on a monthy/weekly/daily basis

    opened by CyrilLeMat 4
  • Add tests configurations

    Add tests configurations

    This PR aims at introducing test_cases in the CONFIGURATIONS map defined at class level, so that to be able to write tests restricted to a given model.

    Moreover, TEST_CASES is now the way to go if you want to define upper-level class tests, which will be ran for every model configuration.

    E.g.:

    class TestableModel(Model[ModelItemType, ModelItemType]):
        CONFIGURATIONS: Dict[str, Dict] = {
            "some_model_a": { 
                "test_cases": {
                    "cases": [
                        {"item": {"x": 1}, "result": {"x": 1}},
                        {"item": {"x": 2}, "result": {"x": 2}},
                    ],
                }
            },
            "some_model_b": {},
        }
        def _predict(self, item):
            return item
    

    Thanks for reviewing

    opened by antoinejeannot 4
  • click 8.1.0 breaks spaCy 3.2.3

    click 8.1.0 breaks spaCy 3.2.3

    Installing with pip install modelkit spacy results in click 8.1.0 and spacy 3.2.3 being installed. But click 8.1.0 contains a breaking change and is incompatible with spaCy 3.2.3 (https://github.com/explosion/spaCy/issues/10564).

    Until updated spaCy is realeased, you need to install with pyenv exec pip install modelkit spacy "click==8.0.4"

    opened by mihaimm 3
  • Enable assets manager to run on read-only filesystem when assets do not need to be downloaded

    Enable assets manager to run on read-only filesystem when assets do not need to be downloaded

    Current state

    During asset creation development process, or if you want to embed assets in a container at build time, Modelkit can be configured to use StorageProvider with LocalStorageDriver and a AssetsDir pointing to the same location of the LocalStorage. This allow to only use local file and do not duplicate them on local file system.

    However, modelkit still requires to have write permission on the filesystem because the management of lock files is not disabled in this condition.

    Fix

    Idea is to disable the lock files management in this configuration. It is fine to do that since we are guaranteed that we will not download anything in this configuration

    opened by ldeflandre 3
  • Ignore modules in library search path

    Ignore modules in library search path

    Sometimes it makes sense to group helper utilities together in in the library search path.

    However, we don't always want these files to be loaded if for example the model itself is not used.

    • The problem is that modelkit traverses and loads all python modules in the search path. It would be useful to be able to skip over specific files.

    This could maybe be done with a common naming convention, for example prefixing files that should be ignored with some standard prefix, eg. library._my_module would be ignored and skipped over (this follows private convention, with the underscore indicating a module that should not be exposed publicly), while library.my_module would be traversed and loaded.

    opened by nmichlo 0
  • Warn or raise Exception when 2 models have the same name

    Warn or raise Exception when 2 models have the same name

    If 2 models have the same configuration name, modelkit will silently use one of them.

    It should at least warn the user but maybe we just want to raise an Exception ?

    enhancement 
    opened by tgenin 0
  • Handle breaking batch behaviour options

    Handle breaking batch behaviour options

    Currently breaking a prediction_batch example, breaks the call and raise the error

    we may want another option like (for example) returning all the batches returns except the breaking ones (set to none or set to the Exception) and a mask or something like that

    enhancement 
    opened by tgenin 0
  • Be able to push assets directly from remote storage to remote storage (for s3)

    Be able to push assets directly from remote storage to remote storage (for s3)

    Currently asset new or asset update push data from local to MODELKIT_STORAGE_PROVIDER

    https://cornerstone-ondemand.github.io/modelkit/assets/managing_assets/#create-a-new-asset

    which mean asset must be on local storage to be pushed

    It could be interesting to be able to directly push asset from remote to remote without writing on local disk

    Note: the feature seems to be partially for gcs but it just download locally to repush :

    https://github.com/Cornerstone-OnDemand/modelkit/blob/6e71fe78155887fd349df4907b3633ade72d565c/modelkit/assets/cli.py#L201

    (idk if it still works)

    So maybe at least implement an automatic redownload + push for s3

    Or better find a way to do it directly (idk if it is possible at least with 2 remotes with same credentials)

    enhancement 
    opened by tgenin 0
Releases(v0.0.25)
  • v0.0.25(Sep 1, 2022)

    What's Changed

    • Make asset requirements for remote storage optional #167
    • tf_serving_fixture expects a new parameter: version of the tensorflow/serving image (used to be pinned to v2.4.0) #162

    This version contains breaking changes:

    • if you are using a remote storage, check this part of the documentation to understand the new optional dependencies.

    Full Changelog: https://github.com/Cornerstone-OnDemand/modelkit/compare/v0.0.24...v0.0.25

    Source code(tar.gz)
    Source code(zip)
  • v0.0.24(Jun 29, 2022)

  • v0.0.21(Jan 17, 2022)

    What's Changed

    • Support python 3.10 by @tbascoul in https://github.com/Cornerstone-OnDemand/modelkit/pull/130

    New Contributors

    • @tbascoul made their first contribution in https://github.com/Cornerstone-OnDemand/modelkit/pull/130

    Full Changelog: https://github.com/Cornerstone-OnDemand/modelkit/compare/v0.0.20...v0.0.21

    Source code(tar.gz)
    Source code(zip)
  • v0.0.20(Jan 6, 2022)

    What's Changed

    • Support usage of inheritance for model definitions by @ldeflandre in https://github.com/Cornerstone-OnDemand/modelkit/pull/128
    • Intuitive abstract support by stopping support of no CONFIGURATIONS by @ldeflandre in https://github.com/Cornerstone-OnDemand/modelkit/pull/129

    Full Changelog: https://github.com/Cornerstone-OnDemand/modelkit/compare/v0.0.19...v0.0.20

    Source code(tar.gz)
    Source code(zip)
  • v0.0.19(Dec 20, 2021)

    What's Changed

    • configurations: allow models without CONFIGURATIONS dict by @victorbenichoux in https://github.com/Cornerstone-OnDemand/modelkit/pull/123
    • Raise ModelsNotFound exception when Model is not found by @tgenin in https://github.com/Cornerstone-OnDemand/modelkit/pull/125
    • Enable read-only filesystem when assets do not need to be downloaded by @ldeflandre in https://github.com/Cornerstone-OnDemand/modelkit/pull/127

    New Contributors

    • @ldeflandre made their first contribution in https://github.com/Cornerstone-OnDemand/modelkit/pull/127

    Full Changelog: https://github.com/Cornerstone-OnDemand/modelkit/compare/v0.0.18...v0.0.19

    Source code(tar.gz)
    Source code(zip)
  • v0.0.18(Dec 9, 2021)

    This release includes:

    • a fix to be able to install and use modelkit on apple silicon https://github.com/Cornerstone-OnDemand/modelkit/issues/119
    • a new storage driver to support azure blob storage
    Source code(tar.gz)
    Source code(zip)
  • v0.0.17(Nov 30, 2021)

  • v0.0.16(Nov 17, 2021)

  • v0.0.11(Jun 24, 2021)

    • validation of downloads in ASSETS_DIR with .SUCCESS files
    • MODELKIT_STORAGE_FORCE_DOWNLOAD to force download of remote assets
    • MODELKIT_STORAGE_TIMEOUT to control timeout on storage download lock
    • Improved debug level logs when fetching assets
    Source code(tar.gz)
    Source code(zip)
  • v0.0.2(Jun 7, 2021)

    This is quite a major update, with numerous breaking changes.

    • separated sync and async Models for clarity (https://github.com/clustree/modelkit/pull/10)
    • reworked batching predict logic: predict no longer automatically switches between single items and batches. Model.predict takes single items, predict_batch only accepts lists of items, and predict_gen knows how to deal with generators (https://github.com/clustree/modelkit/pull/15, https://github.com/clustree/modelkit/pull/16, https://github.com/clustree/modelkit/pull/19, https://github.com/clustree/modelkit/pull/20)
    • caching with native python caches now available (https://github.com/clustree/modelkit/pull/22, https://github.com/clustree/modelkit/pull/23)
    • documentation improvements
    Source code(tar.gz)
    Source code(zip)
  • v0.0.1(May 28, 2021)

A dynamic FastAPI router that automatically creates CRUD routes for your models

⚡ Create CRUD routes with lighting speed ⚡ A dynamic FastAPI router that automatically creates CRUD routes for your models Documentation: https://fast

Adam Watkins 130 Feb 13, 2021
FastAPI Skeleton App to serve machine learning models production-ready.

FastAPI Model Server Skeleton Serving machine learning models production-ready, fast, easy and secure powered by the great FastAPI by Sebastián Ramíre

null 268 Jan 1, 2023
Hyperlinks for pydantic models

Hyperlinks for pydantic models In a typical web application relationships between resources are modeled by primary and foreign keys in a database (int

Jaakko Moisio 10 Apr 18, 2022
Qwerkey is a social media platform for connecting and learning more about mechanical keyboards built on React and Redux in the frontend and Flask in the backend on top of a PostgreSQL database.

Flask React Project This is the backend for the Flask React project. Getting started Clone this repository (only this branch) git clone https://github

Peter Mai 22 Dec 20, 2022
A RESTful API for creating and monitoring resource components of a hypothetical build system. Built with FastAPI and pydantic. Complete with testing and CI.

diskspace-monitor-CRUD Background The build system is part of a large environment with a multitude of different components. Many of the components hav

Nick Hopewell 67 Dec 14, 2022
Restful Api developed with Flask using Prometheus and Grafana for monitoring and containerization with Docker :rocket:

Hephaestus ?? In Greek mythology, Hephaestus was either the son of Zeus and Hera or he was Hera's parthenogenous child. ... As a smithing god, Hephaes

Yasser Tahiri 16 Oct 7, 2022
🍃 A comprehensive monitoring and alerting solution for the status of your Chia farmer and harvesters.

chia-monitor A monitoring tool to collect all important metrics from your Chia farming node and connected harvesters. It can send you push notificatio

Philipp Normann 153 Oct 21, 2022
FastAPI-Amis-Admin is a high-performance, efficient and easily extensible FastAPI admin framework. Inspired by django-admin, and has as many powerful functions as django-admin.

简体中文 | English 项目介绍 FastAPI-Amis-Admin fastapi-amis-admin是一个拥有高性能,高效率,易拓展的fastapi管理后台框架. 启发自Django-Admin,并且拥有不逊色于Django-Admin的强大功能. 源码 · 在线演示 · 文档 · 文

AmisAdmin 318 Dec 31, 2022
Prometheus exporter for Starlette and FastAPI

starlette_exporter Prometheus exporter for Starlette and FastAPI. The middleware collects basic metrics: Counter: starlette_requests_total Histogram:

Steve Hillier 225 Jan 5, 2023
🚀 Cookiecutter Template for FastAPI + React Projects. Using PostgreSQL, SQLAlchemy, and Docker

FastAPI + React · A cookiecutter template for bootstrapping a FastAPI and React project using a modern stack. Features FastAPI (Python 3.8) JWT authen

Gabriel Abud 1.4k Jan 2, 2023
A rate limiter for Starlette and FastAPI

SlowApi A rate limiting library for Starlette and FastAPI adapted from flask-limiter. Note: this is alpha quality code still, the API may change, and

Laurent Savaete 562 Jan 1, 2023
Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.

starlette context Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automat

Tomasz Wójcik 300 Dec 26, 2022
Opentracing support for Starlette and FastApi

Starlette-OpenTracing OpenTracing support for Starlette and FastApi. Inspired by: Flask-OpenTracing OpenTracing implementations exist for major distri

Rene Dohmen 63 Dec 30, 2022
FastAPI application and service structure for a more maintainable codebase

Abstracting FastAPI Services See this article for more information: https://camillovisini.com/article/abstracting-fastapi-services/ Poetry poetry inst

Camillo Visini 309 Jan 4, 2023
A rate limiter for Starlette and FastAPI

SlowApi A rate limiting library for Starlette and FastAPI adapted from flask-limiter. Note: this is alpha quality code still, the API may change, and

Laurent Savaete 154 Feb 16, 2021
Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automatically use request headers such as x-request-id or x-correlation-id.

starlette context Middleware for Starlette that allows you to store and access the context data of a request. Can be used with logging so logs automat

Tomasz Wójcik 110 Feb 16, 2021
Prometheus exporter for Starlette and FastAPI

starlette_exporter Prometheus exporter for Starlette and FastAPI. The middleware collects basic metrics: Counter: starlette_requests_total Histogram:

Steve Hillier 82 Feb 13, 2021
Opentracing support for Starlette and FastApi

Starlette-OpenTracing OpenTracing support for Starlette and FastApi. Inspired by: Flask-OpenTracing OpenTracing implementations exist for major distri

Rene Dohmen 26 Feb 11, 2021
Drop-in MessagePack support for ASGI applications and frameworks

msgpack-asgi msgpack-asgi allows you to add automatic MessagePack content negotiation to ASGI applications (Starlette, FastAPI, Quart, etc.), with a s

Florimond Manca 128 Jan 2, 2023