CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

Overview

CML with cloud compute

This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance and then run a neural style transfer on that instance. On a pull request, the following actions will occur:

  • GitHub will deploy a runner with a custom CML Docker image
  • cml-runner will provision an EC2 instance and pass the neural style transfer workflow to it. DVC is used to version the workflow and dependencies.
  • Neural style transfer will be executed on the EC2 instance
  • CML will report results of the style transfer as a comment in the pull request.

The key file enabling these actions is .github/workflows/cml.yaml.

Variables

You must create a personal access token with repository and workflow privileges to supply as a secret, in addition to your AWS credentials (AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY, and optinoally AWS_SESSION_TOKEN).

You might also like...
Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.
A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

A comprehensive repository containing 30+ notebooks on learning machine learning!
A comprehensive repository containing 30+ notebooks on learning machine learning!

A comprehensive repository containing 30+ notebooks on learning machine learning!

MIT-Machine Learning with Python–From Linear Models to Deep Learning

MIT-Machine Learning with Python–From Linear Models to Deep Learning | One of the 5 courses in MIT MicroMasters in Statistics & Data Science Welcome t

Implemented four supervised learning Machine Learning algorithms

Implemented four supervised learning Machine Learning algorithms from an algorithmic family called Classification and Regression Trees (CARTs), details see README_Report.

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

cuML - RAPIDS Machine Learning Library
cuML - RAPIDS Machine Learning Library

cuML - GPU Machine Learning Algorithms cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions t

mlpack: a scalable C++ machine learning library --
mlpack: a scalable C++ machine learning library --

a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack

Comments
  • Bump tensorflow-gpu from 2.2.2 to 2.4.0

    Bump tensorflow-gpu from 2.2.2 to 2.4.0

    Bumps tensorflow-gpu from 2.2.2 to 2.4.0.

    Release notes

    Sourced from tensorflow-gpu's releases.

    TensorFlow 2.4.0

    Release 2.4.0

    Major Features and Improvements

    • tf.distribute introduces experimental support for asynchronous training of models via the tf.distribute.experimental.ParameterServerStrategy API. Please see the tutorial to learn more.

    • MultiWorkerMirroredStrategy is now a stable API and is no longer considered experimental. Some of the major improvements involve handling peer failure and many bug fixes. Please check out the detailed tutorial on Multi-worker training with Keras.

    • Introduces experimental support for a new module named tf.experimental.numpy which is a NumPy-compatible API for writing TF programs. See the detailed guide to learn more. Additional details below.

    • Adds Support for TensorFloat-32 on Ampere based GPUs. TensorFloat-32, or TF32 for short, is a math mode for NVIDIA Ampere based GPUs and is enabled by default.

    • A major refactoring of the internals of the Keras Functional API has been completed, that should improve the reliability, stability, and performance of constructing Functional models.

    • Keras mixed precision API tf.keras.mixed_precision is no longer experimental and allows the use of 16-bit floating point formats during training, improving performance by up to 3x on GPUs and 60% on TPUs. Please see below for additional details.

    • TensorFlow Profiler now supports profiling MultiWorkerMirroredStrategy and tracing multiple workers using the sampling mode API.

    • TFLite Profiler for Android is available. See the detailed guide to learn more.

    • TensorFlow pip packages are now built with CUDA11 and cuDNN 8.0.2.

    Breaking Changes

    • TF Core:

      • Certain float32 ops run in lower precsion on Ampere based GPUs, including matmuls and convolutions, due to the use of TensorFloat-32. Specifically, inputs to such ops are rounded from 23 bits of precision to 10 bits of precision. This is unlikely to cause issues in practice for deep learning models. In some cases, TensorFloat-32 is also used for complex64 ops. TensorFloat-32 can be disabled by running tf.config.experimental.enable_tensor_float_32_execution(False).
      • The byte layout for string tensors across the C-API has been updated to match TF Core/C++; i.e., a contiguous array of tensorflow::tstring/TF_TStrings.
      • C-API functions TF_StringDecode, TF_StringEncode, and TF_StringEncodedSize are no longer relevant and have been removed; see core/platform/ctstring.h for string access/modification in C.
      • tensorflow.python, tensorflow.core and tensorflow.compiler modules are now hidden. These modules are not part of TensorFlow public API.
      • tf.raw_ops.Max and tf.raw_ops.Min no longer accept inputs of type tf.complex64 or tf.complex128, because the behavior of these ops is not well defined for complex types.
      • XLA:CPU and XLA:GPU devices are no longer registered by default. Use TF_XLA_FLAGS=--tf_xla_enable_xla_devices if you really need them, but this flag will eventually be removed in subsequent releases.
    • tf.keras:

      • The steps_per_execution argument in model.compile() is no longer experimental; if you were passing experimental_steps_per_execution, rename it to steps_per_execution in your code. This argument controls the number of batches to run during each tf.function call when calling model.fit(). Running multiple batches inside a single tf.function call can greatly improve performance on TPUs or small models with a large Python overhead.
      • A major refactoring of the internals of the Keras Functional API may affect code that is relying on certain internal details:
        • Code that uses isinstance(x, tf.Tensor) instead of tf.is_tensor when checking Keras symbolic inputs/outputs should switch to using tf.is_tensor.
        • Code that is overly dependent on the exact names attached to symbolic tensors (e.g. assumes there will be ":0" at the end of the inputs, treats names as unique identifiers instead of using tensor.ref(), etc.) may break.
        • Code that uses full path for get_concrete_function to trace Keras symbolic inputs directly should switch to building matching tf.TensorSpecs directly and tracing the TensorSpec objects.
        • Code that relies on the exact number and names of the op layers that TensorFlow operations were converted into may have changed.
        • Code that uses tf.map_fn/tf.cond/tf.while_loop/control flow as op layers and happens to work before TF 2.4. These will explicitly be unsupported now. Converting these ops to Functional API op layers was unreliable before TF 2.4, and prone to erroring incomprehensibly or being silently buggy.
        • Code that directly asserts on a Keras symbolic value in cases where ops like tf.rank used to return a static or symbolic value depending on if the input had a fully static shape or not. Now these ops always return symbolic values.
        • Code already susceptible to leaking tensors outside of graphs becomes slightly more likely to do so now.
        • Code that tries directly getting gradients with respect to symbolic Keras inputs/outputs. Use GradientTape on the actual Tensors passed to the already-constructed model instead.
        • Code that requires very tricky shape manipulation via converted op layers in order to work, where the Keras symbolic shape inference proves insufficient.
        • Code that tries manually walking a tf.keras.Model layer by layer and assumes layers only ever have one positional argument. This assumption doesn't hold true before TF 2.4 either, but is more likely to cause issues now.

    ... (truncated)

    Changelog

    Sourced from tensorflow-gpu's changelog.

    Release 2.2.2

    Bug Fixes and Other Changes

    • Fixes an access to unitialized memory in Eigen code (CVE-2020-26266)
    • Fixes a security vulnerability caused by lack of validation in tf.raw_ops.DataFormatVecPermute and tf.raw_ops.DataFormatDimMap (CVE-2020-26267)
    • Fixes a vulnerability caused by attempting to write to immutable memory region in tf.raw_ops.ImmutableConst (CVE-2020-26268
    • Fixes a CHECK-fail in LSTM with zero-length input (CVE-2020-26270)
    • Fixes a security vulnerability caused by accessing heap data outside of bounds when loading a specially crafted SavedModel (CVE-2020-26271)
    • Prevents memory leaks in loading SavedModels that import functions
    • Updates libjpeg-turbo to 2.0.5 to handle CVE-2020-13790.
    • Updates junit to 4.13.1 to handle CVE-2020-15250.
    • Updates PCRE to 8.44 to handle CVE-2019-20838 and CVE-2020-14155.
    • Updates sqlite3 to 3.44.0 to keep in sync with master branch.

    Release 2.1.3

    Bug Fixes and Other Changes

    • Fixes an access to unitialized memory in Eigen code (CVE-2020-26266)
    • Fixes a security vulnerability caused by lack of validation in tf.raw_ops.DataFormatVecPermute and tf.raw_ops.DataFormatDimMap (CVE-2020-26267)
    • Fixes a vulnerability caused by attempting to write to immutable memory region in tf.raw_ops.ImmutableConst (CVE-2020-26268
    • Fixes a CHECK-fail in LSTM with zero-length input (CVE-2020-26270)
    • Fixes a security vulnerability caused by accessing heap data outside of bounds when loading a specially crafted SavedModel (CVE-2020-26271)
    • Updates libjpeg-turbo to 2.0.5 to handle CVE-2020-13790.
    • Updates junit to 4.13.1 to handle CVE-2020-15250.
    • Updates PCRE to 8.44 to handle CVE-2019-20838 and

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 3
  • Configure Renovate

    Configure Renovate

    Mend Renovate

    Welcome to Renovate! This is an onboarding PR to help you understand and configure settings before regular Pull Requests begin.

    🚦 To activate Renovate, merge this Pull Request. To disable Renovate, simply close this Pull Request unmerged.


    Detected Package Files

    • .github/workflows/cml.yaml (github-actions)
    • requirements.txt (pip_requirements)

    Configuration Summary

    Based on the default config's presets, Renovate will:

    • Start dependency updates only once this onboarding PR is merged
    • Enable Renovate Dependency Dashboard creation
    • If semantic commits detected, use semantic commit type fix for dependencies and chore for all others
    • Ignore node_modules, bower_components, vendor and various test/tests directories
    • Autodetect whether to pin dependencies or maintain ranges
    • Rate limit PR creation to a maximum of two per hour
    • Limit to maximum 10 open PRs at any time
    • Group known monorepo packages together
    • Use curated list of recommended non-monorepo package groupings
    • A collection of workarounds for known problems with packages

    🔡 Would you like to change the way Renovate is upgrading your dependencies? Simply edit the renovate.json in this branch with your custom config and the list of Pull Requests in the "What to Expect" section below will be updated the next time Renovate runs.


    What to Expect

    With your current configuration, Renovate will create 3 Pull Requests:

    Update dependency tensorflow-gpu to v2.7.1 [SECURITY]
    • Branch name: renovate/pypi-tensorflow-gpu-vulnerability
    • Merge into: master
    • Upgrade tensorflow-gpu to ==2.7.1
    Update actions/checkout action to v3
    • Schedule: ["at any time"]
    • Branch name: renovate/actions-checkout-3.x
    • Merge into: master
    • Upgrade actions/checkout to v3
    Update actions/setup-python action to v4
    • Schedule: ["at any time"]
    • Branch name: renovate/actions-setup-python-4.x
    • Merge into: master
    • Upgrade actions/setup-python to v4

    🚸 Branch creation will be limited to maximum 2 per hour, so it doesn't swamp any CI resources or spam the project. See docs for prhourlylimit for details.


    ❓ Got questions? Check out Renovate's Docs, particularly the Getting Started section. If you need any further assistance then you can also request help here.


    This PR has been generated by Mend Renovate. View repository job log here.

    opened by renovate[bot] 1
Owner
Iterative
Developer Tools for Machine Learning
Iterative
Machine learning template for projects based on sklearn library.

Machine learning template for projects based on sklearn library.

Janez Lapajne 17 Oct 28, 2022
A collection of neat and practical data science and machine learning projects

Data Science A collection of neat and practical data science and machine learning projects Explore the docs » Report Bug · Request Feature Table of Co

Will Fong 2 Dec 10, 2021
Bodywork deploys machine learning projects developed in Python, to Kubernetes.

Bodywork deploys machine learning projects developed in Python, to Kubernetes. It helps you to: serve models as microservices execute batch jobs run r

Bodywork Machine Learning 409 Jan 1, 2023
Data Version Control or DVC is an open-source tool for data science and machine learning projects

Continuous Machine Learning project integration with DVC Data Version Control or DVC is an open-source tool for data science and machine learning proj

Azaria Gebremichael 2 Jul 29, 2021
A Powerful Serverless Analysis Toolkit That Takes Trial And Error Out of Machine Learning Projects

KXY: A Seemless API to 10x The Productivity of Machine Learning Engineers Documentation https://www.kxy.ai/reference/ Installation From PyPi: pip inst

KXY Technologies, Inc. 35 Jan 2, 2023
LILLIE: Information Extraction and Database Integration Using Linguistics and Learning-Based Algorithms

LILLIE: Information Extraction and Database Integration Using Linguistics and Learning-Based Algorithms Based on the work by Smith et al. (2021) Query

null 5 Aug 6, 2022
Tools for Optuna, MLflow and the integration of both.

HPOflow - Sphinx DOC Tools for Optuna, MLflow and the integration of both. Detailed documentation with examples can be found here: Sphinx DOC Table of

Telekom Open Source Software 17 Nov 20, 2022
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Jan 9, 2023
Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Augusto Almeida 84 Nov 25, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Vowpal Wabbit 8.1k Dec 30, 2022