The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it inside a loop of Design, Model Development and Operations.

Overview

GitHub Contributors Image

MLOps

The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it inside a loop of Design, Model Development and Operations.

In this paradigm, teams can easily collaborate in models, with clear tracking of the data throughout the process of cleaning, processing, and feature creation. Automating every repetitive process avoids human error and reduces the delivery time, ensuring the team keeps focusing on the Business Problem.

Some benefits:

  • Versioning data and code, making models to be auditable and reproducible.

  • Automated tests and building ensuring quality functioning of artifacts and availability for the delivery pipelines.

  • Makes it easier and faster the deployment of new models by using an automated cycle.

The MLOps Project

The MLOps project is a path to learning how to implement a study case aiming to be testable and reproducible within the CI/CD methodology, using the best programming practices.

The scope of this project is delimited as you can see in the image below.

We will select the best tool to implement every step, integrate them, and build a Machine Learning Orchestrator. That said, in the end, new ML experiments will be easily made, and delivered as simples as typing a terminal command or clicking on a button!


Prerequisites

For mlops_project to work correctly, first, you should install the prerequisites

Contributing

Have an idea of how to improve this project but don't know how to start, try to contribute

You can understand the project organization here

How to use?

If you are interested just in using this package, follow the steps below.

  1. Clone the repository

    Open a terminal (if you are using Windows, make sure of using the git bash) navigate to the desired destination folder and clone the repository,

    git clone https://github.com/Schots/mlops_project.git

    The Makefile on the root folder defines a set of functions needed to automate repetitive processes in this project. Type "make" in the terminal and see the available functions.


  1. Create an environment & Install requirements

    Create a Python virtual environment for the MLOps project on your local machine. Use any tool you desire. Activate the environment and install the requirements using make:

    make requirements
  2. Download data

    To download the raw dataset, use the get_data

    make get_data

    type the dataset name when prompted. The zip file with data will be downloaded and unzipped under the data/raw folder


Project based on the cookiecutter data science project template. #cookiecutterdatascience

Comments
  • Bump black from 21.12b0 to 22.1.0

    Bump black from 21.12b0 to 22.1.0

    Bumps black from 21.12b0 to 22.1.0.

    Release notes

    Sourced from black's releases.

    22.1.0

    At long last, Black is no longer a beta product! This is the first non-beta release and the first release covered by our new stability policy.

    Highlights

    • Remove Python 2 support (#2740)
    • Introduce the --preview flag (#2752)

    Style

    • Deprecate --experimental-string-processing and move the functionality under --preview (#2789)
    • For stubs, one blank line between class attributes and methods is now kept if there's at least one pre-existing blank line (#2736)
    • Black now normalizes string prefix order (#2297)
    • Remove spaces around power operators if both operands are simple (#2726)
    • Work around bug that causes unstable formatting in some cases in the presence of the magic trailing comma (#2807)
    • Use parentheses for attribute access on decimal float and int literals (#2799)
    • Don't add whitespace for attribute access on hexadecimal, binary, octal, and complex literals (#2799)
    • Treat blank lines in stubs the same inside top-level if statements (#2820)
    • Fix unstable formatting with semicolons and arithmetic expressions (#2817)
    • Fix unstable formatting around magic trailing comma (#2572)

    Parser

    • Fix mapping cases that contain as-expressions, like case {"key": 1 | 2 as password} (#2686)
    • Fix cases that contain multiple top-level as-expressions, like case 1 as a, 2 as b (#2716)
    • Fix call patterns that contain as-expressions with keyword arguments, like case Foo(bar=baz as quux) (#2749)
    • Tuple unpacking on return and yield constructs now implies 3.8+ (#2700)
    • Unparenthesized tuples on annotated assignments (e.g values: Tuple[int, ...] = 1, 2, 3) now implies 3.8+ (#2708)
    • Fix handling of standalone match() or case() when there is a trailing newline or a comment inside of the parentheses. (#2760)
    • from __future__ import annotations statement now implies Python 3.7+ (#2690)

    Performance

    • Speed-up the new backtracking parser about 4X in general (enabled when --target-version is set to 3.10 and higher). (#2728)
    • Black is now compiled with mypyc for an overall 2x speed-up. 64-bit Windows, MacOS, and Linux (not including musl) are supported. (#1009, #2431)

    Configuration

    • Do not accept bare carriage return line endings in pyproject.toml (#2408)
    • Add configuration option (python-cell-magics) to format cells with custom magics in Jupyter Notebooks (#2744)
    • Allow setting custom cache directory on all platforms with environment variable BLACK_CACHE_DIR (#2739).
    • Enable Python 3.10+ by default, without any extra need to specify --target-version=py310. (#2758)
    • Make passing SRC or --code mandatory and mutually exclusive (#2804)

    Output

    • Improve error message for invalid regular expression (#2678)
    • Improve error message when parsing fails during AST safety check by embedding the underlying SyntaxError (#2693)
    • No longer color diff headers white as it's unreadable in light themed terminals (#2691)
    • Text coloring added in the final statistics (#2712)
    • Verbose mode also now describes how a project root was discovered and which paths will be formatted. (#2526)

    Packaging

    • All upper version bounds on dependencies have been removed (#2718)
    • typing-extensions is no longer a required dependency in Python 3.10+ (#2772)
    • Set click lower bound to 8.0.0 as Black crashes on 7.1.2 (#2791)

    ... (truncated)

    Changelog

    Sourced from black's changelog.

    22.1.0

    At long last, Black is no longer a beta product! This is the first non-beta release and the first release covered by our new stability policy.

    Highlights

    • Remove Python 2 support (#2740)
    • Introduce the --preview flag (#2752)

    Style

    • Deprecate --experimental-string-processing and move the functionality under --preview (#2789)
    • For stubs, one blank line between class attributes and methods is now kept if there's at least one pre-existing blank line (#2736)
    • Black now normalizes string prefix order (#2297)
    • Remove spaces around power operators if both operands are simple (#2726)
    • Work around bug that causes unstable formatting in some cases in the presence of the magic trailing comma (#2807)
    • Use parentheses for attribute access on decimal float and int literals (#2799)
    • Don't add whitespace for attribute access on hexadecimal, binary, octal, and complex literals (#2799)
    • Treat blank lines in stubs the same inside top-level if statements (#2820)
    • Fix unstable formatting with semicolons and arithmetic expressions (#2817)
    • Fix unstable formatting around magic trailing comma (#2572)

    Parser

    • Fix mapping cases that contain as-expressions, like case {"key": 1 | 2 as password} (#2686)
    • Fix cases that contain multiple top-level as-expressions, like case 1 as a, 2 as b (#2716)
    • Fix call patterns that contain as-expressions with keyword arguments, like case Foo(bar=baz as quux) (#2749)
    • Tuple unpacking on return and yield constructs now implies 3.8+ (#2700)
    • Unparenthesized tuples on annotated assignments (e.g values: Tuple[int, ...] = 1, 2, 3) now implies 3.8+ (#2708)
    • Fix handling of standalone match() or case() when there is a trailing newline or a comment inside of the parentheses. (#2760)
    • from __future__ import annotations statement now implies Python 3.7+ (#2690)

    Performance

    • Speed-up the new backtracking parser about 4X in general (enabled when --target-version is set to 3.10 and higher). (#2728)
    • Black is now compiled with mypyc for an overall 2x speed-up. 64-bit Windows, MacOS, and Linux (not including musl) are supported. (#1009, #2431)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies safe to test 
    opened by dependabot[bot] 2
  • build(deps-dev): bump notebook from 6.4.10 to 6.5.1

    build(deps-dev): bump notebook from 6.4.10 to 6.5.1

    Bumps notebook from 6.4.10 to 6.5.1.

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • build(deps): bump matplotlib from 3.5.1 to 3.6.0

    build(deps): bump matplotlib from 3.5.1 to 3.6.0

    Bumps matplotlib from 3.5.1 to 3.6.0.

    Release notes

    Sourced from matplotlib's releases.

    REL: v3.6.0

    Highlights of this release include:

    • Figure and Axes creation / management
      • subplots, subplot_mosaic accept height_ratios and width_ratios arguments
      • Constrained layout is no longer considered experimental
      • New layout_engine module
      • Compressed layout added for fixed-aspect ratio Axes
      • Layout engines may now be removed
      • Axes.inset_axes flexibility
      • WebP is now a supported output format
      • Garbage collection is no longer run on figure close
    • Plotting methods
      • Striped lines (experimental)
      • Custom cap widths in box and whisker plots in bxp and boxplot
      • Easier labelling of bars in bar plot
      • New style format string for colorbar ticks
      • Linestyles for negative contours may be set individually
      • Improved quad contour calculations via ContourPy
      • errorbar supports markerfacecoloralt
      • streamplot can disable streamline breaks
      • New axis scale asinh (experimental)
      • stairs(..., fill=True) hides patch edge by setting linewidth
      • Fix the dash offset of the Patch class
      • Rectangle patch rotation point
    • Colors and colormaps
      • Color sequence registry
      • Colormap method for creating a different lookup table size
      • Setting norms with strings
    • Titles, ticks, and labels
      • plt.xticks and plt.yticks support minor keyword argument
    • Legends
      • Legend can control alignment of title and handles
      • ncol keyword argument to legend renamed to ncols
    • Markers
      • marker can now be set to the string "none"
      • Customization of MarkerStyle join and cap style
    • Fonts and Text
      • Font fallback
      • List of available font names
      • math_to_image now has a color keyword argument
      • Active URL area rotates with link text
    • rcParams improvements
      • Allow setting figure label size and weight globally and separately from title
      • Mathtext parsing can be disabled globally
      • Double-quoted strings in matplotlibrc
    • 3D Axes improvements
      • Standardized views for primary plane viewing angles
      • Custom focal length for 3D camera
      • 3D plots gained a 3rd "roll" viewing angle

    ... (truncated)

    Commits
    • a302267 REL: v3.6.0
    • 5b20854 DOC: Fix a typo in the what's new
    • e676dfb DOC: Update version switcher
    • 9e6e8d3 Update security policy
    • d011a7d Pin to 3.6 version of mpl-sphinx-theme
    • 56a5db2 DOC: Update GitHub stats for 3.6.0
    • 8490024 Merge branch 'v3.5.x' into HEAD
    • e3a8c45 Merge branch 'v3.5.3-doc' into v3.5.x
    • df42410 Merge pull request #23814 from QuLogic/relnotes36
    • e467be4 Remove redundant behaviour changes in 3.6
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • build(deps): bump xgboost from 1.5.2 to 1.6.2

    build(deps): bump xgboost from 1.5.2 to 1.6.2

    Bumps xgboost from 1.5.2 to 1.6.2.

    Release notes

    Sourced from xgboost's releases.

    1.6.1 Patch Release

    v1.6.1 (2022 May 9)

    This is a patch release for bug fixes and Spark barrier mode support. The R package is unchanged.

    Experimental support for categorical data

    • Fix segfault when the number of samples is smaller than the number of categories. (dmlc/xgboost#7853)
    • Enable partition-based split for all model types. (dmlc/xgboost#7857)

    JVM packages

    We replaced the old parallelism tracker with spark barrier mode to improve the robustness of the JVM package and fix the GPU training pipeline.

    Artifacts

    You can verify the downloaded packages by running this on your Unix shell:

    echo "<hash> <artifact>" | shasum -a 256 --check
    
    2633f15e7be402bad0660d270e0b9a84ad6fcfd1c690a5d454efd6d55b4e395b  ./xgboost.tar.gz
    

    Release 1.6.0 stable

    v1.6.0 (2022 Apr 16)

    After a long period of development, XGBoost v1.6.0 is packed with many new features and improvements. We summarize them in the following sections starting with an introduction to some major new features, then moving on to language binding specific changes including new features and notable bug fixes for that binding.

    Development of categorical data support

    This version of XGBoost features new improvements and full coverage of experimental categorical data support in Python and C package with tree model. Both hist, approx and gpu_hist now support training with categorical data. Also, partition-based categorical split is introduced in this release. This split type is first available in LightGBM in the context of gradient boosting. The previous XGBoost release supported one-hot split where the splitting criteria is of form x \in {c}, i.e. the categorical feature x is tested against a single candidate. The new release allows for more expressive conditions: x \in S where the categorical feature x is tested against multiple candidates. Moreover, it is now possible to use any tree algorithms (hist, approx, gpu_hist) when creating categorical splits. For more information, please see our tutorial on categorical data, along with examples linked on that page. (#7380, #7708, #7695, #7330, #7307, #7322, #7705, #7652, #7592, #7666, #7576, #7569, #7529, #7575, #7393, #7465, #7385, #7371, #7745, #7810)

    In the future, we will continue to improve categorical data support with new features and optimizations. Also, we are looking forward to bringing the feature beyond Python binding, contributions and feedback are welcomed! Lastly, as a result of experimental status, the behavior might be subject to change, especially the default value of related

    ... (truncated)

    Changelog

    Sourced from xgboost's changelog.

    XGBoost Change Log

    This file records the changes in xgboost library in reverse chronological order.

    v1.6.1 (2022 May 9)

    This is a patch release for bug fixes and Spark barrier mode support. The R package is unchanged.

    Experimental support for categorical data

    • Fix segfault when the number of samples is smaller than the number of categories. (dmlc/xgboost#7853)
    • Enable partition-based split for all model types. (dmlc/xgboost#7857)

    JVM packages

    We replaced the old parallelism tracker with spark barrier mode to improve the robustness of the JVM package and fix the GPU training pipeline.

    v1.6.0 (2022 Apr 16)

    After a long period of development, XGBoost v1.6.0 is packed with many new features and improvements. We summarize them in the following sections starting with an introduction to some major new features, then moving on to language binding specific changes including new features and notable bug fixes for that binding.

    Development of categorical data support

    This version of XGBoost features new improvements and full coverage of experimental categorical data support in Python and C package with tree model. Both hist, approx and gpu_hist now support training with categorical data. Also, partition-based categorical split is introduced in this release. This split type is first available in LightGBM in the context of gradient boosting. The previous XGBoost release supported one-hot split where the splitting criteria is of form x \in {c}, i.e. the categorical feature x is tested against a single candidate. The new release allows for more expressive conditions: x \in S where the categorical feature x is tested against multiple candidates. Moreover, it is now possible to use any tree algorithms (hist, approx, gpu_hist) when creating categorical splits. For more information, please see our tutorial on categorical data, along with examples linked on that page. (#7380, #7708, #7695, #7330, #7307, #7322, #7705, #7652, #7592, #7666, #7576, #7569, #7529, #7575, #7393, #7465, #7385, #7371, #7745, #7810)

    In the future, we will continue to improve categorical data support with new features and optimizations. Also, we are looking forward to bringing the feature beyond Python binding, contributions and feedback are welcomed! Lastly, as a result of experimental status, the behavior might be subject to change, especially the default value of related hyper-parameters.

    Experimental support for multi-output model

    XGBoost 1.6 features initial support for the multi-output model, which includes multi-output regression and multi-label classification. Along with this, the XGBoost classifier has proper support for base margin without to need for the user to flatten the input. In this initial support, XGBoost builds one model for each target similar to the sklearn meta estimator, for more details, please see our quick introduction.

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • build(deps): bump matplotlib from 3.5.1 to 3.5.3

    build(deps): bump matplotlib from 3.5.1 to 3.5.3

    Bumps matplotlib from 3.5.1 to 3.5.3.

    Release notes

    Sourced from matplotlib's releases.

    REL: v3.5.3

    This is the third bugfix release of the 3.5.x series.

    This release contains several bug-fixes and adjustments:

    • Fix alignment of over/under symbols
    • Fix bugs in colorbars:
      • alpha of extensions
      • drawedges=True with extensions
      • handling of panchor=False
    • Fix builds on Cygwin and IBM i
    • Fix contour labels in SubFigures
    • Fix cursor output:
      • for imshow with all negative values
      • when using BoundaryNorm
    • Fix interactivity in IPython/Jupyter
    • Fix NaN handling in errorbar
    • Fix NumPy conversion from AstroPy unit arrays
    • Fix positional markerfmt passed to stem
    • Fix unpickling:
      • crash loading in a separate process
      • incorrect DPI when HiDPI screens

    REL: v3.5.2

    This is the second bugfix release of the 3.5.x series.

    This release contains several bug-fixes and adjustments:

    • Add support for Windows on ARM (source-only; no wheels provided yet)
    • Add year to concise date formatter when displaying less than 12 months
    • Disable QuadMesh mouse cursor to avoid severe performance regression in pcolormesh
    • Delay backend selection to allow choosing one in more cases
    • Fix automatic layout bugs in EPS output
    • Fix autoscaling of scatter plots
    • Fix clearing of subfigures
    • Fix colorbar exponents, inversion of extensions, and use on inset axes
    • Fix compatibility with various NumPy-like classes (e.g., Pandas, xarray, etc.)
    • Fix constrained layout bugs with mixed subgrids
    • Fix errorbar with dashes
    • Fix errors in conversion to GTK4 and Qt6
    • Fix figure options accidentally re-ordering data
    • Fix keyboard focus of TkAgg backend
    • Fix manual selection of contour labels
    • Fix path effects on text with whitespace
    • Fix quiver in subfigures
    • Fix RangeSlider.set_val displaying incorrectly
    • Fix regressions in collection data limits
    • Fix stairs with no edgecolor
    • Fix some leaks in Tk backends
    • Fix tight layout DPI confusion

    ... (truncated)

    Commits
    • d04c8de REL: v3.5.3
    • 318cacc DOC: Update release notes for 3.5.3
    • f4d4b47 Merge branch 'v3.5.2-doc' into v3.5.x
    • 071413e DOC: Update GitHub stats for 3.5.3
    • 0428306 Merge pull request #23591 from meeseeksmachine/auto-backport-of-pr-23549-on-v...
    • 2f3abfb Merge pull request #23593 from QuLogic/fix-flake8
    • 530457e STY: Fix whitespace error from new flake8
    • ab78318 Backport PR #23549: Don't clip colorbar dividers
    • 952227e Merge pull request #23528 from meeseeksmachine/auto-backport-of-pr-23523-on-v...
    • 632e4d7 Backport PR #23523: TST: Update Quantity test class
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • build(deps-dev): bump dvc from 2.9.5 to 2.12.1

    build(deps-dev): bump dvc from 2.9.5 to 2.12.1

    Bumps dvc from 2.9.5 to 2.12.1.

    Release notes

    Sourced from dvc's releases.

    2.12.1 🦉

    Refer to https://dvc.org/doc/install for installation instructions.

    Changes

    🚀 New Features and Enhancements

    🐛 Bug Fixes

    🔨 Maintenance

    Thanks again to @​alexmojaki, @​daavoo, @​dberenbaum, @​dependabot, @​dependabot[bot], @​efiop, @​pared, @​pre-commit-ci, @​pre-commit-ci[bot] and @​skshetry for the contributions! 🎉

    2.12.0 🦉

    Refer to https://dvc.org/doc/install for installation instructions.

    Changes

    🚀 New Features and Enhancements

    🐛 Bug Fixes

    🔨 Maintenance

    ... (truncated)

    Commits
    • bd93d85 deps: bump dvc-data
    • eed6a84 data_cloud: remove logger check
    • 954de0c api: params_show: Raise exception if no params found.
    • 9099413 parsing: Support dict unpacking in cmd.
    • c59f935 Switch uses of %r to %s with quotes to avoid escaped backslashes in logs, esp...
    • 03b789e logger: use lazy formatting
    • cacc2f1 [pre-commit.ci] pre-commit autoupdate
    • 0d71194 dvc: normalize targets before entering brancher
    • 683d22c build(deps): Bump dvc-data from 0.0.16 to 0.0.18
    • c687d1b cli: help text for dvc update
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • build(deps-dev): bump dvc from 2.9.5 to 2.12.0

    build(deps-dev): bump dvc from 2.9.5 to 2.12.0

    Bumps dvc from 2.9.5 to 2.12.0.

    Release notes

    Sourced from dvc's releases.

    2.12.0 🦉

    Refer to https://dvc.org/doc/install for installation instructions.

    Changes

    🚀 New Features and Enhancements

    🐛 Bug Fixes

    🔨 Maintenance

    Thanks again to @​ap-kulkarni, @​daavoo, @​dberenbaum, @​dependabot, @​dependabot[bot], @​dtrifiro, @​efiop, @​jorgeorpinel, @​pmrowla, @​pre-commit-ci, @​pre-commit-ci[bot], @​skshetry, @​ykasimov and Yury for the contributions! 🎉

    2.11.0 🦉

    ... (truncated)

    Commits
    • c347d9f build(deps-dev): Bump pytest-mock from 3.7.0 to 3.8.1
    • efc4787 build(deps-dev): Bump pylint from 2.14.3 to 2.14.4
    • be3ec9d deps: bump dvc-data to 0.0.16
    • 685a2d5 build(deps): Bump styfle/cancel-workflow-action from 0.9.1 to 0.10.0
    • c336507 render: image_converter: Support slash in revision.
    • d448eb5 setup: bump dvc-data
    • 3dcd010 deps: bump dvc-data to 0.0.13
    • 2773e99 config: remove core.jobs dead code
    • 9a1a3a0 Merge pull request #7671 from ap-kulkarni/amey/6141
    • 286fab4 setup: bump dvc-data to 0.0.12
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • build(deps-dev): bump dvc from 2.9.5 to 2.11.0

    build(deps-dev): bump dvc from 2.9.5 to 2.11.0

    Bumps dvc from 2.9.5 to 2.11.0.

    Release notes

    Sourced from dvc's releases.

    2.11.0 🦉

    Refer to https://dvc.org/doc/install for installation instructions.

    Changes

    🚀 New Features and Enhancements

    🏇 Optimizations

    🐛 Bug Fixes

    🔨 Maintenance

    ... (truncated)

    Commits
    • c9a3cb8 sshfs: bump min ver to 2022.6.0
    • 41ecd2e scm: fix clone
    • 33b3afa exp init: create output dirs
    • e849162 exp: speed up repro execution with untracked directories in workspace
    • c14f963 checkout: --relink show helpful message on completion
    • 88d3582 build(deps): Bump pre-commit/action from 2.0.3 to 3.0.0
    • 8fa0e40 setup: set upper bound on networkx
    • a3d6b12 brancher: use scm.root_dir to determine relative cwd
    • 20c7b0e plots: grouping: stop using dpath.util.search
    • 6794dd2 docs: update package installation
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • build(deps-dev): bump notebook from 6.4.10 to 6.4.12

    build(deps-dev): bump notebook from 6.4.10 to 6.4.12

    Bumps notebook from 6.4.10 to 6.4.12.

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • build(deps): bump xgboost from 1.5.2 to 1.6.1

    build(deps): bump xgboost from 1.5.2 to 1.6.1

    Bumps xgboost from 1.5.2 to 1.6.1.

    Release notes

    Sourced from xgboost's releases.

    1.6.1 Patch Release

    v1.6.1 (2022 May 9)

    This is a patch release for bug fixes and Spark barrier mode support. The R package is unchanged.

    Experimental support for categorical data

    • Fix segfault when the number of samples is smaller than the number of categories. (dmlc/xgboost#7853)
    • Enable partition-based split for all model types. (dmlc/xgboost#7857)

    JVM packages

    We replaced the old parallelism tracker with spark barrier mode to improve the robustness of the JVM package and fix the GPU training pipeline.

    Artifacts

    You can verify the downloaded packages by running this on your Unix shell:

    echo "<hash> <artifact>" | shasum -a 256 --check
    
    2633f15e7be402bad0660d270e0b9a84ad6fcfd1c690a5d454efd6d55b4e395b  ./xgboost.tar.gz
    

    Release 1.6.0 stable

    v1.6.0 (2022 Apr 16)

    After a long period of development, XGBoost v1.6.0 is packed with many new features and improvements. We summarize them in the following sections starting with an introduction to some major new features, then moving on to language binding specific changes including new features and notable bug fixes for that binding.

    Development of categorical data support

    This version of XGBoost features new improvements and full coverage of experimental categorical data support in Python and C package with tree model. Both hist, approx and gpu_hist now support training with categorical data. Also, partition-based categorical split is introduced in this release. This split type is first available in LightGBM in the context of gradient boosting. The previous XGBoost release supported one-hot split where the splitting criteria is of form x \in {c}, i.e. the categorical feature x is tested against a single candidate. The new release allows for more expressive conditions: x \in S where the categorical feature x is tested against multiple candidates. Moreover, it is now possible to use any tree algorithms (hist, approx, gpu_hist) when creating categorical splits. For more information, please see our tutorial on categorical data, along with examples linked on that page. (#7380, #7708, #7695, #7330, #7307, #7322, #7705, #7652, #7592, #7666, #7576, #7569, #7529, #7575, #7393, #7465, #7385, #7371, #7745, #7810)

    In the future, we will continue to improve categorical data support with new features and optimizations. Also, we are looking forward to bringing the feature beyond Python binding, contributions and feedback are welcomed! Lastly, as a result of experimental status, the behavior might be subject to change, especially the default value of related

    ... (truncated)

    Changelog

    Sourced from xgboost's changelog.

    XGBoost Change Log

    This file records the changes in xgboost library in reverse chronological order.

    v1.6.0 (2022 Apr 16)

    After a long period of development, XGBoost v1.6.0 is packed with many new features and improvements. We summarize them in the following sections starting with an introduction to some major new features, then moving on to language binding specific changes including new features and notable bug fixes for that binding.

    Development of categorical data support

    This version of XGBoost features new improvements and full coverage of experimental categorical data support in Python and C package with tree model. Both hist, approx and gpu_hist now support training with categorical data. Also, partition-based categorical split is introduced in this release. This split type is first available in LightGBM in the context of gradient boosting. The previous XGBoost release supported one-hot split where the splitting criteria is of form x \in {c}, i.e. the categorical feature x is tested against a single candidate. The new release allows for more expressive conditions: x \in S where the categorical feature x is tested against multiple candidates. Moreover, it is now possible to use any tree algorithms (hist, approx, gpu_hist) when creating categorical splits. For more information, please see our tutorial on categorical data, along with examples linked on that page. (#7380, #7708, #7695, #7330, #7307, #7322, #7705, #7652, #7592, #7666, #7576, #7569, #7529, #7575, #7393, #7465, #7385, #7371, #7745, #7810)

    In the future, we will continue to improve categorical data support with new features and optimizations. Also, we are looking forward to bringing the feature beyond Python binding, contributions and feedback are welcomed! Lastly, as a result of experimental status, the behavior might be subject to change, especially the default value of related hyper-parameters.

    Experimental support for multi-output model

    XGBoost 1.6 features initial support for the multi-output model, which includes multi-output regression and multi-label classification. Along with this, the XGBoost classifier has proper support for base margin without to need for the user to flatten the input. In this initial support, XGBoost builds one model for each target similar to the sklearn meta estimator, for more details, please see our quick introduction.

    (#7365, #7736, #7607, #7574, #7521, #7514, #7456, #7453, #7455, #7434, #7429, #7405, #7381)

    External memory support

    External memory support for both approx and hist tree method is considered feature complete in XGBoost 1.6. Building upon the iterator-based interface introduced in the previous version, now both hist and approx iterates over each batch of data during training and prediction. In previous versions, hist concatenates all the batches into an internal representation, which is removed in this version. As a result, users can expect higher scalability in terms of data size but might experience lower performance due to disk IO. (#7531, #7320, #7638, #7372)

    Rewritten approx

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • build(deps-dev): bump notebook from 6.4.10 to 6.4.11

    build(deps-dev): bump notebook from 6.4.10 to 6.4.11

    Bumps notebook from 6.4.10 to 6.4.11.

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • build(deps-dev): bump notebook from 6.4.10 to 6.5.2

    build(deps-dev): bump notebook from 6.4.10 to 6.5.2

    Bumps notebook from 6.4.10 to 6.5.2.

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • build(deps): bump xgboost from 1.5.2 to 1.7.0

    build(deps): bump xgboost from 1.5.2 to 1.7.0

    Bumps xgboost from 1.5.2 to 1.7.0.

    Release notes

    Sourced from xgboost's releases.

    Release 1.7.0 stable

    v1.7.0 (2022 Oct 20)

    We are excited to announce the feature packed XGBoost 1.7 release. The release note will walk through some of the major new features first, then make a summary for other improvements and language-binding-specific changes.

    PySpark

    XGBoost 1.7 features initial support for PySpark integration. The new interface is adapted from the existing PySpark XGBoost interface developed by databricks with additional features like QuantileDMatrix and the rapidsai plugin (GPU pipeline) support. The new Spark XGBoost Python estimators not only benefit from PySpark ml facilities for powerful distributed computing but also enjoy the rest of the Python ecosystem. Users can define a custom objective, callbacks, and metrics in Python and use them with this interface on distributed clusters. The support is labeled as experimental with more features to come in future releases. For a brief introduction please visit the tutorial on XGBoost's document page. (#8355, #8344, #8335, #8284, #8271, #8283, #8250, #8231, #8219, #8245, #8217, #8200, #8173, #8172, #8145, #8117, #8131, #8088, #8082, #8085, #8066, #8068, #8067, #8020, #8385)

    Due to its initial support status, the new interface has some limitations; categorical features and multi-output models are not yet supported.

    Development of categorical data support

    More progress on the experimental support for categorical features. In 1.7, XGBoost can handle missing values in categorical features and features a new parameter max_cat_threshold, which limits the number of categories that can be used in the split evaluation. The parameter is enabled when the partitioning algorithm is used and helps prevent over-fitting. Also, the sklearn interface can now accept the feature_types parameter to use data types other than dataframe for categorical features. (#8280, #7821, #8285, #8080, #7948, #7858, #7853, #8212, #7957, #7937, #7934)

    Experimental support for federated learning and new communication collective

    An exciting addition to XGBoost is the experimental federated learning support. The federated learning is implemented with a gRPC federated server that aggregates allreduce calls, and federated clients that train on local data and use existing tree methods (approx, hist, gpu_hist). Currently, this only supports horizontal federated learning (samples are split across participants, and each participant has all the features and labels). Future plans include vertical federated learning (features split across participants), and stronger privacy guarantees with homomorphic encryption and differential privacy. See Demo with NVFlare integration for example usage with nvflare.

    As part of the work, XGBoost 1.7 has replaced the old rabit module with the new collective module as the network communication interface with added support for runtime backend selection. In previous versions, the backend is defined at compile time and can not be changed once built. In this new release, users can choose between rabit and federated. (#8029, #8351, #8350, #8342, #8340, #8325, #8279, #8181, #8027, #7958, #7831, #7879, #8257, #8316, #8242, #8057, #8203, #8038, #7965, #7930, #7911)

    The feature is available in the public PyPI binary package for testing.

    Quantile DMatrix

    Before 1.7, XGBoost has an internal data structure called DeviceQuantileDMatrix (and its distributed version). We now extend its support to CPU and renamed it to QuantileDMatrix. This data structure is used for optimizing memory usage for the hist and gpu_hist tree methods. The new feature helps reduce CPU memory usage significantly, especially for dense data. The new QuantileDMatrix can be initialized from both CPU and GPU data, and regardless of where the data comes from, the constructed instance can be used by both the CPU algorithm and GPU algorithm including training and prediction (with some overhead of conversion if the device of data and training algorithm doesn't match). Also, a new parameter ref is added to QuantileDMatrix, which can be used to construct validation/test datasets. Lastly, it's set as default in the scikit-learn interface when a supported tree method is specified by users. (#7889, #7923, #8136, #8215, #8284, #8268, #8220, #8346, #8327, #8130, #8116, #8103, #8094, #8086, #7898, #8060, #8019, #8045, #7901, #7912, #7922)

    Mean absolute error

    The mean absolute error is a new member of the collection of objectives in XGBoost. It's noteworthy since MAE has zero hessian value, which is unusual to XGBoost as XGBoost relies on Newton optimization. Without valid Hessian values, the convergence speed can be slow. As part of the support for MAE, we added line searches into the XGBoost training algorithm to overcome the difficulty of training without valid Hessian values. In the future, we will extend the line search to other objectives where it's appropriate for faster convergence speed. (#8343, #8107, #7812, #8380)

    XGBoost on Browser

    With the help of the pyodide project, you can now run XGBoost on browsers. (#7954, #8369)

    Experimental IPv6 Support for Dask

    With the growing adaption of the new internet protocol, XGBoost joined the club. In the latest release, the Dask interface can be used on IPv6 clusters, see XGBoost's Dask tutorial for details. (#8225, #8234)

    Optimizations

    We have new optimizations for both the hist and gpu_hist tree methods to make XGBoost's training even more efficient.

    • Hist Hist now supports optional by-column histogram build, which is automatically configured based on various conditions of input data. This helps the XGBoost CPU hist algorithm to scale better with different shapes of training datasets. (#8233, #8259). Also, the build histogram kernel now can better utilize CPU registers (#8218)

    • GPU Hist GPU hist performance is significantly improved for wide datasets. GPU hist now supports batched node build, which reduces kernel latency and increases throughput. The improvement is particularly significant when growing deep trees with the default depthwise policy. (#7919, #8073, #8051, #8118, #7867, #7964, #8026)

    Breaking Changes

    Breaking changes made in the 1.7 release are summarized below.

    • The grow_local_histmaker updater is removed. This updater is rarely used in practice and has no test. We decided to remove it and focus have XGBoot focus on other more efficient algorithms. (#7992, #8091)
    • Single precision histogram is removed due to its lack of accuracy caused by significant floating point error. In some cases the error can be difficult to detect due to log-scale operations, which makes the parameter dangerous to use. (#7892, #7828)
    • Deprecated CUDA architectures are no longer supported in the release binaries. (#7774)

    ... (truncated)

    Changelog

    Sourced from xgboost's changelog.

    v1.7.0 (2022 Oct 20)

    We are excited to announce the feature packed XGBoost 1.7 release. The release note will walk through some of the major new features first, then make a summary for other improvements and language-binding-specific changes.

    PySpark

    XGBoost 1.7 features initial support for PySpark integration. The new interface is adapted from the existing PySpark XGBoost interface developed by databricks with additional features like QuantileDMatrix and the rapidsai plugin (GPU pipeline) support. The new Spark XGBoost Python estimators not only benefit from PySpark ml facilities for powerful distributed computing but also enjoy the rest of the Python ecosystem. Users can define a custom objective, callbacks, and metrics in Python and use them with this interface on distributed clusters. The support is labeled as experimental with more features to come in future releases. For a brief introduction please visit the tutorial on XGBoost's document page. (#8355, #8344, #8335, #8284, #8271, #8283, #8250, #8231, #8219, #8245, #8217, #8200, #8173, #8172, #8145, #8117, #8131, #8088, #8082, #8085, #8066, #8068, #8067, #8020, #8385)

    Due to its initial support status, the new interface has some limitations; categorical features and multi-output models are not yet supported.

    Development of categorical data support

    More progress on the experimental support for categorical features. In 1.7, XGBoost can handle missing values in categorical features and features a new parameter max_cat_threshold, which limits the number of categories that can be used in the split evaluation. The parameter is enabled when the partitioning algorithm is used and helps prevent over-fitting. Also, the sklearn interface can now accept the feature_types parameter to use data types other than dataframe for categorical features. (#8280, #7821, #8285, #8080, #7948, #7858, #7853, #8212, #7957, #7937, #7934)

    Experimental support for federated learning and new communication collective

    An exciting addition to XGBoost is the experimental federated learning support. The federated learning is implemented with a gRPC federated server that aggregates allreduce calls, and federated clients that train on local data and use existing tree methods (approx, hist, gpu_hist). Currently, this only supports horizontal federated learning (samples are split across participants, and each participant has all the features and labels). Future plans include vertical federated learning (features split across participants), and stronger privacy guarantees with homomorphic encryption and differential privacy. See Demo with NVFlare integration for example usage with nvflare.

    As part of the work, XGBoost 1.7 has replaced the old rabit module with the new collective module as the network communication interface with added support for runtime backend selection. In previous versions, the backend is defined at compile time and can not be changed once built. In this new release, users can choose between rabit and federated. (#8029, #8351, #8350, #8342, #8340, #8325, #8279, #8181, #8027, #7958, #7831, #7879, #8257, #8316, #8242, #8057, #8203, #8038, #7965, #7930, #7911)

    The feature is available in the public PyPI binary package for testing.

    Quantile DMatrix

    Before 1.7, XGBoost has an internal data structure called DeviceQuantileDMatrix (and its distributed version). We now extend its support to CPU and renamed it to QuantileDMatrix. This data structure is used for optimizing memory usage for the hist and gpu_hist tree methods. The new feature helps reduce CPU memory usage significantly, especially for dense data. The new QuantileDMatrix can be initialized from both CPU and GPU data, and regardless of where the data comes from, the constructed instance can be used by both the CPU algorithm and GPU algorithm including training and prediction (with some overhead of conversion if the device of data and training algorithm doesn't match). Also, a new parameter ref is added to QuantileDMatrix, which can be used to construct validation/test datasets. Lastly, it's set as default in the scikit-learn interface when a supported tree method is specified by users. (#7889, #7923, #8136, #8215, #8284, #8268, #8220, #8346, #8327, #8130, #8116, #8103, #8094, #8086, #7898, #8060, #8019, #8045, #7901, #7912, #7922)

    Mean absolute error

    The mean absolute error is a new member of the collection of objectives in XGBoost. It's noteworthy since MAE has zero hessian value, which is unusual to XGBoost as XGBoost relies on Newton optimization. Without valid Hessian values, the convergence speed can be slow. As part of the support for MAE, we added line searches into the XGBoost training algorithm to overcome the difficulty of training without valid Hessian values. In the future, we will extend the line search to other objectives where it's appropriate for faster convergence speed. (#8343, #8107, #7812, #8380)

    XGBoost on Browser

    With the help of the pyodide project, you can now run XGBoost on browsers. (#7954, #8369)

    Experimental IPv6 Support for Dask

    With the growing adaption of the new internet protocol, XGBoost joined the club. In the latest release, the Dask interface can be used on IPv6 clusters, see XGBoost's Dask tutorial for details. (#8225, #8234)

    Optimizations

    We have new optimizations for both the hist and gpu_hist tree methods to make XGBoost's training even more efficient.

    • Hist Hist now supports optional by-column histogram build, which is automatically configured based on various conditions of input data. This helps the XGBoost CPU hist algorithm to scale better with different shapes of training datasets. (#8233, #8259). Also, the build histogram kernel now can better utilize CPU registers (#8218)

    • GPU Hist GPU hist performance is significantly improved for wide datasets. GPU hist now supports batched node build, which reduces kernel latency and increases throughput. The improvement is particularly significant when growing deep trees with the default depthwise policy. (#7919, #8073, #8051, #8118, #7867, #7964, #8026)

    Breaking Changes

    Breaking changes made in the 1.7 release are summarized below.

    • The grow_local_histmaker updater is removed. This updater is rarely used in practice and has no test. We decided to remove it and focus have XGBoot focus on other more efficient algorithms. (#7992, #8091)
    • Single precision histogram is removed due to its lack of accuracy caused by significant floating point error. In some cases the error can be difficult to detect due to log-scale operations, which makes the parameter dangerous to use. (#7892, #7828)
    • Deprecated CUDA architectures are no longer supported in the release binaries. (#7774)
    • As part of the federated learning development, the rabit module is replaced with the new collective module. It's a drop-in replacement with added runtime backend selection, see the federated learning section for more details (#8257)

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • build(deps): bump matplotlib from 3.5.1 to 3.6.1

    build(deps): bump matplotlib from 3.5.1 to 3.6.1

    Bumps matplotlib from 3.5.1 to 3.6.1.

    Release notes

    Sourced from matplotlib's releases.

    REL: v3.6.1

    This is the first bugfix release of the 3.6.x series.

    This release contains several bug-fixes and adjustments:

    • A warning is no longer raised when constrained layout explicitly disabled and tight layout is applied
    • Add missing get_cmap method to ColormapRegistry
    • Adding a colorbar on a ScalarMappable that is not attached to an Axes is now deprecated instead of raising a hard error
    • Fix barplot being empty when first element is NaN
    • Fix FigureManager.resize on GTK4
    • Fix fill_between compatibility with NumPy 1.24 development version
    • Fix hexbin with empty arrays and log scaling
    • Fix resize_event deprecation warnings when creating figure on macOS
    • Fix build in mingw
    • Fix compatibility with PyCharm's interagg backend
    • Fix crash on empty Text in PostScript backend
    • Fix generic font families in SVG exports
    • Fix horizontal colorbars with hatches
    • Fix misplaced mathtext using eqnarray
    • stackplot no longer changes the Axes cycler

    REL: v3.6.0

    Highlights of this release include:

    • Figure and Axes creation / management
      • subplots, subplot_mosaic accept height_ratios and width_ratios arguments
      • Constrained layout is no longer considered experimental
      • New layout_engine module
      • Compressed layout added for fixed-aspect ratio Axes
      • Layout engines may now be removed
      • Axes.inset_axes flexibility
      • WebP is now a supported output format
      • Garbage collection is no longer run on figure close
    • Plotting methods
      • Striped lines (experimental)
      • Custom cap widths in box and whisker plots in bxp and boxplot
      • Easier labelling of bars in bar plot
      • New style format string for colorbar ticks
      • Linestyles for negative contours may be set individually
      • Improved quad contour calculations via ContourPy
      • errorbar supports markerfacecoloralt
      • streamplot can disable streamline breaks
      • New axis scale asinh (experimental)
      • stairs(..., fill=True) hides patch edge by setting linewidth
      • Fix the dash offset of the Patch class
      • Rectangle patch rotation point

    ... (truncated)

    Commits
    • 318b234 REL: v3.6.1
    • b92cccc Update release notes for 3.6.1
    • 746f3ce DOC: Update GitHub stats for 3.6.1
    • 251e3ca Merge branch 'v3.6.0-doc' into v3.6.x
    • 4627a5e Merge pull request #24124 from meeseeksmachine/auto-backport-of-pr-24111-on-v...
    • 3863297 Backport PR #24111: FIX: add missing method to ColormapRegistry
    • 78bcd91 Merge pull request #24117 from meeseeksmachine/auto-backport-of-pr-24113-on-v...
    • d336a67 Merge pull request #24116 from meeseeksmachine/auto-backport-of-pr-24115-on-v...
    • 305a146 Backport PR #24113: Add exception class to pytest.warns calls
    • 0c248ac Backport PR #24115: Fix mask lookup in fill_between for NumPy 1.24+
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • build(deps-dev): bump dvc from 2.9.5 to 2.13.0

    build(deps-dev): bump dvc from 2.9.5 to 2.13.0

    Bumps dvc from 2.9.5 to 2.13.0.

    Release notes

    Sourced from dvc's releases.

    2.13.0 🦉

    Refer to https://dvc.org/doc/install for installation instructions.

    Changes

    🚀 New Features and Enhancements

    🔨 Maintenance

    Thanks again to @​alexmojaki, @​dependabot, @​dependabot[bot], @​efiop and @​skshetry for the contributions! 🎉

    2.12.1 🦉

    Refer to https://dvc.org/doc/install for installation instructions.

    Changes

    🚀 New Features and Enhancements

    🐛 Bug Fixes

    🔨 Maintenance

    Thanks again to @​alexmojaki, @​daavoo, @​dberenbaum, @​dependabot, @​dependabot[bot], @​efiop, @​pared, @​pre-commit-ci, @​pre-commit-ci[bot] and @​skshetry for the contributions! 🎉

    2.12.0 🦉

    Refer to https://dvc.org/doc/install for installation instructions.

    Changes

    ... (truncated)

    Commits
    • d8be684 deps: bump dvc-data to 0.0.23
    • f6adc49 deps: bump dvc-data to 0.0.22
    • a4c0ae9 deps: bump dvc-data, install dvc-data cli deps in dev mode
    • 8b3020a deps: bump dvc-data to 0.0.20
    • 182a22b checkout: use dvcignore
    • 892b235 metrics: support TOML files
    • da93e9a build(deps-dev): Bump pytest-mock from 3.8.1 to 3.8.2
    • bd93d85 deps: bump dvc-data
    • eed6a84 data_cloud: remove logger check
    • 954de0c api: params_show: Raise exception if no params found.
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • build(deps): bump numpy from 1.21.5 to 1.21.6

    build(deps): bump numpy from 1.21.5 to 1.21.6

    Bumps numpy from 1.21.5 to 1.21.6.

    Release notes

    Sourced from numpy's releases.

    v1.21.6

    NumPy 1.21.6 Release Notes

    NumPy 1.21.6 is a very small release that achieves two things:

    • Backs out the mistaken backport of C++ code into 1.21.5.
    • Provides a 32 bit Windows wheel for Python 3.10.

    The provision of the 32 bit wheel is intended to make life easier for oldest-supported-numpy.

    Checksums

    MD5

    5a3e5d7298056bcfbc3246597af474d4  numpy-1.21.6-cp310-cp310-macosx_10_9_universal2.whl
    d981d2859842e7b62dc93e24808c7bac  numpy-1.21.6-cp310-cp310-macosx_10_9_x86_64.whl
    171313893c26529404d09fadb3537ed3  numpy-1.21.6-cp310-cp310-macosx_11_0_arm64.whl
    5a7a6dfdd43069f9b29d3fe6b7f3a2ce  numpy-1.21.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
    a9e25375a72725c5d74442eda53af405  numpy-1.21.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
    6f9a782477380b2cdb7606f6f7634c00  numpy-1.21.6-cp310-cp310-win32.whl
    32a73a348864700a3fa510d2fc4350b7  numpy-1.21.6-cp310-cp310-win_amd64.whl
    0db8941ebeb0a02cd839d9cd3c5c20bb  numpy-1.21.6-cp37-cp37m-macosx_10_9_x86_64.whl
    67882155be9592850861f4ad8ba36623  numpy-1.21.6-cp37-cp37m-manylinux_2_12_i686.manylinux2010_i686.whl
    c70e30e1ff9ab49f898c19e7a6492ae6  numpy-1.21.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
    e32dbd291032c7554a742f1bb9b2f7a3  numpy-1.21.6-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
    689bf804c2cd16cb241fd943e3833ffd  numpy-1.21.6-cp37-cp37m-win32.whl
    0062a7b0231a07cb5b9f3d7c495e6fe4  numpy-1.21.6-cp37-cp37m-win_amd64.whl
    0d08809980ab497659e7aa0df9ce120e  numpy-1.21.6-cp38-cp38-macosx_10_9_universal2.whl
    3c67d14ea2009069844b27bfbf74304d  numpy-1.21.6-cp38-cp38-macosx_10_9_x86_64.whl
    5f0e773745cb817313232ac1bf4c7eee  numpy-1.21.6-cp38-cp38-macosx_11_0_arm64.whl
    fa8011e065f1964d3eb870bb3926fc99  numpy-1.21.6-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl
    486cf9d4daab59aad253aa5b84a5aa83  numpy-1.21.6-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
    88509abab303c076dfb26f00e455180d  numpy-1.21.6-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
    f7234e2ef837f5f6ddbde8db246fd05b  numpy-1.21.6-cp38-cp38-win32.whl
    e1063e01fb44ea7a49adea0c33548217  numpy-1.21.6-cp38-cp38-win_amd64.whl
    61c4caad729e3e0e688accbc1424ed45  numpy-1.21.6-cp39-cp39-macosx_10_9_universal2.whl
    67488d8ccaeff798f2e314aae7c4c3d6  numpy-1.21.6-cp39-cp39-macosx_10_9_x86_64.whl
    128c3713b5d1de45a0f522562bac5263  numpy-1.21.6-cp39-cp39-macosx_11_0_arm64.whl
    50e79cd0610b4ed726b3bf08c3716dab  numpy-1.21.6-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl
    bd0c9e3c0e488faac61daf3227fb95af  numpy-1.21.6-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
    aa5e9baf1dec16b15e481c23f8a23214  numpy-1.21.6-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
    a2405b0e5d3f775ad30177296a997092  numpy-1.21.6-cp39-cp39-win32.whl
    f0d20eda8c78f957ea70c5527954303e  numpy-1.21.6-cp39-cp39-win_amd64.whl
    9682abbcc38cccb7f56e48aacca7de23  numpy-1.21.6-pp37-pypy37_pp73-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
    6aa3c2e8ea2886bf593bd8e0a1425c64  numpy-1.21.6.tar.gz
    04aea95dcb1d256d13a45df42173aa1e  numpy-1.21.6.zip
    

    SHA256

    ... (truncated)

    Commits
    • ef0ec78 Merge pull request #21323 from charris/prepare-1.21.6-release
    • 24a8ec0 REL: Prepare for NumPy 1.21.6 release.
    • 68ff2d3 Merge pull request #21318 from charris/revert-20354
    • 30ba38c REV: Revert pull request #20464 from charris/backport-20354
    • 7cfef93 REL: prepare 1.21.x for further development
    • See full diff in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • Generalizando

    Generalizando

    Fazer o repositório de ser uso geral:

    • Deveria ser baixada a versão limpa do repositório, sem arquivos ou configurações setadas.

    • Atualmente um problema de classificação está implementado. Precisa ser capaz de funcionar para outros tipos de problema, como regressões. Assim, as métricas devem ter um comportamento geral, sendo configuráveis via .yaml

    • O make_dataset apenas baixa uma base de dados tabular em formato .csv, todo o repo se baseia nisso. Precisa-se garantir que seja funcional com outros formatos de arquivo (.xml, .txt). Também precisa-se ter em mente que nem todo problema é resolvido com dados tabulares, no caso de problemas de NLP precisa de uma implementação específica para transformar um texto e um dataset, isso deveria estar disponível.

    • Diferentes formas de validação de dados deveriam ser possíveis. Poderia ser configurável via arquivo qual o tipo de validação será usado no problema.

    opened by maikereis 3
Owner
Maykon Schots
Maykon Schots
ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.

ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. It has a simple, flexible syntax, is cloud and tool agnostic, and has interfaces/abstractions that are catered towards ML workflows.

ZenML 2.6k Jan 8, 2023
A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Allen Chiang 152 Jan 7, 2023
To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.

To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.

Astitva Veer Garg 1 Jan 11, 2022
ClearML - Auto-Magical Suite of tools to streamline your ML workflow. Experiment Manager, MLOps and Data-Management

ClearML - Auto-Magical Suite of tools to streamline your ML workflow Experiment Manager, MLOps and Data-Management ClearML Formerly known as Allegro T

ClearML 4k Jan 9, 2023
This repo implements a Topological SLAM: Deep Visual Odometry with Long Term Place Recognition (Loop Closure Detection)

This repo implements a topological SLAM system. Deep Visual Odometry (DF-VO) and Visual Place Recognition are combined to form the topological SLAM system.

Best of Australian Centre for Robotic Vision (ACRV) 32 Jun 23, 2022
Pragmatic AI Labs 421 Dec 31, 2022
End to End toy example of MLOps

churn_model MLOps Toy Example End to End You might find below links useful Connect VSCode to Git MLFlow Port Heroku App Project Organization ├── LICEN

Ashish Tele 6 Feb 6, 2022
MLOps pipeline project using Amazon SageMaker Pipelines

This project shows steps to build an end to end MLOps architecture that covers data prep, model training, realtime and batch inference, build model registry, track lineage of artifacts and model drift detection. It utilizes SageMaker Pipelines that offers machine learning (ML) to orchestrate SageMaker jobs and author reproducible ML pipelines.

AWS Samples 3 Sep 16, 2022
Azure MLOps (v2) solution accelerators.

Azure MLOps (v2) solution accelerator Welcome to the MLOps (v2) solution accelerator repository! This project is intended to serve as the starting poi

Microsoft Azure 233 Jan 1, 2023
Hypernets: A General Automated Machine Learning framework to simplify the development of End-to-end AutoML toolkits in specific domains.

A General Automated Machine Learning framework to simplify the development of End-to-end AutoML toolkits in specific domains.

DataCanvas 216 Dec 23, 2022
machine learning model deployment project of Iris classification model in a minimal UI using flask web framework and deployed it in Azure cloud using Azure app service

This is a machine learning model deployment project of Iris classification model in a minimal UI using flask web framework and deployed it in Azure cloud using Azure app service. We initially made this project as a requirement for an internship at Indian Servers. We are now making it open to contribution.

Krishna Priyatham Potluri 73 Dec 1, 2022
A chain of stores, 10 different stores and 50 different requests a 3-month demand forecast for its product.

Demand-Forecasting Business Problem A chain of stores, 10 different stores and 50 different requests a 3-month demand forecast for its product.

Ayşe Nur Türkaslan 3 Mar 6, 2022
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Master status: Development status: Package information: TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assista

Epistasis Lab at UPenn 8.9k Jan 9, 2023
Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Python Extreme Learning Machine (ELM) Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

Augusto Almeida 84 Nov 25, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Vowpal Wabbit 8.1k Dec 30, 2022
LILLIE: Information Extraction and Database Integration Using Linguistics and Learning-Based Algorithms

LILLIE: Information Extraction and Database Integration Using Linguistics and Learning-Based Algorithms Based on the work by Smith et al. (2021) Query

null 5 Aug 6, 2022
AutoOED: Automated Optimal Experiment Design Platform

AutoOED is an optimal experiment design platform powered with automated machine learning to accelerate the discovery of optimal solutions. Our platform solves multi-objective optimization problems and automatically guides the design of experiment to be evaluated.

Yunsheng Tian 107 Jan 3, 2023
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Jan 8, 2023
Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

FINRA 25 Dec 28, 2022