Determined: Deep Learning Training Platform

Overview

Determined AI Logo

Determined: Deep Learning Training Platform

Determined is an open-source deep learning training platform that makes building models fast and easy. Determined enables you to:

  • Train models faster using state-of-the-art distributed training, without changing your model code
  • Automatically find high-quality models with advanced hyperparameter tuning from the creators of Hyperband
  • Get more from your GPUs with smart scheduling and cut cloud GPU costs by seamlessly using preemptible instances
  • Track and reproduce your work with experiment tracking that works out-of-the-box, covering code versions, metrics, checkpoints, and hyperparameters

Determined integrates these features into an easy-to-use, high-performance deep learning environment — which means you can spend your time building models instead of managing infrastructure.

To use Determined, you can continue using popular DL frameworks such as TensorFlow and PyTorch; you just need to update your model code to integrate with the Determined API.


💥 💥 💥 Want to learn more? Join us on the 26th at 2pm PT for our Lunch & Learn where we'll walk through how to train Facebook's DEtection TRansformers (DETR) model with Determined! 💥 💥 💥


Try out Determined Locally

Follow these instructions to install and set up docker.

# Start a Determined cluster locally
python3.7 -m venv ~/.virtualenvs/test
. ~/.virtualenvs/test/bin/activate
pip install determined-cli determined-deploy
det-deploy local cluster-up --no-gpu
## To start a cluster with GPUs, remove no-gpu flag
## Access web UI at localhost:8080. By default, "determined" user accepts a blank password.

# Navigate to a Determined example
git clone https://github.com/determined-ai/determined
cd determined/examples/computer_vision/cifar10_pytorch

# Submit job to train a single model on a single node
det experiment create const.yaml . 

Detailed Installation Guide

See our installation guide for details on how to install Determined, including on AWS and GCP.

Try Now on AWS

Try Now

Next Steps

For a brief introduction to using Determined, check out our Quick Start Guide.

To use an existing deep learning model with Determined, follow the tutorial for your preferred deep learning framework:

Documentation

The documentation for the latest version of Determined can always be found here.

Community

If you need help, want to file a bug report, or just want to keep up-to-date with the latest news about Determined, please join the Determined community!

Contributing

Contributor's Guide

License

Apache V2

Comments
  • fix: when cluster fails ensure end time of open allocation is set to the last cluster heartbeat [DET 6509]

    fix: when cluster fails ensure end time of open allocation is set to the last cluster heartbeat [DET 6509]

    Description

    fix: when cluster fails ensure end time of open allocation is set to the last cluster heartbeat [DET 6509]

    Test Plan

    Added an integration test which tests to ensure that the cluster heartbeat is added to the cluster id table accurately. It also checks whether open allocations have the correct cluster heartbeat as their endTime. The integration test covers the new functions added to postgres_cluster.go and the change made in CloseOpenAllocaitons in postgres_tasks.go. Manually check and ensure that the cluster heartbeat is getting updated every 10 minutes. This manual check will ensure that the go routine added to core.go works correctly.

    Checklist

    • [ ] User-facing API changes need the "User-facing API Change" label.
    • [ ] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
    • [ ] Licenses should be included for new code which was copied and/or modified from any external code.
    cla-signed 
    opened by nrajanee 29
  • feat: useSettings improvements

    feat: useSettings improvements

    Description

    For a time now, we've been concerned with the performance and overall stability of our useSettings hook, it has several issues with an un-synced state with the latest DB changes upon refresh, visual bugs due to state issues, lots of re-renders (although that is a deeper issue as I've noticed while working on this hook, as we have are constantly pooling from DB on multiple locations...).

    So, as proposed solutions, we've been working on revamping the useSettings to a new shape.

    The idea is to use context provider to subscribe to the settings state from DB, remove some transformations and checks that we had due to weak type structure, and use DB as a source of truth, leaving localStorage behind for good, and, finally, have a leaner and easier to understand structure.

    Test Plan

    taken from the spec docs:

    key features:

    it has to be persisted on DB and initial values should be taken from BE (if present)

    in case that there’s nothing on DB (yet) it should initialize based on the config object passed as parameter

    it should keep track of the state based on the initial configurations and sync with the query parameters, if necessary

    settings should be scoped per project id → page, meaning that I can keep several settings state per project, depending on which page uses the hook (this is supposed to be available but actually isn’t, should be doable with key manipulation)

    settings can be “global”, meaning that each account can have certain settings configurations - have noticed a few settings that actually should be global state, like sidebar collapsed, etc.

    settings can be shared via URL, meaning that the hook should be able to collect navigation data and parse any possible query parameters into valid settings, which, in turn, should be active for said route. If the user then modifies these settings they should persist to the DB as usual.

    updating any setting shouldn’t block rendering (keep DB update async)

    settings should be self-contained, meaning that each subtree shouldn’t listen for settings change other than it’s own settings

    useSettings should be able to reset the settings based on initial config

    useSettings should be able to give all active/valid scoped settings

    each entry of the config obj should be of BaseType, meaning no complex structures

    updateSettings should have a prop that allows you to specify whether you want the current state update to be a new entry in the browser history

    browser back navigation should navigate back through these updates and going forward should have a similar result.

    Commentary (optional)

    Checklist

    • [ ] Changes have been manually QA'd
    • [ ] User-facing API changes need the "User-facing API Change" label.
    • [ ] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
    • [ ] Licenses should be included for new code which was copied and/or modified from any external code.
    • [ ] If modifying /webui/react/src/shared/ verify make -C webui/react test-shared passes.
    cla-signed 
    opened by thiagodallacqua-hpe 24
  • feat: non-training-specific checkpoints

    feat: non-training-specific checkpoints

    Description

    Checkpoints in Determined were originally designed exclusively for training. We would like in the future to support checkpointing for all tasks. To get there, we need to change what data we keep about checkpoints. This change focuses on migrating existing training-specific data into checkpoint metadata, modifying how checkpoints are uploaded so that the Core API checkpointing does not feel training-specific, and updating the WebUI to reflect the new schema.

    Test Plan

    Commentary (optional)

    This upstream feature branch will include backend work, webui client-side work, and python client-side work before it is complete.

    cla-signed 
    opened by rb-determined-ai 21
  • chore: setup prettier

    chore: setup prettier

    Description

    DET-8097

    Test Plan

    • Check if all rules are good for all
    • Check if prettier works on local
    • Check if CI catches perttier error

    Commentary (optional)

    • husky and lint-staged are not included in this PR
    • I adjusted some rules to standard rules, such as space in array [ 1, ,2, 3 ] -> [1, 2, 3]
    • I respect the current ESLint rules, I tried not to remove them as much as possible

    Checklist

    • [ ] Changes have been manually QA'd
    • [ ] User-facing API changes need the "User-facing API Change" label.
    • [ ] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
    • [ ] Licenses should be included for new code which was copied and/or modified from any external code.
    • [ ] If modifying /webui/react/src/shared/ verify make -C webui/react test-shared passes.
    cla-signed shared-web 
    opened by keita-determined 20
  • fix: move deploy dependencies to `extras_require`

    fix: move deploy dependencies to `extras_require`

    Description

    Moves "deploy" dependencies to an extras section. This helps segment dependencies that may not be needed for every use case. In particular, this will help reduce the odds of determined causing dependency conflicts with other python dependencies users need to install for their project.

    Test Plan

    Update the existing tests to install the extra and run them.

    Commentary (optional)

    What pieces of the documentation need to be updated?

    Checklist

    • [ ] User-facing API changes need the "User-facing API Change" label.
    • [ ] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
    • [ ] Licenses should be included for new code which was copied and/or modified from any external code.
    opened by schmidt-jake 17
  • specification of default configs for notebooks created through the web UI

    specification of default configs for notebooks created through the web UI

    Feature request: it would be really helpful to be able to specify one or more default configs for the notebook instances that are used when launching notebooks with the button in the web UI. In our situation, being able to specify default mounts and agent labels would be helpful, since we typically mount and NFS share for notebook persistence.

    It would be really nice if we could have several presets that are selectable from the dropdown when you click on that button, similarly to how you can select "cpu only" currently.

    Work in progress 
    opened by zjorgensenbits 16
  • test clabot.

    test clabot.

    Test Plan

    • UI change:
      • [ ] add screenshots
      • [ ] React? build & check storybooks
    • [ ] user-facing api change: modify documentation and examples
    • [ ] user-facing api change: add the "User-facing API Change" label
    • [ ] bug fix: add regression test
    • [ ] bug fix: determine if there are other similar bugs in the codebase
    • [ ] new feature: add test coverage for any user-facing aspects
    • [ ] refactor: maintain existing code coverage
    Work in progress 
    opened by rb-determined-ai 16
  • chore: move task container ssh keygen to experiment

    chore: move task container ssh keygen to experiment

    When we have an experiment with thousands of trials, we generate an ssh key for each trial. This becomes very expensive, since ssh keygen is a cpu intensive operation. To remedy this, we move it to happen in the experiment and have all trials share a key.

    Test Plan

    • UI change:
      • [ ] add screenshots
      • [ ] React? build & check storybooks
    • [ ] user-facing api change: modify documentation and examples
    • [ ] user-facing api change: add the "User-facing API Change" label
    • [ ] bug fix: add regression test
    • [ ] bug fix: determine if there are other similar bugs in the codebase
    • [ ] new feature: add test coverage for any user-facing aspects
    • [ ] refactor: maintain existing code coverage
    opened by stoksc 16
  • fix: add support for float16 metrics serialization

    fix: add support for float16 metrics serialization

    Description

    Adds support for float16 metrics serialization

    Test Plan

    Commentary (optional)

    When running an experiment using keras while having enable mixed precision using tf.keras.mixed_precision.set_global_policy("mixed_float16") (TF 2.6), the metrics are serialized before being sent by socket (to the master I guess?). However, while float64 and float32 are supported, float16 is not one of them: this patch hopefully fixes it.

    Checklist

    • [X] User-facing API changes need the "User-facing API Change" label.
    • [ ] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
    • [X] Licenses should be included for new code which was copied and/or modified from any external code.
    cla-signed 
    opened by BlueskyFR 15
  • feat: add notebook idle timeout [DET-5517, DET-5519]

    feat: add notebook idle timeout [DET-5517, DET-5519]

    Description

    Idle Timeout for Notebooks.

    Test Plan

    Commentary (optional)

    Checklist

    • [ ] User-facing API changes need the "User-facing API Change" label.
    • [ ] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
    • [ ] Licenses should be included for new code which was copied and/or modified from any external code.
    Work in progress cla-signed 
    opened by naren-determined 15
  • chore: add pre-commit check for python

    chore: add pre-commit check for python

    Description

    check py files in new commits with black

    Test Plan

    Commentary (optional)

    Checklist

    • [ ] Changes have been manually QA'd
    • [ ] User-facing API changes need the "User-facing API Change" label.
    • [ ] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
    • [ ] Licenses should be included for new code which was copied and/or modified from any external code.
    cla-signed shared-web 
    opened by hamidzr 14
  • docs: fix typo in rbac docs

    docs: fix typo in rbac docs

    Description

    Test Plan

    Commentary (optional)

    Checklist

    • [ ] Changes have been manually QA'd
    • [ ] User-facing API changes need the "User-facing API Change" label.
    • [ ] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
    • [ ] Licenses should be included for new code which was copied and/or modified from any external code.

    Ticket

    cla-signed 
    opened by hamidzr 1
  • fix: add missing key

    fix: add missing key

    Description

    key was missing

    image

    Test Plan

    • Check if there's no warnings in browser console in webhook page

    Commentary (optional)

    Checklist

    • [ ] Changes have been manually QA'd
    • [ ] User-facing API changes need the "User-facing API Change" label.
    • [ ] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
    • [ ] Licenses should be included for new code which was copied and/or modified from any external code.

    Ticket

    cla-signed 
    opened by keita-determined 1
  • wip: cli error handling

    wip: cli error handling

    Description

    • add consistent error handling for cli methods.
    • change some of the usecases to demo

    Test Plan

    Commentary (optional)

    would this make sense in Python?

    Checklist

    • [ ] Changes have been manually QA'd
    • [ ] User-facing API changes need the "User-facing API Change" label.
    • [ ] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
    • [ ] Licenses should be included for new code which was copied and/or modified from any external code.

    Ticket

    cla-signed 
    opened by hamidzr 1
  • fix: validate workspace name

    fix: validate workspace name

    Description

    WEB-746

    Test Plan

    • Go to workspace
    • Check if Creating/Editing a workspace with an all whitespaced name like on webUI modal causes validation errors
    • Check if Creating/Editing a workspace with an all whitespaced name though API causes errors

    Commentary (optional)

    Checklist

    • [x] Changes have been manually QA'd
    • [x] User-facing API changes need the "User-facing API Change" label.
    • [x] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
    • [x] Licenses should be included for new code which was copied and/or modified from any external code.
    cla-signed 
    opened by keita-determined 1
  • feat: return all validation metrics

    feat: return all validation metrics

    Description

    Augment VallidationCompleted searcher event with all validation metrics (in addition to the searcher metric).

    Test Plan

    Commentary (optional)

    Checklist

    • [ ] Changes have been manually QA'd
    • [ ] User-facing API changes need the "User-facing API Change" label.
    • [ ] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
    • [ ] Licenses should be included for new code which was copied and/or modified from any external code.

    Ticket

    cla-signed 
    opened by mpkouznetsov 0
  • chore: Move '+ New Workspace' button to nav item [WEB-698]

    chore: Move '+ New Workspace' button to nav item [WEB-698]

    Description

    As requested in the ticket, remove the small '+' button from Workspaces nav header, and add a more visible '+ New Workspace' button.

    Screen Shot 2022-12-22 at 9 33 42 AM

    An unmodified + icon is too big (screenshot below) so I needed to add iconSize prop and some CSS. Here's what it looks like with the + the same size as the regular nav items:

    Screen Shot 2022-12-22 at 9 40 27 AM

    Would be open to some cleaner flexbox css or something instead of these specific css changes

    Test Plan

    • While logged into Determined, confirm a new "+ New Workspace" is added below the list of pinned workspaces.
    • Click the "New Workspace" button to create a workspace.
    • Check code that visibility of this button depends on canCreateWorkspace

    Checklist

    • [x] Changes have been manually QA'd
    • [x] User-facing API changes need the "User-facing API Change" label.
    • [x] Release notes should be added as a separate file under docs/release-notes/. See Release Note for details.
    • [x] Licenses should be included for new code which was copied and/or modified from any external code.
    cla-signed 
    opened by mapmeld 10
Releases(0.19.9)
  • 0.19.9(Dec 20, 2022)

    Release Notes

    0.19.9

    Changelog

    • 6bd5f074c chore: bump version: 0.19.9-rc3 -> 0.19.9
    • 04f9fc7e1 docs: add release notes for 0.19.9 (#5632)
    • b9c2752d3 chore: bump version: 0.19.9-rc2 -> 0.19.9-rc3
    • 2330c977c fix: WorkspaceProjects dropdown (#5628)
    • e329ed54e chore: bump version: 0.19.9-rc1 -> 0.19.9-rc2
    • 45ef5f92f fix: ensure current user available in store when reloading app (#5595)
    • 440880f6e chore: bump version: 0.19.9-rc0 -> 0.19.9-rc1
    • 72167da94 fix: Tensorboard launch from experiment multi-select (#5610)
    • 499665a99 fix: hide Active Command UI unless it can be populated from API (#5600)
    • 2babd446f fix: closing jupyterlab modal updates url (#5599)
    • 3978d6ff0 fix: Cannot close workspace modal (#5598)
    • 7955b5405 chore: bump version: 0.19.9-dev0 -> 0.19.9-rc0
    • 60e63d7b9 chore: lock api state for backward compatibility check
    • 28eaeb7be refactor: break out Kubernetes resource pool object [DET-8709] (#5589)
    • 909f8ff06 feat: Add link to fork / continued trial [WEB-296] (#5588)
    • 32a34bb3c feat: sharded checkpoints (#5489)
    • 2696b691e chore: split knownroles from store [WEB-589] (#5580)
    • 8c055dbf8 ci: Revert "ci: temporarily disable test results upload. (#5592)"
    • 514eb2c89 chore: check and format examples (#5587)
    • a17a251d5 chore: eliminate webui warnings [WEB-666] (#5586)
    • bac53c92f ci: temporarily disable test results upload. (#5592)
    • 96992d4d8 chore: docs for launcher-provided pools (#5566)
    • 8e7f30e84 ci(circle-ci): install gke auth plugin (#5575)
    • 0eb07b3ff fix: workspace member (#5579)
    • c3d10aeb1 chore: update Kubernetes fetch-creds.sh to handle cert file path (#5524)
    • bfc506734 docs: Minor Enroot doc fixes (#5585)
    • 9d3828920 feat: Display experiment total checkpoint size [WEB-298] (#5554)
    • c7d1e6241 chore: split omnibar state out of store (#5553)
    • 33c9d2c4d docs: Increase Slurm dependency to 20.02 [FOUNDENG-377] (#5583)
    • 4efa6a2e7 chore: Remove storybook (#5576)
    • 66de8c4b5 fix: Report incorrect configuration message to stderr [FOUNDENG-370] (#5578)
    • d60d1a047 chore: Pinned workspaces in stores/workspaces [WEB-586] (#5522)
    • e05d17dbc chore: add structured task and trial logs (#5569)
    • a038c8fa2 ci: comment out the Ruby-based markdownlint checks (#5577)
    • ead8c10d6 feat: fix spinner after logout (#5572)
    • e377d2fb3 chore: inherit PBS & Slurm defaults (#5560)
    • 0f18bb8f9 chore: Remove check for deprecated init permissions (#5571)
    • e9365d407 feat: moved users and auth away from the Store (#5539)
    • 485971b5e feat: copy changes (#5561)
    • d1e2e8ea7 ci: fix TestRWCoordinatorLayer flake (#5563)
    • 7c58a72f6 chore: Split UserRoles out of Store (#5549)
    • 9899398c9 refactor: rewrite fluent lib without actors [DET-8303] (#5385)
    • 1854b050d fix: ContainerLog.ToEvent honors Level [FOUNDENG-370] (#5558)
    • 05c27544a chore: fix mypy false positive (#5559)
    • f7a90e2d7 chore: remove unused file (#5557)
    • c32ac364d chore: Split determined info from Store [WEB-584] (#5541)
    • b18d7dd3d fix: jupyter modal stalling and garbling input [WEB-579] (#5556)
    • bba309647 chore: remove unused setAgents store action (#5551)
    • 59950b80d chore: fix potential named statement leak (#5320)
    • 131bfb20b feat: add activity table and post user activity api [WEB-665] (#5518)
    • b35d4707e fix: group management edit and delete options (#5550)
    • 27e86ccd0 docs: NVIDIA Enroot support [FOUNDENG-329] (#5546)
    • 1856fd00b chore: update gpt-neox example and improve deepspeed launcher (#5527)
    • cd0aac880 chore: filter workspaces by userId (#5529)
    • 6bb1b3555 fix: use bash in bash script (#5540)
    • 3736b2c1d chore: split resource pool from Store Context (#5543)
    • 023cd8b72 chore: Update resource pool resolution & validation (#5538)
    • a2cc1112c chore: Moving active experiment and task queries to separate stores [WEB-581] [WEB-582] (#5521)
    • 1ae77fd5d feat: Oauth2 in Python SDK [DET-8504] (#5422)
    • 737371385 chore: schemas implement generic-like behavior (#3763)
    • 336ea28e1 chore: delete try_reauth from authentication logic (#5542)
    • cd060ada4 ci: add markdownlint and json schema (#5545)
    • cec6348fc chore: transformer image installs scikit-learn (#5547)
    • b81da3b42 chore: add Session.with_retry() (#5535)
    • 7695e3590 refactor: rewrite container lib without actors [DET-8301] (#5384)
    • 9265ba104 fix: Enable tasks filter by user [WEB-675] (#5525)
    • af269e5c5 chore: bump version: 0.19.8-dev0 -> 0.19.9-dev0
    • 7f741f421 docs: add release notes for 0.19.8 (#5537)
    • e07330e8e ci: move useful weekly tests to master branch, remove the rest [DET-8725] (#5530)
    • 4161fc622 fix: workspace list pagination (#5534)
    • c1a237f84 fix: sandbox experiment list settings for each project (#5531)
    • ca31dcca6 chore: bump antd (minor version update) (#5475)
    • 6a392b9ce feat: open tasks in existing tab [WEB-420] (#5528)
    • f60e8f492 fix: avoid error converting user requested stop for custom searcher (#5520)
    • 602816400 feat: add icons in experiment and trial detail pages (#5512)
    • aeb35f695 feat: WebUI support user agent group when creating user [WEB-638] (#5446)
    • 67010f7e2 test: speed up nightly test_protein_pytorch_geometric. (#5513)
    • 58de4d410 fix: Workspaces list, prevent settings update loop (#5519)
    • 16227106d feat: migrate projects to use store/context (#5509)
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.19.9_checksums.txt(752 bytes)
    determined-agent_0.19.9_darwin_amd64.tar.gz(9.25 MB)
    determined-agent_0.19.9_linux_amd64.deb(8.89 MB)
    determined-agent_0.19.9_linux_amd64.rpm(8.85 MB)
    determined-agent_0.19.9_linux_amd64.tar.gz(8.84 MB)
    determined-agent_0.19.9_linux_ppc64.deb(7.78 MB)
    determined-agent_0.19.9_linux_ppc64.rpm(7.73 MB)
    determined-agent_0.19.9_linux_ppc64.tar.gz(7.72 MB)
    determined-helm-chart_0.19.9.tgz(10.95 KB)
    determined-master_0.19.9_checksums.txt(759 bytes)
    determined-master_0.19.9_darwin_amd64.tar.gz(42.19 MB)
    determined-master_0.19.9_linux_amd64.deb(86.79 MB)
    determined-master_0.19.9_linux_amd64.rpm(86.69 MB)
    determined-master_0.19.9_linux_amd64.tar.gz(42.14 MB)
    determined-master_0.19.9_linux_ppc64.deb(84.11 MB)
    determined-master_0.19.9_linux_ppc64.rpm(83.87 MB)
    determined-master_0.19.9_linux_ppc64.tar.gz(39.32 MB)
  • 0.19.8(Dec 3, 2022)

    Release Notes

    0.19.8

    Changelog

    • 5eb5b45a4 chore: bump version: 0.19.8-rc2 -> 0.19.8
    • 95c6ef40f docs: add release notes for 0.19.8 (#5537)
    • 5c4c6002b chore: bump version: 0.19.8-rc1 -> 0.19.8-rc2
    • e8ea6b648 fix: sandbox experiment list settings for each project (#5531)
    • 0e8d941f2 chore: bump version: 0.19.8-rc0 -> 0.19.8-rc1
    • bcfca6ae5 fix: Workspaces list, prevent settings update loop (#5519)
    • b8cb0c162 feat: migrate projects to use store/context (#5509)
    • b2e7c4ed8 chore: bump version: 0.19.8-dev0 -> 0.19.8-rc0
    • 682745a51 chore: lock api state for backward compatibility check
    • 6b88d7176 chore: add pre-commit check for python (#5392)
    • 1c0677440 feat: det deploy aws enterprise edition (#5516)
    • 17a65b3c0 feat: retry transient failures on det experiment wait (#5393)
    • b0a3f29a6 fix: nightly test for custom searcher (#5515)
    • 2c3a54ed6 fix: Create workspace as non-admin [WEB-691] (#5517)
    • fd98a7dfa docs: Document PBS CUDA_VISIBLE_DEVICES requirement [FOUNDENG-359] (#5514)
    • 63e502a12 docs: Document PBS CUDA_VISIBLE_DEVICES requirement [FOUNDENG-359] (#5511)
    • 41dc8c16c chore: add tensorboardTimeout to Helm chart [DET-8716] (#5500)
    • 533171996 chore: changes k8s unable to find exit status from 137 -> 1025 [DET-8717] (#5507)
    • c2d050c35 chore: minor fixups (#5510)
    • e930e2e7c chore: bump Prettier v2.8.0 (#5505)
    • 071b6ef13 fix: rbac describe role fails with --json flag (#5506)
    • 2e492cb9a test: fix model registry test. (#5504)
    • 1a62e3d2a ci: fix test_workspace_org (#5503)
    • 5febd9826 fix: cli and sdk logout works without login (#5493)
    • 90f18a26d fix: Increase experiment icon font weight (#5488)
    • 38ead9830 chore: remove unnecessary cleanup (#5498)
    • 1e3041cfe feat: Add warning for submissions for requests that cannot currently be fulfilled [DET-6410] (#5376)
    • 09f3781f6 fix: aws instance id under IMDSv2. (#5494)
    • dd761a68a docs: Slurm/PBS override of default resource pools (#5496)
    • 5b3d8aece feat: Add slurm.gpu_type to expconf [FOUNDENG-338] (#5492)
    • 57dc77c11 chore: container defaults for PBS/slurm (#5485)
    • 3da4a456e chore: configure HPC resource pool providers (#5479)
    • e5d547750 fix: GetModelVersion lookup by version number. (#5408)
    • 1d8cda210 ci: disable Tensorflow 2.5-2.7 tests. (#5487)
    • 65562d640 ci: bump openjdk version. (#5490)
    • 95f704a00 chore: Move workspaces to their own store [WEB-674] (#5484)
    • 1c883d9ec refactor: restructure rm package (#5457)
    • 8fc3ce35a fix: Ensure authorized_keys permisison [FOUNDENG-342] (#5454)
    • 9dd5829af ci: fix rounding that created invalid timestamppb.Timestamps (#5482)
    • a0652de19 fix: solve flakiness for the Spinner tests (#5483)
    • 7e02d2bea fix: Experiment listing should not have spurious state triggers (#5477)
    • ecdb07369 fix: metrics selection (#5480)
    • 17c444de2 chore: tighten py bindings to_json advertised type (#5450)
    • f2fe59348 ci: bump python 3.7.11 -> 3.7.15. (#5476)
    • 8957c026b feat: useSettings improvements (#5187)
    • d85ee6c57 chore: more pbsbatch_args usage (#5458)
    • 8e317cb3e fix: trial comparison modal style (#5467)
    • 8d349cc41 fix: handle 'null'::jsonb when aggegrating resource size in proto_get_trials_plus.sql (#5464)
    • c05a5e2ad perf: remediate issues with cast + proto_checkpoints_view (#5465)
    • f55aa8655 chore: replace deprecated scss (#5460)
    • 2384e5681 chore: remove skipLibCheck (#5459)
    • 13fa0c1b7 refactor: rewrite docker lib without actors [DET-8300] (#4943)
    • 7832a93ef style: remove wall of echos from shell script (#5456)
    • fb4240b36 refactor: rewrite websocket lib without actors [DET-8299] (#5382)
    • 118777a1a fix: WebUI project loading msg (#5439)
    • 662a023ab fix: note page button issues (#5453)
    • ec01546a4 feat: WebUI use icon for experiment/trial state [WEB-237] (#5373)
    • 81031757f string interpolation log (#5455)
    • e0b95727a fix: Jupyter notebook iframes (#5434)
    • 1a79b1a70 fix: Revert changes to rankId and other numeric filters (#5452)
    • 931f79c9e feat: enable limited core api usage in NTSC (#5451)
    • b279bb5b0 feat: user mgmt functions in SDK + new user API from old API + CLI uses SDK [DET-8495, DET-8496] (#5206)
    • 08f40afd2 chore: bump Typescript (#5444)
    • 781e7543d chore: Add expconf slurm.gpu_type [FOUNDENG-338] (#5448)
    • c7d6923d5 chore: FOUNDENG-296 Determined shows PULLING state while container is spinning up (#5443)
    • 516f2afcc fix: editing projects in table [WEB-641] (#5445)
    • 75e8214dc docs: Document Singularity cache managment script [FOUNDENG-333] (#5437)
    • 3c82434bb feat: remove checkpoints from multi trial tabs [WEB-639] (#5440)
    • 4710747ba fix: remove noImplicitAny (#5416)
    • c4fe41034 fix: experiment with null notes can't load (#5429)
    • aa94e65dc revert: Add webhooks_base_url to det deploy (#5432)
    • dc93b519b fix: timeAgo tooltip position (#5419)
    • 2aba2a468 fix: sync modal data when cancel (#5433)
    • df231ecad ci: bump resource class to avoid test-intg-master flakes (#5438)
    • cb7e0826b feat: Trial and Task log filters [WEB-239] [WEB-240] (#5420)
    • 4356faee0 Revert "chore: Revert content-security-policy change" (#5424)
    • c19e23f5f chore: make patch user atomic [DET-8659] (#5428)
    • bf02b05a4 feat: Add webhooks_base_url to det deploy (#5354)
    • 5a476ac60 feat: master and agent instances created with IMDSv2 support [DET-7987] (#5421)
    • 43e176e0a chore: bump version: 0.19.7-dev0 -> 0.19.8-dev0
    • 1b3cb5522 docs: add release notes for 0.19.7 (#5425)
    • ff7ad9e7d chore: Revert content-security-policy change
    • 63dc63a22 chore: Add content-security-policy via meta tag, webpack plugin [WEB-310] (#5414)
    • 5a3acc18b fix: fix copy button (#5417)
    • 75969d3ec fix: better name for hermesfilters; listen for onReset for filters (#5405)
    • bf0d2a8a5 fix: WebUI remove checkpoint storage config (#5409)
    • f29cfd0e0 fix: avoid panic in getTasks (#5412)
    • a453ed552 fix: clear table settings between users and groups pages (#5415)
    • e1270e5fd fix: adds user to group when create group through external token (#5413)
    • cbd92d332 fix: workspace header in mobile view (#5406)
    • 6fa66cabd chore: migration for adding newly created scim users to usergroups (#5410)
    • fb10b6fc9 build: lint and fmt shell scripts [DET-7566] (#5389)
    • 384a2093a fix: create personal groups for other ways a user can be created (#5407)
    • 9933a464e fix: Avoid writing to users known_hosts [FOUNDENG-314] (#5403)
    • 8ba901e1f chore: remove unused slurm options validation (#5401)
    • 87982d088 fix: handle webhook testing errors (#5404)
    • 3e5492646 fix: Skip role fetch when rbac not enabled (#5402)
    • e6d5f6d29 fix: sort by log severity level (#5390)
    • 5c7d5c7a3 fix: update webhook payload check in e2e test (#5377)
    • ccb1e9535 ci(test-e2e): add e2e test mark as cluster label (#5368) [INFENG-5]
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.19.8_checksums.txt(752 bytes)
    determined-agent_0.19.8_darwin_amd64.tar.gz(9.24 MB)
    determined-agent_0.19.8_linux_amd64.deb(8.87 MB)
    determined-agent_0.19.8_linux_amd64.rpm(8.84 MB)
    determined-agent_0.19.8_linux_amd64.tar.gz(8.84 MB)
    determined-agent_0.19.8_linux_ppc64.deb(7.78 MB)
    determined-agent_0.19.8_linux_ppc64.rpm(7.72 MB)
    determined-agent_0.19.8_linux_ppc64.tar.gz(7.71 MB)
    determined-helm-chart_0.19.8.tgz(10.95 KB)
    determined-master_0.19.8_checksums.txt(759 bytes)
    determined-master_0.19.8_darwin_amd64.tar.gz(42.11 MB)
    determined-master_0.19.8_linux_amd64.deb(86.63 MB)
    determined-master_0.19.8_linux_amd64.rpm(86.52 MB)
    determined-master_0.19.8_linux_amd64.tar.gz(42.05 MB)
    determined-master_0.19.8_linux_ppc64.deb(83.95 MB)
    determined-master_0.19.8_linux_ppc64.rpm(83.72 MB)
    determined-master_0.19.8_linux_ppc64.tar.gz(39.24 MB)
  • 0.19.7(Nov 15, 2022)

    Release Notes

    0.19.7

    Changelog

    • b6c8b9490 chore: bump version: 0.19.7-rc2 -> 0.19.7
    • 806c03381 docs: add release notes for 0.19.7 (#5425)
    • 95c3c48a5 chore: bump version: 0.19.7-rc1 -> 0.19.7-rc2
    • a5227912d fix: fix copy button (#5417)
    • 6fe00372d fix: better name for hermesfilters; listen for onReset for filters (#5405)
    • 8c5a63ee5 fix: WebUI remove checkpoint storage config (#5409)
    • 3de9bdb87 fix: Skip role fetch when rbac not enabled (#5402)
    • 64ccb0a92 fix: avoid panic in getTasks (#5412)
    • 51f2e98cb fix: adds user to group when create group through external token (#5413)
    • 3d88e34b4 chore: bump version: 0.19.7-rc0 -> 0.19.7-rc1
    • 1d8f46152 chore: migration for adding newly created scim users to usergroups (#5410)
    • 707878667 fix: create personal groups for other ways a user can be created (#5407)
    • ef75386c7 fix: workspace header in mobile view (#5406)
    • 817abe3af fix: update webhook payload check in e2e test (#5377)
    • 3bdad4176 chore: bump version: 0.19.7-dev0 -> 0.19.7-rc0
    • b2746c04b chore: lock api state for backward compatibility check
    • e1b26a5f8 feat: WebUI expand workspace configuration [WEB-33] (#5396)
    • fb5ea8b22 chore: Rbac oss audit logging [DET-8309] (#5378)
    • 76fb4f72f fix: redirect url to login page (#5399)
    • 0075d014e chore: pause shared-web ci checks (#5388)
    • b149ee760 ci: increase RAM for master build (#5397)
    • 822692539 chore: Move Agents into their own context, introduce Loadable abstraction (#5358)
    • 20f548d19 fix: Workspaces member page refetch after update [WEB-621] (#5391)
    • 5db6ef49a chore: cli allow workspace --checkpoint-storage-config-file to use yaml (#5394)
    • cd6c3cd5e fix: typescript inside keys function (#5383)
    • 2c08b9986 fix: group together 4xx and 5xx HTTP responses for prometheus [DET-8336] (#4996)
    • b81c4b44e ci(remote-docker): pin version to 20.10.18 (#5369)
    • 85536a778 ci: add pre-commit execution action (#5379)
    • 264f48925 docs: RBAC v1. (#5298)
    • 892ef4afa ci: enable yamllint [INFENG-116] (#5359)
    • 282ddc05a fix: TorchWriter.reset() closes SummaryWriter (#5375)
    • f9d8cfe2f fix: enter to submit (#5374)
    • 6472abe6f chore: support out-of-k8s dev workflow (#5237)
    • bbebe348a chore: generate PATCH-friendly bindings (#5325)
    • 3221e4e39 fix: allow workspace admins to change role assignments in webui. (#5371)
    • cb7156a53 feat: checkpoint storage config per workspace [DET-8350] (#5309)
    • 83b201d69 ci: restrict token perms & change head ref
    • 288c69c2e fix: improve previous tag logic (#5367)
    • 44000a644 fix: validate username and display name in PostUser. (#5366)
    • 1475136fe ci: remove -f from remote
    • 62c6e3e17 ci: add throwaway user information for rebase
    • 23b34cbb7 ci: use head_ref for rebase if available; else ref
    • 752c7fa32 ci: switch rebase check to use https
    • 3a26aef52 ci: Move conditional down to steps
    • 7231c1e84 ci: Switch checkout method [INFENG-98] (#5364)
    • 2ebf33666 chore: Enable library checks for TypeScript; update types [WEB-191] (#5360)
    • 68b620533 docs: remove references to PBT (#5257)
    • c8cfcaa67 ci: add more credentials [INFENG-98] (#5363)
    • 65f6df309 fix: Shiyuan's suggestion to release note (#5362)
    • f16095d31 ci: fix action env vars
    • 342e5d8ca ci: Add rebase check on PR/main push [INFENG-98] (#5190)
    • 774275ca3 Revert "fix: shiyuan's change to custom searcher release note"
    • f3c6dbec7 Revert "lint"
    • b902fd46d lint
    • b2b3be5ff fix: shiyuan's change to custom searcher release note
    • 540d486c7 ci: Fix previous tag logic (#5103)
    • c82d0afb1 fix: Group management page role edit [WEB-544] (#5339)
    • 572924719 chore: rbac refactor authz iface (#5343)
    • 149f9d9b6 Add release notes header with link [INFENG-115] (#5327)
    • 1ad39c34a chore: bump version: 0.19.6-dev0 -> 0.19.7-dev0
    • fab75ccef docs: add release notes for 0.19.6 (#5357)
    • f2c051b7f feat: remove webhooks feature flag [WEB-560] (#5345)
    • 28c10ed71 docs: fix tutorial link (#5355)
    • 5adb2e27c refactor: rbac protos: move from is_global to scope type masks. (#5346)
    • e4a9e021c fix: signed payload generation (#5353)
    • 3c8b2f11e chore: Replace react notebook library with notebookjs [WEB-77] (#5284)
    • 6e19529b4 feat: add stable diffusion textual inversion example (#5280)
    • de0272d5c chore: replace enum with object as const (#5348)
    • 761cf7295 fix: overflow of long names without spaces [WEB-485] (#5351)
    • c9ddf7731 fix: fix roll polling in non rbac instances (#5349)
    • 781716fc2 chore: add field for internal web UI use to identify product context (#5342)
    • 0d4281e16 chore: introduce useui as a separate store [DET-8575] (#5338)
    • f3319c0e6 chore: add style lint fix (#5337)
    • f0dbef885 docs: Explicitly document Apptainer as supported [FOUNDENG-294] (#5344)
    • 2747638f0 feat: Create Webhook Sender [WEB-213] (#5258)
    • 623976d12 feat: when user can't create workspaces, show a disabled button [WEB-553] (#5341)
    • 2f5a5ab9c fix: duplicated messages logged at INFO and ERROR levels. (#5323)
    • 0a651a3af fix: add and edit workspace member UI [WEB-552] (#5335)
    • f41a90d8f chore: don't tell users to contact admin for wrong password or wrong username (#5336)
    • 3ce6cd777 chore: HPC job ID is logged in experiment log. (#5315)
    • aaba9069b feat: Non-global permissions assignable globally [WEB-539] (#5324)
    • 9f76ecbc3 feat: User can change their own username [WEB-238] (#5304)
    • 6a663c530 fix: improve chart interactivity (#5326)
    • 5f8c3e285 chore: simplify ptrs.Ptr (#5330)
    • 0370f546d fix: login loading (#5334)
    • 6ec84d105 chore: log error when experiment is unrestorable (#5333)
    • 02c5ab256 fix: HP Parallel Coordinates remembers filters as data comes in [WEB-279] (#5243)
    • 99827d79d fix: exclude allocations without start times from aggregation (#5329)
    • 9cf034b0c fix: account for incompatible pbs-related expconf change (#5332)
    • b646464a0 feat: support streaming logs in Python SDK [MLG-46] (#5174)
    • b70f8a483 chore: delete dead code to fix flake8 (#5331)
    • 078017823 fix: Poll roles, remove canGetPermissions permission (#5322)
    • 1f1fd81e5 chore: change list-users-roles and list-groups-roles to return assignment info (#5286)
    • 45d30bcaa feat: implement feature switch [WEB-535] (#5310)
    • 23d951a5d docs: Notification doc zapier [WEB-216] (#5305)
    • 039fd0d7e docs: slots_per_nodes for PBS & Slurm (#5314)
    • 9edc784db docs: Add PodMan requirements and known issues [FOUNDENG-289] (#5311)
    • ea333d6ce fix: move permission denied error to not encounter hash error (#5319)
    • 6c814bedc chore: Revert "fix: remove duplicate event message." (#5316)
    • e4b1494f4 chore: add FieldMask type to apiutils (#5247)
    • 677dbc956 test: fix flaky custom searcher test (#5317)
    • 2a76a70ed feat: actually support --device strings (#5287)
    • 722dfdb34 fix: dont trigger loading state on experiment selection (#5299)
    • 58d2a2793 chore: replace enum with object as const (#5308)
    • 1ef521062 fix: layout of logview (#5307)
    • c711c4321 ci: fix missing say command (#5313)
    • 13760a39f fix: tweak live docs server. (#5293)
    • b3ff34c8e fix: remove duplicate event message. (#5252)
    • 2d4eff4f4 feat: Changing and removing roles from workspace members list (#5283)
    • 4fae14f48 fix: remove compare action for experiments (#5302)
    • 7f33e1558 fix: dont show loading state when polling trial details (#5301)
    • 5ae91f179 chore: API and DB error for invalid input (#5212)
    • bdf5b3ec5 fix: UI improvement when no permission [WEB-532] (#5274)
    • 2dd8e71d9 fix: det rbac describe-role list global assignments for users. (#5297)
    • 10a866f8a fix: dont show scrollbars in table cells (#5290)
    • 02511f2e9 chore: take out manual shared testing instructions (#5270)
    • f40c99255 chore: Revert "chore: update to Stylelint v14 (#5238)" (#5296)
    • e0b4b5447 fix: Avatar text color change (DET-8237) (#5246)
    • 515d25b0d fix: account for incompatible expconf change (#5292)
    • ae4282eb4 docs: add documentation for setting up Slack Webhooks [WEB-215] (#5278)
    • 1a2acad37 fix: fetchMyRoles on login, only if rbac is on (#5291)
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.19.7_checksums.txt(752 bytes)
    determined-agent_0.19.7_darwin_amd64.tar.gz(9.24 MB)
    determined-agent_0.19.7_linux_amd64.deb(8.87 MB)
    determined-agent_0.19.7_linux_amd64.rpm(8.84 MB)
    determined-agent_0.19.7_linux_amd64.tar.gz(8.83 MB)
    determined-agent_0.19.7_linux_ppc64.deb(7.78 MB)
    determined-agent_0.19.7_linux_ppc64.rpm(7.72 MB)
    determined-agent_0.19.7_linux_ppc64.tar.gz(7.71 MB)
    determined-helm-chart_0.19.7.tgz(10.81 KB)
    determined-master_0.19.7_checksums.txt(759 bytes)
    determined-master_0.19.7_darwin_amd64.tar.gz(42.03 MB)
    determined-master_0.19.7_linux_amd64.deb(86.42 MB)
    determined-master_0.19.7_linux_amd64.rpm(86.31 MB)
    determined-master_0.19.7_linux_amd64.tar.gz(41.95 MB)
    determined-master_0.19.7_linux_ppc64.deb(83.76 MB)
    determined-master_0.19.7_linux_ppc64.rpm(83.53 MB)
    determined-master_0.19.7_linux_ppc64.tar.gz(39.17 MB)
  • 0.19.6(Oct 28, 2022)

    Release Notes

    0.19.6

    Changelog

    • 12adc4752 chore: bump version: 0.19.6-rc6 -> 0.19.6
    • 2766fd1e7 docs: add release notes for 0.19.6 (#5357)
    • 4cb7414fa chore: bump version: 0.19.6-rc5 -> 0.19.6-rc6
    • ad4aca79c chore: bump version: 0.19.6-rc4 -> 0.19.6-rc5
    • 54be3a640 chore: bump version: 0.19.6-rc3 -> 0.19.6-rc4
    • 68013ba8a chore: bump version: 0.19.6-rc2 -> 0.19.6-rc3
    • 1d124b164 fix: account for incompatible pbs-related expconf change (#5332)
    • 5ecd13659 chore: bump version: 0.19.6-rc1 -> 0.19.6-rc2
    • a19bbf0b9 fix: dont trigger loading state on experiment selection (#5299)
    • 05ad8b6a0 fix: layout of logview (#5307)
    • 846e3e3c3 fix: remove compare action for experiments (#5302)
    • 96cde3940 fix: dont show loading state when polling trial details (#5301)
    • 89a3b60d8 chore: bump version: 0.19.6-rc0 -> 0.19.6-rc1
    • b42a431f9 chore: Revert "chore: update to Stylelint v14 (#5238)" (#5296)
    • 88f75e333 fix: dont show scrollbars in table cells (#5290)
    • 073d79a66 fix: fetchMyRoles on login, only if rbac is on (#5291)
    • 40dbeacf9 fix: account for incompatible expconf change (#5292)
    • 4f849d6d9 chore: bump version: 0.19.6-dev0 -> 0.19.6-rc0
    • ba7ae157f feat: Trials Comparison Frontend (#4820)
    • 7c20754b7 chore: update to Stylelint v14 (#5238)
    • d140af079 fix: keyboard scrolling in logs [WEB-463] (#5266)
    • fc05dfd80 fix: lockdown NTSC EE rebase import cycle (#5285)
    • 60e5fe145 feat: custom searcher (#4424)
    • 4b8c94f0b docs: Add PBS documentation [FOUNDENG-184] (#5232)
    • 013b3d9bc docs: external notification general doc [DET-219] (#5244)
    • 020d23aa0 fix: remove =true from sso url querystring (#5282)
    • 6c54d3809 chore: improve project not found error message in create experiment (#5277)
    • 36e4b3569 fix: position resize shadow under cursor at start (#5242)
    • dd3e39d80 feat: support configuring agent self-shutdown for det deploy aws (#5241)
    • 644e747df fix: Wire up user edit modal to rbac endpoints (#5261)
    • 340c8a1c1 chore: remove unnecessary getPath usage (#5272)
    • 5db81f6e1 refactor: Add ability to lockdown NTSC [DET-8276, DET-7377] (#5260)
    • a487eab7d refactor: authz checkpoints [DET-8533] (#5233)
    • 3ea55a7ef fix: add version check for protoc to prevent confusion (#5271)
    • 7ac9e7e1a fix: workspace page tab route (#5269)
    • c4b7b5018 chore: add to git blame ignore list (#5262)
    • 7b264a3c8 fix: Workspace member modal handling of groups and roles (#5268)
    • aa9f1ba86 chore: make web_lint_check.py executable (#5265)
    • 781bff654 fix: fix getPath (#5256)
    • 21355bf32 fix: check target user, allow \wadmin\w names but not admin itself [WEB-529] (#5251)
    • 2fee696da feat: page listing Webhooks [from main repo] [WEB-211] (#5259)
    • ad8e6ef71 fix: selector fixes in log viewer (#5192)
    • b3f253f3c docs: fix fypos, grammar edits for clarification, etc. (#5186)
    • 67890df6f fix: Load workspace members from RBAC API (#5253)
    • 81ac80478 chore: PBS & Slurm options for commands (#5219)
    • cef943026 fix: decorate PBS & slurm options with ,omitempty (#5230)
    • 8cf354c60 feat: support -i/--include [MLG-194] (#5193)
    • 77ff49a35 chore: bump version: 0.19.5-dev0 -> 0.19.6-dev0
    • 5ce358ca8 docs: add release notes for 0.19.5 (#5227)
    • 794742b55 build(deps): bump amannn/action-semantic-pull-request from 4 to 5 (#5239)
    • 79a163078 feat: ls for all applicable cli options [MLG-193]
    • c5b535b7d fix: trial workloads sort and filter (#5228)
    • 7096deb14 fix: Roles should be fetched immediately when logging in (#5229)
    • b32039d77 chore: update to React18 (#5226)
    • f88fddc03 fix: register checkpoint "new model" workflow [WEB-499] (#5215)
    • 8ee7e8c26 perf: remove unused getExpValHistory calls (#5172)
    • 7e7cd9643 refactor: Authz tasks [DET-8367] (#5209)
    • dc6160874 fix: hide uncategorized from nav, when it is not available to RBAC user (#5203)
    • 10f5d7bae feat: Add CAN_EDIT_WEBHOOKS permission to pre-canned admin role [WEB-218] (#5200)
    • 552bef9c4 chore: update to react-router-dom v6 (#5222)
    • fccaecac6 test: enables linting for master integration tests (#5223)
    • b35a55970 fix: correctly aggregate allocation resources by agent label. (#5214)
    • b9803c0d3 chore: Round Robin scheduler message (#5216)
    • cfdd7bb0d fix: custom error requires permission object (#5221)
    • bdb41a824 build: fix check if test-intg-downstream could be skipped (#5115)
    • 999eb800e feat: add echo authentication by default (#5008)
    • 906356ead chore: new react-router components (#5166)
    • d285e8fd7 chore: add custom permission [DET-8526] (#5195)
    • b5d018541 fix: replace shell script with python script for Pre-Commit (#5220)
    • 98ef84e97 fix: fix the error message for auto checkpoint download (#5201)
    • 4806f1f74 ci: only run lint in package-and-push for release workflows. (#5191)
    • c091de954 chore: revert "fix: use theme var in Avatar stylesheet [DET-8237] (#5109)" (#5211)
    • b8155bdda fix: table ui follow up (#5207)
    • 8f01284da fix: only call fetchKnownRoles in RBAC, closing testing issue (#5210)
    • 78b2b9dde refactor: authz allocations [DET-8366. DET-7971] (#5178)
    • 8a7e7999a chore: usePolling on fetchMyRoles (#5204)
    • 48ac9af20 fix: RBAC calls getPermissionsSummary, changes to admin's use of listRoles [WEB-517] (#5185)
    • cf2df493a chore: remove accidentally added attributions file (#5202)
    • 20fde081a fix: resolve table UI issues (#5199)
    • 08bab936f fix: remove unnecessary loading state (#5181)
    • 37fe428ef fix: page transition in multi trial page (#5198)
    • 106e2313e feat: rbac authz refactor for user groups [DET-8478] (#5136)
    • 522d192ed fix: avoid using lookbehind regex (#5197)
    • 40b3739bf fix: stop infinite rerender in move experiment modal (#5194)
    • 0d608e558 fix: check for horovod backend in PyTorchTrial (#5180)
    • 0429c648c ci: bump cache buster version (#5189)
    • 3cecf5014 feat: Add Webhook CRUD API [WEB-212] (#5175)
    • 8f01b4611 feat: enable configuring the agent to shut down on connection failure (#5044)
    • cad902c7f feat: support security.authz config option in helm chart. (#5183)
    • 5d5be28bd chore: refactor slurm/pbs options to expconf (#5150)
    • 1703adfeb fix: do not publish helm chart for *-rc releases [DPS-260] (#5182)
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.19.6_checksums.txt(752 bytes)
    determined-agent_0.19.6_darwin_amd64.tar.gz(9.24 MB)
    determined-agent_0.19.6_linux_amd64.deb(8.87 MB)
    determined-agent_0.19.6_linux_amd64.rpm(8.84 MB)
    determined-agent_0.19.6_linux_amd64.tar.gz(8.84 MB)
    determined-agent_0.19.6_linux_ppc64.deb(7.78 MB)
    determined-agent_0.19.6_linux_ppc64.rpm(7.72 MB)
    determined-agent_0.19.6_linux_ppc64.tar.gz(7.71 MB)
    determined-helm-chart_0.19.6.tgz(10.81 KB)
    determined-master_0.19.6_checksums.txt(759 bytes)
    determined-master_0.19.6_darwin_amd64.tar.gz(41.97 MB)
    determined-master_0.19.6_linux_amd64.deb(83.55 MB)
    determined-master_0.19.6_linux_amd64.rpm(83.43 MB)
    determined-master_0.19.6_linux_amd64.tar.gz(41.91 MB)
    determined-master_0.19.6_linux_ppc64.deb(80.88 MB)
    determined-master_0.19.6_linux_ppc64.rpm(80.62 MB)
    determined-master_0.19.6_linux_ppc64.tar.gz(39.10 MB)
  • 0.19.5(Oct 11, 2022)

    Release Notes

    0.19.5

    Changelog

    • 1d0fe5906 chore: bump version: 0.19.5-rc2 -> 0.19.5
    • c8b90b294 docs: add release notes for 0.19.5 (#5227)
    • 56372bb8b fix: fix the error message for auto checkpoint download (#5201)
    • 0e9f57bda chore: bump version: 0.19.5-rc1 -> 0.19.5-rc2
    • c3248baef chore: revert "fix: use theme var in Avatar stylesheet [DET-8237] (#5109)" (#5211)
    • 23c2acb57 fix: table ui follow up (#5207)
    • da655cb79 chore: bump version: 0.19.5-rc0 -> 0.19.5-rc1
    • 1ce54657c fix: resolve table UI issues (#5199)
    • 453504907 fix: page transition in multi trial page (#5198)
    • 62b9fab6d fix: avoid using lookbehind regex (#5197)
    • e2a1ba80b feat: enable configuring the agent to shut down on connection failure (#5044)
    • 4a2ea91b6 fix: do not publish helm chart for *-rc releases [DPS-260] (#5182)
    • 52cceb9a3 chore: bump version: 0.19.5-dev0 -> 0.19.5-rc0
    • 2183501d6 chore: lock api state for backward compatibility check
    • e59380ab8 fix: streaming errors not being handled (#5171)
    • ee1195e75 feat: agent user group settings per workspace [DET-8472, DET-7547] (#5122)
    • 21e29b8db fix: use theme var in Avatar stylesheet [DET-8237] (#5109)
    • 07309d5ec chore: streamline local frontend against remote cluster (#4593)
    • 12b42efbe fix: Sort workloads by training and validation metrics [WEB-430] (#5167)
    • ec90ab1a7 feat: short options for RBAC CLI. (#5169)
    • 08e270ad6 test: fix test-k8-mount. (#5153)
    • 2dcbab9e9 fix: Table reordering fixed (#5159)
    • a70db1fa8 chore: update readme for web (#5168)
    • 11ed4c389 fix: dont show unknown error popups for telemetry
    • a2a66b719 chore: lint-staged for web (#5155)
    • f920bde0c fix: more correctly compute TLS cert hash in CLI (#5164)
    • 3b22f70eb chore: Convert permission.name string to permission.id enum [DET-8464] (#5121)
    • 316413a0c fix: new tab to open Jupyter (#5145)
    • 79f029609 fix: remove workspace members view mock special casing (#5138)
    • 6ac26317f fix: FOUNDENG-246 Task reports "RuntimeError: Dataset not found or corrupted" (#5160)
    • e54b33ad0 chore: replace router hooks for react-router-dom v6 (#5142)
    • 9a31b8585 fix: regression in master user auth error message. (#5152)
    • d7b9bc5e1 feat: groups and users to workspace proto (#5085)
    • 704ea4f19 feat: cli: support downloading checkpoints through master (#5083)
    • 80e880ba5 feat: support slots_per_trial=0 in Trial classes (#5035)
    • 8bd936038 build: setup Pre-Commit (#5116)
    • ef26d7fe5 fix: handle transient GCS errors in Tensorboard upload [DET-8491] (#5151)
    • cf54175c2 fix: shells fail if vars contain a newline [FOUNDENG-251] (#5148)
    • 8df93c5a3 fix: adjust elevation for dark mode (#5147)
    • 058713a20 chore: Placeholder when notes/markdown is disabled [DET-8409] (#5131)
    • f5b7c9915 feat: Table component performance enhancement (#5056)
    • 4d0adddef feat: New Project button/message checks permission [DET-8374] (#5114)
    • 1d0963c97 feat: checkpoint download through master (#4989)
    • 2a16f8138 perf: fix memory leak warnings (#5139)
    • 61103df3c fix: remove temporary text (#5141)
    • c82be8705 ci: cache more Go files (#5140)
    • 6d8a49575 fix: doc typos (#5127)
    • a72305f75 chore: only request listRoles with rbac on [DET-8419] (#5132)
    • c596cd963 docs: bring back analytics. (#5134)
    • 027ef1635 feat: Trials Comparison backend (#4543)
    • 41deb4d17 chore: replace use of XORTrials with OneVarTrials in TestPyTorchTrial [MLG-42] (#5107)
    • 91bd0e335 fix: null check (#5130)
    • 5320aec91 docs: fix user-reported errata. (#5133)
    • e896981ca feat: RBAC for experiment actions [DET-8372] (#5069)
    • 47dd822a5 fix: tensorboard inherits imagePullSecrets from experiment [DET-8458] (#5123)
    • 5fc93fd66 feat: Add "Members" view to workspace page [DET-8219] (#5113)
    • bd57ec482 ci: Switch to token w/ higher quota [INFENG-100] (#5125)
    • 8f2e29e4e chore: oss refactor for auto assign WorkspaceCreator role plugin (#5075)
    • 26bfdcf39 chore: move polling hook to shared (#5078)
    • fc778f30b feat: rbac CLI [DET-7868] (#5061)
    • de4b7bc01 chore: reuse created loggers (#5077)
    • 010382bc8 fix: prevent reattach deadlock upon container reattach (#5112)
    • e1f9502dd chore: bump version: 0.19.4-dev0 -> 0.19.5-dev0
    • 38813e85e docs: add release notes for 0.19.4 (#5110)
    • af8be294f build: skip more jobs for web-only prs (#5117)
    • 672eda983 fix: fix unshared dependency [DET-8462] (#5118)
    • 08c14723a chore: fix package-lock format compatibility (#5073)
    • 8d9bbf56e fix: remove as any (#5089)
    • f6faa5752 fix: No Permissions page vs. 404 page (#5088)
    • db3c53b39 feat: Reconnect to Slurm jobs on startup (FOUNDENG-215) (#5104)
    • a9c2a8804 chore: include running eslint fix in fmt target (#5096)
    • dd59244bf chore: remove check from pytorch [MLG-182] (#5071)
    • 02a8f45b2 fix: fix useSettings bug (#5102)
    • d74d5ac4a fix: test_efficientdet_coco_pytorch_const failing (#5097)
    • f7098e2a6 fix: Initial comm failure should provide explaination (FDN-217) (#5092)
    • ea5851451 fix: new states can be paused, canceled, killed [DET-8449] (#5090)
    • 9a0bcb79e fix: display primitive hps correctly in parallel coordinates plot (#5091)
    • 399435ccb fix: allow sort of metrics under avg_metrics [DET-8408] (#5086)
    • 810071118 ci: store docs output as a single file (#5081)
    • 80df08b9d refactor: improve storybook [DET-8099] (#5011)
    • 1734a848d fix: upload helm chart to GitHub release artifacts [INFENG-93] (#5064)
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.19.5_checksums.txt(752 bytes)
    determined-agent_0.19.5_darwin_amd64.tar.gz(9.22 MB)
    determined-agent_0.19.5_linux_amd64.deb(8.85 MB)
    determined-agent_0.19.5_linux_amd64.rpm(8.82 MB)
    determined-agent_0.19.5_linux_amd64.tar.gz(8.81 MB)
    determined-agent_0.19.5_linux_ppc64.deb(7.76 MB)
    determined-agent_0.19.5_linux_ppc64.rpm(7.70 MB)
    determined-agent_0.19.5_linux_ppc64.tar.gz(7.69 MB)
    determined-helm-chart_0.19.5.tgz(10.77 KB)
    determined-master_0.19.5_checksums.txt(759 bytes)
    determined-master_0.19.5_darwin_amd64.tar.gz(41.83 MB)
    determined-master_0.19.5_linux_amd64.deb(74.34 MB)
    determined-master_0.19.5_linux_amd64.rpm(74.22 MB)
    determined-master_0.19.5_linux_amd64.tar.gz(41.76 MB)
    determined-master_0.19.5_linux_ppc64.deb(71.67 MB)
    determined-master_0.19.5_linux_ppc64.rpm(71.42 MB)
    determined-master_0.19.5_linux_ppc64.tar.gz(38.96 MB)
  • 0.19.4(Sep 23, 2022)

    Changelog

    • 1b50f1c2b chore: bump version: 0.19.4-rc3 -> 0.19.4
    • 424cbaf00 docs: add release notes for 0.19.4 (#5110)
    • 11b868894 chore: bump version: 0.19.4-rc2 -> 0.19.4-rc3
    • 3ce78d358 fix: consolidate useSettings changes (#5106)
    • 555b583ab chore: bump version: 0.19.4-rc1 -> 0.19.4-rc2
    • 25929ee6a fix: fix useSettings bug (#5102)
    • 2197af14b chore: bump version: 0.19.4-rc0 -> 0.19.4-rc1
    • ac8a69a58 fix: new states can be paused, canceled, killed [DET-8449] (#5090)
    • 375b7c9cb fix: display primitive hps correctly in parallel coordinates plot (#5091)
    • 7dffc938f fix: upload helm chart to GitHub release artifacts [INFENG-93] (#5064)
    • 56daac28d chore: bump version: 0.19.4-dev0 -> 0.19.4-rc0
    • 89bdccf9f chore: lock api state for backward compatibility check
    • f8dc2c863 fix: bumpenv again after moving pip install protobuf (#5082)
    • ad92bd882 chore: Avoid error from RBAC listRole endpoint [DET-8395] (#5079)
    • 64640fd70 chore: format code (#5080)
    • ae0207d40 revert: work around a bum pyzmq build (#5074)
    • 85ce9a133 feat: WebUI edit permission at user profile [DET-8224] (#5068)
    • 97408f36c feat: Add state wait to tensor board [DET-8273] (#5009)
    • 2bca6373d fix: support setting agent username and group in user APIs. (#5055)
    • 9fc986551 ci(weekly-vuln-scan): remove superseded workflow [INFENG-94] (#5051)
    • c29e3ce46 fix: Trial.TotalCheckpointSize incorrect [DET-8399] (#5063)
    • cef4492db docs: Slurm installation feedback from user installs (#5048)
    • bfbe14182 build: speed up parallel runs for react make (#5065)
    • f075bffcf chore: add lint for Prettier and Ignore format code commit (#5059)
    • ca9cc8163 ci(scan-docker-images): fix sarif upload job [INFENG-94] (#5050)
    • de9b47c39 fix: handle pin when its not defined (#5062)
    • 8b8c2c8b8 feat: permissions to create and view workspaces (#5045)
    • 85396ebc4 chore: format code (#5058)
    • 2890b3380 feat: add cluster_admin permission route (#5024)
    • 2b3cad5af chore: test PyTorch AMP with gradient aggregation [DET-6105] (#4987)
    • d8b6c471a feat: WebUI add RBAC feature switch [DET-8352] (#5036)
    • f2fe16374 fix: avoid pip executable when upgrading pip in circleci (#5057)
    • 7248ef8e6 fix: set auth cookie client side for external flows [DET-8310] (#4967)
    • 999d70b2e fix: WebUI codeview test mock [DET-8351] (#5047)
    • 079e8af7f feat: Modal to add role to group [DET-8220] (#5022)
    • f14e5be3c chore: bumpenvs for ROCm changes (#5026)
    • 62756f5d5 feat: setup Prettier (#5033)
    • d39b55d70 chore: add .prettierrc.js (#5034)
    • 161285b6e chore: personal groups get automatically created for users [DET-8363] (#5025)
    • d4ed2e6e0 ci: work around a bum pyzmq build (#5029)
    • 71805d1db chore: replacing globalOnly with isGlobal in web code (#5028)
    • cbac3594a fix: add alignment in tables (#4995)
    • 8fe208f8a fix: update shell and star icons (#5019)
    • 2e7458b97 fix: remove det deploy aws vpc deployment type, fix govcloud agent AMI. (#5023)
    • 2da9d1490 chore: permissions and permission summary proto (#5020)
    • 8abd939b2 feat: add jupyter notebook files support (#5004)
    • 86ffb0825 fix: duplicate checkpoints returned by listing checkpoint routes (#4894)
    • 273eb84e0 feat: add the useSettings to the view code (#4961)
    • 8636ad4e5 chore: update slurm-known-issues (#4892)
    • f7322ad6f feat: add stub RBAC API (#4990)
    • bbf59677f chore: split out ui store and its actions [DET-8218]
    • 1b907fb2f feat: Redesigning active-state for experiment, trial, and task [DET-7278] [DET-7801] (#4420)
    • 946be9eea fix: don't attempt to remove zero checkpoints (#4986)
    • 7fdf0e075 fix: .detignore interprets wildcards like .gitignore [DET-7094] (#4998)
    • be5471d9c test: WebUI add test for settings account (#4980)
    • 311cb1172 feat: pin experiments (#4925)
    • b67a11bd0 chore: bump version: 0.19.3-dev0 -> 0.19.4-dev0
    • b46792989 docs: add release notes for 0.19.3 (#4997)
    • d3e8fd5a9 docs: proto required version to 3.15 to support optionals (#4992)
    • 75cf3c51b chore: add streaming to bindings and use it (#4942)
    • f6f2f5416 feat: add cache control headers to static web assets [DET-7450] (#5005)
    • 373df82d6 chore: add expconf environment.pbs (#4982)
    • b39299944 feat: add encoding to the file path (#4981)
    • b1e91f002 docs: Updates for ROCm support with Slurm (FOUNDENG-128) (#4985)
    • 664b58ef1 fix: update icon codes (#5002)
    • e701ac036 perf: cache grpcutil.GetUser() result (#4991)
    • 37e3e1e54 feat: user group CLI and RBAC feature flag [DET-7889,DET-8210] (#4637)
    • ee0b595ee test: master test-intg TestDeleteCheckpoints stability. (#4988)
    • a5607bf60 feat: Add no permissions warning page[DET-8227] (#4950)
    • 067c410d6 feat: WebUI add view user profile [DET-8228] (#4960)
    • 350182322 refactor: rename usergroup.APIServer -> usergroup.UserGroupAPIServer (#4933)
    • d35e24e19 refactor: bunify grpcutil.GetUser. (#4976)
    • 3a873d2c1 ci: gke version bump. (#4983)
    • f9b9872bd chore: reduce and log effective store state changes (#4952)
    • 95f82b189 feat: can see users' permissions if view_permissions enabled [DET-8222] (#4984)
    • 580e5d905 feat: Frontend uses permission store to clear actions [DET-8215] (#4965)
    • a444c01a0 feat: use task names for interactive task page titles. (#4954)
    • d90c0ce5f fix: address early loading state resolution [DET-8320] (#4978)
    • f6d9ea6f0 fix: make hp search look good on mobile [DET-8321] (#4973)
    • 79c60fe94 fix: correct overflow action buttons [DET-8322] (#4979)
    • 2ca4ad2aa fix: remove duplicate Admin Guide tile (#4975)
    • c15db09af feat: make table row inline (#4962)
    • 449c19444 feat: allow specifying Fluent Bit container UID/GID on Kubernetes [DET-8012] (#4963)
    • f76c7ac38 chore: recursively unwrap caught exceptions for type checks (#4966)
    • 134151ed4 fix: WebUI config download [DET-8323] (#4974)
    • 829c30eef feat: add batch register and deletion of checkpoints from experiment [DET-8130] (#4931)
    • c3ef4bd2e chore: revert "chore: secure echo with default authentication [DET-7405] [DET-7378] (#4267)" (#4971)
    • e5252f1eb fix: reduce settings api calls [DET-8307] (#4970)
    • 2d0af46ab feat: add sorting to the tree and fix css for the tree (#4858)
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.19.4_checksums.txt(752 bytes)
    determined-agent_0.19.4_darwin_amd64.tar.gz(9.20 MB)
    determined-agent_0.19.4_linux_amd64.deb(8.83 MB)
    determined-agent_0.19.4_linux_amd64.rpm(8.80 MB)
    determined-agent_0.19.4_linux_amd64.tar.gz(8.79 MB)
    determined-agent_0.19.4_linux_ppc64.deb(7.74 MB)
    determined-agent_0.19.4_linux_ppc64.rpm(7.68 MB)
    determined-agent_0.19.4_linux_ppc64.tar.gz(7.67 MB)
    determined-helm-chart_0.19.4.tgz(10.77 KB)
    determined-master_0.19.4_checksums.txt(759 bytes)
    determined-master_0.19.4_darwin_amd64.tar.gz(40.74 MB)
    determined-master_0.19.4_linux_amd64.deb(73.17 MB)
    determined-master_0.19.4_linux_amd64.rpm(73.05 MB)
    determined-master_0.19.4_linux_amd64.tar.gz(40.68 MB)
    determined-master_0.19.4_linux_ppc64.deb(70.58 MB)
    determined-master_0.19.4_linux_ppc64.rpm(70.34 MB)
    determined-master_0.19.4_linux_ppc64.tar.gz(37.97 MB)
  • 0.19.3(Sep 12, 2022)

    Changelog

    • 17f6d80b3 chore: bump version: 0.19.3-rc4 -> 0.19.3
    • ba4c8fb53 docs: add release notes for 0.19.3 (#4997)
    • 620cf6920 chore: bump version: 0.19.3-rc3 -> 0.19.3-rc4
    • 1cd716afd feat: allow specifying Fluent Bit container UID/GID on Kubernetes [DET-8012] (#4963)
    • 041cae931 chore: bump version: 0.19.3-rc2 -> 0.19.3-rc3
    • a7364af1a fix: correct overflow action buttons [DET-8322] (#4979)
    • 7557b065e chore: bump version: 0.19.3-rc1 -> 0.19.3-rc2
    • 83df8ccdc fix: remove duplicate Admin Guide tile (#4975)
    • f7824e4bc fix: WebUI config download [DET-8323] (#4974)
    • 11fe52c13 chore: bump version: 0.19.3-rc0 -> 0.19.3-rc1
    • 1af55aa57 chore: revert "chore: secure echo with default authentication [DET-7405] [DET-7378] (#4267)" (#4971)
    • dbe7008fb fix: reduce settings api calls [DET-8307] (#4970)
    • fa2a8251c chore: bump version: 0.19.3-dev0 -> 0.19.3-rc0
    • 05d713edd chore: lock api state for backward compatibility check
    • 0127d7d3c chore: secure echo with default authentication [DET-7405] [DET-7378] (#4267)
    • e2512cf68 feat: adjust scrollbar color by theme (#4964)
    • e1971c814 fix: associate allocation sessions with users (#4949)
    • 6c0ea8773 chore: fix a typo in py generator (#4938)
    • a5ea7e8df chore: add question issue (#4959)
    • c87cc1f4c ci(test-unit): remove debug code (#4947)
    • fedee52b2 test: remove ds test from p2 (#4951)
    • 4b565ac2c ci: run deepspeed on g4dn instances (#4946)
    • b2765f0ff feat: WebUI 404 not found page [DET-8226] (#4937)
    • f1f77c675 refactor: AuthZ for trials [DET-8211] (#4940)
    • 522f9f3a0 fix: allow forking an archived experiment [DET-8277] (#4944)
    • d346f3ffc chore: test apex checkpointing [DET-7886] (#4904)
    • 6a3a45574 chore: ensure isAuthError can see into wrapped exceptions (#4934)
    • c94a91ce0 ci(test-unit): accept only status events (#4941)
    • 5363d846d docs: slurm jobs do not require gres (#4911)
    • 7c12bd254 docs: update required python to 3.7 (#4939)
    • acd2ba92a feat: add programatic download for the config files (#4907)
    • 8f1f2f099 ci(test-unit): flail productively (#4936)
    • bd2db37e6 chore: address low hanging security updates (#4872)
    • bf61b0839 fix: remove prevUser constraint (#4932)
    • 948f34a5c feat: WebUI create user with group info [DET-8221] (#4923)
    • a57c90910 refactor: AuthZ for experiments [DET-8003] (#4905)
    • a93903bb2 feat: helm chart: add OIDC and SCIM options [DPS-204] (#4897)
    • ab8e47172 test: update yaml file names (#4924)
    • 798fca680 docs: fix to hyperlink in release notes (#4895)
    • 0fa875c56 docs: Slurm support updates for 0.19.3 (#4919)
    • 99c8f3f23 chore: fix rebase error (#4922)
    • e1632c025 chore: add stream argument to Session._do_request (#4902)
    • c3b0fb652 fix: rbac-user-groups merge conflicts and lints.
    • f923e790a feat: WebUI group list page [DET-7921, DET-7976] (#4724)
    • 710f8f689 fix: rbac-user-groups merge conflicts.
    • e9a909d47 feat: WebUI edit user [DET-7846] (#4680)
    • e35fb59b2 chore: RBAC user groups crud (#4620)
    • 1933ef3a8 feat: migrate patch user logic to grpc server [DET-7909] (#4648)
    • e9ab25da2 feat: pluggable authorization for RBAC. (#4626)
    • 12cad9f7d chore: User Groups SQL (#4519) [DET-7803]
    • d551eb4dc fix: change /var/cache permissions to mode 775 (#4920)
    • 0164be026 fix: GetExperiments error on forked experiment (#4918)
    • 9b23d6f93 ci(test-unit): limit runs to only test-e2e updates (#4915)
    • 3dc86510d fix: race condition in agent container actor around missing containerInfo. (#4869)
    • b2caa1573 ci(test-unit): fix conditional check syntax (#4913)
    • bbf27db5f ci(test-unit): fix debug line to print payload (#4912)
    • c9fdcfa3c ci(link-artifacts): add initial workflow attempt (#4906)
    • 0306d6694 chore: resource pool support for PBS (#4884)
    • 51355e4af perf: improve getWorkspaceProjects api for Quick Search (#4896)
    • 4ad9c1d4d chore: change import path in generated bindings (#4900)
    • fc1aee26c chore: proto build should fail on first error. (#4802)
    • 3f68ac2ac fix: re-render issue (#4898)
    • 0e3c81eda feat: GetExperiments to bun (#4813)
    • 479beba8d feat: DeepSpeed CPU offloading (#4875)
    • b85c1b3b2 chore: replace PropsWithChildren with explicit children (#4890)
    • 3f9aacfcf chore: migrate python sdk to generated bindings [DET-8005] (#4844)
    • a3ad849a8 chore: bump version: 0.19.2-dev0 -> 0.19.3-dev0
    • c339e3402 docs: add release notes for 0.19.2 (#4877)
    • e066d3215 chore: set torch_geometric version in example to fix e2e test. (#4889)
    • f6580ddda perf: set memory cap to improve memory allocation (#4840)
    • 25019fa3f chore: fix limit 0 for /api/v1/trials/:id/workloads (#4886)
    • a5c6f79aa feat: experiment checkpoint list [DET-8201] [DET-8129] (#4870)
    • 95c5126ef feat: allow OrderBy in GetExperimentCheckpoints for SortBy SearcherMetric (#4885)
    • a5278b1d3 feat: create quick search to jump to workspace or project (#4837)
    • c0b98dbb1 build: enable storybook previews (#4874)
    • 116baf948 fix: det e describe with multiple trials (#4863)
    • cf31c477c ci: fix flakes in test_max_concurrent_trials (#4865)
    • 9f5306d1a chore: test AMP autocast and gradient scaling [DET-7885] (#4702)
    • 0f0f82e0d chore: some cli cleanup (#4859)
    • 5e8d8f2bb docs: remove misleading redirect (#4883)
    • 07e76508d feat: add security.default_task and openshift host options to helm chart [DPS-204] (#4843)
    • 30e339385 feat: add disabled prop to ActionDropdown (DET-7937) (#4867)
    • 2f0464f90 fix: downgrade fluentbit to fix tls.vhost issues (#4871)
    • 74dd27f39 build: avoid double testing via e2e-longrunning (#4850)
    • f008dcb07 chore: add controllable logging support [DET-8025] (#4826)
    • 4a7c03f57 fix: remove workloadCount from trial responses; single-trial view fix (#4857)
    • 945cd6a0d chore: document reasons for scaler.update() (#4845)
    • 70c0c6690 chore: add authz on moving experiments between projects [DET-7750] (#4806)
    • 9e132ed8e fix: remove subprocess import (#4856)
    • 64911159b chore: preserve failed action's error message (#4822)

    Docker images

    • docker pull determinedai/determined-master:0.19.3
    • docker pull determinedai/determined-master:17f6d80b3
    • docker pull determinedai/determined-master:17f6d80b349011a29f51210a7634806709f99472
    • docker pull determinedai/determined-dev:determined-master-17f6d80b3
    • docker pull determinedai/determined-dev:determined-master-17f6d80b349011a29f51210a7634806709f99472
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.19.3
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:17f6d80b3
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:17f6d80b349011a29f51210a7634806709f99472
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.19.3_checksums.txt(752 bytes)
    determined-agent_0.19.3_darwin_amd64.tar.gz(9.17 MB)
    determined-agent_0.19.3_linux_amd64.deb(8.81 MB)
    determined-agent_0.19.3_linux_amd64.rpm(8.78 MB)
    determined-agent_0.19.3_linux_amd64.tar.gz(8.77 MB)
    determined-agent_0.19.3_linux_ppc64.deb(7.72 MB)
    determined-agent_0.19.3_linux_ppc64.rpm(7.66 MB)
    determined-agent_0.19.3_linux_ppc64.tar.gz(7.65 MB)
    determined-helm-chart_0.19.3.tgz(10.77 KB)
    determined-master_0.19.3_checksums.txt(759 bytes)
    determined-master_0.19.3_darwin_amd64.tar.gz(40.56 MB)
    determined-master_0.19.3_linux_amd64.deb(69.83 MB)
    determined-master_0.19.3_linux_amd64.rpm(69.74 MB)
    determined-master_0.19.3_linux_amd64.tar.gz(40.50 MB)
    determined-master_0.19.3_linux_ppc64.deb(67.24 MB)
    determined-master_0.19.3_linux_ppc64.rpm(67.01 MB)
    determined-master_0.19.3_linux_ppc64.tar.gz(37.78 MB)
  • 0.19.2(Aug 29, 2022)

    Changelog

    • 8abc3decd chore: bump version: 0.19.2-rc2 -> 0.19.2
    • 7db357223 docs: add release notes for 0.19.2 (#4877)
    • ea7abb899 chore: bump version: 0.19.2-rc1 -> 0.19.2-rc2
    • 02831bb64 fix: downgrade fluentbit to fix tls.vhost issues (#4871)
    • 950f1ce82 chore: bump version: 0.19.2-rc0 -> 0.19.2-rc1
    • 78b9c303d fix: remove workloadCount from trial responses; single-trial view fix (#4857)
    • d47ee91ab chore: bump version: 0.19.2-dev0 -> 0.19.2-rc0
    • 584448bbc fix: job queue experiment restore (#4797)
    • ef833743d fix: set gc policy [DET-8018] (#4812)
    • 5e4f50b72 fix: misc view code bug fixes
    • 90ca91760 chore: TrialContext is not an interface (#4851)
    • 7ae3a3841 fix: remove ds example (#4852)
    • ec7c8e3ce chore: fix random/grid searcher bug with max_concurrent_trials (#4836)
    • 27ba9cc7c fix: enable moving jobs around w/o assuming the full set [DET-8015] (#4766)
    • 4ca0fd48c fix: rps should correctly ignore other rps job msgs [DET-8214] (#4848)
    • 9a9e05f07 fix: fit long name (#4825)
    • d66d2359b fix: remove duplicate loading animation (#4839)
    • 80a55079a fix: remove non-model-hub mmdet tests (#4846)
    • 29c083ed7 fix: pass user ids for this user ids filter (#4842)
    • 5735ea345 fix: do not bypass torch.distributed.launch for single-slot trials (#4838)
    • 80735bcbb fix: handle avgMetrics response on individual trials of multi-trial experiment (#4821)
    • 83f5a1a75 fix: correctly display nested categorical hyperparameters [DET-8074] (#4818)
    • 7ffdc3d1e feat: rolling upgrades support for det deploy aws [DET-7853]. (#4829)
    • b161957ff fix: begin standardizing API pagination behavior in CLI. (#4833)
    • ab0df530b fix: make allocation saves idempotent (#4695)
    • ce376e95c build: set shared web to use xlarge resource class (#4824)
    • cf8ebdebc fix: canceling all experiment trials should cancel experiment (#4759)
    • 16e0cc155 ci: fix test-cli-win. (#4834)
    • 53316c6a5 chore: rename a file to prevent API breakages (#4831)
    • 345a3b128 ci: Fix publish_helm syntax errors [INFENG-1] (#4819)
    • dbccd3cd8 ci: add a checkbox to the PR template (#2969)
    • 726fe804c chore: promote Session to a first-class citizen (#4787)
    • 2cb0c9b39 test: webui user management unit test [DET-7968] (#4809)
    • 0f4bf8657 chore: deprecate mmdetection example in favor of model-hub version (#4816)
    • 6815ddf97 perf: reduce user settings api call (#4790)
    • 208afc613 fix: use fluent version 1.9.3 everywhere by default. (#4814)
    • 78f40f127 chore: UI fix and improvements (#4747)
    • 170000dff fix: get trial datapoints from trial comparison/summarization endpoint (#4796)
    • 5bda3fcfa fix: always override protobufAny description in openapi spec (#4811)
    • 4bd7c1641 fix: reconcile metrics proto, move det trial describe to the new API. [DET-7617] (#4746)
    • 69f5dfb9d chore: fix ListValue types in swagger spec (#4801)
    • 3d459fc14 ci: fix GHA syntax better [INFENG-1]
    • 399148b74 ci: fix GHA workflow boolean syntax [INFENG-1]
    • 6e3bb5347 ci: Remove unnecessary quotes [INFENG-1] (#4810)
    • 17773b945 chore: share copy to clipboard btn (#4799)
    • 7805d590f fix: correct job queue table bugs [DET-8069] (#4804)
    • bbbde9c29 fix: hide Delete button if user is not a creator or admin (#4805)
    • b445352f1 fix: speed up det deploy aws stack updates. (#4793)
    • 7bcb26e95 style: update number input error style for dark mode (#4772)
    • 7c867e759 refactor: authz interface for projects and workspaces [DET-8002] (#4721)
    • e96dca9d6 fix: grid view on Workspaces and Projects pages show all items [DET-8031] (#4794)
    • 441d1c6ec fix: job queue pagination (#4756)
    • 3191b4cfa chore: react-router-dom partial update part1 (#4788)
    • cde91c8e9 feat: Async deleting workspaces and projects (from CLI) [DET-7821] (#4675)
    • 003ddd8fe fix: directly return object-not-found errors instead of rewrapping them (#4791)
    • c0794f541 fix: jupyterLab modal poping issue (#4792)
    • efe0fe7e6 fix: use jupyter icon in navigation side bar (#4786)
    • 8cf151760 feat: deepspeed cpu offloading example (#4623)
    • 55542b725 fix: tab routing issue in resource pool (#4789)
    • 7cd0e51ba perf: improve too many user api call (#4763)
    • 69c2dfad0 test: Fix cluster utils cluster_slots() API (#4784)
    • 56f6469e6 fix: use correct experiment list offset when deleting an experiment [DET-7880] (#4754)
    • 0dd9e2aba chore: Use bindings.v1File instead of ContextItem (#4779)
    • e7e0ab21e fix: remove core external dependency from shared (#4782)
    • ea3f2570e feat: view code UI (#4473)
    • 550667b5a chore: disable positional args in bindings.py classes (#4777)
    • 0d5f3eb65 docs: Add release note for Slurm feature (#4778)
    • 047b6bab6 fix: lint-python ci test (#4774)
    • cbc0ba10c refactor: reduce unneeded api calls [DET-7451] (#4771)
    • ccf20c592 fix: gpt_neox deepspeed example (#4622)
    • f9274311a chore: update shared tester git url format (#4773)
    • e0ac8e89d fix: label filter in model registry (#4769)
    • 99878b7dc test: add tests for utils/service (#4749)
    • 228744ebd chore: expose Avatar props through AvatarCard (#4765)
    • ad7767cdc chore: upgrade swagger generator from 2.4.14 to 2.4.27 (#4738)
    • 9113e7eeb Fix a couple more helm action typos [INFENG-1]
    • e8944085d Fix helm workflow typos / indentation [INFENG-1]
    • 4e10f82e9 ci: add helm repo [infeng 1] (#4725)
    • 57842f996 chore: bump version: 0.19.1-dev0 -> 0.19.2-dev0
    • 1f5b0439e docs: add release notes for 0.19.1 (#4768)
    • c478c019e fix: tensorboard metrics step count [DET-8028] (#4761)
    • d78c8fc2a ci: re-enable gke shell logs test fixed d74ef5 (#4760)
    • 7d5506340 chore: Remove obsolete workloads from Trials API (#4703)
    • c6579efc5 refactor: solidify rm interface [DET-7852, DET-7984] (#4705)
    • b303b16c5 fix: allow changing max_length units in HP Search (#4755)
    • 7c98baab6 ci: remove trent from shared codeowners (#4757)
    • d74ef5af4 ci: shells should generate keys, even with empty 'data' field (#4744)

    Docker images

    • docker pull determinedai/determined-master:0.19.2
    • docker pull determinedai/determined-master:8abc3decd
    • docker pull determinedai/determined-master:8abc3decdc2c30813dcf674f19d1beb25eeb51e8
    • docker pull determinedai/determined-dev:determined-master-8abc3decd
    • docker pull determinedai/determined-dev:determined-master-8abc3decdc2c30813dcf674f19d1beb25eeb51e8
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.19.2
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:8abc3decd
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:8abc3decdc2c30813dcf674f19d1beb25eeb51e8
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.19.2_checksums.txt(752 bytes)
    determined-agent_0.19.2_darwin_amd64.tar.gz(9.16 MB)
    determined-agent_0.19.2_linux_amd64.deb(8.80 MB)
    determined-agent_0.19.2_linux_amd64.rpm(8.77 MB)
    determined-agent_0.19.2_linux_amd64.tar.gz(8.76 MB)
    determined-agent_0.19.2_linux_ppc64.deb(7.71 MB)
    determined-agent_0.19.2_linux_ppc64.rpm(7.65 MB)
    determined-agent_0.19.2_linux_ppc64.tar.gz(7.64 MB)
    determined-helm-chart_0.19.2.tgz(9.92 KB)
    determined-master_0.19.2_checksums.txt(759 bytes)
    determined-master_0.19.2_darwin_amd64.tar.gz(40.76 MB)
    determined-master_0.19.2_linux_amd64.deb(69.94 MB)
    determined-master_0.19.2_linux_amd64.rpm(69.83 MB)
    determined-master_0.19.2_linux_amd64.tar.gz(40.69 MB)
    determined-master_0.19.2_linux_ppc64.deb(67.33 MB)
    determined-master_0.19.2_linux_ppc64.rpm(67.10 MB)
    determined-master_0.19.2_linux_ppc64.tar.gz(37.96 MB)
  • 0.19.1(Aug 11, 2022)

    Changelog

    • 7cc610754 chore: bump version: 0.19.1-rc2 -> 0.19.1
    • 5feb1dcc6 docs: add release notes for 0.19.1 (#4768)
    • 60b06ded7 chore: bump version: 0.19.1-rc1 -> 0.19.1-rc2
    • 6797dbadd fix: tensorboard metrics step count [DET-8028] (#4761)
    • b7ef72696 chore: bump version: 0.19.1-rc0 -> 0.19.1-rc1
    • 1706a0cfe fix: allow changing max_length units in HP Search (#4755)
    • 086942f0e chore: bump version: 0.19.1-dev0 -> 0.19.1-rc0
    • 14cff265d ci: unversion workflows (#4752)
    • f080a6334 chore: lock api state for backward compatibility check
    • f26428129 fix: Write change-password script in /tmp instead of CWD (#4677)
    • ebc79c28b fix: python sdk can parse master output again (#4745)
    • 2b415761b chore: update docker images names (#4727)
    • 286194d6c fix: deepspeedtrial validation batch size computation (#4743)
    • ce780672b chore: [Ant Design] replace old menu with new menu (#4741)
    • 6de74a507 docs: add release note for fix for searcher early termination bug (#4739)
    • bf6fff383 fix: url-encode description of notebook and tensorboard (#4718)
    • 933494528 fix: fix an issue with forbidden api actions causing logout (#4737)
    • e69576307 style: fix mobile exp header [DET-7975] (#4733)
    • 73fe5b6bd fix: hardcode pathname instead of using paths (#4740)
    • a071fc648 test: test cases for shared/utils/routes.ts [DET-7902] (#4706)
    • e5fa6fdbc style: remove styling that forced padding to be 0 (#4734)
    • d8d7e252d fix: remove the default theme from initialization (#4698)
    • 5adbf42db feat: add spinner to show trial fetching (#4683)
    • 464f68c4b test: add tests for experiment detail page [DET-7979] (#4723)
    • a09e81d9a fix: cursor in modal text field jumps to end of input (#4691)
    • 3fcb98531 chore: add regex in InlineEditor [DET-7518] (#4716)
    • 5ef83ba19 chore: share sort utilities [DET-7970] (#4711)
    • feabd4595 fix: breadcrumb text color (#4720)
    • bd542a038 fix: resolve issues around InlineEditor [DET-7914] (#4713)
    • 2ea79464d test: add tests for settings page [DET-7966] (#4717)
    • 2724b707b chore: Remove unnecessary imports and fields in proto (#4710)
    • eef660fd5 fix: WebUI workspace pagination [DET-7927] (#4700)
    • aad801115 refactor: authz provider implementation and authz users basic implementation (#4676)
    • 2c575fd94 fix: record operations at the right places around shutting down (#4719)
    • 9ab7dae72 chore: clear selected item when clear filters (#4714)
    • b089329ef feat: One-Click Hyperparameter Search [DET-7537] [DET-7538] (#4458)
    • 02be458d4 test: add wait utils test coverage [DET-7959] (#4701)
    • bd8664ad1 fix: fix low contrast issue for button styles [DET-7958] (#4692)
    • 05800e21b test: update path conditionals for gh workflows (#4708)
    • 332270af2 style: fix doc tile styling (dark mode support and responsive) [DET-7955] (#4709)
    • c78b1526a fix: mark all 4xx api failures as auth failure (#4690)
    • 972954d82 feat: Connect trial UI to workloads API; pass sort/filter to API (#4407)
    • b97198b44 test: add samlauth tests (#4685)
    • 2c26b4619 docs: add rest api reference link and rewrite rest api doc (#4688)
    • c2968d68e docs: port slurm deployment to oss docs (#4653)
    • 3623b0ce5 test: WebUI interaction test for page [DET-7894] (#4689)
    • 55aa326e7 refactor: test cases for ActionDropdown (#4699)
    • 938a48614 fix: push-shared target's directory change (#4672)
    • da5f7fe3f fix: keyboard doesnt show for inline editor in mobile [DET-7519] (#4659)
    • 0016fd7c5 fix: word break in description (#4697)
    • 6a8856c60 fix: move some libs in package.json (#4687)
    • 20d48be12 chore: support enum sizes for avatar (#4686)
    • 84631abff test: add test cases for string.ts (#4679)
    • 8d2a82103 test: create interaction tests for action dropdown [DET-7895] (#4684)
    • 5c1679da5 test: add test coverage for shared error utilities [DET-7900] (#4666)
    • ed65d20c9 ci(lint-python): migrate to gha workflow (#4639)
    • 4c3e9f2c4 test: add test cases for Image.tsx (#4667)
    • aedaa589a chore: bump version: 0.19.0-dev0 -> 0.19.1-dev0
    • 63b2dac9b docs: add release notes for 0.19.0 (#4671)
    • 4eeaa51cd chore: update live docs script for extension change (#4678)
    • 53ed638cf test: add test cases for Icon.tsx (#4664)
    • 2a115c494 test: add unit tests for logger class [DET-7901] (#4674)
    • 20b682afe feat: add new PyTorchCallbacks [DET-7760] (#4500)
    • a9f4a8724 refactor: remove unused code in model version detail page (#4670)
    • 725c74fd9 fix: persist task state update in interactive task view (#4662)
    • 31155099b feat: Create user UI [DET-7847] (#4665)
    • fac341c49 fix: Count only active tasks in cluster info board (#4658)
    • 9651e0dcb chore: update codecov badge to reflect web only (#4661)
    • 69a8668e4 fix: comment to gen swagger for model def API [DET-7926] (#4657)
    • 17f39260f test: utils/set unit tests [DET-7904] (#4655)
    • a42bd971a fix: description overflowing table cell (#4656)
    • 65ba5d652 test: add test cases for AvatarCard (#4650)
    • ad70ce8de feat: task specific actions to job overflow menu (#4638)
    • dc1ecfbce ci(lint-bindings): migrate to gha workflow (#4642)
    • 6b341f29e ci(lint-go): migrate to gha workflow (#4636)
    • cd5cbec95 docs: fix a typo in docs for Elasticsearch-backed logging (#4228)

    Docker images

    • docker pull determinedai/determined-master:0.19.1
    • docker pull determinedai/determined-master:7cc610754
    • docker pull determinedai/determined-master:7cc610754b2f6828240e07cb222a31da71df4f10
    • docker pull determinedai/determined-dev:determined-master-7cc610754
    • docker pull determinedai/determined-dev:determined-master-7cc610754b2f6828240e07cb222a31da71df4f10
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.19.1
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:7cc610754
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:7cc610754b2f6828240e07cb222a31da71df4f10
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.19.1_checksums.txt(752 bytes)
    determined-agent_0.19.1_darwin_amd64.tar.gz(9.16 MB)
    determined-agent_0.19.1_linux_amd64.deb(8.80 MB)
    determined-agent_0.19.1_linux_amd64.rpm(8.77 MB)
    determined-agent_0.19.1_linux_amd64.tar.gz(8.76 MB)
    determined-agent_0.19.1_linux_ppc64.deb(7.71 MB)
    determined-agent_0.19.1_linux_ppc64.rpm(7.65 MB)
    determined-agent_0.19.1_linux_ppc64.tar.gz(7.64 MB)
    determined-helm-chart_0.19.1.tgz(9.92 KB)
    determined-master_0.19.1_checksums.txt(759 bytes)
    determined-master_0.19.1_darwin_amd64.tar.gz(40.69 MB)
    determined-master_0.19.1_linux_amd64.deb(69.92 MB)
    determined-master_0.19.1_linux_amd64.rpm(69.81 MB)
    determined-master_0.19.1_linux_amd64.tar.gz(40.65 MB)
    determined-master_0.19.1_linux_ppc64.deb(67.29 MB)
    determined-master_0.19.1_linux_ppc64.rpm(67.06 MB)
    determined-master_0.19.1_linux_ppc64.tar.gz(37.89 MB)
  • 0.19.0(Jul 29, 2022)

    Changelog

    • 5497b5114 chore: bump version: 0.19.0-rc2 -> 0.19.0
    • d4a101ce5 docs: add release notes for 0.19.0 (#4671)
    • 35c926f55 chore: bump version: 0.19.0-rc1 -> 0.19.0-rc2
    • 1f0a22ea2 fix: Count only active tasks in cluster info board (#4658)
    • 0785b6962 chore: bump version: 0.19.0-rc0 -> 0.19.0-rc1
    • 86ba506ea fix: description overflowing table cell (#4656)
    • 90f3c269a chore: bump version: 0.19.0-dev0 -> 0.19.0-rc0
    • f296b9e76 chore: lock api state for backward compatibility check
    • 7ae6674d0 chore: bump version: 0.18.5-dev0 -> 0.19.0-dev0
    • 704c36b78 chore: hide user management for now (#4652)
    • c1dc69b3d ci: long running det_deploy_local stress test to upstream (#4643)
    • 74349af83 fix: change average_training_metrics default to true [DET-7240] (#4646)
    • 801642625 fix: update resource pool job queues to paginate properly (#4579)
    • bc9ec2baa chore: test improvement for _pytorch_context and _pytorch_trial [DET-7761] [DET-7763] (#4494)
    • ffc1b34cc style: fix avatar card for long names (#4644)
    • 5d57eabc1 style: fix disabled primary button style (#4645)
    • c980fead0 fix: model deital bugs (#4640)
    • 9032f67c1 feat: enable open telemetry in agent [DET 7085] (#4276)
    • 562532351 chore: install ptl in startup hooks (#4380)
    • 69083b993 fix: minor bugs in test config creation (#4515)
    • 84a63b604 feat: allow DET_PASS and DET_USER to skip need for login [DET-7025] (#4597)
    • c4e32ba14 style: update account layout [ET-7829] (#4627)
    • f8c8c9954 fix: agent entrypoint should use exec. [DET-7834] (#4641)
    • ead85bb88 feat: support pagination in bun (#4634)
    • 331376e97 build: skip det-deploy-local for webui only changes (#4628)
    • 2acd1235d feat: Show active experiment and task count on resources page [DET-7680] (#4566)
    • 8bd2c8455 feat: User management list view [DET-7796] (#4607)
    • be07453df chore: fix docs/.gitignore to use .rst (#4633)
    • de18b887e chore: allow setting context type for useModal (#4612)
    • 177ae9606 chore: handle new type-pyopenssl pacakage (#4632)
    • 2da2ed2a1 test: fix EditableMetadata test flake (#4610)
    • 96a3a15df chore: enhance GitHub issue templates (#4630)
    • b5b99f59d ci(lint-docs): migrate to gha workflow (#4621)
    • a89d7b350 docs: apply formatting again
    • b7d28bb68 docs: complete switch from txt to rst extensions
    • ad0c12998 chore: implement closeBar feature for omnibar (#3306)
    • 3358e9092 chore: remove getProjectExperiments and merge into getExperiments (#4606)
    • 271f5e929 Adding issues template
    • 27da5e8d1 ci: test downstream (#4604)
    • b1d36043c fix: reject reconnecting agents with different device configuration [DET-7568] (#4381)
    • 4336924cc ci(circleci): remove determined-ci context ref (#4613)
    • ce0f89da9 docs: model definition file cache (#4609)
    • 6742e1a4a chore: add attr to track api state has been initialized (#4599)
    • e409e7185 chore: share useModal hook [DET-7781] (#4571)
    • e1618399f fix: flaky test caused by compare stats [DET-7838] (#4576)
    • a17c043fd fix: rocm-smi workaround without product info (#4553)
    • 1d87dcf7b style: fix spacing for tabs and headers [DET-7823] (#4595)
    • 8e52a7988 docs: shorten landing page tile descriptions (#4605)
    • d149a54aa fix: Drop rendezvous interface warning (FOUNDENG-139) (#4594)
    • 066c11e3f fix: helm option "defaultPassword" caused the deployment to hang [DET-7814] (#4570)
    • 6362a3083 ci(lint-react): migrate to gha workflow (#4592)
    • 0d0f89ba0 ci(lint-secrets): migrate to gha workflow (#4591)
    • a5ca19e67 feat: text search for task/trial logs [DET-7446] (#4577)
    • fb4dba510 chore: add cache to tools/devcluster.yaml (#4584)
    • 0e0784e30 docs: restructure content (#4484)
    • db32a6434 fix: ignore order for Set isEqual [DET-7822] (#4589)
    • 9c79ea844 test: switch hamid with a one line check (#4596)
    • 881ef6859 chore: update relative import styles to absolute (#4581)
    • 10205cdaa docs: update description of telemetry reporting (#4175)
    • 0c6020972 feat: add sort and order to users api [DET-7828] (#4573)
    • 6d3065d2a feat: open tasks in embedded task view in CLI [DET-7686] (#4563)
    • 58fc5b11b fix: notebook ignores --template CLI param and notebooks still launch if --preview param is set [DET-7632] (#4476)
    • d81faa2ab fix: zero slot tasks k8s using wrong image and exposing all GPUs [DET-7808] (#4586)
    • deef8819a ci(test-cli): migrate to gha workflow (#4575)
    • 05827e122 ci(test-e2e): move optional jobs into own workflow (#4587)
    • 823e322ce fix: allow zero slots for JupyterLab in modal (#4582)
    • 52c5a0006 chore: tweak ilia's CODEOWNERS. (#4580)
    • 85531d1e9 chore: rbac query params [DET-7843] (#4583)
    • 75706262e refactor: move user settings to page [DET-7795] (#4550)
    • 31fff14f2 fix: double scroll bar (#4578)
    • 3897a5000 fix: storybook rendering issue (#4554)
    • b1818a472 fix: missing key for rows (#4564)
    • 91811cf1c refactor: replace antd Space with flexbox gap in SelectFilter (#4569)
    • facb23491 chore: add docs to codeowners (#4568)
    • 46888f001 fix: expconf copy timeout too short causing errors (#4567)
    • 9559b99d5 fix: race condition on experiment config for 'det e create --paused' [DET-7789] (#4533)
    • 09dfaf5d0 chore: eslint keyword spacing (#4561)
    • 011ca00e7 chore: update eslint array multiline and add eslint arrow paren (#4562)
    • d5f897715 feat: Place experiment in a project using CLI [DET-7720] (#4552)
    • cb92dd6a1 feat: add dynamic page title to embedded task page [DET-7681] (#4555)
    • 48b6c321a refactor: make sso a plugin [DET-7560] (#4559)
    • 2bb461ee7 feat: remove reset column widths button [DET-7675] (#4558)
    • 5a163d80e chore: bump version: 0.18.4-dev0 -> 0.18.5-dev0
    • d109ef4b0 docs: add release notes for 0.18.4 (#4547)
    • f519b39a4 style: add gofumpt (#4217)
    • 002d9502a ci(scan-docker-images): migrate to gha (#4546)
    • 495f40937 ci: switch to custom-built container image (#4535)
    • f13c2377e perf: improve deepspeed checkpointing for sharedfs (#3905)
    • 92bf7c5e2 feat: api to expose experiment model code [DET-7465] (#4374)
    • 00c290949 feat: Web UI request checkpoint deletion [DET-7113] (#4545)
    • 9907bc728 fix: increase time to wait for command in priority scheduler test (#4544)
    • 2d59d25ed chore: remove activemetric duplication [DET-7737] (#4469)
    • ff758fd3d fix: correct allocation end times when cluster heartbeat is before allocation start time (#4556)
    • 8ae44d4ca fix: hide description placeholder when archived (#4551)
    • 1e6da7b1c fix: rename window title (#4548)
    • 276088514 fix: polish cluster page UI [DET-7682] (#4508)
    • 85f72ede7 feat: update allocation bar legend logic [DET-7683] (#4510)
    • 5e1bd1a4a chore: update omnibar with new theme variables (#4528)
    • 9045e2169 fix: tensorboard redirecting due to missing rp (#4542)
    • d0d23beee fix: scroll bar overlap (#4538)
    • 20ac53809 feat: use skeleton component for rendering the table while fetching data (#4462)
    • 7ddc8ad88 chore: agent sends log level with agent added container logs (#4532)
    • 937964e64 chore: temporary-disable-dependabumps (#4541)
    • 6004b8992 fix: rename lable restarts to auto restarts (#4536)
    • 633be08a1 fix: change short ID width (#4537)
    • 1e66cabc8 test: disable flaky test for now (#4530)
    • 7a54c9acc fix: persist whose workspaces/projects (#4525)
    • b673e34ed test: update avatar tests to remove warnings (#4523)
    • bdfa9fa64 fix: text color in Hyperparameters page (#4529)
    • e9ea762d0 fix: skip gzip only for proxy web assets [DET-7802] (#4517)
    • 82d1144d2 feat: docker auth improvements [DET-7633, DET-7636] (#4513)

    Docker images

    • docker pull determinedai/determined-master:0.19.0
    • docker pull determinedai/determined-master:5497b5114
    • docker pull determinedai/determined-master:5497b5114db5546f1ecaa2349dd6e8c4c3638fd5
    • docker pull determinedai/determined-dev:determined-master-5497b5114
    • docker pull determinedai/determined-dev:determined-master-5497b5114db5546f1ecaa2349dd6e8c4c3638fd5
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.19.0
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:5497b5114
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:5497b5114db5546f1ecaa2349dd6e8c4c3638fd5
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.19.0_checksums.txt(752 bytes)
    determined-agent_0.19.0_darwin_amd64.tar.gz(9.15 MB)
    determined-agent_0.19.0_linux_amd64.deb(8.80 MB)
    determined-agent_0.19.0_linux_amd64.rpm(8.76 MB)
    determined-agent_0.19.0_linux_amd64.tar.gz(8.75 MB)
    determined-agent_0.19.0_linux_ppc64.deb(7.71 MB)
    determined-agent_0.19.0_linux_ppc64.rpm(7.65 MB)
    determined-agent_0.19.0_linux_ppc64.tar.gz(7.64 MB)
    determined-helm-chart_0.19.0.tgz(9.91 KB)
    determined-master_0.19.0_checksums.txt(759 bytes)
    determined-master_0.19.0_darwin_amd64.tar.gz(40.66 MB)
    determined-master_0.19.0_linux_amd64.deb(69.57 MB)
    determined-master_0.19.0_linux_amd64.rpm(69.46 MB)
    determined-master_0.19.0_linux_amd64.tar.gz(40.59 MB)
    determined-master_0.19.0_linux_ppc64.deb(66.96 MB)
    determined-master_0.19.0_linux_ppc64.rpm(66.73 MB)
    determined-master_0.19.0_linux_ppc64.tar.gz(37.86 MB)
  • 0.18.4(Jul 15, 2022)

    Changelog

    • 710d5575b chore: bump version: 0.18.4-rc4 -> 0.18.4
    • 990e62b04 docs: add release notes for 0.18.4 (#4547)
    • a12fd4520 chore: bump version: 0.18.4-rc3 -> 0.18.4-rc4
    • 57e4e7e8a chore: bump version: 0.18.4-rc2 -> 0.18.4-rc3
    • 06dcaacc0 chore: bump version: 0.18.4-rc1 -> 0.18.4-rc2
    • 09bf496ab fix: correct allocation end times when cluster heartbeat is before allocation start time (#4556)
    • 75d52378c chore: bump version: 0.18.4-rc0 -> 0.18.4-rc1
    • 47f920731 fix: tensorboard redirecting due to missing rp (#4542)
    • 56bf894de chore: bump version: 0.18.4-dev0 -> 0.18.4-rc0
    • 900b134cd chore: lock api state for backward compatibility check
    • a14986bbf test: set_priority should respect the master url. (#4527)
    • 8a3c07a39 test: bring back tests for generic commands on master restarts. (#4490)
    • 73268ac6b docs: add ClusterInfo docs (#4496)
    • 9a34b3d48 chore: bumpenvs (#4520)
    • b1a292731 fix: ensure correct master url is used for priority scheduler and other managed devcluster tests. (#4512)
    • 667e313c8 fix: deepspeed examples config (#4511)
    • 2f6373ce5 style: add uplot cursor styles for dark mode (#4502)
    • 7657c15c1 feat: add props to style avatar differently (#4514)
    • 5272c0916 test: set up shared web to test with local strategy (#4504)
    • 2f0950317 chore: bump version: 0.18.3-dev0 -> 0.18.4-dev0
    • 5f30b6714 docs: add release notes for 0.18.3 (#4491)
    • 758c2e929 test: fix local time tests (#4503)
    • 2a968bf7a fix: give a reasonable name to model def download (#4493)
    • 7e20b5d11 docs: minor improvements and Homebrew build instructions (#4489)
    • 38fe30220 chore: add helpers for working with shared-code (#4452)
    • 28c7b895d fix: make actor system GetOrElseTimeout actually timeout. (#4499)
    • 5a4198677 feat: environment_variables in task_container_defaults [DET-7638] (#4485)
    • 1184bb3dd test: fix editable metadata test flake (#4495)
    • 7898dc1a6 style: theme tuning [DET-7362, DET-7363, DET-7364, DET-7365, DET-7366, DET-7383, DET-7425, DET-7498, DET-7551] (#4378)
    • cd28b37a1 chore: fix react fmt (#4487)
    • 97d3e6029 refactor: update modals to better support contexts [DET-6297] (#4468)
    • 855e8324b fix: allocation errors on finished hp search [DET-7724] (#4467)
    • ce296c5f4 fix: correct path selector for tensorboard upload (#4474)
    • c3d650984 fix: add priority scheduler e2e tests [DET-7106] (#4429)
    • 7e50a0416 fix: close button should not show in non-embedded log viewer [DET-7723] (#4447)
    • 1d61f3a32 fix: adjust spinner for the InlineEditor (#4367)
    • a69813bfa feat: persist user web setting [DET-7501] (#4394)
    • be48f7654 feat: support non-root init container k8s [DET-7109] (#4460)
    • 2015f145e fix: disable pointer events on log viewer spinner block (#4470)
    • db65bf45a test: tweak test_agent_reconnect_keep_experiment test timeouts. (#4465)
    • 5f1340767 chore: bump the resources for react tests (#4459)
    • 60ec81f14 fix: sync agent actor init on master restart [DET-7746] (#4463)
    • 473f7b620 fix: loadingState cleanup for WorkspaceList (#4461)
    • e1fd06275 feat: add spinner to tiral/experiment tabs to show data fetching (#4436)
    • d123d90c3 ci: fix autolabeling for shared (#4456)
    • 7347bb66b test: update ci job for testing shared-code (#4444)
    • 6450aa140 docs: workspaces docs (#4448)
    • a1a06c4ac fix: trim workspace and project name whitespace [DET-7390] [DET-7747] (#4451)
    • 908cdb19e docs: PyTorchTrialCallback.on_validation_end (#4457)
    • e9144916c ci: add codeowners for shared directory (#4455)
    • d28d1c245 chore: share AvatarCard and Avatar comps [DET-7714 DET-7713] (#4430)
    • efc5ffb06 fix: only call move API for permissioned experiments (#4445)
    • 62b9a2df3 fix: map name to name instead of description in getProjectExperiments (#4450)
    • 6475a19c7 fix: mac path for --agent-config-path (#4449)
    • df830a118 fix: long tag wrap (#4446)
    • 10022e8ed fix: various workspaces fixes (#4443)
    • 6790aee73 feat: master config option for additional fluent outputs [DET-7549] (#4415)
    • c0ddb8dbd chore: webui changes for slurm (#4427)
    • 4df29fcd1 chore: resolve warnings from yaml and numpy, fix supposed fstrings, black formatting (#4372)

    Docker images

    • docker pull determinedai/determined-master:0.18.4
    • docker pull determinedai/determined-master:710d5575b
    • docker pull determinedai/determined-master:710d5575b8565fc50b5e65143b0b27dd661b0d17
    • docker pull determinedai/determined-dev:determined-master-710d5575b
    • docker pull determinedai/determined-dev:determined-master-710d5575b8565fc50b5e65143b0b27dd661b0d17
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.18.4
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:710d5575b
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:710d5575b8565fc50b5e65143b0b27dd661b0d17
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.18.4_checksums.txt(752 bytes)
    determined-agent_0.18.4_darwin_amd64.tar.gz(8.66 MB)
    determined-agent_0.18.4_linux_amd64.deb(8.33 MB)
    determined-agent_0.18.4_linux_amd64.rpm(8.31 MB)
    determined-agent_0.18.4_linux_amd64.tar.gz(8.29 MB)
    determined-agent_0.18.4_linux_ppc64.deb(7.29 MB)
    determined-agent_0.18.4_linux_ppc64.rpm(7.24 MB)
    determined-agent_0.18.4_linux_ppc64.tar.gz(7.23 MB)
    determined-helm-chart_0.18.4.tgz(8.96 KB)
    determined-master_0.18.4_checksums.txt(759 bytes)
    determined-master_0.18.4_darwin_amd64.tar.gz(40.59 MB)
    determined-master_0.18.4_linux_amd64.deb(69.36 MB)
    determined-master_0.18.4_linux_amd64.rpm(69.25 MB)
    determined-master_0.18.4_linux_amd64.tar.gz(40.53 MB)
    determined-master_0.18.4_linux_ppc64.deb(66.73 MB)
    determined-master_0.18.4_linux_ppc64.rpm(66.50 MB)
    determined-master_0.18.4_linux_ppc64.tar.gz(37.78 MB)
  • 0.18.3(Jul 11, 2022)

    Changelog

    • 5cf7e8b5a chore: bump version: 0.18.3-rc8 -> 0.18.3
    • 6d19e1795 docs: add release notes for 0.18.3 (#4491)
    • a7cc461ff chore: bump version: 0.18.3-rc7 -> 0.18.3-rc8
    • 6dba678bb fix: make actor system GetOrElseTimeout actually timeout. (#4499)
    • 250aca7b0 chore: bump version: 0.18.3-rc6 -> 0.18.3-rc7
    • f02911147 chore: bump version: 0.18.3-rc5 -> 0.18.3-rc6
    • 73fd9f09b fix: allocation errors on finished hp search [DET-7724] (#4467)
    • c85186be2 fix: correct path selector for tensorboard upload (#4474)
    • 2193001e0 feat: support non-root init container k8s [DET-7109] (#4460)
    • 40d4cd149 chore: bump version: 0.18.3-rc4 -> 0.18.3-rc5
    • d996a96a0 fix: sync agent actor init on master restart [DET-7746] (#4463)
    • 97b5a0c5e fix: loadingState cleanup for WorkspaceList (#4461)
    • e2fb0f865 docs: workspaces docs (#4448)
    • d835b1e8a fix: trim workspace and project name whitespace [DET-7390] [DET-7747] (#4451)
    • bcb8b3ea9 chore: bump version: 0.18.3-rc3 -> 0.18.3-rc4
    • e34bbe1b6 fix: map name to name instead of description in getProjectExperiments (#4450)
    • 9b86fc343 fix: mac path for --agent-config-path (#4449)
    • 0918a7123 fix: only call move API for permissioned experiments (#4445)
    • 5a8a4c462 chore: bump version: 0.18.3-rc2 -> 0.18.3-rc3
    • 054d089b3 fix: long tag wrap (#4446)
    • 29f224e42 fix: various workspaces fixes (#4443)
    • c7d6b68bf chore: bump version: 0.18.3-rc1 -> 0.18.3-rc2
    • 650cce517 chore: webui changes for slurm (#4427)
    • ca975c8b4 chore: bump version: 0.18.3-rc0 -> 0.18.3-rc1
    • 761e6ad52 feat: master config option for additional fluent outputs [DET-7549] (#4415)
    • 759e5d643 chore: bump version: 0.18.3-dev0 -> 0.18.3-rc0
    • ca0a7c116 chore: lock api state for backward compatibility check
    • 4010b330b feat: experiment comparison page (#4410)
    • 030de4a39 fix: default max length properly to undefined (#4437)
    • 6e6420ec8 feat: task container checks for occupied gpus [DET-5091] (#4323)
    • e414235c9 test: resolve useCustomizeColumnsModal test flakes [DET-7635] (#4433)
    • bd576678e chore: update deepspeed launcher to work with newer versions (#4156)
    • 4277e991d Split 'webui/react/src/shared/' into commit '3302ddf5d45c5a73cdcbbc750936bfb32e025c06'
    • 9fbf9d5d0 fix: hide continue trial for experiments with 0 or more than 1 trial (#4434)
    • 79b9d90b1 chore: Merge Slurm-infrastructure support changes (#4376)
    • 8b799e97d ci: send Docker image scans to the infrastructure engineering channel instead (#4425)
    • 7436c6c82 fix: properly close experiment create modal [DET-7553] (#4421)
    • 169c0d835 docs: clarify count requirement for grid search (#4428)
    • 47c393f5c test: tune web tests (#4409)
    • 2e8b92b27 feat: install a SIGUSR1 signal handler to print stacktraces (#4281)
    • bff6d1afc fix: registry auth in master.yaml (#4412)
    • 69d6b1e82 fix: Sync epoch_idx across all workers for PyTorchTrial [DET-7488] (#4303)
    • 3302ddf5d chore: make core styles and configs shareable [DET-7316] (#4398)
    • b8a7afcb0 chore: make core styles and configs shareable [DET-7316] (#4398)
    • 182af4956 fix: allocation state mismatch with proto [DET-7593] (#4408)
    • 2c4a77a57 chore: don't launch horovod for hypothetical zero slot trials (#4414)
    • e1c1eacf1 feat: shm_size can take a string with units like docker run allows (#4314)
    • c05ebd772 chore: migrate shared-web to subtree (#4402)
    • 050542264 chore: migrate shared-web to subtree (#4402)
    • cb1fc53fa fix: always flush tensorboard ready log (#4387)
    • c82d64c81 fix: End agent stats log to debug level (#4411)
    • b5e239714 fix: fix cluster resource pool selection [DET-7517 DET-7567 DET-7517] (#4391)
    • 6672fd8e1 chore: Merge Slurm-infrastructure support changes trial (#4396)
    • 6d0fe6bb1 fix: recover from permission error when deleting preexisting file or link (#4397)
    • 1e8bbac6d fix: move scale import (#4406)
    • 624d17904 chore: Add nolintlint to deadcode for unused oss method (#4404)
    • 8e50499b2 fix: Do not fail if no checkpoints to gc [FOUNDENG-83] (#4405)
    • e3ee40f8e fix: test agent config flake (#4386)
    • b701fcdf8 feat: Trial Metrics - Summary endpoints (#4392)
    • c9d9d5354 feat: add log scale to experiment viz (#4273)
    • 2d82f0148 chore: sync node version requirements with package-lock (#4395)
    • b2acc5daa chore: Update task logging setup [FOUNDENG-81] (#290) (#4389)
    • 0ad82fd5f chore: Merge Slurm-infrastructure support changes task_trial (#4393)
    • 9149c1a49 chore: Merge Slurm-infrastructure support changes job.go (#4388)
    • 3d306378e chore: Merge Slurm-infrastructure support changes archive (#4390)
    • e9f3b1a9b feat: rolling upgrades support for generic command types (notebooks etc.) [DET-7218] (#4371)
    • b130e3f32 fix: Add compute driver capabilities to determined-agent Dockerfile (#4385)
    • 6d8ea5304 chore: Restrict display names [DET-7356] (#4266)
    • 72d4bcb22 fix: update experiment list tag truncate rule [DET-7530] (#4373)
    • c78e92eb1 feat: stress test for agent enable/disable and reconnect spam [DET-6733] (#4269)
    • e4a275b26 feat: add --agent-config-path to det deploy local agent-up [DET-6278] (#4366)
    • 7f40e8642 chore: handle setResourcePool message in commands (#3571)
    • 9a1ab33a3 style: ban fmt.Println and fmt.Printf in go code. (#4369)
    • cbc37cd74 chore: upgrade mockery to latest (#4363)
    • 76423ba86 fix: don't GC registered checkpoints [DET-7418] (#4316)
    • 81fed29cd chore: idempotent container running messages [DET-7555] (#4364)
    • 19f311e9e feat: workspaces and projects [DET-6461] (#4203)
    • d257b95e9 chore: update resource pool cloud icons [DET-7160] (#4317)
    • c1da1f29f chore: restrict access to master.yaml and agent.yaml to the user/owner only. (#4299)
    • 8d91f0fb5 ci: Add dependabot for all the things (#4313)
    • 058cf5cd5 ci: make test-unit-storage run on any upstream branch (#4320)
    • 49966efa5 feat: Tensorboard profiles from all hosts (#4142)
    • 762cee189 ci: merge separate codeowners (#4322)
    • 5e6ba22a8 chore: bump version: 0.18.2-dev0 -> 0.18.3-dev0
    • bb5ccc123 docs: add release notes for 0.18.2 (#4318)
    • 13b6059c9 ci: create default codeowner file (#4321)
    • 0527c48dc fix: harness azure dependencies (#4319)
    • 869dea1f9 chore: upgrade to typescript 4.7 [DET-7499] (#4285)
    • 9c94d1fce ci: use parallelism for test-e2e-managed-devcluster. (#4193)
    • 98e4b0bd9 chore: delete dead code (#4312)
    • 3df0fb13c fix: provide unique port offsets to trials (#4315)
    • c5eb030e5 chore: drop pbt, adaptive and adaptive_simple from docs and web (#4311)
    • ca57ffcd6 chore: cleanup container terminations resent on agent exit (#4309) [DET-7533]
    • 11d17fe47 chore: add logging for e2e gpu test (#4232)
    • b14c28142 docs: more consistently use Determined AI vs. Determined (#4310)
    • 0a93e76dc fix: skip allocation check on checkpoint save to fix pulling preemption (#4308)
    • 4610d7071 ci: skip e2e_tests for webui
    • 2e9bae652 fix: remove limit of length 4 for task id (#4304)
    • ee6b118d5 chore: use vanilla markdown in examples (#4268)
    • db5cd90d9 fix: harness dependency class fore azure (#4302)
    • 19aa55c61 chore: stamp out unexpected messages [DET-7492] (#4284)
    • 342236a33 chore: FOUNDENG-55 For deepspeed, set the NCCL_SOCKET_IFNAME env variable based on dtrain_network_interface (#4297)
    • 69ff97386 ci: fix log message grepped for by test_launch_layer_cifar (#4301)
    • 1299ef395 ci: remove overzealous assert that is a race with a quick noop task (#4296)
    • 5c2830431 docs: move description of agent client cert fields to the right place (#4293)

    Docker images

    • docker pull determinedai/determined-master:0.18.3
    • docker pull determinedai/determined-master:5cf7e8b5a
    • docker pull determinedai/determined-master:5cf7e8b5a6a8393b04c1d54a5363cbbe6e8792d2
    • docker pull determinedai/determined-dev:determined-master-5cf7e8b5a
    • docker pull determinedai/determined-dev:determined-master-5cf7e8b5a6a8393b04c1d54a5363cbbe6e8792d2
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.18.3
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:5cf7e8b5a
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:5cf7e8b5a6a8393b04c1d54a5363cbbe6e8792d2
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.18.3_checksums.txt(752 bytes)
    determined-agent_0.18.3_darwin_amd64.tar.gz(8.66 MB)
    determined-agent_0.18.3_linux_amd64.deb(8.33 MB)
    determined-agent_0.18.3_linux_amd64.rpm(8.30 MB)
    determined-agent_0.18.3_linux_amd64.tar.gz(8.29 MB)
    determined-agent_0.18.3_linux_ppc64.deb(7.29 MB)
    determined-agent_0.18.3_linux_ppc64.rpm(7.24 MB)
    determined-agent_0.18.3_linux_ppc64.tar.gz(7.23 MB)
    determined-helm-chart_0.18.3.tgz(8.96 KB)
    determined-master_0.18.3_checksums.txt(759 bytes)
    determined-master_0.18.3_darwin_amd64.tar.gz(40.57 MB)
    determined-master_0.18.3_linux_amd64.deb(69.29 MB)
    determined-master_0.18.3_linux_amd64.rpm(69.18 MB)
    determined-master_0.18.3_linux_amd64.tar.gz(40.49 MB)
    determined-master_0.18.3_linux_ppc64.deb(66.70 MB)
    determined-master_0.18.3_linux_ppc64.rpm(66.46 MB)
    determined-master_0.18.3_linux_ppc64.tar.gz(37.78 MB)
  • 0.18.2(Jun 14, 2022)

    Changelog

    • d214a34df chore: bump version: 0.18.2-rc2 -> 0.18.2
    • 10b172bcc docs: add release notes for 0.18.2 (#4318)
    • 38e0efb15 chore: bump version: 0.18.2-rc1 -> 0.18.2-rc2
    • c5cb23fc9 fix: harness azure dependencies (#4319)
    • f623c1c1e chore: bump version: 0.18.2-rc0 -> 0.18.2-rc1
    • 350de3838 chore: cleanup container terminations resent on agent exit (#4309) [DET-7533]
    • f02f11bc5 fix: skip allocation check on checkpoint save to fix pulling preemption (#4308)
    • cfc0f8108 fix: remove limit of length 4 for task id (#4304)
    • 0645ba968 fix: harness dependency class fore azure (#4302)
    • 9991528a2 chore: bump version: 0.18.2-dev0 -> 0.18.2-rc0
    • 86b6c28a4 chore: lock api state for backward compatibility check
    • 271ae41c7 chore: ensure container terminations aren't lost in the event of network failures [DET-7440, DET-7441] (#4272)
    • 2d5a11865 fix: check for errors from actor asks in createResourcePoolSummary [DET-7494] (#4294)
    • daee03661 fix: stale agent state crashes resource pool [DET-7493] (#4295)
    • 4722caa95 fix: limit waiting for logger to 30 seconds (#4289)
    • b8f9463c6 fix: jupyterlab modal stale values, not sending values to api
    • 42615b4b1 feat: add cli and api for delete checkpoints [DET-7119], [DET-7120] (#4246)
    • ad9bcd809 chore: restore determined_version in Trial checkpoint metadata (#4288)
    • 0b7dcdbd2 chore: widen node version detection to 16.29 (#4290)
    • 118a850da fix: pass signals through wrapper processes (#4286)
    • 0089ede8c fix: enable Full Configuration in appropriate modals [DET-7495] (#4280)
    • 45c62f20c feat: show only selected trials in learning curve
    • 922b11349 chore: reduce use of any in InteractiveTable
    • 5b03672a4 chore: update docs to accurate node support (#4279)
    • 59f686c96 fix: make trials table sort properly (#4275)
    • 695b30117 feat: incrementally release resources (#4278)
    • c0f53fa26 perf: avoid Seq Scanning raw steps (#4244)
    • b36596f13 fix: backslashes in show_ssh_command for windows (#4260)
    • ec503e06d fix: correct error message when command list fails (#4235)
    • 15b0525fc feat: add master-side verification for agent mTLS (#4220)
    • c6e824937 fix: don't hardcode /bin/which in entrypoints (#4257)
    • 236c24117 refactor: migrating tables to use InteractiveTable component [DET-7382] (#4229)
    • 156ff504b fix: reshow drop targets for customize columns modal (#4199)
    • 5e9962686 docs: delete mnist_tf_layers example (#4263)
    • 9d8223113 Revert "add autosync action"
    • 8288461f5 add autosync action
    • 70e534f9e fix: det task logs unable to use trial task IDs and checkpoint GC task IDs [DET-7424] (#4258)
    • 12d21c2e0 fix: gcs storage upstream test failure (#4253)
    • b09ec407e feat: improve trial logs to have some system events [DET-5885] (#4215)
    • c20ad4334 ci: Lints migrations to ensure new migrations have higher timestamps than old ones [DET-7146] (#4250)
    • da83e6509 chore: small cleanups for slurm (#4211)
    • b71f179aa feat: det deploy now can use --yes to skip prompts [DET-7408] (#4255)
    • 7b3fee44c chore: better slurm option override support (#4254)
    • 7af3cfe48 feat: add google cloud storage (gcs) prefix support [DET-6883] (#4238)
    • b5c2b14f8 docs: add enhanced launcher user guide (#4248)
    • eb1d9aaac chore: add dev server support for embedded tasks view (#4243)
    • 9d19a8704 perf: dont repeatedly reprocess profiler data
    • b6582c02b ci: add check to prevent ssh git url (#4240)
    • 50bdbc3a8 chore: bumpenvs (#4239)
    • 1065314ea chore: add node_modules to eslintignore (#4237)
    • c74dee646 feat: Add theme toggle to user settings [DET-7321] (#4204)
    • 3016fc103 chore: remove legacy code/docs for NCCL/Gloo port range config (#4187)
    • f1e8d3cdc ci: check ulimit before 4x4 distributed test on macs (#4234)
    • 7d9e999cd fix: adjust page to preserve props.children (#4231)
    • 35258efd7 feat: Enable sending empty string for displayname with fallback to username [DET-7031] (#4140)
    • 2e77ec58c fix: det shell start/open, in windows (#4227)
    • 7217cfd94 feat: k8s detect non-det tasks (#4154)
    • a88f5c450 chore: share webui base page (#4218)
    • 15bd75895 fix: use custom image for tensor board [DET-7242] (#4123)
    • de926b1eb chore: fix rendezvous timeout logic (#4226)
    • c29e97a45 chore: base Dockerfile TensorFlow 2.6, 2.7, 2.8 security patches [DET-7325] (#4223)
    • 7882381cd fix: authenticate pprof endpoints [DET-7402]
    • 2b0ccb830 chore: bump version: 0.18.1-dev0 -> 0.18.2-dev0
    • bcbab4d92 docs: add release notes for 0.18.1 (#4216)
    • 034b9570c chore: revert rename of RestoreResourcesFailure -> ResourcesFailure. (#4210)
    • a7c4c2afe feat: enable agent-side mTLS for connection to master (#4212)
    • c9e13b64a feat: save connection in context (#4213)
    • 15f65ab37 feat: pix2pix example (#4125)
    • db987de7d chore: delete "conditional" json-schema extension (#4177)
    • 9b33f825c fix: use bigint for checkpoint size in proto_get_trials_plus (#4208)
    • 2c4c847d7 docs: update release note instructions with important admonition (#4207)
    • 138caf8f3 fix: pool detail page tab count when loading (#4200)
    • c5f685f18 feat: move task logs to embedded view [DET-7169] (#4179)
    • 54982c9be perf: tweak proto_get_trials_plus plan (#4206)
    • e585ebbb1 refactor: cleanup task logging shell scripts (#4113)
    • 4e2913f4a chore: update entrypoint in expconf docs (#4198)
    • fcff1c2aa fix: agent panic on commands with unusal formatted environment variables [DET-6649] (#4202)
    • 4172a4690 refactor: pull in user service code changes from EE (#4183)
    • 1c48fa618 docs: improve OpenTelemetry docs slightly (#4182)
    • 7f508e462 fix: allow internal: null for pre-0.15.6 experiments (#4197)
    • 55957fe06 fix: add restarts back to get_trial_ids for sorting
    • dd8d3f377 feat: add det experiment logs <EXP_ID> [DET-7145] (#4190)
    • 3ef36d56b chore: refactor action dropdown comp to be reused [DET-7171] (#4164)
    • 623e60d37 ci: bust circleci cache (#4189)
    • af56e0103 docs: document using AWS Load Balancer on EKS [DET-6669] (#4174)
    • 36e56670d feat: allow enabling Prometheus monitoring through helm [DET-6993] (#4158)
    • 3bb7bb1d9 style: minor theme fixes and style adjustments [DET-7349] (#4161)
    • 2166dfc88 docs: update screen shots for cluster UI (#4188)
    • b7a327819 style: address new flake8-comprehensions, pyzmq==23.0.0. (#4185)
    • 5e1a81cb2 feat: allow setting of checkpointStorage.prefix through helm [DET-7152] (#4152)
    • 526e1dcc1 feat: display trial restarts [DET-7347] (#4160)
    • 2fdc6d710 fix: agent can now be control-C while connecting to master [DET-6287] (#4178)
    • d339f7be2 chore: migrate det a list to new api and bindings (#4186)
    • ed257d38e refactor: rip out UseFluentLogging. (#4184)
    • 21b85907b docs: update fluent-bit version. (#4181)
    • 4b325a558 docs: document database SSL options (#4169)
    • 5d228f8a5 chore: make core-api tutorial Windows-friendly (#4176)
    • 187505148 fix: sync slot usage for k8s [DET-7350] (#4172)
    • 92a944f96 chore: add .dccache to .gitignore (#4173)
    • 5e7a30c43 docs: fix typo in release note (#4170)
    • a52210f9e feat: chart sync provider [DET-7309] (#4139)
    • a7fafbb9a fix: enable currently active side nav item (#4167)
    • 0a5d54d45 chore: fix hardcoded url in schema logic (#4171)
    • 7df89b293 chore: allow deleting delete failed experiments (#4141) [DET-7070]
    • 4ef2b6771 perf: fixup query for latest training per trial (#4166) [DET-7352]
    • 3d3fe1cd0 fix: include both old and new checkpoints in total checkpoint size (#4165)
    • 4d98cee3d fix: replace carriage returns with newlines in task output [DET-5302] (#3945)
    • a2f878aaf chore: only warn on invalid calls to daemonize resources for slurm (#4108)
    • c00ce0a4c chore: check git state in lock-api-state.sh (#4163)
    • ec743d754 ci: turn off github annotations. (#4146)
    • 0efd44d08 doc: fix a broken file reference (#4131)
    • b911d85ed fix: avoid potential race between AllocationReady and Running state (#4159)
    • cc59985dc revert: partial revert of 96e0e584 (#4162)
    • f07633b4e fix: port collisions for multiple shared-non distributed jobs (#4120) [HAL-2894]
    • 80d1bb152 feat: Add embedded experience for JupyterLab and TensorBoard [DET-7162] (#4134)
    • 83cc1ec8f fix: prevent experiment name in header from flowing entire vertical space of screen during resize (#4157)

    Docker images

    • docker pull determinedai/determined-master:0.18.2
    • docker pull determinedai/determined-master:d214a34df
    • docker pull determinedai/determined-master:d214a34df0c0eb2e5e38ae63d1359862fd2af8f1
    • docker pull determinedai/determined-dev:determined-master-d214a34df
    • docker pull determinedai/determined-dev:determined-master-d214a34df0c0eb2e5e38ae63d1359862fd2af8f1
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.18.2
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:d214a34df
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:d214a34df0c0eb2e5e38ae63d1359862fd2af8f1
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.18.2_checksums.txt(752 bytes)
    determined-agent_0.18.2_darwin_amd64.tar.gz(8.61 MB)
    determined-agent_0.18.2_linux_amd64.deb(8.29 MB)
    determined-agent_0.18.2_linux_amd64.rpm(8.26 MB)
    determined-agent_0.18.2_linux_amd64.tar.gz(8.25 MB)
    determined-agent_0.18.2_linux_ppc64.deb(7.25 MB)
    determined-agent_0.18.2_linux_ppc64.rpm(7.20 MB)
    determined-agent_0.18.2_linux_ppc64.tar.gz(7.19 MB)
    determined-helm-chart_0.18.2.tgz(8.96 KB)
    determined-master_0.18.2_checksums.txt(759 bytes)
    determined-master_0.18.2_darwin_amd64.tar.gz(40.36 MB)
    determined-master_0.18.2_linux_amd64.deb(69.84 MB)
    determined-master_0.18.2_linux_amd64.rpm(69.73 MB)
    determined-master_0.18.2_linux_amd64.tar.gz(40.30 MB)
    determined-master_0.18.2_linux_ppc64.deb(67.25 MB)
    determined-master_0.18.2_linux_ppc64.rpm(67.01 MB)
    determined-master_0.18.2_linux_ppc64.tar.gz(37.59 MB)
  • 0.18.1(May 24, 2022)

    Changelog

    • 9284a3aa6 chore: bump version: 0.18.1-rc7 -> 0.18.1
    • eb09c3449 docs: add release notes for 0.18.1 (#4216)
    • bad9a336a chore: bump version: 0.18.1-rc6 -> 0.18.1-rc7
    • 6cff1b0a5 fix: use bigint for checkpoint size in proto_get_trials_plus (#4208)
    • 60291e934 chore: bump version: 0.18.1-rc5 -> 0.18.1-rc6
    • 8f4a7977d perf: tweak proto_get_trials_plus plan (#4206)
    • 414bcd2a4 chore: bump version: 0.18.1-rc4 -> 0.18.1-rc5
    • 789b39c9c fix: allow internal: null for pre-0.15.6 experiments (#4197)
    • fd27bac05 fix: add restarts back to get_trial_ids for sorting
    • 845f2f06e chore: bump version: 0.18.1-rc3 -> 0.18.1-rc4
    • eed09e911 docs: update screen shots for cluster UI (#4188)
    • 88271dd14 style: minor theme fixes and style adjustments [DET-7349] (#4161)
    • f2a4e5ee4 feat: display trial restarts [DET-7347] (#4160)
    • eaf84e681 chore: bump version: 0.18.1-rc2 -> 0.18.1-rc3
    • a8ddc8208 fix: sync slot usage for k8s [DET-7350] (#4172)
    • 1f13710b3 fix: enable currently active side nav item (#4167)
    • e9333d2f1 perf: fixup query for latest training per trial (#4166) [DET-7352]
    • 764ef2d06 fix: include both old and new checkpoints in total checkpoint size (#4165)
    • 4157c82a9 chore: bump version: 0.18.1-rc1 -> 0.18.1-rc2
    • e2f949ab2 chore: bump version: 0.18.1-rc0 -> 0.18.1-rc1
    • 26ede20fb chore: revert scheduling docs
    • 8ea2a5256 fix: prevent experiment name in header from flowing entire vertical space of screen during resize (#4157)
    • 1e244e048 chore: bump version: 0.18.1-dev0 -> 0.18.1-rc0
    • 96e0e5840 chore: lock api state for backward compatibility check
    • 82f0366dc feat: allow NaN validation metrics [DET-7177] (#4150)
    • 0bbeec1f7 feat: upload all tb files DET-7139 (#4155)
    • 90b918acf fix: adjust upscaling of column widths [DET-7220] (#4138)
    • 4b66bf07f feat: rolling upgrades v0 [DET-6548] (#4031)
    • beea2451a ci: disable most checks on ci-only changes (#4118)
    • 5f7e74a6c fix: upstream test failures due to config being admin protected (#4153)
    • da1dcd76a fix: return user data when new user is created [DET-7255] (#4149)
    • 1eba7a20c docs: a vain attempt to pass ci test on already approved pr4110 content changes (#4151)
    • 7aea015d0 fix: No redirecting url when model name is changed (#4127)
    • 00171f82b feat: Cluster UI improvement [DET-7072, DET-7073] (#4009)
    • 27e04e4d8 feat: require admin privileges for cluster managment [DET-7186] (#4129)
    • 09a8ff698 ci: update gke version. (#4147)
    • fa3a95960 ci: Increase package-and-push-system-local resource class (#4143)
    • 0d4fe2385 chore: fix boolean urlparams for grpc (#4136)
    • b1829b8c8 feat: enable SLURM preemption (#4114) [FOUNDENG-21]
    • b4ef2735f chore: add a local docs server (#4117)
    • 5dea21112 refactor: theme architecture [DET-6211] (#4004)
    • aef66b088 build: make docs build incremental and idempotent (#4116)
    • ec7007cef ci: persist debs and rpms in circleci for dev, rc, and release builds (#4124)
    • b89b0a359 fix: user filter on dashboard [DET-7251] (#4132)
    • e3e50a916 chore: add job ID and experiment labels to prometheus endpoint mappings [DET-6964] (#4119)
    • b0d8a935e ci: make codecov information for sure now. (#4130)
    • b313503d0 chore: restructure shareable webui utils and types (#4112)
    • 78505d6e6 ci: turn off codecov bot PR comments (#4122)
    • a91866fdb chore: fix rank determination for horovod with mpi (#4109)
    • 3368a7727 fix: NCCL interface in distributed tests (#4111)
    • 18aadd0dd chore: bump version: 0.18.0-dev0 -> 0.18.1-dev0
    • 7500d6fc6 docs: add release notes for 0.18.0 (#4102)
    • 1f4a64260 chore: bump version: 0.17.16-dev0 -> 0.18.0-dev0
    • 59928ec4f chore: explicit naming of preemption and coscheduler resources [DET-7140] (#4101)
    • ac2564b87 fix: add missing task log teardown for trials (#4107)
    • c41e1dcc0 docs: rework quickstart for ml developers (#4091)
    • 2c7564d13 chore: use reported slots available for on prem deployments (#4095)
    • 62049aeab chore: change codecov to informational only (#4105)
    • 5061a8fd3 fix: mark distributed tests as parallel (#4093)
    • 87b8e5396 chore: enable codecov enforcement (#4084)
    • 15dd7e3d4 fix: bindings sessions in experiment apis. (#4096)
    • eb65b09ff chore: cleanup and fixes for "det deploy" (#4103)
    • d78a7120a perf: improve plan for proto_get_trial_plus.sql (#4073)
    • a980c560b build: update submodules on webui get-deps (#4082)
    • 78d4e8b7d chore: HAL-2879 Cleanly shutdown all sshd servers on exit (#176) (#4087)
    • 940f8f703 chore: Refactor JupyterLabModal pattern [DET-6276] (#4072)
    • ec5553ab4 chore: make container proxy support more flexible, for slurm (#3948)
    • 3614c83a6 chore: wait for process substition log filters [DET-6712] (#3930)
    • 474742f9a chore: clean up useCallback dependency (#4092)
    • 39588ee25 feat: add det.LOG_FORMAT constant (#4090)
    • bcc50f97c fix: Support rendering rank of 0 (#4083)
    • c523523e8 fix: consistent total slot calculation for cluster overview [DET-7182] (#4080)
    • 9c163497c feat: add wrap_rank helper script (#4086)
    • ea4a94911 fix: dont show archived in column picker [DET-7187] (#4085)
    • a4e5f8410 feat: authenticate task proxies (#4071)
    • 5fab38417 fix: wrap torch.distributed launch in pid server/client (#4077)
    • e73f06309 fix: use displayNames in ClusterHistoricalUsage (#4059)
    • 3368dd86a chore: filter out NaN, +/- Infinity metric values for charts for now. (#4076)
    • f8b5bf50e chore: add and consolidate code coverage to codecov (#4064)
    • 487b04c21 docs: release note for core api (#4069)
    • 15a668be9 fix: add user column back to experiment list (#4070)
    • 880b769ec feat: break workload info from trial endpoint into a new endpoint [DET-6729] (#3635)
    • d703c968c fix: show notification when delete experiment fail [DET-6811] (#4051)

    Docker images

    • docker pull determinedai/determined-master:0.18.1
    • docker pull determinedai/determined-master:9284a3aa6
    • docker pull determinedai/determined-master:9284a3aa6e307c61426c93b5e09730c664725604
    • docker pull determinedai/determined-dev:determined-master-9284a3aa6
    • docker pull determinedai/determined-dev:determined-master-9284a3aa6e307c61426c93b5e09730c664725604
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.18.1
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:9284a3aa6
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:9284a3aa6e307c61426c93b5e09730c664725604
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.18.1_checksums.txt(752 bytes)
    determined-agent_0.18.1_darwin_amd64.tar.gz(8.61 MB)
    determined-agent_0.18.1_linux_amd64.deb(8.28 MB)
    determined-agent_0.18.1_linux_amd64.rpm(8.25 MB)
    determined-agent_0.18.1_linux_amd64.tar.gz(8.24 MB)
    determined-agent_0.18.1_linux_ppc64.deb(7.24 MB)
    determined-agent_0.18.1_linux_ppc64.rpm(7.19 MB)
    determined-agent_0.18.1_linux_ppc64.tar.gz(7.19 MB)
    determined-master_0.18.1_checksums.txt(759 bytes)
    determined-master_0.18.1_darwin_amd64.tar.gz(40.33 MB)
    determined-master_0.18.1_linux_amd64.deb(69.23 MB)
    determined-master_0.18.1_linux_amd64.rpm(69.12 MB)
    determined-master_0.18.1_linux_amd64.tar.gz(40.28 MB)
    determined-master_0.18.1_linux_ppc64.deb(66.62 MB)
    determined-master_0.18.1_linux_ppc64.rpm(66.39 MB)
    determined-master_0.18.1_linux_ppc64.tar.gz(37.55 MB)
  • 0.18.0(May 9, 2022)

    Changelog

    • 3c00fc281 chore: bump version: 0.18.0-rc3 -> 0.18.0
    • 797ceeca4 docs: add release notes for 0.18.0 (#4102)
    • 505102457 chore: bump version: 0.18.0-rc2 -> 0.18.0-rc3
    • bd87faa20 perf: improve plan for proto_get_trial_plus.sql (#4073)
    • c2a3ba920 fix: Support rendering rank of 0 (#4083)
    • d46995f27 fix: consistent total slot calculation for cluster overview [DET-7182] (#4080)
    • 1b946122f feat: add det.LOG_FORMAT constant (#4090)
    • aacd23be7 feat: add wrap_rank helper script (#4086)
    • 8fbf0a78a fix: dont show archived in column picker [DET-7187] (#4085)
    • a190a1f88 chore: bump version: 0.18.0-rc1 -> 0.18.0-rc2
    • 88fc38b7f chore: bump version: 0.18.0-rc0 -> 0.18.0-rc1
    • 4d73ceaa6 feat: authenticate task proxies (#4071)
    • 3e594e47b chore: filter out NaN, +/- Infinity metric values for charts for now. (#4076)
    • 1038840af fix: wrap torch.distributed launch in pid server/client (#4077)
    • 7df0f0013 docs: release note for core api (#4069)
    • 7721553cc fix: add user column back to experiment list (#4070)
    • db5f434de chore: bump version: 0.17.16-dev0 -> 0.18.0-rc0
    • 02114cf37 chore: lock api state for backward compatibility check
    • 7be795af1 docs: Core API reference and cookbook docs (#4054)
    • 5e364d1a6 chore: ancient checkpoints for very old pytorch (#4068)
    • 51e80d8ea perf: minimize create/destroy of uPlots [DET-6972] [DET-6972] [DET-6796] [DET-6853] [DET-6672] (#3935)
    • 531af28cc chore: update removed reducer methods (#4067)
    • 6f63f13ec chore: remove det.pytorch.reset_parameters() (#4066)
    • 4fb1886b8 feat: generic checkpoints and making Core API public (#3859)
    • db0f8ef5c fix: correct the return type for readStream (#4063)
    • 33a1a6c66 chore: remove PBT searcher (#4058)
    • a35d49a77 feat: support for torch native dtrain (#3807)
    • 29385bc09 chore: update k8s scheduler to run latest image (#4061)
    • e7f328962 chore: remove remainder of native api (#4055)
    • 87008acad chore: deprecate data layer (#4056)
    • 75a6bf59d chore: remove deprecated experimental custom reducer methods (#4060)
    • 948518cee chore: remove unnecessary use of username in webui (DET-6922) (#4049)
    • d2cc8d425 refactor: simplify apiConfig to reduce redundancy (#4043)
    • 707dd2680 chore: Refactor CheckpointModal to be hook based [DET-7136] (#4034)
    • ea535b035 fix: make "Show full config" modal larger (#4053)
    • 625070edb chore: move webui codecov upload to use env var instead of hardcoded token (#4045)
    • 0c1821eed fix: remove dead shell start code [DET-7131] (#4016)
    • 59e5b9b9a chore: Recommend git clone --recurse-submodules for submodules (#4036)
    • ad06cd431 fix: move allocation resources migration to the top.
    • a6bf58b65 refactor: rewrite ndjson streamer [DET-7121] (#4014)
    • 6d018f09e fix: Alter Boolean arg default handling (#4038)
    • b278ef040 chore: store allocation resources and agent RM containers in DB. (#3946)
    • 7afc75f49 ci: Add build/coverage badge for webui/react [DET-6767] (#4028)
    • 628f07c96 move up profile-pics migration to assure it happens
    • 600d77d52 fix: put profile pic migrations in correct directory
    • 3569a6f80 build: set up shared-web submodule [DET-6961] (#4006)
    • 0686befc4 test: ensure that agent disabling doesn't count for experiment restarts [DET-5916] (#4029)
    • b61a83dec refactor: remove NewAllocationID. (#3959)
    • b605da88d fix: add another missing key fix (#4033)
    • 64e6f7d9e feat: allow position modification in k8s [DET-6967] [DET-6968] (#3938)
    • 24bbec93f feat: Creating a table for user profile pictures
    • b831dbd34 fix: Human-readable option for empty filters in logs [DET-6781, DET-6999] (#4017)
    • 5d04a258c fix: Prevent archived models from appearing in the Register Checkpoint modal [DET-7132] (#4015)
    • c1523dab1 fix: Ensure table offset does not exceed pagination total [DET-6829] (#4011)
    • 84401b26c fix: change timeout in e2e_tests/tests/cluster/test_logging.py/test_trial_logs (#4019)
    • ad6f3d54a fix: add key for cancel operation (#4023)
    • 9ba0f0459 chore: remove deprecated type provider for moment-timezone (#4018)
    • be3125b8d docs: update copyright year (#4005)
    • ea4ab8854 feat: add product feedback link [DET-5811]
    • 616241874 chore: fix documentation comment for /ws/data-layer
    • 482ecd61c docs: fix grammar in training-run index (#3988)
    • 5480903ae docs: fix indent in pytorch-porting-tutorial (#3989)
    • 9a6d75260 chore: warn about ambiguous enum params (#3997)
    • 4de96b955 chore: give the latest-master deploy job a name (#3900)
    • 64b46627c fix: pass task time not from logCtx (#3993)
    • 59796b027 chore: update node version to active LTS (DET-7046) (#3932)
    • 07de42689 chore: show appropriate severity level on job launch failures (#4002)
    • c68b7a429 test: add option to disable compare_stats. (#4008)
    • 915184c77 chore: Removing CODEOWNERS entry for release notes (#3978)
    • 93b98c269 fix: handle zeros correctly in HpTrialTable metricSorter (#4003)
    • cd0e53149 chore: bump version: 0.17.15-dev0 -> 0.17.16-dev0
    • 7d1493d00 docs: add release notes for 0.17.15 (#3986)
    • 63eb86c92 feat: add modal to explain why users cannot delete items [DET-6998] (#3994)
    • f6fbc05b6 fix: guard trial and allocation exit logic correctness (#3983)
    • 08218c128 fix: do not default to noverify for bindings sessions in CLI. (#3991)
    • 4c5dad8a1 chore: add expconf environment.slurm (#3966)
    • d2610c557 fix: handle infinite metrics in searcher snapshots [DET-7122] (#3999)
    • 57d1b38f9 chore: handle rank id in log entries (#3995)
    • f915a6952 fix: handle infinite validation metric values in more cases (#3992)
    • 8527e60c5 fix: Experiment columns filter is still applied after closing [DET-6837] (#3982)
    • c7f4610f0 fix: avoid permanent filtered state in model registry [DET-6946] (#3984)
    • 35ac5ff88 docs: update release note process (#3990)
    • d565f842e ci: bump profiling test timeout back up (#3981)
    • a43fcffed chore: handle errors from starting allocations [DET-5862] (#3975)
    • bd5484561 feat: add det support bundle [DET-5886] (#3904)
    • 899db1998 feat: add drag and drop functionality to experiment list column table [DET-7044] (#3956)
    • cf5c94fb6 chore: document usage of /ws/data-layer [DET-6685] (#3971)
    • 6f6e4a26a feat: Add overall allocation bar to new cluster page [DET-7074] (#3955)
    • ff78aa985 chore: add error type for non retry-able resource manager errors (#3947)
    • e9d26af42 chore: apply filtering to task logs (#3963)
    • 2a981a198 fix: add test warmup command e2e [DET-5803] (#3965)
    • a77b8c189 docs: fix tutorial link swap (#3979)
    • 5d65f6443 ci: remove old semantic PR app config. (#3980)
    • 896725bc7 fix: notebook logs filter by level (#3967)
    • f3660c2cb chore: remove old notebook logs endpoint (#3960)
    • 8a99e03c8 fix: task log level parsing (#3973)
    • 005d14402 fix: forward job.DeleteJob through agentRM (#3968)
    • 8e21bf0f8 docs: add release notes for PR #3914 (#3962)
    • 28ab2fa42 chore: Update list of false alarms in Docker image scanning (#3933)
    • 80a1dc7f6 ci: new semantic pull request check. (#3958)
    • 189f148f7 docs: update package versions, add ROCm, edit to style guide (#3950)
    • 0d180fe96 fix: update HPE logo sizes (#3953)
    • 393de717f chore: add message to cleanup external RM resources on delete, for slurm (#3902)

    Docker images

    • docker pull determinedai/determined-master:0.18.0
    • docker pull determinedai/determined-master:3c00fc281
    • docker pull determinedai/determined-master:3c00fc281542c272c1591c7d1c86eb53db8f230c
    • docker pull determinedai/determined-dev:determined-master-3c00fc281
    • docker pull determinedai/determined-dev:determined-master-3c00fc281542c272c1591c7d1c86eb53db8f230c
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.18.0
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:3c00fc281
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:3c00fc281542c272c1591c7d1c86eb53db8f230c
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.18.0_checksums.txt(752 bytes)
    determined-agent_0.18.0_darwin_amd64.tar.gz(8.35 MB)
    determined-agent_0.18.0_linux_amd64.deb(8.04 MB)
    determined-agent_0.18.0_linux_amd64.rpm(8.01 MB)
    determined-agent_0.18.0_linux_amd64.tar.gz(8.00 MB)
    determined-agent_0.18.0_linux_ppc64.deb(7.02 MB)
    determined-agent_0.18.0_linux_ppc64.rpm(6.98 MB)
    determined-agent_0.18.0_linux_ppc64.tar.gz(6.97 MB)
    determined-master_0.18.0_checksums.txt(759 bytes)
    determined-master_0.18.0_darwin_amd64.tar.gz(40.23 MB)
    determined-master_0.18.0_linux_amd64.deb(66.06 MB)
    determined-master_0.18.0_linux_amd64.rpm(65.95 MB)
    determined-master_0.18.0_linux_amd64.tar.gz(40.17 MB)
    determined-master_0.18.0_linux_ppc64.deb(63.47 MB)
    determined-master_0.18.0_linux_ppc64.rpm(63.24 MB)
    determined-master_0.18.0_linux_ppc64.tar.gz(37.45 MB)
  • 0.17.15(Apr 25, 2022)

    Changelog

    • 9b74e5444 chore: bump version: 0.17.15-rc4 -> 0.17.15
    • 931983ac6 chore: bump version: 0.17.15-rc3 -> 0.17.15-rc4
    • 3bd2b4afa fix: handle infinite metrics in searcher snapshots [DET-7122] (#3999)
    • 153221eda chore: bump version: 0.17.15-rc2 -> 0.17.15-rc3
    • 1e9f5c4b6 chore: handle rank id in log entries (#3995)
    • d12c8921e fix: handle infinite validation metric values in more cases (#3992)
    • 59b180d26 docs: add release notes for 0.17.15 (#3986)
    • 9e3e5419f chore: bump version: 0.17.15-rc1 -> 0.17.15-rc2
    • 1c97bc76a chore: apply filtering to task logs (#3963)
    • 8dd01c87f fix: task log level parsing (#3973)
    • 9c241218e docs: add release notes for PR #3914 (#3962)
    • 507cfe75d fix: update HPE logo sizes (#3953)
    • e1a3de646 chore: bump version: 0.17.15-rc0 -> 0.17.15-rc1
    • dc3ef476f chore: bump version: 0.17.15-dev0 -> 0.17.15-rc0
    • 6c38aad74 chore: lock api state for backward compatibility check
    • f206919e8 fix: take out job summary caching [DET-6695] (#3849)
    • c9941ac85 chore: add missing icons to mobile navbar [DET-7009]
    • 927ef9417 fix: parse different time format in compare stats script [DET-7039] (#3909)
    • 55fcf65f4 fix: checkpoint gc job should close (#3943)
    • b8f1073e1 feat: track task stats [DET-6872, DET-6926, DET-6927] (#3852)
    • b1a470d7c chore: add key attribute to avatar ActionCard action (#3939)
    • 7c7375c26 chore: give det job a default command to run (#3934)
    • 9028fe001 chore: mask registry auth password in harness [DET-6279] (#3867)
    • 7d0234b16 chore: add filtering by userIds to API endpoints (DET-7019) (#3898)
    • 180a79f47 chore: mask registry creds in webui [DET-7013] (#3881)
    • 2d4456439 replace username with userId for user API (#3914)
    • b6b2c7172 test: add interaction tests for spinner [DET-6665] (#3826)
    • d86d09ba5 chore: prevent InteractiveTable scroll from moving pagination or other controls [DET-7037] (#3923)
    • c9a24582f fix: crash on upgrade to InteractiveTable [DET-7036] (#3922)
    • 4c0ef956c hide archived experiments unless using --all (#3918)
    • 586bba4a2 docs: tweak docs for socket activation (#3926)
    • 64ae7ca2a perf: add infiniband-related libraries to environment (#3832)
    • bcc954a0d chore: disallow getting metadata from dummy checkpoint context (#3920)
    • 575cc21f7 chore: update dep requirements for react (#3901)
    • e359843b3 chore: demote _get_last_validation to internal (#3919)
    • 5c93cb68a chore: log checkpoint uuids in core, not in wlsq (#3924)
    • 7086df4b7 chore: add missing docstrings in core api (#3911)
    • ea0db53a9 feat: add slurm rendezvous (#3777)
    • d5e793ba1 chore: make store_path auto-create the directory (#3916)
    • 1f3c4b232 feat: add core.DownloadMode (#3910)
    • c334a5385 docs: fix install cli typo (#3917)
    • 069eb6f27 fix: run scheduling on agent connection/enable events, reconnectBacklog replay. (#3906)
    • b75afe6d9 chore: bump version: 0.17.15-dev0-dev0 -> 0.17.15-dev0
    • d06dc9429 refactor: remove dependency of settings in the updateSettings call (#3894)
    • 1b4c8e951 chore: remove pr preview cluster address [DET-7040] (#3907)
    • e5723c7ea docs: Release notes for 0.17.14 (#3912)
    • 43b9a7c06 chore: bump version: 0.17.14 -> 0.17.15-dev0
    • 5c4554437 fix: RM crashes when setting cmd priority (#3908)
    • 0c5236786 chore: fix deepspeed nightly tests (#3897)
    • 3e6267f0d chore: update StorageManager and extend CheckpointContext (#3829)
    • f76cc4599 docs: fix description of scheduling_unit behavior (#3890)
    • 1815ee3cb chore: sweeping rename of Core API components (#3896)
    • 0b7141b88 feat: deepspeed DCGAN example (#3758)
    • 015a8e07c feat: use enums instead of chief_only bool in Core API (#3888)
    • 129d841cb fix: use otel only if enabled (#3893)
    • 0ab3288d8 hide sizeChanger on RoutePagination (#3892)
    • ae50d3cec fix: match reported rp name for k8 across endpoints [DET-7006] (#3870)
    • c49561c6f docs: various fixes for master configuration and k8s docs (#3889)
    • c716759b1 chore: bump version: 0.17.13-dev0 -> 0.17.14-dev0
    • 2fdb2efc7 docs: add release notes for 0.17.13 (#3879)
    • a34962262 chore: Clean up tests with fewer Optional types and asserts (#3872)
    • b0e0c96ea feat: make core.Searcher multiworker-safe (#3871)
    • 50c9f669d feat: show agent version in /agents and det agent list [DET-6847] (#3873)
    • 954cf9714 fix: avoid double-timestamps in logs (#3876)
    • b00124a41 feat: Drag to Reorder and Resize Experiment List columns [DET-6438] [DET-6809] (#3765)
    • be5848588 chore: use better NCCL SOCKET setting for gpt-neox (#3874)
    • 65e1a97d3 chore: remove internal flag from det.launch.horovod --help (#3875)
    • 9b1fa3bd0 feat: Search models by name and description substring [DET-6939] (#3869)
    • 78ba4a33a feat: add opentel to determined master [DET-6775] (#3851)
    • 789f16dce chore: cleanup stray changes (#3868)

    Docker images

    • docker pull determinedai/determined-master:0.17.15
    • docker pull determinedai/determined-master:9b74e5444
    • docker pull determinedai/determined-master:9b74e54448d64009ce574e1a68b52149c1d00fe7
    • docker pull determinedai/determined-dev:determined-master-9b74e5444
    • docker pull determinedai/determined-dev:determined-master-9b74e54448d64009ce574e1a68b52149c1d00fe7
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.15
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:9b74e5444
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:9b74e54448d64009ce574e1a68b52149c1d00fe7
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.15_checksums.txt(759 bytes)
    determined-agent_0.17.15_darwin_amd64.tar.gz(8.35 MB)
    determined-agent_0.17.15_linux_amd64.deb(8.04 MB)
    determined-agent_0.17.15_linux_amd64.rpm(8.01 MB)
    determined-agent_0.17.15_linux_amd64.tar.gz(8.01 MB)
    determined-agent_0.17.15_linux_ppc64.deb(7.03 MB)
    determined-agent_0.17.15_linux_ppc64.rpm(6.98 MB)
    determined-agent_0.17.15_linux_ppc64.tar.gz(6.97 MB)
    determined-master_0.17.15_checksums.txt(766 bytes)
    determined-master_0.17.15_darwin_amd64.tar.gz(40.25 MB)
    determined-master_0.17.15_linux_amd64.deb(66.03 MB)
    determined-master_0.17.15_linux_amd64.rpm(65.92 MB)
    determined-master_0.17.15_linux_amd64.tar.gz(40.19 MB)
    determined-master_0.17.15_linux_ppc64.deb(63.43 MB)
    determined-master_0.17.15_linux_ppc64.rpm(63.21 MB)
    determined-master_0.17.15_linux_ppc64.tar.gz(37.46 MB)
  • 0.17.14(Apr 14, 2022)

    Changelog

    • cf1cc00b1 chore: bump version: 0.17.14-rc0 -> 0.17.14
    • f06304521 docs: Release notes for 0.17.14 (#3912)
    • 83868bc34 chore: bump version: 0.17.14-dev0 -> 0.17.14-rc0
    • d64bbf762 chore: bump version: 0.17.13 -> 0.17.14-dev0
    • 4f42c1b4b fix: RM crashes when setting cmd priority (#3908)

    Docker images

    • docker pull determinedai/determined-master:0.17.14
    • docker pull determinedai/determined-master:cf1cc00b1
    • docker pull determinedai/determined-master:cf1cc00b17f8e5d59effbd0e2f25d32328ade8b5
    • docker pull determinedai/determined-dev:determined-master-cf1cc00b1
    • docker pull determinedai/determined-dev:determined-master-cf1cc00b17f8e5d59effbd0e2f25d32328ade8b5
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.14
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:cf1cc00b1
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:cf1cc00b17f8e5d59effbd0e2f25d32328ade8b5
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.14_checksums.txt(759 bytes)
    determined-agent_0.17.14_darwin_amd64.tar.gz(8.34 MB)
    determined-agent_0.17.14_linux_amd64.deb(8.03 MB)
    determined-agent_0.17.14_linux_amd64.rpm(8.01 MB)
    determined-agent_0.17.14_linux_amd64.tar.gz(8.00 MB)
    determined-agent_0.17.14_linux_ppc64.deb(7.02 MB)
    determined-agent_0.17.14_linux_ppc64.rpm(6.96 MB)
    determined-agent_0.17.14_linux_ppc64.tar.gz(6.95 MB)
    determined-master_0.17.14_checksums.txt(766 bytes)
    determined-master_0.17.14_darwin_amd64.tar.gz(39.71 MB)
    determined-master_0.17.14_linux_amd64.deb(65.37 MB)
    determined-master_0.17.14_linux_amd64.rpm(65.26 MB)
    determined-master_0.17.14_linux_amd64.tar.gz(39.66 MB)
    determined-master_0.17.14_linux_ppc64.deb(62.81 MB)
    determined-master_0.17.14_linux_ppc64.rpm(62.58 MB)
    determined-master_0.17.14_linux_ppc64.tar.gz(36.98 MB)
  • 0.17.13(Apr 11, 2022)

    Changelog

    • 7629d330a chore: bump version: 0.17.13-rc3 -> 0.17.13
    • 0de91c795 docs: add release notes for 0.17.13 (#3879)
    • e02363582 chore: bump version: 0.17.13-rc2 -> 0.17.13-rc3
    • 8ac3a499d fix: avoid double-timestamps in logs (#3876)
    • 11e84f829 feat: show agent version in /agents and det agent list [DET-6847] (#3873)
    • e00ebb775 chore: bump version: 0.17.13-rc1 -> 0.17.13-rc2
    • 79d96c7a3 chore: remove internal flag from det.launch.horovod --help (#3875)
    • 6f705b7b2 chore: use better NCCL SOCKET setting for gpt-neox (#3874)
    • 6ebd3c796 chore: bump version: 0.17.13-rc0 -> 0.17.13-rc1
    • a95b892ad chore: cleanup stray changes (#3868)
    • 5d85dc806 chore: bump version: 0.17.13-dev0 -> 0.17.13-rc0
    • 4ffc52871 chore: lock api state for backward compatibility check
    • cde70d40b fix: e2e test stats for gke machine (#3865)
    • c6d557495 fix: deepspeed cifar moe example ds_config (#3864)
    • 9ff7aa86d feat: make launch layers generally useful (#3853)
    • 5c5c7b161 feat: allow the CLI to accept incomplete task UUIDs [DET-6706] (#3836)
    • bb8245dc4 Revert "chore: use userId as url param for user endpoints (#3806)" (#3855)
    • 4f5f3221e feat: add table to store unused GPU [DET-6697] (#3735)
    • 7f64b0ec3 fix: gpt-neox example multinode on aws (#3848)
    • be5232a10 refactor: simplify resourcemanagers.container. (#3840)
    • cc763f8af set ExperimentCreate modal width in AdvancedMode (#3831)
    • 3176e9e7c add conditionalRender prop to Spinner (#3837)
    • 733873899 fix: remove duplicated code in useModal (#3845)
    • 3444f4aab fix: outdated deepspeed launcher rest api call (#3834)
    • bb4050fe6 ci: fail building docs when sphinx-build fails (#3843)
    • 63d86d38e fix: docs build failure for deepspeed reference (#3844)
    • 998103d83 fix: remove python 3.10 union-type syntax (#3839)
    • 6751860fa chore: convert ptrs.* to generics-based ptrs.Ptr (#3792)
    • c2142e410 chore: handle race between experiment persistent state, actor existence in ask, take 2 (#3835)
    • 470f98b39 chore: add missing FK constraints for trials and allocation_sessions (#3810)
    • c6697a585 refactor: consolidate allocation state. (#3833)
    • 607b565ae chore: add generic math max/min (#3823)
    • 9777130a9 chore: handle race between experiment persistent state, actor existence in ask (#3821)
    • ba56d2410 use RoutePagination to navigate between trials (#3828)
    • 31ecb4a4d fix: Set default project for experiments.project_id [DET-6958] (#3825)
    • 91a528369 doc: nested hps (#3827)
    • 02c55857e chore: clean up get_experiment.sql query (#3822)
    • 8467f936f fix: ci failures for deepspeed and pytorch geometric (#3824)
    • 39e629d63 fix: keep page from crashing when trial/hyperparameters are not loaded yet [DET-6951]
    • 4175f96cc chore: bump version: 0.17.12-dev0 -> 0.17.13-dev0
    • 69e396056 docs: add release notes for 0.17.12 (#3815)
    • 7d64e11ae chore: update pytorch data loader args (#3775)
    • 3f9188667 ci: fix click/black conflict. (#3818)
    • fc56db09c docs: Add examples and documentation for AMP [DET-5170] (#3757)
    • 0639e8017 chore: add trialIds to GET experiment response (#3809)
    • 4f93158bf feat: Accept model.id for lookups, show model.id in UI [DPS-49] (#3723)
    • 95ea7a29f fix: use max slots provisioning agents [DET-5725] (#3788)
    • e35972d83 chore: use userId as url param for user endpoints (#3806)
    • 1c6c84894 fix: change menus for mobile resolution (DET-6869) (#3801)
    • 0ff423940 fix: remember originally requested page on login [DET-6928] (#3790)
    • 2c263bc2c chore: release notes for job queue (#3782)
    • d70550495 ci: delete useless and slow noop test (#3812)
    • 7f94c988a fix: bugs in PIDServer (#3800)
    • 9592e0417 chore: add bun.DB singleton in master (again) (#3798)
    • f7ac95506 fix: add pytorch as requirement before deepspeed is installed (#3811)
    • 78b50fb26 fix: update references to branding (DET-6870) (#3802)
    • 4146b021e chore: upgrade to support panoptic segmentation for coco [DET-6945] (#3545)
    • d920fed12 chore: allow cors by default in dev cluster. (#3794)
    • b4ded4b28 fix: add dependency for harness tests (#3804)
    • c4cfaa045 chore: API requests that return username should also return userId (#3796)
    • c3008e33b chore: bump deepspeed version to 0.6.0 (#3803)
    • 7bec5b39e fix: update HPE logos (DET-6868) (#3789)
    • c2904b32a fix: cli win tests (#3799)
    • 704c17384 chore: reorder jobs.q_position migrations [DET-6934] (#3797)
    • 476b12d2d revert: add bun.DB singleton in master (#3769)
    • db4062c82 test: turn off timeout for now on logs tests (#3793)
    • 9f15d2468 docs: fixes for concepts index page (#3756)
    • 03c68f36c chore: add bun.DB singleton in master (#3769)
    • bb8186955 feat: add deepspeed support (#3235)
    • 67f0ca409 restore getExperimentModelDefinition (#3791)
    • 21c2d2768 chore: Update required fields, remove repeat fields from model registry [DET-6861] (#3755)
    • a8d541bce fix: wl_output with zero trials (#3786)
    • 4297aa596 docs: update job queue docs to mention k8 (#3781)
    • 1542afff8 chore: add ProtoConverter (#3766)
    • 86e029652 fix: kubernetes job queue unexpected message (#3783)
    • e89162da7 chore: disable showing queue position controls for k8 [DET-6917] (#3784)
    • 6df7fe587 docs: fix typo (#3780)

    Docker images

    • docker pull determinedai/determined-master:0.17.13
    • docker pull determinedai/determined-master:7629d330a
    • docker pull determinedai/determined-master:7629d330abbbf74b45865d524bf9d14761d263af
    • docker pull determinedai/determined-dev:determined-master-7629d330a
    • docker pull determinedai/determined-dev:determined-master-7629d330abbbf74b45865d524bf9d14761d263af
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.13
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:7629d330a
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:7629d330abbbf74b45865d524bf9d14761d263af
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.13_checksums.txt(759 bytes)
    determined-agent_0.17.13_darwin_amd64.tar.gz(8.34 MB)
    determined-agent_0.17.13_linux_amd64.deb(8.03 MB)
    determined-agent_0.17.13_linux_amd64.rpm(8.00 MB)
    determined-agent_0.17.13_linux_amd64.tar.gz(8.00 MB)
    determined-agent_0.17.13_linux_ppc64.deb(7.02 MB)
    determined-agent_0.17.13_linux_ppc64.rpm(6.96 MB)
    determined-agent_0.17.13_linux_ppc64.tar.gz(6.95 MB)
    determined-master_0.17.13_checksums.txt(766 bytes)
    determined-master_0.17.13_darwin_amd64.tar.gz(39.71 MB)
    determined-master_0.17.13_linux_amd64.deb(65.37 MB)
    determined-master_0.17.13_linux_amd64.rpm(65.27 MB)
    determined-master_0.17.13_linux_amd64.tar.gz(39.67 MB)
    determined-master_0.17.13_linux_ppc64.deb(62.80 MB)
    determined-master_0.17.13_linux_ppc64.rpm(62.57 MB)
    determined-master_0.17.13_linux_ppc64.tar.gz(36.98 MB)
  • 0.17.12(Mar 28, 2022)

    Changelog

    • fe3acb352 chore: bump version: 0.17.12-rc2 -> 0.17.12
    • 6e8693056 docs: add release notes for 0.17.12 (#3815)
    • 982686240 chore: release notes for job queue (#3782)
    • 6b4daa203 chore: bump version: 0.17.12-rc1 -> 0.17.12-rc2
    • 21fbfc780 restore getExperimentModelDefinition (#3791)
    • bdd52c13f chore: bump version: 0.17.12-rc0 -> 0.17.12-rc1
    • 988ae875e chore: reorder jobs.q_position migrations [DET-6934] (#3797)
    • 8fcb60001 fix: wl_output with zero trials (#3786)
    • 0cd2b681d docs: update job queue docs to mention k8 (#3781)
    • 75278ae6f chore: disable showing queue position controls for k8 [DET-6917] (#3784)
    • 6b59fa211 fix: kubernetes job queue unexpected message (#3783)
    • 487d791d6 chore: bump version: 0.17.12-dev0 -> 0.17.12-rc0
    • 1cc555594 chore: lock api state for backward compatibility check
    • 08bbd1522 fix: stop single trial tabs from breaking when trial is undefined (#3752)
    • b14e2be90 chore: fix up job queue issues after merge (#3779)
    • 46827b940 feat: add fine grained control for reordering jobs (#3347)
    • 8061937d4 chore: remove unused websockets (#3776)
    • 565255a40 refactor: make rendezvous less actor-y (#3754)
    • 12973dc68 chore: improve log metadata (#3748)
    • d553ae18d feat: tell tasks RM type to discern available features (#3713)
    • ea9c24b94 chore: bump proto to go 1.18 (#3772)
    • d02765b33 refactor: replace plotly with hermes [DET-6725] (#3689)
    • 8a5d00772 test: disable test_task_logs[command] on k8s (#3770)
    • eedab6ee7 fix: nil pointer exception when restoring experiments (#3771)
    • ca6b3b456 chore: bump to go 1.18 (#3745)
    • 1401d33f5 feat: add through-master all gather for v2 rendezvous (#3694)
    • 3b459a14e fix: update types for pytorch custom reducers (#3764)
    • 13cf79c1e fix: use displayName in tables (#3742)
    • 003888134 fix: check_idle.py is quiet when jupyter takes a while to start up (#3749)
    • 42c2764b0 build: use mockery v2.10.0 which is go1.18-friendly (#3751)
    • a32af6a52 fix: prevent pytorch geometric nightly test timeout (#3747)
    • 188537bbb refactor: pull out intermediary resources representation for RM (#3663)
    • b99377d28 docs: update release note process (#3727)
    • 3c17335e2 fix: replace loading skeleton with empty state [DET-6800] (#3743)
    • fe23d1f45 fix: set allocation start time accurately (specifically for Kubernetees) [DET-6713] (#3732)
    • abe5f3d0f docs: remove some outdated information. (#3744)
    • e5b4b891b feat: Add workspaces, projects, and related DB schema [DET-6815] (#3728)
    • 0693fa052 test: add testing for cross-version master/agent clusters (#3725)
    • 7d954ed2d fix: create user with password broken [DET-6852] (#3738)
    • f04827982 feat: update default env images to pytorch 1.10.2, tf 2.8. (#3734)
    • feff05457 chore: updates to prometheus docs (#3716)
    • df695ba19 fix: always show progress bar for experiments (#3733)
    • 5971e1ea7 chore: don't check google-cloud-storage with mypy. (#3736)
    • 1b3e76411 chore: Return trial workloads and metrics in place of ExperimentRaw [DET-6485] (#3539)
    • 6434f58c2 fix: allow ranks to report independent early exits [DET-6792, DET-6835] (#3717)
    • 7f1e169d5 fix: restore preview-search (#3737)
    • 740b79c26 fix: run all migrations in transactions (#3712)
    • 04cde60af feat: remove global_batch_size requirement (#3724)
    • d2b89d5c2 chore: bump version: 0.17.11-dev0 -> 0.17.12-dev0
    • eb8dcda03 docs: add release notes for 0.17.11 (#3731)
    • 2e0269ee4 test: add tests for user settings DET-6788 DET-6750 (#3708)
    • 1a1ad6eae fix: det deploy gcp bad resource name (#3726)
    • d7e18c45b chore: dev mode for update-bumpenvs-yaml. (#3684)
    • 375066df7 fix: download cifar10 only once per node in cifar10_pytorch example (#3718)
    • 2fcb8cb8a chore: Add interaction tests for RadioGroup [DET-6662] (#3715)
    • 889483714 ci: python/master/agent unit test coverage (#3701)
    • ddf757333 fix: accelerator field for GCP/AWS instance (#3714)
    • 3b735361a perf: rewrite get_experiments.sql for a better query plan (#3673)
    • 617966262 docs: fix css typo (#3710)

    Docker images

    • docker pull determinedai/determined-master:0.17.12
    • docker pull determinedai/determined-master:fe3acb352
    • docker pull determinedai/determined-master:fe3acb352d1a488b542fdfae8d73546205a2b751
    • docker pull determinedai/determined-dev:determined-master-fe3acb352
    • docker pull determinedai/determined-dev:determined-master-fe3acb352d1a488b542fdfae8d73546205a2b751
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.12
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:fe3acb352
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:fe3acb352d1a488b542fdfae8d73546205a2b751
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.12_checksums.txt(759 bytes)
    determined-agent_0.17.12_darwin_amd64.tar.gz(8.34 MB)
    determined-agent_0.17.12_linux_amd64.deb(8.03 MB)
    determined-agent_0.17.12_linux_amd64.rpm(8.00 MB)
    determined-agent_0.17.12_linux_amd64.tar.gz(7.99 MB)
    determined-agent_0.17.12_linux_ppc64.deb(7.01 MB)
    determined-agent_0.17.12_linux_ppc64.rpm(6.96 MB)
    determined-agent_0.17.12_linux_ppc64.tar.gz(6.95 MB)
    determined-master_0.17.12_checksums.txt(766 bytes)
    determined-master_0.17.12_darwin_amd64.tar.gz(39.10 MB)
    determined-master_0.17.12_linux_amd64.deb(64.60 MB)
    determined-master_0.17.12_linux_amd64.rpm(64.48 MB)
    determined-master_0.17.12_linux_amd64.tar.gz(39.02 MB)
    determined-master_0.17.12_linux_ppc64.deb(62.08 MB)
    determined-master_0.17.12_linux_ppc64.rpm(61.84 MB)
    determined-master_0.17.12_linux_ppc64.tar.gz(36.36 MB)
  • 0.17.11(Mar 14, 2022)

    Changelog

    • 54b5394de chore: bump version: 0.17.11-rc1 -> 0.17.11
    • f9e301694 docs: add release notes for 0.17.11 (#3731)
    • 3650339cc chore: bump version: 0.17.11-rc0 -> 0.17.11-rc1
    • 939661bcf fix: det deploy gcp bad resource name (#3726)
    • 9a74fd60d chore: bump version: 0.17.11-dev0 -> 0.17.11-rc0
    • 34db509af chore: lock api state for backward compatibility check
    • b165f284c fix: det user whoami to return current user in notebooks (#3661)
    • 40b0e4dfc feat: update resource pool cards [DET-6417] (#3698)
    • 6c946ea48 fix: trial rendezvous timeout should just warn (#3704)
    • 3145b18e7 fix: select filter input width [DET-6777] (#3681)
    • 195a5e2e4 fix: ensure allocation end time is always valid (#3705)
    • 8021e90a7 feat: add on_trial_{startup,shutdown} hooks to PyTorchCallback (#3643)
    • c8833e953 feat: context menu row background (#3683)
    • 978606f8e docs: fix broken link to k8s pod specs (#3707)
    • 2b21ea855 ci: temporarily avoid test_task_logs[command] test flakes (#3702)
    • 9c8f8127f fix: bypass backoff if download_gcs_blob_with_backoff called with None (#3623)
    • 42a56e0bb refactor: change oddly structured log message code (#3680)
    • 936dcb071 test: add tests for get experiment (#3672)
    • c30a21e00 docs: update docker install for macos (#3699)
    • ac4535bcd chore: add AWS EC2 instance types (#3693)
    • 9807236a7 fix: pin azure-core version due to incomp with TF2.4 [DET-6798] (#3700)
    • fa36e2d51 chore: bump version: 0.17.10-dev0 -> 0.17.11-dev0
    • 97250f625 docs: add release notes for 0.17.10 (#3690)
    • f08334f3e ci: deploy-release-party is no longer used releases (#3695)
    • d4410ac11 docs: fix broken link (#3696)
    • cd3556f5e chore: upgrade go-pg [DET-6363] (#3656)
    • 6abe9a2f5 test: log filter test [DET-6787] (#3688)
    • e5cf108f9 docs: restructure training debugging doc (#3687)
    • aa5ac5bc6 docs: add prometheus and grafana integration (#3676)
    • c8739ff7d feat: create modal to edit displayName in UI (DET-6484, DET-6581) (#3646)
    • f38be1f7c docs: change glob directive to explicit content map (#3685)
    • a72765d42 refactor: log filters [DET-6766, DET-6779, DET-6780] (#3678)
    • 1be744b89 docs: update copyright (#3662)
    • 1c351492b fix: add rank_id to task logs response (#3679)
    • ff2817140 chore: pin GKE less strictly (#3675)
    • b4dac7a78 fix: GET /api/v1/trial/:id/logs/fields should return all filters (#3677)
    • f429b37e3 refactor: centralize configuration; align user get methods (#3637)
    • f8973967e fix: Model/Experiment API and error codes [DET-6711] (#3626)
    • 9761b4c9e feat: make the forked-from column in experiments list present by default and sortable [DET-6727] (#3655)
    • 27fd74201 chore: remove storybook stories from test coverage (#3666)
    • 29c6aa734 chore: realign task log timestamps (#3664)
    • ef968c554 docs: invert getting started and quick start titles (#3647)
    • 6e2733f8e chore: interaction test for multiselect [DET-6666] (#3665)
    • ecea4414b fix: when cluster fails ensure end time of open allocation is set to the last cluster heartbeat in cluster_id table [DET 6509] (#3657)
    • 9828b0b6d fix: update select filter test to avoid console errors (#3667)
    • 6b05a261d refactor: migrate web logs [DET-4826, DET-6068, DET-6726, DET-6680, DET-6681] (#3619)
    • f84b823d6 fix: allow list hparams in local mode (#3622)
    • e8bb7b811 chore: add tests to make sure postgres enums are exhaustive (#3401)
    • 119877ba1 feat: add accelerator field for resource pool [DET-6755] (#3659)
    • acf080a7f fix: allow modal width to shrink on mobile [DET-6443] (#3642)
    • c5f42f349 fix: refreshing profiler results in blank screen [DET-6673] (#3658)
    • d1a362071 docs: add font packages and change to open sans (#3639)
    • b828db789 fix: nightly test errors for BYOL. (#3652)
    • 754964184 fix: stop experiment modal closing unexpectedly [DET-6746] (#3653)
    • 2b4571cfa fix: Use pagination in CLI to return all experiments, not the default limit (#3641)
    • a57fde2c3 fix: regression in det deploy gcp down. (#3654)
    • e36b50cda test: fix test_start_and_write_to_shell (#3650)
    • c43540610 test: add interaction test for inline editor [DET-6667] (#3645)

    Docker images

    • docker pull determinedai/determined-master:0.17.11
    • docker pull determinedai/determined-master:54b5394de
    • docker pull determinedai/determined-master:54b5394de90946626be0617904a9e962605a65c7
    • docker pull determinedai/determined-dev:determined-master-54b5394de
    • docker pull determinedai/determined-dev:determined-master-54b5394de90946626be0617904a9e962605a65c7
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.11
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:54b5394de
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:54b5394de90946626be0617904a9e962605a65c7
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.11_checksums.txt(759 bytes)
    determined-agent_0.17.11_darwin_amd64.tar.gz(8.06 MB)
    determined-agent_0.17.11_linux_amd64.deb(7.75 MB)
    determined-agent_0.17.11_linux_amd64.rpm(7.72 MB)
    determined-agent_0.17.11_linux_amd64.tar.gz(7.71 MB)
    determined-agent_0.17.11_linux_ppc64.deb(6.81 MB)
    determined-agent_0.17.11_linux_ppc64.rpm(6.75 MB)
    determined-agent_0.17.11_linux_ppc64.tar.gz(6.74 MB)
    determined-master_0.17.11_checksums.txt(766 bytes)
    determined-master_0.17.11_darwin_amd64.tar.gz(37.54 MB)
    determined-master_0.17.11_linux_amd64.deb(64.48 MB)
    determined-master_0.17.11_linux_amd64.rpm(64.37 MB)
    determined-master_0.17.11_linux_amd64.tar.gz(37.49 MB)
    determined-master_0.17.11_linux_ppc64.deb(61.82 MB)
    determined-master_0.17.11_linux_ppc64.rpm(61.55 MB)
    determined-master_0.17.11_linux_ppc64.tar.gz(34.67 MB)
  • 0.17.10(Mar 4, 2022)

    Changelog

    • fea84524b chore: bump version: 0.17.10-rc3 -> 0.17.10
    • b2a2ff0fe docs: add release notes for 0.17.10 (#3690)
    • ca72d68e2 chore: bump version: 0.17.10-rc2 -> 0.17.10-rc3
    • 467d12488 refactor: log filters [DET-6766, DET-6779, DET-6780] (#3678)
    • d70bb5f5e fix: add rank_id to task logs response (#3679)
    • 917b623dc chore: bump version: 0.17.10-rc1 -> 0.17.10-rc2
    • c7c0cd074 fix: GET /api/v1/trial/:id/logs/fields should return all filters (#3677)
    • 54d245167 chore: bump version: 0.17.10-rc0 -> 0.17.10-rc1
    • 52681b6b6 refactor: migrate web logs [DET-4826, DET-6068, DET-6726, DET-6680, DET-6681] (#3619)
    • 513d5f8a7 test: fix test_start_and_write_to_shell (#3650)
    • c885610ed test: add interaction test for inline editor [DET-6667] (#3645)
    • 1d719920e fix: regression in det deploy gcp down. (#3654)
    • f15b52e32 chore: bump version: 0.17.10-dev0 -> 0.17.10-rc0
    • 0f3404fbd chore: lock api state for backward compatibility check
    • 21a407387 chore: minor upgrades to ML frameworks (#3648)
    • f1c06a4c7 feat: support k8s db ssl [DET-6364] (#3624)
    • 219a3e67b chore: remove extra logging for credHelpers [DET-6728] (#3632)
    • 32e50799f test: byol pytorch should use epochs (#3640)
    • 911f16b61 fix: convert log strings to bytes in go, not sql (#3636)
    • 3b779c773 feat: create PATCH users endpoint (#3634)
    • 4aa4e15a5 refactor: agent RM state consolidation. [DET-6638] (#3552)
    • 829d7fd88 ci: Add technical writer as codeowner of release-notes (#3633)
    • 298bee0a2 chore: bump version: 0.17.9-dev0 -> 0.17.10-dev0
    • e4178489f docs: add release notes for 0.17.9 (#3613)
    • dce756b0d fix: remove duplicated header for new cluster page [DET-6722] (#3628)
    • a8724ab7a docs:add integrations section (#3631)
    • f5e49c3e8 feat: log records/second and batches/second in trials [DPS-1] (#3593)
    • 2c84269f6 chore: fix migration helper scripts (#3627)
    • 6420ccee4 chore: at least try to auto-negotiate docker client version (#3611)
    • e8ed6a2f7 chore: update PTL adapter to support latest PTL [DET-6421] (#3603)
    • 718e93471 chore: update jq error message for single-job updates (#3590)
    • 54a5bd224 feat: support gcs backend for terraform state in det deploy gcp. (#3615)
    • 59da336b3 chore: interactive test for Section [DET-6664] (#3621)
    • bf1191873 fix: log viewer bug (#3605)
    • 6d8d0b7e7 test: add more checks on interaction tests (#3616)
    • 7f789ad02 ci: upgrade test-unit-react from medium to large resource class (runtime 6m -> 1m) (#3625)
    • 4a11a2374 chore: fix typo (#3620)
    • 296c374db chore: adding a limit to task endpoints by type (#3604)
    • 58d1d0078 fix: make SelectFilter input wide enough to see what I type in MetricSelectFilter [DPS-30] [DET-6642] (#3618)
    • 0edf93b7e test: write SelectFilter tests [DET-6663] (#3596)
    • 9f21b2fc3 feat: add BYOL example (#3513)
    • e534e83ce chore: remove webui tests [DET-6610] (#3511)
    • f3b149d43 feat: Undo changes and warn user when InlineEditor fails to update db [DET-6659] (#3612)
    • 6dba19390 fix: dtrain + native api (#3614)
    • 679b7aa58 chore: check existence of correct actor to find which RM is in use (#3609)
    • 7a12f148a chore: test pytorch gradient aggregation DET-793 (#3602)
    • 9b528de70 chore: handle possible errors from docker daemon call /containers/cID/wait (#3592)
    • e1a00ea87 chore: add additional logging for creds helper [DET-6710] (#3607)
    • 7821d9020 chore: update get trial queries to not require aligned steps, vals and checkpoints (#3568)
    • 08888717a feat: unify task logs [DET-6062, DET-6063, DET-6064, DET-6065, DET-6066] (#3070)
    • ab72e1b0d fix: nil pointer error during switch RP (#3608)
    • 92b32d0de chore: containers should always be killed by their parent, remove incorrect exits with 0 (#3595)
    • d435513ce fix: update Resource Pool details in Manage Job modal (#3606)
    • 3f77957ad fix: add v4 shim due to experiment snapshot change (#3601)
    • a33747471 chore: fix flaky webui tests (#3599)
    • 5e5bf1097 fix: update enterprise helm chart [DET-6668] (#3578)
    • fdfb25562 fix: update K8s slotsUsed value [DET-6704] (#3600)
    • 85254394c fix: fix behavior for Change Password buttons (#3598)
    • 570c0e519 fix: remove icon for LOG_LEVEL_UNSPECIFIED log [DET-6436] (#3585)
    • 73c660c49 Notebook doc changes to conform to a style guide (#3587)
    • a75359d9b test: interaction test for Badge component (#3580)

    Docker images

    • docker pull determinedai/determined-master:0.17.10
    • docker pull determinedai/determined-master:fea84524b
    • docker pull determinedai/determined-master:fea84524b7a779e2ab74b42feeff952431e570ca
    • docker pull determinedai/determined-dev:determined-master-fea84524b
    • docker pull determinedai/determined-dev:determined-master-fea84524b7a779e2ab74b42feeff952431e570ca
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.10
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:fea84524b
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:fea84524b7a779e2ab74b42feeff952431e570ca
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.10_checksums.txt(759 bytes)
    determined-agent_0.17.10_darwin_amd64.tar.gz(8.06 MB)
    determined-agent_0.17.10_linux_amd64.deb(7.75 MB)
    determined-agent_0.17.10_linux_amd64.rpm(7.72 MB)
    determined-agent_0.17.10_linux_amd64.tar.gz(7.71 MB)
    determined-agent_0.17.10_linux_ppc64.deb(6.81 MB)
    determined-agent_0.17.10_linux_ppc64.rpm(6.75 MB)
    determined-agent_0.17.10_linux_ppc64.tar.gz(6.74 MB)
    determined-master_0.17.10_checksums.txt(766 bytes)
    determined-master_0.17.10_darwin_amd64.tar.gz(37.63 MB)
    determined-master_0.17.10_linux_amd64.deb(59.12 MB)
    determined-master_0.17.10_linux_amd64.rpm(58.99 MB)
    determined-master_0.17.10_linux_amd64.tar.gz(37.58 MB)
    determined-master_0.17.10_linux_ppc64.deb(56.51 MB)
    determined-master_0.17.10_linux_ppc64.rpm(56.22 MB)
    determined-master_0.17.10_linux_ppc64.tar.gz(34.82 MB)
  • 0.17.9(Feb 12, 2022)

    Changelog

    • e38e8847a chore: bump version: 0.17.9-rc1 -> 0.17.9
    • 0139481e4 docs: add release notes for 0.17.9 (#3613)
    • 701893664 fix: dtrain + native api (#3614)
    • f7e0e97d2 chore: bump version: 0.17.9-rc0 -> 0.17.9-rc1
    • be3da6b87 fix: nil pointer error during switch RP (#3608)
    • 5a5f534fb fix: update Resource Pool details in Manage Job modal (#3606)
    • b5d99d417 fix: add v4 shim due to experiment snapshot change (#3601)
    • 44be28b2b fix: update K8s slotsUsed value [DET-6704] (#3600)
    • 1409b5091 fix: fix behavior for Change Password buttons (#3598)
    • 6ec1281c8 chore: bump version: 0.17.9-dev0 -> 0.17.9-rc0
    • 11186a48c chore: resolve conflicting cli abbreviation (#3594)
    • 1e5317116 fix: check if data file exists before downloading (#3586)
    • 7b5c8fbda fix: use hidden prop to always include form items (#3582)
    • bf000be8a fix: use google.com instead of example.com as trusted domain (#3589)
    • 6ec67a046 chore: Remove old /checkpoints and searcher/preview endpoints [DET-6682] (#3588)
    • f1bd508b8 chore: unequate reservation existence and the fact they've been started (#3577)
    • 8a20a1628 test: cover checkpoint, model api due to invariant changes [DET-4980] (#3564)
    • cd931deaa chore: update hpviz to allow un-validation-aligned checkpoints (#3567)
    • b3840d153 feat: customize experiment columns [DET-6398] [DET-6401] (#3561)
    • 2e4e17827 fix: actually send the context directory for notebooks [DET-6634] (#3570)
    • dbb063f29 chore: bumpenvs on last image refresh (#3573)
    • d47db202a ci: fix gke version (#3583)
    • 976986cf7 feat: Tensorboard use local storage [DET-6029] (#3487)
    • b99505e4a fix: tests failing on pytest 7 (#3579)
    • b02ae127a chore: remove container_id from internal apis (#3556)
    • b1dd87424 test: interaction test for BadgeTag component [DET-6423] (#3572)
    • 508fba08a feat: webui support for moving jobs between resource pools (#3563)
    • a2d9ddecc feat: option to skip determined wheel installation [DET-6650] (#3562)
    • 87c7e8e50 chore: WebUI interaction test for Message component [DET-6425] (#3553)
    • 45144fba6 fix: add PyTorch Geometric example to the docs (#3569)
    • b426f870a fix: resolve badge related stories (#3566)
    • f5f578cbf chore: preserve try-for-flake script for react unit tests from e2e (#3560)
    • 7be66739d chore: add tensborboard timeout logging [DET-6317] (#3550)
    • 1bdb82719 style: shift the dropdown to be visible in collapsed nav state (#3549)
    • f9e469661 chore: apply logic to cancel streaming when page is not active and reactive when it is (#3547)
    • 94aeb54c0 feat: create Change Password modal (#3537)
    • c1e52c37d fix: Handle model name changes [DET-6648] (#3559)
    • 29ed1c88c feat: return actual resource pool info for K8s [DET-6411, DET-6383] (#3531)
    • cd500edc8 docs: fix several minor issues (#3558)
    • df41dbe9f fix: deflake unets-tf-keras test by avoiding redundant downloads (#3544)
    • fa2e073f7 chore: restore RP on set failure (#3557)
    • 063c54376 fix: Progress on completed experiment to be reported as 1 (#3502)
    • 8f3d3e7c5 ci: try using codeowners more. (#3555)
    • 908c697f9 chore: modify internal priority message (#3532)
    • dd78ccb3e fix: force login failure to not be silent (#3551)
    • 231e1ac1d Revert "chore: bump version: 0.17.9-dev0 -> 0.17.10-dev0"
    • e7b249f82 chore: bump version: 0.17.9-dev0 -> 0.17.10-dev0
    • 7562b5336 Revert "Merge branch 'master' of github.com:determined-ai/determined"
    • 3689a1180 docs: release-0.17.8 notes (#3554)
    • b45743127 Merge branch 'master' of github.com:determined-ai/determined
    • 6448ff181 fix: deflake unets-tf-keras test by avoiding redundant downloads (#3544)
    • daaef8875 chore: bump version: 0.17.8-dev0 -> 0.17.9-dev0
    • fe50b39e4 fix: correct tqdm log display and fix the flickering issue [DET-6470] (#3542)
    • 674866592 style: fix spinner placement for hp viz [DET-6631] (#3543)
    • 379c26b51 webui test on MetricBadgeTag (#3538)
    • 10e11cc2d test: update leftover det m config -o usage to --json (#3540)
    • e8752b031 fix: capture sys.exit code from launch (#3541)
    • 783e1e4a3 feat: configure k8s fluent image in helm chart (#3522) [DET-6380]
    • 1942e0705 feat: make notebook timeout more configurable (#3498)
    • 389c76307 chore: update polling functions to be async where applicable (#3499)
    • 18ed310a8 chore: change the output format args for master config cli (#3387)
    • c7baf31b5 chore: Use nullable types in PatchExperiment [DET-6486] (#3497)
    • b55a61bf6 fix: ensure randomly picked search id is unique (#3533)
    • d4b88dcd8 build: fix make -C harness clean (#3468)
    • cdc8cd90b chore: fix suggested command to install growforest (#3530)
    • 8bb24ebb6 chore: Add logging - checkpoint storage validation [DET-6623] (#3529)
    • ff1c063bc fix: update test:coverage script and makefile to enable coverage report to work locally (#3535)
    • 2e2df57b8 feat: create a new cluster page [DET-6403] (#3495)
    • 7c36ab1d5 DET-6620 Amazon S3 subheading missing from docs. (#3527)
    • 2f279350c fix: Patch issue with regitering a checkpoint to a model version (#3528)
    • b8c9c85b7 fix: Escape model name (#3501)
    • 1c5eb9005 chore: update checkpoint export apis [DET-6037] (#3448)
    • 4d4fc1ef5 chore: bump version: 0.17.7-dev0 -> 0.17.8-dev0
    • 99f151513 docs: add release notes for 0.17.7 (#3503)
    • 3ebb534a7 fix: correct for out-of-order PR landing (#3523)
    • 417c48c48 feat: make k8s fluent bit sidecar image configurable (#3518)
    • 00befce7e fix: add abort mechanism to stop actor deadlocks, check resource staleness on receipt (#3519)
    • de619ec0a fix: clear out the error if there are no errors upon each keystroke (#3517)
    • 1793ee428 feat: basic request audit logging [DET-6370] (#3516)
    • 5dd366b94 feat: create user settings modal (#3508)
    • 5510d8f3f chore: unitless searcher [DET-6359] (#3343)
    • bfa6060fb fix: move python launch logger setup to not miss important log (#3520)
    • ae52c620f chore: simplify generated code for optional param handling (#3514)
    • 82dd7fa78 feat: additional pytorch callbacks [DET-5180] (#3479)
    • 2de93dfe0 chore: fix spacing in ui update message (#3515)
    • ed6246b2d feat: jq allow exp change RPs (#3470)
    • b659ccdfb chore: add docs for specifying priority behavior in k8s (#3512)
    • 7d43ad31c ci: remove test-e2e-webui from circleci (#3509)
    • a03d686f0 feat: stopping and starting polling based on page visibility [DET-6604] (#3500)
    • f46c8a6d3 fix: hide manage job for unmanageable command jobs in k8 (#3504)
    • 8812f5462 chore: print trial logs if trial ends with 0 steps (#3505)
    • 8c234ef9d chore: set up api errors to be handled close to their usage (#3482)
    • f13611f61 chore: Use model's name as its identifier (#3486)

    Docker images

    • docker pull determinedai/determined-master:0.17.9
    • docker pull determinedai/determined-master:e38e8847a
    • docker pull determinedai/determined-master:e38e8847a475bd18c71f56c58866634ba31109c4
    • docker pull determinedai/determined-dev:determined-master-e38e8847a
    • docker pull determinedai/determined-dev:determined-master-e38e8847a475bd18c71f56c58866634ba31109c4
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.9
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:e38e8847a
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:e38e8847a475bd18c71f56c58866634ba31109c4
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.9_checksums.txt(752 bytes)
    determined-agent_0.17.9_darwin_amd64.tar.gz(8.19 MB)
    determined-agent_0.17.9_linux_amd64.deb(7.87 MB)
    determined-agent_0.17.9_linux_amd64.rpm(7.84 MB)
    determined-agent_0.17.9_linux_amd64.tar.gz(7.83 MB)
    determined-agent_0.17.9_linux_ppc64.deb(6.92 MB)
    determined-agent_0.17.9_linux_ppc64.rpm(6.86 MB)
    determined-agent_0.17.9_linux_ppc64.tar.gz(6.85 MB)
    determined-master_0.17.9_checksums.txt(759 bytes)
    determined-master_0.17.9_darwin_amd64.tar.gz(37.55 MB)
    determined-master_0.17.9_linux_amd64.deb(59.06 MB)
    determined-master_0.17.9_linux_amd64.rpm(58.93 MB)
    determined-master_0.17.9_linux_amd64.tar.gz(37.51 MB)
    determined-master_0.17.9_linux_ppc64.deb(56.40 MB)
    determined-master_0.17.9_linux_ppc64.rpm(56.11 MB)
    determined-master_0.17.9_linux_ppc64.tar.gz(34.70 MB)
  • 0.17.8(Feb 3, 2022)

    Changelog

    • 9c8b8d380 chore: bump version: 0.17.8-rc0 -> 0.17.8
    • fbf281697 chore: bump version: 0.17.8-dev0 -> 0.17.8-rc0
    • 5b4db5352 fix: capture sys.exit code from launch (#3541)
    • 7039e7ec6 chore: bump version: 0.17.7 -> 0.17.8-dev0

    Docker images

    • docker pull determinedai/determined-master:0.17.8
    • docker pull determinedai/determined-master:9c8b8d380
    • docker pull determinedai/determined-master:9c8b8d3809a8fd1aa52d7acc05a1720349ba0458
    • docker pull determinedai/determined-dev:determined-master-9c8b8d380
    • docker pull determinedai/determined-dev:determined-master-9c8b8d3809a8fd1aa52d7acc05a1720349ba0458
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.8
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:9c8b8d380
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:9c8b8d3809a8fd1aa52d7acc05a1720349ba0458
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.8_checksums.txt(752 bytes)
    determined-agent_0.17.8_darwin_amd64.tar.gz(8.19 MB)
    determined-agent_0.17.8_linux_amd64.deb(7.87 MB)
    determined-agent_0.17.8_linux_amd64.rpm(7.85 MB)
    determined-agent_0.17.8_linux_amd64.tar.gz(7.84 MB)
    determined-agent_0.17.8_linux_ppc64.deb(6.92 MB)
    determined-agent_0.17.8_linux_ppc64.rpm(6.86 MB)
    determined-agent_0.17.8_linux_ppc64.tar.gz(6.85 MB)
    determined-master_0.17.8_checksums.txt(759 bytes)
    determined-master_0.17.8_darwin_amd64.tar.gz(37.57 MB)
    determined-master_0.17.8_linux_amd64.deb(59.04 MB)
    determined-master_0.17.8_linux_amd64.rpm(58.91 MB)
    determined-master_0.17.8_linux_amd64.tar.gz(37.53 MB)
    determined-master_0.17.8_linux_ppc64.deb(56.38 MB)
    determined-master_0.17.8_linux_ppc64.rpm(56.10 MB)
    determined-master_0.17.8_linux_ppc64.tar.gz(34.72 MB)
  • 0.17.7(Jan 28, 2022)

    Changelog

    • 814043805 chore: bump version: 0.17.7-rc3 -> 0.17.7
    • 6c02e0624 chore: bump version: 0.17.7-rc2 -> 0.17.7-rc3
    • cb59cf471 fix: move python launch logger setup to not miss important log (#3520)
    • 1430a2524 chore: add docs for specifying priority behavior in k8s (#3512)
    • fb17a04e5 fix: add abort mechanism to stop actor deadlocks, check resource staleness on receipt (#3519)
    • 200e544bd fix: clear out the error if there are no errors upon each keystroke (#3517)
    • 23dba3529 chore: bump version: 0.17.7-rc1 -> 0.17.7-rc2
    • b464b1899 fix: hide manage job for unmanageable command jobs in k8 (#3504)
    • 274fcb577 chore: set up api errors to be handled close to their usage (#3482)
    • 13c7ba52a docs: add release notes for 0.17.7 (#3503)
    • 094acf117 chore: bump version: 0.17.7-rc0 -> 0.17.7-rc1
    • 96572d312 chore: Use model's name as its identifier (#3486)
    • 79093e692 chore: bump version: 0.17.7-dev0 -> 0.17.7-rc0
    • c7ce5fbbd chore: lock api state for backward compatibility check
    • 0201934de chore: refresh images (#3496)
    • 4b435c811 ci: increase resource_class for Docker image scan (#3406)
    • ca5e9e82b fix: increment runID during allocation (#3494)
    • 77b3433b0 Add telemetry reporting for data layer (#3493)
    • 819ea05bd chore: Remove deprecated routes from old API (#3483)
    • 3d84b6f0d fix: gcp vm image with google guest agent working [DET-6489] (#3489)
    • 96b1e462a refactor: migrate scatter plot to uplot [DET-6376m, DET-6377] (#3474)
    • c8f77b4a0 chore: resolve previous /agents fetch before polling for a new one (#3481)
    • c2d31771d test: support parallel running of auth test suite (#3443)
    • 2f4883f24 fix: running experiment log misbehave [DET-6339] (#3484)
    • f664db6bd chore: associate gc tasks with their job (#3464)
    • a29990acb chore: support concurrent agents with matching fluent names. (#3460)
    • 00be2a735 separate launch layer (#3371)
    • d23f1a2e3 feat: add columns to experiment list table [DET-6400] (#3445)
    • 68c12e242 fix: update task list Resource Pool column sorter to alphaNumericSorter [DET-6434] (#3471)
    • 417816f9d chore: address "key" prop warnings (#3476)
    • 32dae0aee chore: bump version: 0.17.6-dev0 -> 0.17.7-dev0
    • ba6b12398 docs: add release notes for 0.17.6 (#3475)
    • 22eeac453 chore: Move parts of Experiment CLI to GRPC API (#3418)
    • c29852e0c fix: jq error changing ntbcs priority (#3461)
    • cd8bf45c0 fix: visualization tab 'Apply' button style [DET-6442] (#3463)
    • 97b0aa5be fix: use different tf on macos (#3462)
    • 917364694 ci: increase timeouts for GKE cluster creation, move notifications from #ml-ag to #ci-bots. (#3469)
    • 25b5f6b1f docs: describe oidc integration for enterprise (#3400)
    • 768ee4ed2 build: make -C master doesn't call make fmt (#3467)
    • c7f5a7dc8 fix: make systemd socket activation actually work (#3459)
    • fe20d429a chore: report job entity id in the job queue list cli (#3429)
    • e890aa9be chore: update pr preview cluster address for Netlify (#3455)
    • ff3151375 fix: Allow allocations array to be empty without null value (#3465)
    • 057a25773 fix: Update sample models code for Determined client
    • b7a5a7937 fix: model-hub transformers ner example (#3456)
    • 05ff3a963 fix: avoid terminating profiler streaming [DET-6459] (#3453)
    • 9badb4bc6 chore: improve webui analytics [DET-4857, DET-4859] (#3286)
    • e9a3c18cb chore: remove accidentally introduced files from webui/tests (#3441)
    • 842f03497 fix: mobile view icons for experiment header [DET-6011] (#3446)
    • 3d450bd8e fix: update to show full version when nav sidebar is expended [DET-6264] (#3439)
    • 9a17bf579 fix: update label reference to tag [Det-5969] (#3413)
    • f63726449 fix: revert to prior slot utilization logic for static agents (#3451)
    • 50e4d4a29 fix: check that login input is visible before logging in (#3450)
    • 4d534afdf chore: upgrade golangci-lint to latest (#3379)
    • ec0f1172c test: await focusing username element (#3438)
    • 3c8476abb chore: update tasklist comparator (#3440)
    • 52f9d03e1 fix: update timeago casing for taskcards on dashboard (#3434)
    • 01f7a44cd test: Create Avatar component test [DET-6404] (#3402)
    • 437270b33 chore: remove tools.go+go.mod tool dep tracking pattern, just go install versions of tools (#3383)
    • 375c53d54 refactor: error handler [DET-6298] (#3416)
    • 2e1b4b8d3 ci: update gauge version [DET-6432] (#3433)
    • 6f217e8b5 ci: run releases for tags with new proper SemVer format (#3435)
    • 36bb4d9be chore: better telemetry (#3271)
    • 845c3c47b fix: fix priority scheduler time-based preemption and prioritization issues (#3428)
    • 932660076 ci: log test boundary markers to managed devcluster logs. (#3426)
    • e312aa5a6 record parent_id from webui, return in api as forkedFrom (#3425)
    • 79f291cb6 chore: validate priority and weight ranges in job api requests (#3430)

    Docker images

    • docker pull determinedai/determined-master:0.17.7
    • docker pull determinedai/determined-master:814043805
    • docker pull determinedai/determined-master:814043805d83f5178dca05e2621e820ae9182402
    • docker pull determinedai/determined-dev:determined-master-814043805
    • docker pull determinedai/determined-dev:determined-master-814043805d83f5178dca05e2621e820ae9182402
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.7
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:814043805
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:814043805d83f5178dca05e2621e820ae9182402
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.7_checksums.txt(752 bytes)
    determined-agent_0.17.7_darwin_amd64.tar.gz(8.19 MB)
    determined-agent_0.17.7_linux_amd64.deb(7.87 MB)
    determined-agent_0.17.7_linux_amd64.rpm(7.85 MB)
    determined-agent_0.17.7_linux_amd64.tar.gz(7.84 MB)
    determined-agent_0.17.7_linux_ppc64.deb(6.92 MB)
    determined-agent_0.17.7_linux_ppc64.rpm(6.86 MB)
    determined-agent_0.17.7_linux_ppc64.tar.gz(6.85 MB)
    determined-master_0.17.7_checksums.txt(759 bytes)
    determined-master_0.17.7_darwin_amd64.tar.gz(37.57 MB)
    determined-master_0.17.7_linux_amd64.deb(59.04 MB)
    determined-master_0.17.7_linux_amd64.rpm(58.91 MB)
    determined-master_0.17.7_linux_amd64.tar.gz(37.53 MB)
    determined-master_0.17.7_linux_ppc64.deb(56.38 MB)
    determined-master_0.17.7_linux_ppc64.rpm(56.10 MB)
    determined-master_0.17.7_linux_ppc64.tar.gz(34.72 MB)
  • 0.17.6(Jan 20, 2022)

    Changelog

    • a7806b5a chore: bump version: 0.17.6-rc6 -> 0.17.6
    • 48451b62 chore: fix import ordering
    • d9a1257b docs: add release notes for 0.17.6 (#3475)
    • 47a17f67 chore: bump version: 0.17.6-rc5 -> 0.17.6-rc6
    • 7a8132a9 fix: make systemd socket activation actually work (#3459)
    • b6d82d13 chore: bump version: 0.17.6-rc4 -> 0.17.6-rc5
    • 6e65020c fix: Allow allocations array to be empty without null value (#3465)
    • 79ca9dde fix: update timeago casing for taskcards on dashboard (#3434)
    • f40314ef chore: bump version: 0.17.6-rc3 -> 0.17.6-rc4
    • 247c288a fix: avoid terminating profiler streaming [DET-6459] (#3453)
    • 90a385e3 chore: bump version: 0.17.6-rc2 -> 0.17.6-rc3
    • bb43d698 fix: revert to prior slot utilization logic for static agents (#3451)
    • 75972b34 chore: bump version: 0.17.6-rc1 -> 0.17.6-rc2
    • 34e2e31f chore: better telemetry (#3271)
    • bdd5c238 ci: run releases for tags with new proper SemVer format
    • c87dcf4b chore: bump version: 0.17.6-rc0 -> 0.17.6-rc1
    • 49c03b40 chore: bump version: 0.17.6-dev0 -> 0.17.6-rc0
    • e7ed8e97 chore: lock api state for backward compatibility check
    • 18859054 fix: Add allocation state to db test object (#3427)
    • 4dcf3250 feat: allow podSpec env variables (#3431)
    • 25616ac6 feat: adjust job priority and weight through job queue (#3411)
    • c3c4df9f feat: Add /tasks/:task_id endpoint to GRPC API [DET-6354] [DET-6355] (#3360)
    • db71ac48 docs: announce deprecation of pbt (#3407)
    • 01da9977 feat: pass metrics to simple reducer in original order (#3405)
    • 39568043 fix: show correct total gpu capacity [DET-3733] (#3385)
    • b0f5458c chore: bump env images for security. (#3415)
    • 4f2ced6f fix: address experiment name going out of sync with db (#3414)
    • dd733ef6 fix: avoid Can't pickle local object in TestPIDServer (#3393)
    • 3dcdbc8f fix: add missing fields to allocation query and tests to prevent future bugs (#3398)
    • c1d5db02 ci: fix flake in provisioner unit test (#3409)
    • f082d650 chore: update unreleased manage job modal (#3374)
    • cd64b92e chore: make mypy happy with requests wrapper (#3408)
    • d0da8697 fix: fix a conditional render loop (#3394)
    • c3e2d3a3 chore: bumpenvs for updated base AMIs (#3404)
    • c6f23186 chore: get gov images in refresh-ubuntu-amis.py (#3399)
    • 7f504e7d ci: make checkpoint gc tests actually wait for gc (#3403)
    • 142f5999 fix: set gc-policy broken [DET-6373] (#3391)
    • 47b2375d fix: fix forked experiments missing username in memory (#3392)
    • 9a1e8113 feat: add systemd socket activation support to the master (#3366)
    • 875d6732 DET-6361 - update docs (#3386)
    • e9ad0532 chore: force github.com/containerd/containerd upgrade (#3381)
    • 7e822f55 chore: fix default format selection and enum loading in cli (#3384)
    • dc511af9 chore: write our own swagger bindings (#3361)
    • 9a5c1f5e fix: Fix sphinx-build parsing bug (#3376)
    • 3bc09571 fix: stop re-rendering loops and throw the appropriate errors for continue trial modal [DET-6368] (#3378)
    • 66378a49 chore: bump github.com/labstack/echo/v4 dependencies to address dependabot (#3354)
    • 38f77750 fix: fix webui full config edit in notebook modal (#3373)
    • 2debbcf9 chore: bump docker and k8s dependencies (#3352)
    • b3a34baa docs: address onboarding gaps (#3122)
    • a996243e ci: stop testing EOL python (#3377)
    • e6ae62aa chore: update github pr template (#3365)
    • 0f45c4a7 fix: stop profiler spinner when terminal [DET-6326] (#3325)
    • ceb537d5 fix: negative slots per agent [DET-6357] (#3342)
    • 31549d8b fix: default shell/cmd slots should be 1. (#3369)
    • 268035be chore: try to sidestep race in use of check_if_string_present_in_trial_logs test helper (#3367)
    • eaeb658c chore: bump goreleaser (#3345)
    • 69965984 chore: image updates: bump all, add ROCm image. (#3363)
    • c81a7a9d ci: unflake master IdleTimeoutWatcher test. (#3364)
    • 4b76543d chore: fix small data race found by go build --race (#3359)
    • ddda7a45 fix: purge model.ExperimentConfig (#3362)
    • 1f4898c0 feat: experimental ROCm support. [DET-6285] (#3282)
    • 611947c1 ci: only install yq with snap. (#3355)
    • a0a5a8cf chore: fix trial log readability (#3356)
    • 34698e68 chore: AdvancedSearcher->Searcher (#3339)
    • 98c69adb chore: More info for test failure [DET-6347] (#3353)
    • c1a1e303 chore: add trial log dump for test assertion failure (#3336)
    • 5241759c fix: handle calls to old command endpoints, make it harder to crash cmd managers [DET-6336] (#3315)
    • 906ea4b3 fix: Convert all experiment and job states to labels (#3351)
    • c9d43f9c chore: bump test-e2e go version (#3344)
    • 95999f97 fix: fix incorrect preemption status report from Kubernetes RP (#3330)
    • 1f8eaa9a chore: rename Generic API to Core API [DET-6243] (#3310)
    • c266f8ee ci: print trial logs in more failure cases (#3333)
    • 61b309fc fix: don't allow allocations to take actions with unreceived cancellations (#3326)
    • 2e234990 fix: small bug in error log (#3332)
    • 377560e8 feat: support agent on Apple Silicon without Rosetta (#3328)
    • 9b31a1b6 feat: add config option for Tensorboard (#3319)
    • 46d56274 refactor: stop experiment modal [DET-6325] (#3307)
    • 12fce381 ci: update to python3.7. (#3316)
    • 30397f25 chore: unpin google-cloud dependencies. (#3320)
    • 4632cbe8 chore: simplify job queue state tracking (#3302)
    • 7ff1f8ec fix: collect system metrics from all agents (#3313) [DET-6332]
    • 355dcb86 chore: workaround upstream torch bug (#3321)
    • 0c12fe3d test: store and report webui test results (#3248)
    • a2cede94 feat: add prometheus endpoint for internal Determined state mappings [DET-5890] (#3258)
    • 7d9714e0 chore: update can-i-use browserlist (#3317)
    • f4fc7bb8 chore: remove hvd_config usage [DET-6220] (#3210)
    • cb139a16 fix: pull logs [DET-6335] (#3308)
    • ca96c776 feat: add wall clock time, tests to get trial API [DET-6226] (#3311)
    • 997dd8fb fix: preview search (#3309)
    • 43ccbeab fix: improve CPU core count parsing on agents with CPU slots. (#3304)
    • 9b80a576 chore: clean up code owners (#3312)
    • dd3edaa6 docs: fix image formatting in 0.17.5 release notes (#3305)
    • f9dd54ac chore: bump version: 0.17.5-dev0 -> 0.17.6-dev0
    • 09ca94f5 docs: add release notes for 0.17.5 (#3299)
    • 09c049cb chore: update job queue title and navigation entry (#3303)
    • c1890544 feat: Allow new AWS instances to be specified [DET-6327] (#3296)
    • 8152a90c chore: reuse ordering logic between k8 and priority schedulers (#3301)
    • 5716cf2e chore: reorganize how endpoints are queried in jobs page (#3298)
    • 50be69ed fix: update craco config to have webpack use the ify-loader for plotly imports (#3294)
    • 54a37e33 fix: fix experiment active state check in webui (#3295)
    • a2bdf72e fix: increase CircleCI resource class for React builds (#3297)
    • e06e307d fix: open job queue task links in a new tab (#3293)
    • 26b6d26b fix: pass get_trials sort parameters to REST (#3291)
    • 60b29812 feat: set up, read, and visualize job queue (#3231)
    • 6686b5f3 fix: update use in notebook code snippet [DET-6305] (#3288)
    • 9707b820 Fix: send activate param from WebUI to API (#3290)
    • 6432ca9f fix: det model list-versions --json (#3292)
    • ded1d768 chore: avoid progress rendering for tasks card in some cases (#3289)
    • f4302d87 fix: add visual indicators that you can't edit an archived model [DET-6280] (#3279)
    • f38996aa refactor: customized timeago [DET-6244] (#3283)

    Docker images

    • docker pull determinedai/determined-master:0.17.6
    • docker pull determinedai/determined-master:a7806b5a
    • docker pull determinedai/determined-master:a7806b5a6670a0c6a2d9126b004384d322930b73
    • docker pull determinedai/determined-dev:determined-master-a7806b5a
    • docker pull determinedai/determined-dev:determined-master-a7806b5a6670a0c6a2d9126b004384d322930b73
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.6
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:a7806b5a
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:a7806b5a6670a0c6a2d9126b004384d322930b73
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.6_checksums.txt(752 bytes)
    determined-agent_0.17.6_darwin_amd64.tar.gz(8.19 MB)
    determined-agent_0.17.6_linux_amd64.deb(7.87 MB)
    determined-agent_0.17.6_linux_amd64.rpm(7.84 MB)
    determined-agent_0.17.6_linux_amd64.tar.gz(7.83 MB)
    determined-agent_0.17.6_linux_ppc64.deb(6.92 MB)
    determined-agent_0.17.6_linux_ppc64.rpm(6.86 MB)
    determined-agent_0.17.6_linux_ppc64.tar.gz(6.85 MB)
    determined-master_0.17.6_checksums.txt(759 bytes)
    determined-master_0.17.6_darwin_amd64.tar.gz(37.54 MB)
    determined-master_0.17.6_linux_amd64.deb(58.97 MB)
    determined-master_0.17.6_linux_amd64.rpm(58.86 MB)
    determined-master_0.17.6_linux_amd64.tar.gz(37.50 MB)
    determined-master_0.17.6_linux_ppc64.deb(56.31 MB)
    determined-master_0.17.6_linux_ppc64.rpm(56.03 MB)
    determined-master_0.17.6_linux_ppc64.tar.gz(34.69 MB)
  • 0.17.5(Dec 10, 2021)

    Changelog

    439547c6 chore: bump version: 0.17.5-rc7 -> 0.17.5 92feff10 docs: add release notes for 0.17.5 (#3299) 15fa6549 chore: bump version: 0.17.5-rc6 -> 0.17.5-rc7 d4568d2c chore: update job queue title and navigation entry (#3303) d777adc4 chore: reuse ordering logic between k8 and priority schedulers (#3301) 000839f1 chore: bump version: 0.17.5-rc5 -> 0.17.5-rc6 b1030f06 fix: fix experiment active state check in webui (#3295) b79a0eb9 fix: update craco config to have webpack use the ify-loader for plotly imports (#3294) 228b9d64 chore: bump version: 0.17.5-rc4 -> 0.17.5-rc5 ec7d3491 chore: avoid progress rendering for tasks card in some cases (#3289) 44bbbfc5 chore: bump version: 0.17.5-rc3 -> 0.17.5-rc4 b6420cde fix: open job queue task links in a new tab (#3293) e44ac2bc fix: pass get_trials sort parameters to REST (#3291) 4cbd0035 feat: set up, read, and visualize job queue (#3231) 2126d3bb fix: update use in notebook code snippet [DET-6305] (#3288) 1025947f Fix: send activate param from WebUI to API (#3290) e251222b fix: det model list-versions --json (#3292) 2c9725a7 fix: add visual indicators that you can't edit an archived model [DET-6280] (#3279) e168d758 chore: bump version: 0.17.5-rc2 -> 0.17.5-rc3 f083b089 fix: increase CircleCI resource class for React builds 22a2c8bc chore: bump version: 0.17.5-rc1 -> 0.17.5-rc2 94163687 ci: run releases for tags with new proper SemVer format bf1efefc chore: bump version: 0.17.5-rc0 -> 0.17.5-rc1 02ce03d5 chore: bump version: 0.17.5-dev0 -> 0.17.5-rc0 e89861a9 fix: CLI keeps activate_experiment for old master servers (#3284) 2561e2ab fix: model version last_updated_time (#3285) 3b4dd314 chore: rename master to cluster in the webui [DET-6273] (#3264) b923ef3e fix: various model registry tweaks (#3278) c7a91b4c docs: updating model registry documentation [DET-6072] (#3276) 8c2b9701 feat: Model Registry UX Phase 2 - Creation [DET-6257] [DET-6258] (#3244) 84f69a4b chore: fix the spinner story (#3272) d9579cab feat: Add python method for listing trials in an experiment [DET-5088] (#3270) 9e91c01b fix: generate SSH keys for restored trials (#3249) 5c37bfe9 fix: remove old build targets (#3268) b9d0864b perf: optimizations to trial logs streaming [DET-6262] (#3254) 8b822be9 chore: apply more jsx lint (#3259) 4e79a7ac chore: update release note for MIG support (#3260) e16ee27d chore: fix storybook assets (#3253) d66f0cc1 chore: bump version: 0.17.4-dev0 -> 0.17.5-dev0 2d299f34 docs: add release notes for 0.17.4 (#3242) 327fe9f5 fix: update bumpversion config to account for new React config file (#3257) 897329a1 fix: hp search progress to account for early-stopped trials [DET-6231] (#3247) 4dd2281c docs: add an example for bind_mounts. (#3255) e3c8723a chore: unit tests [DET-6245] (#3229) b3a6911e fix: invalidhp handling for single trial experiments (#3246) fca19635 feat: Activate forked experiments once created [DET-6190] (#3236) ee0ce6d6 fix: add Validate methods for checkpoint_storage structs [DET-6281] (#3243) 3666b751 Update behavior of insert model version and archived models (#3241) ddce7de8 ci: docker image scan: update known tf1 vulns. (#3238) c56f12dd fix: stop Model Registry from showing empty message while loading (#3240) eefb0e6a fix: don't create filestore with no_filestore flag (#3233) 4a3ffb73 feat: basic support for container recovery on agent restarts [DET-6061] (#3155) 0f6a0f26 ci: lint-python: readjust to fresh flake8-bugbear==21.11.28 release. (#3239) 518b06ea fix: make Helm deployment work with no OpenShift config present (#3230) 3aa864a6 fix: upgrade to swagger-ui v4.1.0 to patch rest-api XSS vulnerability [DET-6210] (#3234) 66759c00 refactor: customize-cra to craco [DET-5441] (#3221) 93ca970f fix: clarify det deploy gke-experimental help text (#3232)

    Docker images

    • docker pull determinedai/determined-master:latest
    • docker pull determinedai/determined-master:0.17.5
    • docker pull determinedai/determined-master:439547c6
    • docker pull determinedai/determined-master:439547c65ddf8039569961d61baf02a6efac6281
    • docker pull determinedai/determined-dev:determined-master-439547c6
    • docker pull determinedai/determined-dev:determined-master-439547c65ddf8039569961d61baf02a6efac6281
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.5
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:439547c6
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:439547c65ddf8039569961d61baf02a6efac6281
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.5_checksums.txt(752 bytes)
    determined-agent_0.17.5_darwin_amd64.tar.gz(7.72 MB)
    determined-agent_0.17.5_linux_amd64.deb(7.43 MB)
    determined-agent_0.17.5_linux_amd64.rpm(7.41 MB)
    determined-agent_0.17.5_linux_amd64.tar.gz(7.40 MB)
    determined-agent_0.17.5_linux_ppc64.deb(6.53 MB)
    determined-agent_0.17.5_linux_ppc64.rpm(6.47 MB)
    determined-agent_0.17.5_linux_ppc64.tar.gz(6.46 MB)
    determined-master_0.17.5_checksums.txt(759 bytes)
    determined-master_0.17.5_darwin_amd64.tar.gz(35.23 MB)
    determined-master_0.17.5_linux_amd64.deb(56.67 MB)
    determined-master_0.17.5_linux_amd64.rpm(56.56 MB)
    determined-master_0.17.5_linux_amd64.tar.gz(35.19 MB)
    determined-master_0.17.5_linux_ppc64.deb(54.20 MB)
    determined-master_0.17.5_linux_ppc64.rpm(53.95 MB)
    determined-master_0.17.5_linux_ppc64.tar.gz(32.58 MB)
  • 0.17.4(Nov 30, 2021)

    Changelog

    ce264345 chore: bump version: 0.17.4-rc1 -> 0.17.4 56e5f615 docs: add release notes for 0.17.4 (#3242) 4ccc5eae ci: lint-python: readjust to fresh flake8-bugbear==21.11.28 release. (#3239) de48d074 fix: make Helm deployment work with no OpenShift config present (#3230) 46211c7d chore: bump version: 0.17.4-rc0 -> 0.17.4-rc1 e1af55c6 chore: bump version: 0.17.4-dev0 -> 0.17.4-rc0 b63e9117 chore: lock api state for backward compatibility check 1be80269 feat: Adding Openshift route support in Helm chart (#3214) 5fa9e59f chore: prefer https over git for npm dependency (#3225) b184eb97 fix: overflow buttons on small screens (#3227) f15a1fcc feat: add link to docs in Model Registry empty state (#3226) 3cc7d127 chore: reorder where Model Registry appears in navbar (#3224) b8548bce feat: don't delete image before force_pull_image [DET-6145] (#3219) b41ae1dd fix: reorder migrations so timestamps reflect commit order (#3223) c4153dce feat: track historical allocation over users for all tasks (#3199) [DET-6247] 939754b2 fix: update experiment state filter [DET-6217] (#3216) 73917ee2 feat: read-only model registry UI [DET-5992] [DET-5993] [DET-5994] (#3172) 168f2aa2 fix: experiment delete should work when trials have restarts (#3212) 9ff9c638 feat: image updates: add tf 2.7; security tf 2.4, 2.5, 2.6; fix PTL. (#3215) 2a6f3605 fix: typo in experiment state go to postgres enum mapping (#3211) 78748225 feat: model registry API can update name, has Notes field (#3213) f41d178d feat: detect MIG instances in agents (#3204) 1aa22252 feat: add det deploy gke-experimental [DET-5752] (#3136) 525777bd ci: manual image scanning with anchore script. (#3206) bd747d3b docs: remove note about model.predict (#3202) 1cb6e2a7 fix: Continue using model names on the CLI [DET-6152] (#3152) 6dd7410f chore: enforce consistent spacing around operators (#3200) 7311d66d chore: reusable action dropdown (#3194) e1dbd091 ci: update gke version. (#3207) d82530d3 Fix: add user_id to previous model_versions (#3208) 612312f2 fix: use ListValue to receive empty list on labels field (#3198) ae1d367c fix: display short version in sidebar properly (#3197) f5939c4b style: remove text-shadow from selected antd tabs (#3201) 6ef67655 chore: update string tests organization (#3192) 290ab556 refactor: simplify api and api config imports [DET-6201] (#3193) 4c675d3a docs: remove det exp from docs (#3183) 0cf46177 chore: trial log settings query param [DET-6206] (#3196) 1221690e chore: convert master logs to use new streaming api [DET-5267, DET-6187] (#3111) 14a429cd refactor: condense master RM modules together. (#3173) dde5892d chore: bump version: 0.17.3-dev0 -> 0.17.4-dev0 193cea64 docs: add release notes for 0.17.3 (#3186) 51b98895 docs: fix TFKerasTrial docstring to say we default to TF 2.x. (#3177) 42ceb948 chore: fix race condition in test_login_as_non_active_user (#3182) ca01d6a0 chore: move config into its own package (#3175) 813a0ba4 fix: fluentbit should be able to run as nonroot [DET-5949] (#3160) 97ccfcfd chore: handle different provider name cases (#3180) 3e5b126b fix: update bad routing to trial actor (#3178) a5abafa9 feat: add support for any SSO providers [DET-6184] (#3174)

    Docker images

    • docker pull determinedai/determined-master:latest
    • docker pull determinedai/determined-master:0.17.4
    • docker pull determinedai/determined-master:ce264345
    • docker pull determinedai/determined-master:ce264345f9c61c7496b8efaa106f26a35d55f48f
    • docker pull determinedai/determined-dev:determined-master-ce264345
    • docker pull determinedai/determined-dev:determined-master-ce264345f9c61c7496b8efaa106f26a35d55f48f
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.4
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:ce264345
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:ce264345f9c61c7496b8efaa106f26a35d55f48f
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.4_checksums.txt(752 bytes)
    determined-agent_0.17.4_darwin_amd64.tar.gz(7.71 MB)
    determined-agent_0.17.4_linux_amd64.deb(7.42 MB)
    determined-agent_0.17.4_linux_amd64.rpm(7.39 MB)
    determined-agent_0.17.4_linux_amd64.tar.gz(7.38 MB)
    determined-agent_0.17.4_linux_ppc64.deb(6.52 MB)
    determined-agent_0.17.4_linux_ppc64.rpm(6.46 MB)
    determined-agent_0.17.4_linux_ppc64.tar.gz(6.45 MB)
    determined-master_0.17.4_checksums.txt(759 bytes)
    determined-master_0.17.4_darwin_amd64.tar.gz(35.16 MB)
    determined-master_0.17.4_linux_amd64.deb(54.12 MB)
    determined-master_0.17.4_linux_amd64.rpm(54.02 MB)
    determined-master_0.17.4_linux_amd64.tar.gz(35.12 MB)
    determined-master_0.17.4_linux_ppc64.deb(51.66 MB)
    determined-master_0.17.4_linux_ppc64.rpm(51.43 MB)
    determined-master_0.17.4_linux_ppc64.tar.gz(32.53 MB)
  • 0.17.3(Nov 12, 2021)

    Changelog

    70d1142b chore: bump version: 0.17.3-rc4 -> 0.17.3 bd1a9a3a docs: add release notes for 0.17.3 (#3186) f9d6bd58 chore: bump version: 0.17.3-rc3 -> 0.17.3-rc4 06990f5b fix: fluentbit should be able to run as nonroot [DET-5949] (#3160) 8ef70747 chore: handle different provider name cases (#3180) ae5c9cf9 fix: update bad routing to trial actor (#3178) 579b92f4 chore: bump version: 0.17.3-rc2 -> 0.17.3-rc3 7c1f4a84 feat: add support for any SSO providers [DET-6184] (#3174) 17f066f9 chore: bump version: 0.17.3-rc1 -> 0.17.3-rc2 a49befd2 ci: run releases for tags with new proper SemVer format 9024e3ee chore: bump version: 0.17.3-rc0 -> 0.17.3-rc1 a8ef116b ci: run releases for tags with new proper SemVer format 96d4d114 chore: bump version: 0.17.3-dev0 -> 0.17.3-rc0 43a87f83 chore: lock api state for backward compatibility check dc9f7001 fix: correct experiment snapshot shim error messages (#2999) a580ed5d feat: Add Checkpoint Storage prefix for S3 (#2978) b28caa27 style: add job queue icon to project (#3170) 6d2f40eb fix: update fetch-cloud-forest.sh to be compatible with old versions of bash (#3115) 00d64239 chore: remove simplejson (#3148) ebfe7429 refactor: classmethods instead of staticmethods in trial controllers. (#3171) a0e7a9ad feat: make determined-agent [ARGS] behave as determined-agent run [ARGS]. (#3157) 47ca90f4 fix: remove old api kill trial (#3169) f6c36b1d style: move master logs nav item to other internal nav items (#3168) ce60bbfa chore: update manifest to use relative paths (#3131) ea42cadb refactor: consolidate mechanisms for API actor requests (#3165) [DET-5700] 580aa1ad chore: fixes to makefile and devcluster config (#3163) a473af7c ci: add no-prompt to ignore safety checks in automation (#3166) 32743e98 style: add model registry icon to project (#3158) e471d61b refactor: rename pkg/agent to aproto, pkg/container to cproto. (#3161) 18eac599 chore: advertise shorter openapi operation ids (#3154) 4d65c15f feat: confirm before modifying an existing AWS stack in det deploy [DET-6128] (#3117) 1a1c1c0c fix: hide secrets when printing EnvContext [DET-6046] (#3022) cc7605c7 fix: adhere to proper SemVer format (#3153) bc28eb66 feat: Add /labels to Model Registry API [DET-6178] (#3150) 18531b92 feat: define jobs for experiments and NTbSC (#3135) d7294ae3 fix: fail non-restartable tasks on allocation termination [DET-6158] (#3144) 75e7dfa2 feat: Add DELETE to Model and Model Version API [DET-6118] (#3138) 60730034 chore: bump version: 0.17.2.dev0 -> 0.17.3.dev0 e8011b82 docs: add release notes for 0.17.2 (#3146) 1d2088e7 fix: non-scalar nan handling (#3143) f3915f36 feat: Make SSH RSA key size configurable [DET-5983] (#3141) 97abf3e4 chore: refactor load order [DET-6036] (#3046) b845805b feat: load determined really fast (#3137) 1d6bfaeb docs: normalize case of language names in code blocks (#3121) 96f577ad feat: Implement /archive and /unarchive in Model API [DET-6148] (#3125) 098d0123 fix: CLI fixes for model registry API (#3134) e3253260 fix: webui links to new docs layouts (#3132) 65508eb5 fix doc redirecting (#3133) 7bd70102 feat: Implement PATCH in Model and Model Version API [DET-6051] (#3113) 940b5ade docs: fix docker install docs [DET-5900] (#3129) b344cac8 style: prefer optional chain (#3124)

    Docker images

    • docker pull determinedai/determined-master:latest
    • docker pull determinedai/determined-master:0.17.3
    • docker pull determinedai/determined-master:70d1142b
    • docker pull determinedai/determined-master:70d1142b8f9728ff6f22557de7dd778f01b04c7d
    • docker pull determinedai/determined-dev:determined-master-70d1142b
    • docker pull determinedai/determined-dev:determined-master-70d1142b8f9728ff6f22557de7dd778f01b04c7d
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.3
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:70d1142b
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:70d1142b8f9728ff6f22557de7dd778f01b04c7d
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.3_checksums.txt(752 bytes)
    determined-agent_0.17.3_darwin_amd64.tar.gz(7.71 MB)
    determined-agent_0.17.3_linux_amd64.deb(7.42 MB)
    determined-agent_0.17.3_linux_amd64.rpm(7.39 MB)
    determined-agent_0.17.3_linux_amd64.tar.gz(7.38 MB)
    determined-agent_0.17.3_linux_ppc64.deb(6.52 MB)
    determined-agent_0.17.3_linux_ppc64.rpm(6.46 MB)
    determined-agent_0.17.3_linux_ppc64.tar.gz(6.45 MB)
    determined-master_0.17.3_checksums.txt(759 bytes)
    determined-master_0.17.3_darwin_amd64.tar.gz(35.15 MB)
    determined-master_0.17.3_linux_amd64.deb(54.08 MB)
    determined-master_0.17.3_linux_amd64.rpm(53.98 MB)
    determined-master_0.17.3_linux_amd64.tar.gz(35.12 MB)
    determined-master_0.17.3_linux_ppc64.deb(51.66 MB)
    determined-master_0.17.3_linux_ppc64.rpm(51.42 MB)
    determined-master_0.17.3_linux_ppc64.tar.gz(32.56 MB)
  • 0.17.2(Oct 29, 2021)

    Changelog

    f3f15e8d chore: bump version: 0.17.2rc7 -> 0.17.2 0e73d67a docs: add release notes for 0.17.2 (#3146) 000e41d8 chore: bump version: 0.17.2rc6 -> 0.17.2rc7 35b9ae94 fix: non-scalar nan handling (#3143) fbb80cf2 chore: bump version: 0.17.2rc5 -> 0.17.2rc6 2badb07d chore: bump version: 0.17.2rc4 -> 0.17.2rc5 20e5cb7f chore: bump version: 0.17.2rc3 -> 0.17.2rc4 c8dc28ad chore: bump version: 0.17.2rc2 -> 0.17.2rc3 0b14aee9 fix: CLI fixes for model registry API (#3134) 15189a79 fix: return m.id in get_model 3d6c5eed chore: bump version: 0.17.2rc1 -> 0.17.2rc2 8cee8415 chore: bump version: 0.17.2rc0 -> 0.17.2rc1 4c0d94b9 fix: webui links to new docs layouts (#3132) d7961a02 fix doc redirecting (#3133) 885fce1e docs: fix docker install docs [DET-5900] (#3129) 063f582a chore: bump version: 0.17.2.dev0 -> 0.17.2rc0 a0536eb2 chore: lock api state for backward compatibility check 1f332d33 fix: store and display metrics with NaN or ±Infinity [DET-5944] (#3101) ad0f5e0b ci: switch release ordering of Docker and PyPi publishing [DET-5195] (#3128) acef5c21 chore: update docs with logo-free images (#3127) c9d3da66 fix: update experiment state proto with missing enum value [DET-6115] (#3114) 4f935254 chore: prevent NaNs from polluting HP importance [DET-6119] (#3126) 509fdb7a chore: react component testing [DET-6048] (#3043) c5f76f5c chore: add a helpful error message when fetching context fails (#3123) 0b4d2b59 chore: Bump images for DET-5888 and DET-6134, and revert container runtime selection (#3119) fd3d3f7a fix: fix doc static file path (#3120) 91284792 chore: pin node/npm version (#3108) 4bc7634b fix: master_config_path not being treated as Path (#3112) 6b5d3de4 fix: upgrade pip on windows CI to get new resolver (#3118) 1a51bcd1 docs: update sphinx theme (#3110) fe6502a2 feat: Implement Model API POST to /models, support labels [DET-6053] (#3107) 2e93116c chore: add test coverage report (#3109) 1a2df6b4 chore: fix icon filters tooltip label for JupyterLab and TensorBoard (#3104) d1f909bf feat: add timestamp, log level to master logs APIs (#3105) 22727984 ci: vulnerability scanning of Docker images [DET-5931] (#3103) 2ea2125c ci: wait for slot availability in e2e-cpu-double. (#3099) cdd88fbc chore: rename notebook to jupyter lab [DET-6125] (#3097) 31449294 style: fix issue of table headers overlapping (#3098) 74840d57 feat: Implement Model + Model Version API Backend [DET-5995] [DET-5996] (#3096) 98adbe7e style: update ee webui branding [DET-6090, DET-6098, DET-6099, DET-6100, DET-6102] (#3089) d34ea10b test: remove unused variables in tests and other outdated references (#3095) e14c28c2 chore: bump version: 0.17.1.dev0 -> 0.17.2.dev0 93241551 docs: add release notes for 0.17.1 (#3091) 403edfa5 fix: write cluster_info.json in all non-cmd task types (#3094) 8c6b3e32 chore: update Docker images and AMIs (#3093) 6a8eecde fix: prevent login redirect from being the login page itself (#3088) 1497583b fix: make sure login failure is shown to the user (#3087) 34bfcbd0 chore: update WebUI READMEs regarding testing [DET-4628] (#3047) fdd60c07 fix: avoid race on schema cache load (#3081) [DET-6108] bc5ae03d refactor: update mypy and get rid of type: ignore on all tests. (#3086) 075d1795 fix: update harness to handle telemetry being off (#3085) aa751100 fix: report progress correctly for searchers configured in epochs (#3084) [DET-6112] 4fa78035 refactor: move cluster-wide tests into one dir, reorg, remove unused utils. (#3074) ec654a7c chore: add branding field to master endpoint [DET-6067] (#3080) a636b6d6 chore: add redirect to documents (#3054) 4cd722d1 chore: update experiment and checkpoint imports for consistency (#3079) 0fea0753 fix: fix an issue in some CLI aliases not working (#3078) 8d08a42b chore: upgrade sphinx version (#3077) 97201ccd feat: trigger alert when user tries to leave with unsaved notes [DET-6096] (#3071) 09310094 fix: allow for non-integer timekeys (#3076) ca59928f fix: update helm push command to helm cm-push (#3075)

    Docker images

    • docker pull determinedai/determined-master:latest
    • docker pull determinedai/determined-master:0.17.2
    • docker pull determinedai/determined-master:f3f15e8d
    • docker pull determinedai/determined-master:f3f15e8dcd4de1f745b762f2edba5b2568416d59
    • docker pull determinedai/determined-dev:determined-master-f3f15e8d
    • docker pull determinedai/determined-dev:determined-master-f3f15e8dcd4de1f745b762f2edba5b2568416d59
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.2
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:f3f15e8d
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:f3f15e8dcd4de1f745b762f2edba5b2568416d59
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.2_checksums.txt(752 bytes)
    determined-agent_0.17.2_darwin_amd64.tar.gz(7.70 MB)
    determined-agent_0.17.2_linux_amd64.deb(7.41 MB)
    determined-agent_0.17.2_linux_amd64.rpm(7.39 MB)
    determined-agent_0.17.2_linux_amd64.tar.gz(7.38 MB)
    determined-agent_0.17.2_linux_ppc64.deb(6.51 MB)
    determined-agent_0.17.2_linux_ppc64.rpm(6.45 MB)
    determined-agent_0.17.2_linux_ppc64.tar.gz(6.44 MB)
    determined-master_0.17.2_checksums.txt(759 bytes)
    determined-master_0.17.2_darwin_amd64.tar.gz(35.10 MB)
    determined-master_0.17.2_linux_amd64.deb(54.00 MB)
    determined-master_0.17.2_linux_amd64.rpm(53.91 MB)
    determined-master_0.17.2_linux_amd64.tar.gz(35.07 MB)
    determined-master_0.17.2_linux_ppc64.deb(51.57 MB)
    determined-master_0.17.2_linux_ppc64.rpm(51.33 MB)
    determined-master_0.17.2_linux_ppc64.tar.gz(32.49 MB)
  • 0.17.1(Oct 18, 2021)

    Changelog

    a2ac78ba chore: bump version: 0.17.1rc3 -> 0.17.1 262a4ccc docs: add release notes for 0.17.1 (#3091) a0fdf9dd fix: write cluster_info.json in all non-cmd task types (#3094) cabc9558 chore: bump version: 0.17.1rc2 -> 0.17.1rc3 6c5c02d8 fix: avoid race on schema cache load (#3081) [DET-6108] 02308ca0 fix: report progress correctly for searchers configured in epochs (#3084) [DET-6112] 6d2708fb chore: bump version: 0.17.1rc1 -> 0.17.1rc2 114f5c9c fix: update harness to handle telemetry being off (#3085) 82fff64c chore: bump version: 0.17.1rc0 -> 0.17.1rc1 f7f5860c chore: add redirect to documents (#3054) e102ac8a chore: upgrade sphinx version (#3077) f9ed8da6 chore: update experiment and checkpoint imports for consistency (#3079) f05c32b6 fix: fix an issue in some CLI aliases not working (#3078) 0ea9b0e1 fix: update helm push command to helm cm-push (#3075) a29efdff chore: bump version: 0.17.1.dev0 -> 0.17.1rc0 9b977b92 chore: lock api state for backward compatibility check 6689e625 fix: mispelling [DET-6095] (#3073) 8419fd1a chore: remove flaky tests (#3069) 0e616f2e chore: speed up cli startup time (#3061) f536460a feat: add Notes tab on experiment pages [DET-4691] (#3048) cadb2f66 test: stop trying to close modal twice in a row (#3067) c0e17570 fix: always mkdir default mounted checkpoint_storage host_path. (#3065) c1aa0880 chore: rename cpu containers to aux (#3056) d8def532 chore: bump version: 0.17.0.dev0 -> 0.17.1.dev0 ff4df832 docs: add release notes for 0.17.0 (#3024) 0fda11fd ci: don't depend on badssl.com for test_custom_tls (#3062) b916abf6 ci: update gke version (#3051) 693ded3b feat: run db migrations in transactions [DET-5987] (#3025) 6b4ff187 chore: environment bump analytics-python (#3057) abb3250c feat: add segment tracking python package to harness (#3053) bda42c52 test: update experiment row kill to handle modal confirmation (#3055) 94fdc507 chore: remove deprecated io-ts any type (#3045) 51453e44 chore: tweak samples_per_second metric to represent all workers (#3050) 68d9aca6 docs: reorganize the document structure (#3034) ca594d0b fix: gracefully handle prestart agent failures (#3049) 2a956a1f chore: remove NativeContext and simplify Context inheritance (#3044) cc26061e Revert "test mmdetection on p3.8xlarge" 1a3036d0 test mmdetection on p3.8xlarge e0129f52 fix: Make agent names unique for det deploy local agent-up (#3038) f0273d1c chore: add server-side portion of external session handling (#3016) 3b1df0ca feat: introduce ClusterInfo API (#2946) 85aabd3b chore: added confirmation modal to task kill [DET-6049] (#3035) ac31bac5 chore: adding markdown component (#3033) b21d88ee fix: use str for FileLock (#3036) 50eb8f7b chore: rewrite schemas package without typing internals (#3029) 12f7427b feat: cross-compile for powerpc64 (#2828) 9012a785 chore: prefer https to ssh for git dep (#3028) 5058c60e chore: update release note guidelines (#3027) aa0252f0 fix: fix nested hparams with grid (#3021) 7b9fd713 test: fix flake from race in idle watcher tests (#3008) e0e84bf2 chore: mark open allocs as closed on restart (#3019) b7f1c3c9 chore: upgrade to Go 1.17 (#3015) 87e791fc chore: lower e2e-webui resource class (#2935) 7f40dabe fix: propagate podspec to gc (#3012) b8afaa4d include load fast flag in 2.6 (#3007) 18986727 feat: allow kubernetes to use priority from exp config (#2956) 6fc5e7e4 refactor: move app queries and migrations out of internal/db/postgres. (#3014) 2709f56a chore: handle query param jwt for external auth (#2992) 22fcb05d chore: clean notebook readme (#3013) 4d1734c8 fix: notebook idle check use master port, cert [DET-6013] (#3010) 54e3e567 chore: menu items require keys for newer antd versions (4.x+) (#3011) 3dfcc11a chore: update package json [DET-5846] (#2982) e5caf86f chore: recover agent websocket flakes [DET-5935] (#2991) 888eb5da feat: add CPU images for TF 2.5 and 2.6 [DET-5877] (#2981) a3886178 chore: minor copy fix (#3009) 44ebc982 fix: save task end times (#3006) [DET-6028] cf68f173 chore: nit command exit message placeholder (#3003) d3595e3d chore: add a few timings metrics when sync_required == True (#2996) a0107855 fix: Inline editor for experiment description truncates placeholder text [DET-6024] 70a089b4 fix: kill trial should send kill (#3001) [DET-6026] 8d7a4519 fix: det t describe --metrics API and rendering [DET-6025] (#3002) 76c17f32 fix: bug in notebook README (#3004) 9f4e2702 fix: remove refs to workload start_time from det e describe (#2995) 32c41e12 fix: notebook wait page updates for API changes (#2998) c9b16e58 fix: rename trial job type to experiment (#2997) e1f79257 chore: fix nil deref in idle timeout watcher (#2993) 24043e6e fix: always helm push latest version (#2994)

    Docker images

    • docker pull determinedai/determined-master:latest
    • docker pull determinedai/determined-master:0.17.1
    • docker pull determinedai/determined-master:a2ac78ba
    • docker pull determinedai/determined-master:a2ac78ba1ecf397a2a156c9b9b3ed3bee057899d
    • docker pull determinedai/determined-dev:determined-master-a2ac78ba
    • docker pull determinedai/determined-dev:determined-master-a2ac78ba1ecf397a2a156c9b9b3ed3bee057899d
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:0.17.1
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:a2ac78ba
    • docker pull nvcr.io/isv-ngc-partner/determined/determined-master:a2ac78ba1ecf397a2a156c9b9b3ed3bee057899d
    Source code(tar.gz)
    Source code(zip)
    determined-agent_0.17.1_checksums.txt(752 bytes)
    determined-agent_0.17.1_darwin_amd64.tar.gz(7.69 MB)
    determined-agent_0.17.1_linux_amd64.deb(7.41 MB)
    determined-agent_0.17.1_linux_amd64.rpm(7.38 MB)
    determined-agent_0.17.1_linux_amd64.tar.gz(7.37 MB)
    determined-agent_0.17.1_linux_ppc64.deb(6.50 MB)
    determined-agent_0.17.1_linux_ppc64.rpm(6.45 MB)
    determined-agent_0.17.1_linux_ppc64.tar.gz(6.44 MB)
    determined-master_0.17.1_checksums.txt(759 bytes)
    determined-master_0.17.1_darwin_amd64.tar.gz(35.10 MB)
    determined-master_0.17.1_linux_amd64.deb(52.12 MB)
    determined-master_0.17.1_linux_amd64.rpm(52.04 MB)
    determined-master_0.17.1_linux_amd64.tar.gz(35.06 MB)
    determined-master_0.17.1_linux_ppc64.deb(49.71 MB)
    determined-master_0.17.1_linux_ppc64.rpm(49.49 MB)
    determined-master_0.17.1_linux_ppc64.tar.gz(32.50 MB)
Owner
Determined AI
Determined AI
Lighting the Darkness in the Deep Learning Era: A Survey, An Online Platform, A New Dataset

Lighting the Darkness in the Deep Learning Era: A Survey, An Online Platform, A New Dataset This repository provides a unified online platform, LoLi-P

Chongyi Li 457 Jan 3, 2023
A deep learning based semantic search platform that computes similarity scores between provided query and documents

semanticsearch This is a deep learning based semantic search platform that computes similarity scores between provided query and documents. Documents

null 1 Nov 30, 2021
Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

InfoPro-Pytorch The Information Propagation algorithm for training deep networks with local supervision. (ICLR 2021) Revisiting Locally Supervised Lea

null 78 Dec 27, 2022
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

ONNX Runtime is a cross-platform inference and training machine-learning accelerator. ONNX Runtime inference can enable faster customer experiences an

Microsoft 8k Jan 4, 2023
Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases.

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases. Ivy wraps the functional APIs of existing frameworks. Framework-agnostic functions, libraries and layers can then be written using Ivy, with simultaneous support for all frameworks. Ivy currently supports Jax, TensorFlow, PyTorch, MXNet and Numpy. Check out the docs for more info!

Ivy 8.2k Jan 2, 2023
Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools

Deep-rPPG: Camera-based pulse estimation using deep learning tools Deep learning (neural network) based remote photoplethysmography: how to extract pu

Terbe Dániel 138 Dec 17, 2022
deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

null 63 Oct 17, 2022
Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

Abdultawwab Safarji 7 Nov 27, 2022
FTIR-Deep Learning - FTIR Deep Learning With Python

CANDIY-spectrum Human analyis of chemical spectra such as Mass Spectra (MS), Inf

Wei Mei 1 Jan 3, 2022
Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution

Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution Figure: Example visualization of the method and baseline as a

Oliver Hahn 16 Dec 23, 2022
Deep Learning GPU Training System

DIGITS DIGITS (the Deep Learning GPU Training System) is a webapp for training deep learning models. The currently supported frameworks are: Caffe, To

NVIDIA Corporation 4.1k Jan 3, 2023
Sandbox for training deep learning networks

Deep learning networks This repo is used to research convolutional networks primarily for computer vision tasks. For this purpose, the repo contains (

Oleg Sémery 2.7k Jan 1, 2023
A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

collie_recs Collie is a library for preparing, training, and evaluating implicit deep learning hybrid recommender systems, named after the Border Coll

ShopRunner 97 Jan 3, 2023
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

Microsoft 8.4k Jan 1, 2023
FluxTraining.jl gives you an endlessly extensible training loop for deep learning

A flexible neural net training library inspired by fast.ai

null 86 Dec 31, 2022
We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

Facebook Research 42 Dec 9, 2022
NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

NVIDIA Merlin NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA’s GPUs. It enables data scientists, machine

null 419 Jan 3, 2023
A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

collie Collie is a library for preparing, training, and evaluating implicit deep learning hybrid recommender systems, named after the Border Collie do

ShopRunner 96 Dec 29, 2022
Colossal-AI: A Unified Deep Learning System for Large-Scale Parallel Training

ColossalAI An integrated large-scale model training system with efficient parallelization techniques Installation PyPI pip install colossalai Install

HPC-AI Tech 7.1k Jan 3, 2023