An orchestration platform for the development, production, and observation of data assets.

Overview



Dagster

An orchestration platform for the development, production, and observation of data assets.

Dagster lets you define jobs in terms of the data flow between reusable, logical components, then test locally and run anywhere. With a unified view of jobs and the assets they produce, Dagster can schedule and orchestrate Pandas, Spark, SQL, or anything else that Python can invoke.

Dagster is designed for data platform engineers, data engineers, and full-stack data scientists. Building a data platform with Dagster makes your stakeholders more independent and your systems more robust. Developing data pipelines with Dagster makes testing easier and deploying faster.

Develop and test locally, then deploy anywhere

With Dagster’s pluggable execution, the same computations can run in-process against your local file system, or on a distributed work queue against your production data lake. You can set up Dagster’s web interface in a minute on your laptop, deploy it on-premise, or in any cloud.

Model and type the data produced and consumed by each step

Dagster models data dependencies between steps in your orchestration graph and handles passing data between them. Optional typing on inputs and outputs helps catch bugs early.

Link data to computations

Dagster’s Asset Manager tracks the data sets and ML models produced by your jobs, so you can understand how they were generated and trace issues when they don’t look how you expect.

Build a self-service data platform

Dagster helps platform teams build systems for data practitioners. Jobs are built from shared, reusable, configurable data processing and infrastructure components. Dagit, Dagster’s web interface, lets anyone inspect these objects and discover how to use them.

Avoid dependency nightmares

Dagster’s repository model lets you isolate codebases so that problems in one job don’t bring down the rest. Each job can have its own package dependencies and Python version. Jobs are run in isolated processes so user code issues can't bring the system down.

Debug pipelines from a rich UI

Dagit, Dagster’s web interface, includes expansive facilities for understanding the jobs it orchestrates. When inspecting a run of your job, you can query over logs, discover the most time consuming tasks via a Gantt chart, re-execute subsets of steps, and more.

Getting Started

Installation

Dagster is available on PyPI, and officially supports Python 3.6+.

$ pip install dagster dagit

This installs two modules:

  • Dagster: the core programming model and abstraction stack; stateless, single-node, single-process and multi-process execution engines; and a CLI tool for driving those engines.
  • Dagit: the UI for developing and operating Dagster pipelines, including a DAG browser, a type-aware config editor, and a live execution interface.

Learn

Next, jump right into our tutorial, or read our complete documentation. If you're actively using Dagster or have questions on getting started, we'd love to hear from you:


Contributing

For details on contributing or running the project for development, check out our contributing guide.

Integrations

Dagster works with the tools and systems that you're already using with your data, including:

Integration Dagster Library
Apache Airflow dagster-airflow
Allows Dagster pipelines to be scheduled and executed, either containerized or uncontainerized, as Apache Airflow DAGs.
Apache Spark dagster-spark · dagster-pyspark
Libraries for interacting with Apache Spark and PySpark.
Dask dagster-dask
Provides a Dagster integration with Dask / Dask.Distributed.
Datadog dagster-datadog
Provides a Dagster resource for publishing metrics to Datadog.
 /  Jupyter / Papermill dagstermill
Built on the papermill library, dagstermill is meant for integrating productionized Jupyter notebooks into dagster pipelines.
PagerDuty dagster-pagerduty
A library for creating PagerDuty alerts from Dagster workflows.
Snowflake dagster-snowflake
A library for interacting with the Snowflake Data Warehouse.
Cloud Providers
AWS dagster-aws
A library for interacting with Amazon Web Services. Provides integrations with Cloudwatch, S3, EMR, and Redshift.
Azure dagster-azure
A library for interacting with Microsoft Azure.
GCP dagster-gcp
A library for interacting with Google Cloud Platform. Provides integrations with GCS, BigQuery, and Cloud Dataproc.

This list is growing as we are actively building more integrations, and we welcome contributions!

Comments
  • Converting PickledObjectFilesystemIOManager to use UPathIOManager

    Converting PickledObjectFilesystemIOManager to use UPathIOManager

    Summary & Motivation

    The default IO manager can now be built on top of the recently added UPathIOManager.

    The PickledObjectFilesystemIOManager class can now be used with any filesystems.

    The fs_io_manager object, however, is still meant to be used with the local filesystem

    How I Tested These Changes

    Running existing tests

    opened by danielgafni 20
  • Sequence type (map/collect - dynamic DAG construction)

    Sequence type (map/collect - dynamic DAG construction)

    When a ("mapper"?) solid has an output of Seq[T], with length N, the entire sub-DAG descending from that output, until a single terminal ("reducer"?) node with input(s) of types Seq[T], Seq[S], etc., is reached, should be executed N times, once for each member of the sequence. These mapped sub-DAGs should be nestable, provided the nested sub-DAG terminates before the outer sub-DAG. (I.e., parentheses must match.) I think that by requiring a single terminal node for the entire sub-DAG we don't lose expressiveness -- If there are any number of terminal nodes, nothing is lost by having them pass their outputs to a no-op reduction solid.

    project core 
    opened by mgasner 20
  • Add dagster-azure package

    Add dagster-azure package

    This PR adds the dagster-azure package which provides various storage component implementations using Azure Data Lake Storage. New components are:

    • ADLS2FileCache and adls2_file_cache
    • ADLS2FileManager
    • ADLS2IntermediateStore
    • ADLS2ObjectStore
    • the adls2_resource providing direct access to Azure Data Lake Storage
    • the adls2_system_storage system storage

    This is pretty similar to the S3 implementation, the main difference being configuration: Azure's SDK requires credentials to be passed explicitly, so the credential is expected in configuration.

    Tests currently require an access key to complete any tests marked 'nettest - I guess this will need to be passed over to a new Azure storage account under dagster's control at some point.

    One other issue I just remembered is a dependency version conflict with the snowflake-connector-python package, which is being tracked here. Not quite sure how to resolve that...

    opened by sd2k 19
  • Dagit doesn't work when hosted on a path

    Dagit doesn't work when hosted on a path

    I've set up a k8s LoadBalancer service for my dagit deployment.

    when I curl: curl https://k8s.foo.com/tim/dagster and curl http://<load-balancer-ip>/

    I get the exact same content, but in the browser I see a white page for the former url and a working dagit in the second url.

    bug 
    opened by zzztimbo 17
  • Make dagster a PEP-561 compliant typed library

    Make dagster a PEP-561 compliant typed library

    This PR aims to expose Dagster as a PEP-561-compliant typed Python library. This should significantly improve the development experience of users working with Dagster in the context of a language server like VSCode's Pyright/Pylance.

    While some users are already getting typing info from Dagster, not all of them are. For PyLance (VSCode), if useLibraryCodeForTypes setting is on, then PyLance will try to pull inline type info even from libraries that aren't PEP-561-compliant. So users with this setting on should already be getting Dagster type info during development. However, the UX can be bad since the server doesn't know what's private vs public. This leads to auto-imports resolving to the wrong location (e.g. from dagster.core... import instead of from dagster import) and the autocomplete menu getting noisy. This is a problem with all libraries, so many users probably turn off this setting. And other Python language servers may not expose a setting like this at all.

    This problem is solved by a nascent standard for declaring a library's public API. This standard is not in a PEP yet but has been implemented in PyRight/PyLance and documented in the official python/typing repo.

    This PR takes steps towards conforming Dagster's public API to this standard. It does four things:

    • Adds a py.typed marker file to dagster (no extension libs yet)
    • Marks all dagster submodules private by prefixing the top-level submodules with _ (there is unfortunately no way around this-- see discussion with Pyright maintainer linked below)
    • Renames some other internal modules to avoid naming collisions with exports. This is a subtle issue, but if a submodule has the same name as an export from its parent, then you end up with an unstable referent for that name that is particularly problematic for static analysis. There were several places in the Dagster codebase where this collision was occurring, I renamed the modules accordingly:
      • All submodules of dagster.core.definitions.decorators (e.g. dagster.core.decorators.solid conflicted with the solid export from dagster.core.decorators, was renamed to dagster.core.decorators.solid_decorator
      • dagster.core.definitions.reconstructable > dagster.core.definitions.reconstructable_definition
      • dagster.core.asset_defs.asset > dagster.core.asset_defs.assets
    • Adds module name mapping to ConfigurableClassData and dagster.yaml loading to support backcompatibility.

    Doing this allows us to run pyright --verifytypes dagster to analyze coverage:

    $ pyright --verifytypes dagster
    
    Symbols exported by "dagster": 263
      With known type: 212
      With ambiguous type: 9
      With unknown type: 42
      Functions without docstring: 25
      Functions without default param: 59
      Classes without docstring: 0
    
    Other symbols referenced but not exported by "dagster": 4751
      With known type: 3475
      With ambiguous type: 70
      With unknown type: 1206
    
    Type completeness score: 80.6%
    
    Completed in 3.699sec
    

    So we're most of the way to being fully typed. Because we aren't for now, the py.typed file contains a single line partial\n, this should be deleted when the library reaches 100%.

    Relevant background:

    opened by smackesey 16
  • (Ver 0.10.2) dagit's

    (Ver 0.10.2) dagit's "Tick History" chart no longer works, just shows constant spinning circle

    Actually, the spinning circle is running even before turning on the schedule. In other words, when viewing the page for the first time, the spinning circle animation is already running.

    image

    My setup:

    • Windows 10
    • Local dagit and daemon instance
    • On-prem Postgresql ver 11.5
    • dagster_postgres ver 0.10.2
    • Chrome ver 87

    I will report back using default sqlite db and see if the problem persists.

    bug 
    opened by pybokeh 16
  • Putting is_str into dagster.utils #2779

    Putting is_str into dagster.utils #2779

    Hi all, just checkout out the repository and wanted to do a first time contributor issue, closes #2779. Hope I didn't misinterpret what the issue was asking for.

    In this PR:

    • Puts is_str logic into dagster.utils near is_enum_value(value)
    • Importing all is_str for all required files from dagster.utils

    Notes: I ran make black and make pylint but the pylint gave me lots of output that I overlooked, but I made sure it did not involve is_str

    opened by monicayao 16
  • RFC: Add register_run_asset event

    RFC: Add register_run_asset event

    Adds a new Dagster event type REGISTER_RUN_ASSET.

    This event is yielded after an asset job run begins for every single asset that is selected in the job. The goal of this feature is to enable querying for runs that intended to generate certain assets (e.g. job that intends to materialize 3 assets fails during materialization of second asset).

    This will enable warnings in Dagit on the asset details page and the asset graph nodes. In the future, we can enable this for partitioned assets to generate views of all runs across a certain asset's partition. If this event becomes too noisy in run logs, we can consider ways to condense the logs.

    opened by clairelin135 15
  • Sensors that return many run requests result in heavy querying of the run storage by the daemon

    Sensors that return many run requests result in heavy querying of the run storage by the daemon

    GCP profiling suggests that several runs table queries take 8-15ms and are responsible for a large portion of the load on our postgres DB.

    SELECT runs.run_body FROM runs WHERE runs.status IN ($1, $2, $3) ORDER BY runs.id DESC

    and

    SELECT runs.run_body FROM runs JOIN run_tags ON runs.run_id = run_tags.run_id WHERE run_tags.key = $1 AND run_tags.value = $2 GROUP BY runs.run_body, runs.id HAVING COUNT(runs.run_id) = $3 ORDER BY runs.id DESC

    bug platform 
    opened by gibsondan 15
  • Partition UI resets status of OPs upon step subset backfill

    Partition UI resets status of OPs upon step subset backfill

    Dagster version

    1.0.12

    What's the issue?

    Backfills that are launched with a step-subset for a specific partition in a partitioned job run successfully, but the corresponding UI appears to reset the status of the OPs that were not part of the step selection. Only the OP that was selected turns green. The partition itself still shows as successful:

    image

    What did you expect to happen?

    Each row (OP) that did not run as part of the step-subset partitioned backfill should remain green.

    How to reproduce?

    1. Create a partitioned job
    2. Run a partition to success
    3. Click launch backfill to backfill the successful partition. In the step subset section, add an OP to re-run.

    Deployment type

    Other Docker-based deployment

    Deployment details

    Running 1.22.12-gke.2300.

    Deployments of:

    • Dagit
    • Daemon
    • Repository

    Additional information

    No response

    Message from the maintainers

    Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

    bug backfills dagit 
    opened by yeachan153 14
  • Partition functionality on Multi Asset Sensor Context

    Partition functionality on Multi Asset Sensor Context

    Augments MultiAssetSensorEvaluationContext to contain functionalities to work with asset partitions.

    Adds a couple of methods:

    • latest_materialization_by_partition: Given an asset key, returns a dict mapping partition key to the corresponding materialization
    • get_cursor_partition: Gets the current partition the cursor is on
    • map_partition: Given a partition key and the partitions definition it comes from, returns a list of corresponding partitions it maps to in a downstream partitions definition
    • get_partition_from_event_log_record: Utility method to return the partition key (if exists) on an event log record

    Updates the following existing methods to contain an after_cursor_partition flag that allows for retrieving only event log entries whose partitions are after the cursor partition. By default, after_cursor_partition is set to True.

    • materialization_records_for_key
    • latest_materialization_records_by_key
    opened by clairelin135 14
  • define_asset_job accept AssetKey and AssetsDefinition selections

    define_asset_job accept AssetKey and AssetsDefinition selections

    Summary & Motivation

    Enable passing a list of AssetKeys or AssetsDefinitions to the selection argument of define_asset_job.

    How I Tested These Changes

    bk

    opened by sryza 1
  • monitored_asset_selection param for multi-asset sensor

    monitored_asset_selection param for multi-asset sensor

    Summary & Motivation

    This takes advantage of the fact that multi-asset sensor is a relatively recent experimental API to do a breaking change. It replaces the asset_keys and asset_selection arguments to @multi_asset_sensor with a single monitored_asset_selection argument.

    The advantages:

    • We've received a number of user reports that, generally, asset selection parameters are finicky and difficult to get right. If we standardize everywhere on a single asset_selection parameter that accepts Union[AssetSelection, Sequence[AssetKey]], it would make life easier.
    • Avoids conflict with the asset_selection param on @sensor, which means something different.

    Fixes https://github.com/dagster-io/dagster/issues/11558.

    How I Tested These Changes

    bk

    opened by sryza 2
  • [dagit] Add GraphQL configuration for VSCode

    [dagit] Add GraphQL configuration for VSCode

    Summary & Motivation

    Picking this back up from https://github.com/dagster-io/dagster/pull/11500.

    Add a .graphqlrc.yml to the dagster root to allow the VSCode GraphQL language extension (https://marketplace.visualstudio.com/items?itemName=GraphQL.vscode-graphql) to find the dagit schema for autocompletion and validation.

    How I Tested These Changes

    Open VersionNumber.tsx, mess around with its query.

    Screenshot 2023-01-06 at 4 17 04 PM Screenshot 2023-01-06 at 4 17 16 PM Screenshot 2023-01-06 at 4 18 17 PM
    opened by hellendag 4
  • [dynamic] fix how skips cascade

    [dynamic] fix how skips cascade

    resolves https://github.com/dagster-io/dagster/issues/10292 resolves https://github.com/dagster-io/dagster/issues/5948

    How I Tested These Changes

    added tests cases

    opened by alangenfeld 2
  • replace asset_keys and asset_selection params to multi_asset_sensor with monitored_asset_selection

    replace asset_keys and asset_selection params to multi_asset_sensor with monitored_asset_selection

    A couple advantages here:

    • Avoids conflict with the asset_selection param on @sensor, which means something different
    • Less params is better

    Related:

    • https://github.com/dagster-io/dagster/issues/11557
    • https://dagster.slack.com/archives/C01U5LFUZJS/p1673028482635199?thread_ts=1671157035.783759&cid=C01U5LFUZJS
    opened by sryza 0
Releases(1.1.8)
  • 1.1.8(Jan 6, 2023)

    New

    • Asset backfills launched from the asset graph now respect partition mappings. For example, if partition N of asset2 depends on partition N-1 of asset1, and both of those partitions are included in a backfill, asset2’s partition N won’t be backfilled until asset1’s partition N-1 has been materialized.
    • Asset backfills launched from the asset graph will now only materialize each non-partitioned asset once - after all upstream partitions within the backfill have been materialized.
    • Executors can now be configured with a tag_concurrency_limits key that allows you to specify limits on the number of ops with certain tags that can be executing at once within a single run. See the docs for more information.
    • ExecuteInProcessResult, the type returned by materialize, materialize_to_memory, and execute_in_process, now has an asset_value method that allows you to fetch output values by asset key.
    • AssetIns can now accept Nothing for their dagster_type, which allows omitting the input from the parameters of the @asset- or @multi_asset- decorated function. This is useful when you want to specify a partition mapping or metadata for a non-managed input.
    • The start_offset and end_offset arguments of TimeWindowPartitionMapping now work across TimeWindowPartitionsDefinitions with different start dates and times.
    • If add_output_metadata is called multiple times within an op, asset, or IO manager handle_output, the values will now be merged, instead of later dictionaries overwriting earlier ones.
    • materialize and materialize_to_memory now both accept a tags argument.
    • Added SingleDimensionDependencyMapping, a PartitionMapping object that defines a correspondence between an upstream single-dimensional partitions definition and a downstream MultiPartitionsDefinition.
    • The RUN_DEQUEUED event has been removed from the event log, since it was duplicative with the RUN_STARTING event.
    • When an Exception is raised during the execution of an op or asset, Dagit will now include the original Exception that was raised, even if it was caught and another Exception was raised instead. Previously, Dagit would only show exception chains if the Exception was included using the raise Exception() from e syntax.
    • [dagit] The Asset Catalog table in Dagit is now a virtualized infinite-scroll table. It is searchable and filterable just as before, and you can now choose assets for bulk materialization without having to select across pages.
    • [dagit] Restored some metadata to the Code Locations table, including image, python file, and module name.
    • [dagit] Viewing a partition on the asset details page now shows both the latest materialization and also all observations about that materialization.
    • [dagit] Improved performance of the loading time for the backfills page
    • [dagit] Improved performance when materializing assets with very large partition sets
    • [dagit] Moving around asset and op graphs while selecting nodes is easier - drag gestures no longer clear your selection.
    • [dagster-k8s] The Dagster Helm chart now allows you to set an arbitrary kubernetes config dictionary to be included in the launched job and pod for each run, using the runK8sConfig key in the k8sRunLauncher section. See the docs for more information.
    • [dagster-k8s] securityContext can now be set in the k8sRunLauncher section of the Dagster Helm chart.
    • [dagster-aws] The EcsRunLauncher can now be configured with cpu and memory resources for each launched job. Previously, individual jobs needed to be tagged with CPU and memory resources. See the docs for more information.
    • [dagster-aws] The S3ComputeLogManager now takes in an argument upload_extra_args which are passed through as the ExtraArgs parameter to the file upload call.
    • [dagster-airflow] added make_dagster_definitions_from_airflow_dags_path and make_dagster_definitions_from_airflow_dag_bag which are passed through as the ExtraArgs parameter to the file upload call.

    Bugfixes

    • Fixed a bug where ad-hoc materializations of assets were not correctly retrieving metadata of upstream assets.
    • Fixed a bug that caused ExperimentalWarnings related to LogicalVersions to appear even when version-based staleness was not in use.
    • Fixed a bug in the asset reconciliation sensor that caused multi-assets to be reconciled when some, but not all, of the assets they depended on, were reconciled.
    • Fixed a bug in the asset reconciliation sensor that caused it to only act on one materialization per asset per tick, even when multiple partitions of an asset were materialized.
    • Fixed a bug in the asset reconciliation sensor that caused it to never attempt to rematerialize assets which failed in their last execution. Now, it will launch the next materialization for a given asset at the same time that it would have if the original run had completed successfully.
    • The load_assets_from_modules and load_assets_from_package_module utilities now will also load cacheable assets from the specified modules.
    • The dequeue_num_workers config setting on QueuedRunCoordinatoris now respected.
    • [dagit] Fixed a bug that caused a “Maximum recursion depth exceeded” error when viewing partitioned assets with self-dependencies.
    • [dagit] Fixed a bug where “Definitions loaded” notifications would constantly show up in cases where there were multiple dagit hosts running.
    • [dagit] Assets that are partitioned no longer erroneously appear "Stale" in the asset graph.
    • [dagit] Assets with a freshness policy no longer appear stale when they are still meeting their freshness policy.
    • [dagit] Viewing Dagit in Firefox no longer results in erroneous truncation of labels in the left sidebar.
    • [dagit] Timestamps on the asset graph are smaller and have an appropriate click target.
    • [dagster-databricks] The databricks_pyspark_step_launcher will now cancel the relevant databricks job if the Dagster step execution is interrupted.
    • [dagster-databricks] Previously, the databricks_pyspark_step_launcher could exit with an unhelpful error after receiving an HTTPError from databricks with an empty message. This has been fixed.
    • [dagster-snowflake] Fixed a bug where calling execute_queries or execute_query on a snowflake_resource would raise an error unless the parameters argument was explicitly set.
    • [dagster-aws] Fixed a bug in the EcsRunLauncher when launching many runs in parallel. Previously, each run risked hitting a ClientError in AWS for registering too many concurrent changes to the same task definition family. Now, the EcsRunLauncher recovers gracefully from this error by retrying it with backoff.
    • [dagster-airflow] Added make_dagster_definitions_from_airflow_dags_path and make_dagster_definitions_from_airflow_dag_bag for creating Dagster definitions from a given airflow Dag file path or DagBag

    Community Contributions

    • Fixed a metadata loading error in UPathIOManager, thanks @danielgafni!
    • [dagster-aws]FakeS3Session now includes additional functions and improvements to align with the boto3 S3 client API, thanks @asharov!
    • Typo fix from @vpicavet, thank you!
    • Repository license file year and company update, thanks @vwbusguy!

    Experimental

    • Added experimental BranchingIOManager to model use case where you wish to read upstream assets from production environments and write them into a development environment.
    • Add create_repository_using_definitions_args to allow for the creation of named repositories.
    • Added the ability to use Python 3 typing to define and access op and asset config.
    • [dagster-dbt] Added DbtManifestAssetSelection, which allows you to define selections of assets loaded from a dbt manifest using dbt selection syntax (e.g. tag:foo,path:marts/finance).

    Documentation

    • There’s now only one Dagster Cloud Getting Started guide, which includes instructions for both Hybrid and Serverless deployment setups.
    • Lots of updates throughout the docs to clean up remaining references to @repository, replacing them with Definitions.
    • Lots of updates to the dagster-airflow documentation, a tutorial for getting started with Dagster from an airflow background, a migration guide for going to Dagster from Airflow and a terminology/concept map for Airflow onto Dagster.

    All Changes

    https://github.com/dagster-io/dagster/compare/1.1.7...1.1.8

    See All Contributors
    • 95950df - Adjust resources guide to be in a Definitions world by @schrockn
    • f245696 - add thread name prefix to run dequeue workers (#11155) by @alangenfeld
    • 45b4c26 - make schedules produced by build_schedule_from_partitioned_job more p… (#11147) by @sryza
    • 2d3c914 - [k8s launcher] security context (#9788) by @alangenfeld
    • b655ca6 - [run coordinator] fix threaded tests (#11139) by @alangenfeld
    • 1f7980e - [docs] - [definitions] Update Dagit + tutorial screenshots (#11089) by @erinkcochran87
    • c5c6eff - [docs] - [definitions] Update Partitions concept docs (#11030) by @erinkcochran87
    • 6ef79f0 - Port asset sensor guide to Definitions by @schrockn
    • 8cf4dba - Use buildkite_deps.txt to declare explicit buildkite deps (#11025) by @jmsanders
    • 52c726f - Trigger builds when .ini files change (#11161) by @jmsanders
    • 7480066 - Revert "keep track of max timestamps client side for code location up… (#11162) by @prha
    • 679d2e1 - [docs] - Fix links (#11163) by @erinkcochran87
    • b16f30c - Add create_repository_using_definitions_args by @schrockn
    • b4f8705 - Change Graph-backed asset guide code examples to be on Definitions by @schrockn
    • e71dacf - Delete repository unit testing using load_all_definitions in testing guide by @schrockn
    • 2ae2ed6 - 1.1.7 Changelog (#11165) by @jamiedemaria
    • c0d8a9a - Do a single pip install in tox suites [OSS] (#11164) by @gibsondan
    • a763ec9 - Update declarative scheduling guide to include Definitions by @schrockn
    • 4467bc7 - add thread name prefix to grpc server (#11158) by @alangenfeld
    • fb59fda - Automation: versioned docs for 1.1.7 by @elementl-devtools
    • 1518150 - Fixup rename of requirements.txt -> buildkite_deps.txt (#11160) by @jmsanders
    • 9a46201 - fix precedence ordering when merging dictionaries in container context (#11169) by @gibsondan
    • bf4222c - support in and len on PartitionsSubsets (#11172) by @sryza
    • d64b050 - Disable breaking azure tests in master (#11183) by @schrockn
    • 722c3b6 - [docs] Fix typo in title (#11156) by @vpicavet
    • 630d9c6 - Fix regression with passing in None to snowflake resource (#11182) by @gibsondan
    • b2bb257 - [dagit] Add tests for partition health data parsing / accessors (#11114) by @bengotow
    • 3922523 - Allow setting resources on EcsContainerContext and EcsRunLauncher (#11170) by @gibsondan
    • d8555aa - fix a small bug in UPathIOManager (#11110) by @danielgafni
    • 15b0f27 - [dagster-dbt] in tests, pin dbt rpc < 0.3.0 (#11196) by @OwenKephart
    • baadabe - Move core_tests/storage_tests to storage_tests (#11180) by @schrockn
    • bcd1229 - skip sqlite env var test on windows (#11195) by @gibsondan
    • 59faa75 - Move old_sqlalchemy_tests to only run on storage_tests (#11181) by @schrockn
    • fb3d4f9 - [dagster-airflow] airflow terminology mapping (#11015) by @Ramshackle-Jamathon
    • c4e398b - Move core_tests/definitions_tests to definitions_tests (#11184) by @schrockn
    • ab3498a - fix dequeue_num_workers setting (#11198) by @alangenfeld
    • 3db32c1 - Move core_tests/asset_defs_tests to asset_defs_tests (#11186) by @schrockn
    • 8ae492c - update timeout in sensor run tests (#11193) by @jamiedemaria
    • 80557d2 - Move core_tests/launcher_tests to launcher_tests (#11187) by @schrockn
    • 4961453 - add assets example to AssetSelection apidoc (#11194) by @sryza
    • b242ed7 - Move various logging tests into logging_tests (#11188) by @schrockn
    • ed912c3 - feat(dbt-cloud): add Dagster run id to dbt Cloud run (#11005) by @rexledesma
    • d521afa - [docs] snowflake reference page (#10985) by @jamiedemaria
    • 6acf983 - [declarative-scheduling] Fix bug with declarative scheduling where repeated calls to get_latest_materialization_record could return incorrect results (#11214) by @OwenKephart
    • 732ab85 - only reconcile multi-assets if all parents are reconciled (#11190) by @sryza
    • 1ed7d08 - [dagit] Fix dragging on the DAG clearing your selection / clicking links (#11202) by @bengotow
    • e446201 - [dagster-aws] Extend fake S3 resource (#11105) by @asharov
    • b99fd34 - [dagit] Optimizations to backfill UI for large partition key sets (#11201) by @bengotow
    • 4d79412 - do not compute projected logical versions of partitioned assets (#11204) by @sryza
    • ae3ed2d - factor out asset reconciliation graph traversal into util (#11206) by @sryza
    • 6ee425f - [dagit] Fresh + Stale should not show “Stale” on the Asset Graph (#11234) by @bengotow
    • c5cb42a - [dagit] Round middle truncate calculations for Firefox (#11203) by @bengotow
    • cbf21af - support tags argument on materialize and materialize_to_memory (#11225) by @sryza
    • 530e118 - docs: update license to include year and company (#11231) by @vwbusguy
    • 1895517 - [docs] - [definitions] Update OSS deployment overview (#11199) by @erinkcochran87
    • d94cff0 - split backfill table so partition status is fetched lazily (#11205) by @prha
    • 934f7c3 - fix tslint (#11242) by @alangenfeld
    • 76b00ba - [dagit] webpack-bundle-analyzer (#11230) by @hellendag
    • 8f72eb0 - [dagit] Virtualized Asset Catalog (#11168) by @hellendag
    • cd523d9 - Can remove hardcoded_resource from project_fully_featured because of Definitions (#11243) by @schrockn
    • ced6042 - [dagit] Replace moment-timezone (#11197) by @hellendag
    • 4cb9a82 - Remove unused backfillStatus, which loads individual run status (#11246) by @prha
    • e763c59 - Retry registering task definitions (#11192) by @jmsanders
    • 9e5df46 - Rerun dagster_tests with --snapshot-update (#11208) by @schrockn
    • 17f13ea - Add test case for binding assets before passing to Definitions (#11216) by @schrockn
    • 2c03dfd - [dagster-airflow] from airflow to dagster guide updates (#11218) by @Ramshackle-Jamathon
    • 86338fa - [graphql test] share schema instance (#11236) by @alangenfeld
    • 9460504 - Use FromSourceAsset instead of FromRootInputManager when loading assets with input managers (#11233) by @jamiedemaria
    • 8850c6a - Use bare objects for the hacker news resources (#11249) by @schrockn
    • 34a5f5a - Use bare I/O manager in fully featured (#11250) by @schrockn
    • 3f30430 - Delete unused fixed_s3_pickle_io_manager (#11251) by @schrockn
    • 5df60bf - Consolidate _resolve_bound_config and have logger and resource use the same one (#11209) by @schrockn
    • df13fc6 - Delete op version of _resolve_bound_config and call generic one (#11211) by @schrockn
    • 2ecec63 - Skip race condition tests (#11286) by @jmsanders
    • eca24df - Skip flaky test (#11292) by @jmsanders
    • 5d83868 - [dagit] Replace remaining moment usage (#11278) by @hellendag
    • a28f5d5 - Update op-retries.mdx - no solid (#11284) by @yuhan
    • 11c1640 - [docs] airbyte guide repository -> definitions (#11296) by @yuhan
    • 8be6dc3 - [docs] fivetran guide repository -> definitions (#11297) by @yuhan
    • 547a6cb - [docs] dbt guide repository -> definitions (#11298) by @yuhan
    • ed212ed - [docs] dbt cloud guide repository -> definitions (#11299) by @yuhan
    • 52336d7 - [declarative-scheduling] Rework constraint passing (#11229) by @OwenKephart
    • 58a31a0 - [dagster-airflow] basic airflow migration guide (#11012) by @Ramshackle-Jamathon
    • d46f969 - [dagstermill] add retries to flaky tests (#11291) by @jamiedemaria
    • c8f3a0c - Product tour component (#11227) by @salazarm
    • bd372fd - [dagit] Utility for timezone-aware date/time formatting (#11285) by @hellendag
    • 8ca865f - [dagit] Restore some metadata on Code Locations page (#11281) by @hellendag
    • d8b5c3b - [docs] - [definitions] Update Loggers Concept docs (#11171) by @erinkcochran87
    • b09e95a - [docs] - Update ECS deployment guide (#11289) by @erinkcochran87
    • d235619 - [docs] - [definitions] Update Dagster instance docs (#11241) by @erinkcochran87
    • be59c3e - Pass duckdb_path to __init__ rather than relying on context (#11300) by @schrockn
    • 40cd16a - [structured config] Base structured config implementation (#11268) by @benpankow
    • a0152b8 - [structured config] Add support for default values (#11272) by @benpankow
    • a21feb4 - Add gql pin (#11312) by @gibsondan
    • d1ae079 - [structured config] Add support for class, field descriptions (#11274) by @benpankow
    • ec47d7c - Move env var injection earlier in step command (#11239) by @gibsondan
    • 3e47f52 - accrete metadata with multiple calls to add_output_metadata (#9518) by @sryza
    • 534e38d - [docs] - [definitions] - Update Executors docs (#11247) by @erinkcochran87
    • 905ef7a - [docs] - [definitions] Update Helm guide (#11320) by @erinkcochran87
    • a3963b1 - Eliminating unnecessary output_context.resource_config check (#11313) by @schrockn
    • 49dd2c7 - [docs] - Consolidate sections in Resources concept doc (#11316) by @erinkcochran87
    • 0e670b6 - [docs] - [definitions] Update SDA Concept docs (#11018) by @erinkcochran87
    • b22c163 - [docs] - [definitions] Update Run launchers guide (#11200) by @erinkcochran87
    • 9632313 - [docs] - Re-do Cloud Getting Started guides (#10429) by @erinkcochran87
    • f099158 - [docs] - [definitions] Update Docker guide (#11295) by @erinkcochran87
    • 09c13a5 - backfill perf: swap backfill requested for num cancelable (#11304) by @prha
    • a3c0e71 - allow upload config to pass through s3 compute log manager (#11317) by @prha
    • b911f34 - [apidoc] define_asset_job repository -> Definitions (#11302) by @yuhan
    • dbca38a - Ignore stale timestamps from code location updates (#11173) by @prha
    • 43d4283 - [structured config] Fix usage with Assets (#11327) by @benpankow
    • e72a3f5 - Include parent exceptions in Dagster errors, even if they weren't explicitly raised (#11306) by @gibsondan
    • ec38705 - [docs] - [definitions] Update Dagster daemon docs (#11226) by @erinkcochran87
    • e8e3c57 - Docs release backfill 1.1.7 (#11328) by @yuhan
    • e92ce74 - Use dark mode logo on README.md (#11326) by @hellendag
    • a88a38e - AssetGraph.from_external_assets().get_required_multi_asset_keys() (#11318) by @sryza
    • 8f80959 - [declarative-scheduling] Update retry logic to attempt to retry failed materializations after some time has passed (#11294) by @OwenKephart
    • c90056e - Add instance property to InputContext (#11331) by @schrockn
    • dbfc06a - chore: add method to strip error stack trace (#11307) by @rexledesma
    • 661b4e4 - Hoist schema and database to DbIOManager constructor (#11301) by @schrockn
    • 3b0fe94 - Rename _resolve_bound_config to resolve_bound_config (#11287) by @schrockn
    • 7839ee3 - Add asset_materialization property to EventLogEntry (#11340) by @schrockn
    • 3a11f32 - store repository on external asset graph (#11332) by @sryza
    • e830ac6 - Revert "Use dark mode logo on README.md (#11326)" (#11345) by @hellendag
    • 6e8d060 - [dagster-io/eslint-config] v1.0.6 (#11324) by @hellendag
    • abfc7b5 - Add get_implicit_global_asset_job on Definitions and RepositoryDefinition (#11279) by @schrockn
    • 73e6a02 - feat: add retry number and url to integration api call failure (#11308) by @rexledesma
    • 538e5b3 - Fix dagster-graphql circular import (BK broken) (#11339) by @smackesey
    • ccf7468 - Skipping dbt rpc resource tests (#11357) by @schrockn
    • 7434f5e - skip test_threaded_ephemeral_instance (#11359) by @schrockn
    • 6eb8673 - Allow specifying resources in Op/Asset params list (#11322) by @benpankow
    • 945eb03 - change version placeholder 0+dev -> 1!0+dev (#11334) by @smackesey
    • 2975629 - [structured config] Structured-config-backed Resources (#11321) by @benpankow
    • a34a02d - [structured config] Add traditional resource wrapper base class (#11337) by @benpankow
    • a842b2b - [structured config] Fix test importing functools.cached_property on py37 tests (#11361) by @benpankow
    • b8a7b0c - [structured config] Structured-config-backed IO managers (#11343) by @benpankow
    • 62e1527 - fix some bugs in ExternalAssetGraph (#11350) by @sryza
    • 02ed6c2 - more refactors to reconciliation sensor (#11223) by @sryza
    • 9d25371 - [dagster-io/eslint-config] v1.0.7 (#11347) by @hellendag
    • 586f15b - Add Materialize Button hook (#11319) by @salazarm
    • 1a57141 - Allow setting raw k8s config at the run launcher / container context level (#11333) by @gibsondan
    • af2ddc6 - Make it more likely that we hit our lock (#11290) by @jmsanders
    • 860ffe6 - fix __contains__ of TimeWindowPartitionsSubset (#11380) by @sryza
    • 2394272 - [docs] - add note on pandas integration to redirect users interested in pandas w/out validation (#11342) by @slopp
    • 4dbf286 - fix asset reconciliation bug that ignores earlier partitions (#11336) by @sryza
    • 0263eab - fix typo on partitions concepts page (#11349) by @sryza
    • 9832545 - Clicking on top-level concept sections takes you to a page (#10140) by @sryza
    • 81b192d - Add single dimension -> multidimension partition mapping (#10910) by @clairelin135
    • d761f14 - [RFC] Update cached_status_data column 1/2 (#10821) by @clairelin135
    • b60b9a1 - Remove stray console.log (#11410) by @gibsondan
    • 1c5733d - ExecuteInProcessResult.asset_value (#11403) by @sryza
    • 9504c26 - Elminate PIPELINE_DEQUEUED event (#11393) by @gibsondan
    • 551a532 - [dagit] Begin adding new graphql-codegen (#11411) by @hellendag
    • b88880c - Docs for new ECS resource options (#11391) by @gibsondan
    • 26e9853 - Docs for new k8s configuration options (#11390) by @gibsondan
    • 7854e62 - Fix malformed dagster_cloud.yaml in code location docs (#11416) by @gibsondan
    • be844f7 - [dagit] Refactor mutations to new GraphQL codegen (#11414) by @hellendag
    • ffe89bf - [dagit] Refactor queries in src/workspace (#11418) by @hellendag
    • f29e001 - [dagit] dedupeFragments (#11426) by @hellendag
    • 2bc5146 - relax matching criteria for TimeWindowPartitionMappings w/offsets (#11422) by @sryza
    • 5ba8f4e - [dagster-databricks] handle empty responses (#11430) by @OwenKephart
    • b5fbfed - Export RunRecord in the public API (#11427) by @gibsondan
    • 67c7650 - [dagit] Refactor queries in src/runs (#11419) by @hellendag
    • 03fcf9b - [dagit] Refactor queries in Assets (#11423) by @hellendag
    • b7b7d06 - [dagit] Refactor queries in instance, instigation, launchpad (#11429) by @hellendag
    • 384fe6b - Branching I/O Manager (#11315) by @schrockn
    • a4ea49f - Fix issues with a SerializableErrorInfo being coerced to a GraphenePythonError (#11437) by @gibsondan
    • 186e2c7 - export product tour component (#11314) by @salazarm
    • f12798f - [dagit] Refactor queries in remaining src (#11435) by @hellendag
    • ba3a381 - Eliminate unused config_or_config_fn arg in copy_for_configured (#11433) by @schrockn
    • a6dcbfe - [dagit] Don’t show assets as stale if projectedLogicalVersion is null (#11224) by @bengotow
    • 9ca91c5 - [dagit] Show observations about the latest materialization on Asset > Partitions (#11288) by @bengotow
    • 157d901 - asset backfill core logic (#11377) by @sryza
    • d72e34c - Allow AssetIn(dagster_type=Nothing) (#11436) by @sryza
    • 5ab885d - Rename and refactor in structured_config.py for clarity (#11372) by @schrockn
    • f00cec7 - Consolidate all logic and trickery to support private cached properties in pydantic in a single class (#11373) by @schrockn
    • 5a82bdc - deprecate job-level memoization in docs (#11392) by @sryza
    • 09a66b9 - Add StructuredIOManagerAdapter (#11383) by @schrockn
    • eb5e973 - [dagit] Fix line height and size of timestamps on the asset graph (#11465) by @bengotow
    • afaeb13 - Default new-style config mappings to *Source equivalents rather than raw scalars (#11386) by @schrockn
    • f2915c5 - Unexperimentalize LogicalVersion and quell warning messages whenever assets are materialized (#11407) by @sryza
    • 7d69c44 - Passthrough pydantic.ModelField.required to dagster.Field.is_required (#11388) by @schrockn
    • 8c7b92f - Support pydantic aliases in config schema mapping (#11389) by @schrockn
    • cdcfedb - Make conversion to *Source types work on direct annotations of the config parameter (#11469) by @schrockn
    • dafbf49 - [dagster-databricks] handle execution interrupts (#11421) by @OwenKephart
    • d74dcf4 - Skip flaky race condition test (#11471) by @jmsanders
    • 08d0b0a - remove AssetStoreHandle (#11412) by @sryza
    • ebd9fb2 - refactor: sequester job backfill code (#11252) by @sryza
    • 8ca3874 - Be more resilient to user code errors when dequeuing runs (#11406) by @gibsondan
    • 0ff7c7f - Make all arguments on DagsterInstance.create_run keyword-only (#11446) by @schrockn
    • 9dd7579 - Elimiinate default value for asset_selection param on DagsterInstance.create_run (#11447) by @schrockn
    • 5f23b2f - Eliminate default values for solid_selection, external_pipeline_origin, and pipeline_code_origin on create_run (#11448) by @schrockn
    • 662077b - rename create_run to create_run_for_test on TestQueuedRunCoordinator to increase greppability of create_run (#11449) by @schrockn
    • 4d205f5 - [dagit] Remove old GraphQL Codegen (#11474) by @hellendag
    • fe1620e - [dagit] Delete apollo CLI dep (#11478) by @hellendag
    • 359e16f - Add tag_concurrency_limits config to executors (#11472) by @gibsondan
    • 8ca33ca - add run filter for updated before (#11481) by @prha
    • 8d50157 - docs(dagster-dbt): clarify that the integration supports arbitrary dbt profiles (#11351) by @rexledesma
    • 057b369 - [structured config] Add support for Permissive fields (#11275) by @benpankow
    • 7751e64 - handle pure asset backfills in backfill daemon (#11378) by @sryza
    • 5bdc212 - Fix missing callsite of tag_concurrency_limits (#11496) by @gibsondan
    • 936660e - fix(docs): format docs (#11501) by @rexledesma
    • d7e687e - [dagster-dbt] add DbtManifestAssetSelection (#11473) by @OwenKephart
    • 41b616b - Add parameter invariants around external_pipeline_origin and pipeline_code_origin arguments (#11450) by @schrockn
    • 6a6bb02 - supply solid_selection to fix submitting runs from pure asset backfills (#11502) by @sryza
    • 75b9084 - telemetry: add num_assets_in_repo in repo-level metadata (#11490) by @yuhan
    • 5144eab - fix(docs): run mdx-format again (#11504) by @rexledesma
    • 0e654f6 - Tighten solids_to_execute and solid_selection invariant (#11451) by @schrockn
    • 7d69dad - Bump json5 from 1.0.1 to 1.0.2 in /js_modules/dagit (#11485) by @dependabot[bot]
    • 39c0573 - Add invariant for asset_selection (#11452) by @schrockn
    • 72654df - Typehint pipeline_name, run_id, and mode (#11453) by @schrockn
    • 4b91684 - graphql for pure asset backfills (#11379) by @sryza
    • ba75a8f - docs for new tag_concurrency_limits feature on executor (#11499) by @gibsondan
    • 8e47dc2 - Add support for loading cacheable assets from module (#10389) by @benpankow
    • ea185ef - enable filtering by asset tag when asset tags table is not present (#11509) by @sryza
    • c578c55 - only count materializations within backfill (#11506) by @sryza
    • 862ca6d - telemetry: add num_assets_in_repo in log_repo_stats metadata (#11513) by @yuhan
    • e6a1a94 - telemetry: add location_name in repo-level metadata (#11514) by @yuhan
    • 8349be9 - [dagster-airflow] warn on airflow dataset (#11498) by @Ramshackle-Jamathon
    • 893ed12 - [dagster-airflow] addmake_dagster_definitions_from_airflow_dags_pathandmake_dagster_definitions_from_airflow_dag_bagapis (#11441) by @Ramshackle-Jamathon
    • 07c5d7d - Changelog 1.1.8 (#11526) by @clairelin135
    • 88c01f1 - 1.1.8 by @elementl-devtools
    Source code(tar.gz)
    Source code(zip)
  • 1.1.7(Dec 15, 2022)

    New

    • Definitions is no longer marked as experimental and is the preferred API over @repository for new users of Dagster. Examples, tutorials, and documentation have largely ported to this new API. No migration is needed. Please see GitHub discussion for more details.
    • The “Workspace” section of Dagit has been removed. All definitions for your code locations can be accessed via the “Deployment” section of the app. Just as in the old Workspace summary page, each code location will show counts of its available jobs, assets, schedules, and sensors. Additionally, the code locations page is now available at /locations.
    • Lagged / rolling window partition mappings: TimeWindowPartitionMapping now accepts start_offset and end_offset arguments that allow specifying that time partitions depend on earlier or later time partitions of upstream assets.
    • Asset partitions can now depend on earlier time partitions of the same asset. The asset reconciliation sensor will respect these dependencies when requesting runs.
    • dagit can now accept multiple arguments for the -m and -f flags. For each argument a new code location is loaded.
    • Schedules created by build_schedule_from_partitioned_job now execute more performantly - in constant time, rather than linear in the number of partitions.
    • The QueuedRunCoordinator now supports options dequeue_use_threads and dequeue_num_workers options to enable concurrent run dequeue operations for greater throughput.
    • [dagster-dbt] load_assets_from_dbt_project, load_assets_from_dbt_manifest, and load_assets_from_dbt_cloud_job now support applying freshness policies to loaded nodes. To do so, you can apply dagster_freshness_policy config directly in your dbt project, i.e. config(dagster_freshness_policy={"maximum_lag_minutes": 60}) would result in the corresponding asset being assigned a FreshnessPolicy(maximum_lag_minutes=60).
    • The DAGSTER_RUN_JOB_NAME environment variable is now set in containerized environments spun up by our run launchers and executor.
    • [dagster-airflow] make_dagster_repo_from_airflow_dags_path ,make_dagster_job_from_airflow_dag and make_dagster_repo_from_airflow_dag_bag have a new connections parameter which allows for configuring the airflow connections used by migrated dags.

    Bugfixes

    • Fixed a bug where the log property was not available on the RunStatusSensorContext context object provided for run status sensors for sensor logging.

    • Fixed a bug where the re-execute button on runs of asset jobs would incorrectly show warning icon, indicating that the pipeline code may have changed since you last ran it.

    • Fixed an issue which would cause metadata supplied to graph-backed assets to not be viewable in the UI.

    • Fixed an issue where schedules often took up to 5 seconds to start after their tick time.

    • Fixed an issue where Dagster failed to load a dagster.yaml file that specified the folder to use for sqlite storage in the dagster.yaml file using an environment variable.

    • Fixed an issue which would cause the k8s/docker executors to unnecessarily reload CacheableAssetsDefinitions (such as those created when using load_assets_from_dbt_cloud_job) on each step execution.

    • [dagster-airbyte] Fixed an issue where Python-defined Airbyte sources and destinations were occasionally recreated unnecessarily.

    • Fixed an issue with build_asset_reconciliation_sensor that would cause it to ignore in-progress runs in some cases.

    • Fixed a bug where GQL errors would be thrown in the asset explorer when a previously materialized asset had its dependencies changed.

    • [dagster-airbyte] Fixed an error when generating assets for normalization table for connections with non-object streams.

    • [dagster-dbt] Fixed an error where dbt Cloud jobs with dbt run and dbt run-operation were incorrectly validated.

    • [dagster-airflow] use_ephemeral_airflow_db now works when running within a PEX deployment artifact.

    Documentation

    • New documentation for Code locations and how to define one using Definitions
    • Lots of updates throughout the docs to reflect the recommended usage of Definitions. Any content not ported to Definitions in this release is in the process of being updated.
    • New documentation for dagster-airflow on how to start writing dagster code from an airflow background.

    All Changes

    https://github.com/dagster-io/dagster/compare/1.1.6...1.1.7

    See All Contributors
    • 858b9d2 - Non isolated runs docs (#10860) by @johannkm
    • 24bff5f - [dagit] Fix Gantt chart rendering of per-step resource init log messages (#10943) by @bengotow
    • cde76d9 - [dagit] Fix the “Assets” label on large asset runs (#10932) by @bengotow
    • 6cd734a - [dagit] Fix “Job In” label regression in Chrome v109 (#10934) by @bengotow
    • 523edb0 - [dagit] Pass repository tag when loading runs for Partitions page (#10948) by @bengotow
    • 49f7c4f - [dagit] Updated asset DAG styles, added additional compute tags (#10931) by @bengotow
    • fb574c0 - [docs] - add a guide for scheduling assets (#10949) by @slopp
    • 66959b7 - cap packaging requirement at 22.0 (#10968) by @smackesey
    • 5a45dd1 - Execution result typing (#10919) by @smackesey
    • 7d18ed2 - solid -> node method renames (#10920) by @smackesey
    • f19c6ea - [dagster-airflow] re-enable airflow 2.5.0 tests (#10966) by @Ramshackle-Jamathon
    • 0d80225 - [dagster-airbyte][docs] Use dagster-airbyte CLI alias in docs (#10955) by @benpankow
    • 162e791 - [dagster-slack] create slack_on_freshness_policy_sensor (#10960) by @OwenKephart
    • faf4f30 - [docs ] - fix image dimensions in hello-dagster materialize (#10977) by @slopp
    • 4babd9d - 1.1.6 Changelog (#10978) by @OwenKephart
    • 577e3eb - Automation: versioned docs for 1.1.6 by @elementl-devtools
    • 0b59fa9 - [convert-environment-variables-and-secrets-guide-stack-2] Convert env vars and secrets guide from repository to Definitions by @schrockn
    • 352d077 - unexperimentalize PartitionMapping (#10980) by @sryza
    • 231ddfa - remove validation in AssetGraph.get_child_partition_keys_of_parent an… (#10981) by @sryza
    • 305f1a2 - [convert-development-to-production-1] Move repository to __init__.py by @schrockn
    • d1054f9 - [convert-development-to-production-2] Changing development to production to use snowflake_pandas_io_manager by @schrockn
    • 7d5fabc - [convert-deployment-to-production-3] Convert @repository to Definitions by @schrockn
    • 196c086 - [convert-development-to-production-4] Use base object instead of resource by @schrockn
    • 435ddcd - [convert-development-to-production-6] Use pyproject.toml instead of workspace.yaml by @schrockn
    • 888660d - [convert-development-to-production-7] Convert guide to use Definitions by @schrockn
    • ad41a9f - [dagit] New Code Locations table (#10975) by @hellendag
    • 3008420 - Pin graphene to <3.2 by @schrockn
    • 5373a74 - fix missing metadata in dagit on graph-backed assets (#10988) by @OwenKephart
    • 7bbcb8a - [dagit] Export a few Code Location components for Cloud (#11008) by @hellendag
    • 39c84bb - Temporarily disable some Azure test suites (#11007) by @jmsanders
    • 41101f7 - [graphql] fix for graphene 3.2 (#11011) by @alangenfeld
    • a0e5c3d - [code-location-selector-stack] Code location sensor tests 1/N. Rename workspace_load_target function to create_workspace_load_target by @schrockn
    • 160aba2 - [code-location-selector-stack] Code location sensor tests 2/N Make instance_with_multiple_repos_with_sensors workspace_load_target parameterizable by @schrockn
    • d6ad971 - [code-location-selector-stack] Code location sensor tests 3/N. Refactor instance_with_multiple_repos_with_sensors to handle multiple code locations by @schrockn
    • 6e305c1 - [code-location-selector-stack] Code location sensor tests 4/N Actually add test to test cross code location selector by @schrockn
    • 5fc415a - [code-location-selector-stack Add CodeLocationSelector; Have run_status_sensor accept it by @schrockn
    • 3c2367f - Fix test_persistent by @schrockn
    • bf83710 - chore: auto-assign dependabot pull requests (#10953) by @rexledesma
    • a91525b - fix(dbt-cloud): parse command string to find materialization commands (#10989) by @rexledesma
    • 3be1547 - [code-location-selector-stack] Change typehint on make_slack_on_run_failure_sensor to accept CodeLocationSelector by @schrockn
    • c156ab3 - [docs] re-org snowflake integration guide (#10984) by @jamiedemaria
    • 33affcf - Check scheduler ticks right after each minute boundary instead of once every 5 seconds (#10886) by @gibsondan
    • 3b93947 - [docs] - [definitions] Update Configured API concept doc (#11020) by @erinkcochran87
    • 3cfc612 - Fix bug with re-execution snapshot ids (#10967) by @OwenKephart
    • 3fb76d9 - [bugfix] UPathIOManger load_input type checking (#11022) by @danielgafni
    • 546ece8 - Pathspec typing fix (#11036) by @smackesey
    • c729760 - [dagit] /code-locations -> /locations (#11024) by @hellendag
    • f606250 - Replace partition ranges with subsets (#10909) by @clairelin135
    • cec9f43 - [dagit] With multiple assets selected, backfill “missing” should include partially materialized partitions (#11027) by @bengotow
    • ae946c0 - [dagit] Add empty value string for invalid tag input in Runs filter (#11044) by @hellendag
    • 57e5def - [docs] - [definitions] Update Repository page for Definitions (#10986) by @erinkcochran87
    • 9300fc7 - [definitions-accessors] Add get_job_def toDefinitions. by @schrockn
    • 1014add - 1/ definitions in create new project: update dagster project CLI (#10829) by @yuhan
    • 550ec43 - 2/ definitions in create new project: update create-new-project docs (#10830) by @yuhan
    • 7eb520d - [docs] - [definitions] - Update dbt tutorial to use Definitions (#10842) by @erinkcochran87
    • 891dc89 - 2.1/ definitions in in create new project: remove unnecessary asset dir in scaffold + update docs (#10831) by @yuhan
    • ad1fc35 - update isolated run docs to say that it's enabled by default (#11039) by @gibsondan
    • a4cb91e - [docs] - remove repository from hello dagster as its not needed (#11048) by @slopp
    • 90b3e01 - [dagster-fivetran][ez] Add missing docstring entry for build_fivetran_assets (#11049) by @benpankow
    • b4e5a80 - Add start_offset and end_offset to TimeWindowPartitionMapping (#10979) by @sryza
    • ec1d974 - fix subsettable multi asset case (#10878) by @OwenKephart
    • e98e5fb - move ecs task tagging to run_task call (#11037) by @prha
    • c1d1224 - Add custom resource key toload_assets_from_dbt_project(#10827) by @dpeng817
    • cc1925f - [dagit] Highlight Assets in top nav for global asset graph (#11052) by @hellendag
    • fc1ab1f - [dagster-airbyte][docs][ez] Fix AirbyteConnection apidoc rendering (#11059) by @benpankow
    • 795ae3f - convert-examples 1.1/ assets_dbt_python defs + pyproject (#11060) by @yuhan
    • c4d6792 - convert-examples 2.1/ assets_modern_data_stack repo -> defs, pyproject (#11062) by @yuhan
    • 09ac5ed - convert-examples 5/ assets_smoke_test repository -> definitions (#11071) by @yuhan
    • 48c7ae3 - convert-examples 6.1/ feature_graph_backed_assets pyproject, @repository -> Definitions (#11072) by @yuhan
    • 8a12bf3 - convert-examples 7.1/ quickstart_etl workspace->pyproject, repo->defs (#11074) by @yuhan
    • d85d59f - convert-examples 8.1/ with_great_expectations workspace->pyproject, repo->defs (#11077) by @yuhan
    • e4099ec - convert-examples 1.2/ assets_dbt_python file renames (#11061) by @yuhan
    • 4d38088 - convert-examples 6.2/ feature_graph_backed_assets file rename and reorder (#11073) by @yuhan
    • 58e0d88 - convert-examples 2.2/ assets_modern_data_stack file renames (#11063) by @yuhan
    • 99a62a1 - Removes codecov commands from tox (#10635) by @dpeng817
    • 0a675c6 - [dagit] Disable Re-execute menu item in run action menu for perms (#11026) by @hellendag
    • db21352 - Support loading multiple modules in single command line invocation by @schrockn
    • e06b9ae - Enable multiple files in CLI tools by @schrockn
    • 4d4aecb - Add get_sensor_def and get_schedule_def to Definitions by @schrockn
    • ba87836 - convert-examples 10/ with_pyspark_emr repository.py -> defs.py, update [docs] (#11080) by @yuhan
    • 32c3742 - convert-examples 9/ with_pyspark repository.py -> defs.py, update [docs] (#11079) by @yuhan
    • 0234c90 - Add load_asset_value and get_asset_value_loader to Definitions by @schrockn
    • c3a962a - [dagster-airflow] reorg dagster airflow api docs (#11009) by @Ramshackle-Jamathon
    • f2a705c - convert-examples 3.1/ assets_pandas_pyspark definitons, pyproject (#11067) by @yuhan
    • e1a93e1 - convert-examples 8.2/ with_great_expectations file rename (#11078) by @yuhan
    • b0bcc14 - convert-examples 7.2/ quickstart_etl file rename + update README (#11075) by @yuhan
    • e9651fc - Support module_name in tool.dagster section of pyproject.toml by @schrockn
    • f0ed982 - Move all examples to use module_name in pyproject.toml by @schrockn
    • 0510adf - Pin tox on windows to < 4 (#11013) by @jmsanders
    • b023c47 - [docs] - [definitions] Code location concept page (#10843) by @erinkcochran87
    • 606ff79 - Add instance for test to top level API (#10709) by @dpeng817
    • 86a4fe1 - [example] recomment code in dagstermill tutorial (#11098) by @jamiedemaria
    • e43e653 - Make JobSelector work on cross code location by @schrockn
    • 3693ccf - [convert-branch-deployments-stack] Convert branch deployments guide to Definitions by @schrockn
    • a541020 - Eliminate package_name from pyproject.toml loading spec by @schrockn
    • e1f51b0 - [docs] - [definitions] Update Enriching with SDAs guide (#10849) by @erinkcochran87
    • c706db1 - [dagit] Don't show View link on code location toast if already there (#11095) by @hellendag
    • 559ebb6 - set DAGSTER_RUN_JOB_NAME env var (#10888) by @alangenfeld
    • 9e58100 - remove parameter tag from dagstermill notebook (#11109) by @jamiedemaria
    • 195f975 - sort task definition config secrets and env vars to ensure consistent ordering (#10982) by @gibsondan
    • b32e5cb - [dagster-airflow] from airflow to dagster guide (#10923) by @Ramshackle-Jamathon
    • aefd575 - [dagit] Improvements to partition range selection interactions (#11017) by @bengotow
    • 07b6d4e - add tests for legacy storage (#11047) by @prha
    • 8884177 - keep track of max timestamps client side for code location updates (#11100) by @prha
    • 5caeb9f - in default IO manager, handle PartitionMappings that return 0 partitions (#11065) by @sryza
    • 8552f8f - Fix bug with logical versions for assets with changed deps (#11083) by @smackesey
    • cbf36be - Adding apidocs for Definitions by @schrockn
    • 822f1e8 - convert-examples 3.2/ asset_pandas_pyspark re-arrange files + [docs] (#11068) by @yuhan
    • 5ebe03e - [dagit] Persist AssetPartitions state to URL, open Materialize panel with same selection (#11042) by @bengotow
    • ceb2050 - [dagit] Render assets with self-dependencies in Dagit (#11043) by @bengotow
    • ccb2b47 - run queue daemon refactor (#11090) by @alangenfeld
    • fa6b48b - Allow assets to depend on earlier partitions of themselves (#11066) by @sryza
    • 1811e25 - [docs] - [definitions] Update Schedules documentation (#11046) by @erinkcochran87
    • afc161c - Re-add section for customizing Docker images pre-fast-deploys (#11112) by @shalabhc
    • 8a5ade0 - Allow setting base_dir as a StringSource in sqlite storage (#11031) by @gibsondan
    • c1ca6ef - handle self-dependencies in reconciliation sensor (#11085) by @sryza
    • 621f9fb - [docs] - [definitions] Update Workspaces concept page (#10944) by @erinkcochran87
    • d5feb2b - Coerce IOManager objects into IOManagerDefinitions in Definitions by @schrockn
    • 1c7542a - Revert "[docs] - [definitions] Update Workspaces concept page (#10944)" by @alangenfeld
    • 1292df7 - Allow overriding the default max_concurrent for the default executor via env var (#11116) by @gibsondan
    • 2ec20a3 - convert-examples 4.1/ assets_pandas_type_metadata definitions, pyproject (#11069) by @yuhan
    • 1f9c37c - [docs] [definitions] convert snowflake guide to definitions (#11029) by @jamiedemaria
    • 132ce0d - [docs] - Fix workspace link (#11121) by @erinkcochran87
    • e64f0de - convert-examples 4.2/ assets_pandas_type_metadata repository.py -> __init__.py (new) (#11127) by @yuhan
    • d36a5e6 - Add executor and loggers to Definitions by @schrockn
    • ab743c4 - [docs] - [definitions] Update Sensors documentation (#11050) by @erinkcochran87
    • 4506c12 - threaded run queue daemon (#11113) by @alangenfeld
    • c87bf8d - convert-examples 11.1/ project_fully_featured defs, pyproject, no with_resources (#11081) by @yuhan
    • 9491833 - [dagster-airbyte] Fix normalization table generation, materializations for non-object Streams (#10899) by @benpankow
    • c221574 - update assets_modern_data_stack example readme (#11122) by @yuhan
    • f0ff38c - [CacheableAssets] Fix bug that would cause definitions to be recomputed when using a StepDelegatingExecutor (#11086) by @OwenKephart
    • bd0801b - convert-examples 11.2/ project_fully_featured repository.py -> __init__.py (#11082) by @yuhan
    • 4e72f19 - update assets_dbt_python example readme (#11126) by @yuhan
    • 7d92ffc - [dagster-dbt] Enable setting FreshnessPolicies on dbt assets (#11103) by @OwenKephart
    • c14d793 - Mark Definitions as not experimental by @schrockn
    • a0268ff - [declarative-scheduling] Fix issue where in-progress runs were not properly handled (#11118) by @OwenKephart
    • 2a66629 - [dagster-airbyte] Avoid unnecessarily recreating sources, destinations w/ managed ingestion (#11117) by @benpankow
    • 7ab1263 - [dagster-airflow] pex compatibility for ephemeral db (#11115) by @Ramshackle-Jamathon
    • 47e3c31 - [dagster-airflow] provide interface for passing connection models directly to dagster (#11006) by @Ramshackle-Jamathon
    • 372aae8 - [dagit] Left nav: show entire code location string for non-dunder repos (#11133) by @hellendag
    • 439a32a - move quickstart_aws back to mono repo (#11130) by @yuhan
    • 650f837 - move quickstart_gcp back to mono repo (#11131) by @yuhan
    • 30bd794 - move quickstart_snowflake back to mono repo (#11132) by @yuhan
    • 479ab1a - Update Definitions docs by @schrockn
    • a5a8c84 - Use wrap_resources_for_execution in Definitions by @schrockn
    • 8182da0 - Use i/o manager coercion in assets_pandas_pyspark example by @schrockn
    • 5fdecbd - convert-examples-repo-to-defs cloud nux quickstart_aws (#11135) by @yuhan
    • 2169bf8 - convert-examples-repo-to-defs cloud nux quickstart_gcp (#11136) by @yuhan
    • 96750d3 - convert-examples-repo-to-defs cloud nux quickstart_snowflake (#11137) by @yuhan
    • 841efdc - enable run status sensor logging (#11145) by @prha
    • defd785 - Move jobs concept guide to refer to Definitions by @schrockn
    • ef723a6 - Move I/O Manager Guide to be on Definitions by @schrockn
    • 2d9ff0b - add thread name prefix to run dequeue workers (#11155) by @alangenfeld
    • 1670918 - Adjust resources guide to be in a Definitions world by @schrockn
    • 61b43cb - make schedules produced by build_schedule_from_partitioned_job more p… (#11147) by @sryza
    • 4760e16 - [docs] - [definitions] Update Dagit + tutorial screenshots (#11089) by @erinkcochran87
    • 9df72f3 - [docs] - [definitions] Update Partitions concept docs (#11030) by @erinkcochran87
    • 799d333 - Port asset sensor guide to Definitions by @schrockn
    • 9290f54 - Revert "keep track of max timestamps client side for code location up… (#11162) by @prha
    • 468f39b - [docs] - Fix links (#11163) by @erinkcochran87
    • ecd4895 - Merge branch 'release-1.1.7' of github.com:dagster-io/dagster into release-1.1.7 by @jamiedemaria
    • de634ac - 1.1.7 Changelog (#11165) by @jamiedemaria
    • 9882250 - 1.1.7 by @elementl-devtools
    Source code(tar.gz)
    Source code(zip)
  • 1.1.6(Dec 8, 2022)

    New

    • [dagit] Throughout Dagit, when the default repository name __repository__ is used for a repo, only the code location name will be shown. This change also applies to URL paths.
    • [dagster-dbt] When attempting to generate software-defined assets from a dbt Cloud job, an error is now raised if none are created.
    • [dagster-dbt] Software-defined assets can now be generated for dbt Cloud jobs that execute multiple commands.

    Bugfixes

    • Fixed a bug that caused load_asset_value to error with the default IO manager when a partition_key argument was provided.
    • Previously, trying to access context.partition_key or context.asset_partition_key_for_output when invoking an asset directly (e.g. in a unit test) would result in an error. This has been fixed.
    • Failure hooks now receive the original exception instead of RetryRequested when using a retry policy.
    • The LocationStateChange GraphQL subscription has been fixed (thanks @****roeij !)**
    • Fixed a bug where a sqlite3.ProgrammingError error was raised when creating an ephemeral DagsterInstance, most commonly when build_resources was called without passing in an instance parameter.
    • [dagstermill] Jupyter notebooks now correctly render in Dagit on Windows machines.
    • [dagster-duckdb-pyspark] New duckdb_pyspark_io_manager helper to automatically create a DuckDB I/O manager that can store and load PySpark DataFrames.
    • [dagster-mysql] Fixed a bug where versions of mysql < 8.0.31 would raise an error on some run queries.
    • [dagster-postgres] connection url param “options“ are no longer overwritten in dagit.
    • [dagit] Dagit now allows backfills to be launched for asset jobs that have partitions and required config.
    • [dagit] Dagit no longer renders the "Job in repo@location" label incorrectly in Chrome v109.
    • [dagit] Dagit's run list now shows improved labels on asset group runs of more than three assets
    • [dagit] Dagit's run gantt chart now renders per-step resource initialization markers correctly.
    • [dagit] In op and asset descriptions in Dagit, rendered markdown no longer includes extraneous escape slashes.
    • Assorted typos and omissions fixed in the docs — thanks @C0DK and @akan72!

    Experimental

    • As an optional replacement of the workspace/repository concepts, a new Definitions entrypoint for tools and the UI has been added. A single Definitions object per code location may be instantiated, and accepts typed, named arguments, rather than the heterogenous list of definitions returned from an @repository-decorated function. To learn more about this feature, and provide feedback, please refer to the Github Discussion.
    • [dagster-slack] A new make_slack_on_freshness_policy_status_change_sensor allows you to create a sensor to alert you when an asset is out of date with respect to its freshness policy (and when it’s back on time!)

    Documentation

    Source code(tar.gz)
    Source code(zip)
  • 1.1.5(Dec 2, 2022)

  • 1.1.4(Dec 2, 2022)

    Community Contributions

    • Fixed a typo in GCSComputeLogManager docstring (thanks reidab)!
    • [dagster-airbyte] job cancellation on run termination is now optional. (Thanks adam-bloom)!
    • [dagster-snowflake] Can now specify snowflake role in config to snowflake io manager (Thanks binhnefits)!
    • [dagster-aws] A new AWS systems manager resource (thanks zyd14)!
    • [dagstermill] Retry policy can now be set on dagstermill assets (thanks nickvazz)!
    • Corrected typo in docs on metadata (thanks C0DK)!

    New

    • Added a job_name parameter to InputContext.
    • Fixed inconsistent io manager behavior when using execute_in_process on a GraphDefinition (it would use the fs_io_manager instead of the in-memory io manager).
    • Compute logs will now load in Dagit even when websocket connections are not supported.
    • [dagit] A handful of changes have been made to our URLs:
      • The /instance URL path prefix has been removed. E.g. /instance/runs can now be found at /runs.
      • The /workspace URL path prefix has been changed to /locations. E.g. the URL for job my_job in repository foo@bar can now be found at /locations/foo@bar/jobs/my_job.
    • [dagit] The “Workspace” navigation item in the top nav has been moved to be a tab under the “Deployment” section of the app, and is renamed to “Definitions”.
    • [dagstermill] Dagster events can now be yielded from asset notebooks using dagstermill.yield_event.
    • [dagstermill] Failed notebooks can be saved for inspection and debugging using the new save_on_notebook_failure parameter.
    • [dagster-airflow] Added a new option use_ephemeral_airflow_db which will create a job run scoped airflow db for airflow dags running in dagster
    • [dagster-dbt] Materializing software-defined assets using dbt Cloud jobs now supports partitions.
    • [dagster-dbt] Materializing software-defined assets using dbt Cloud jobs now supports subsetting. Individual dbt Cloud models can be materialized, and the proper filters will be passed down to the dbt Cloud job.
    • [dagster-dbt] Software-defined assets from dbt Cloud jobs now support configurable group names.
    • [dagster-dbt] Software-defined assets from dbt Cloud jobs now support configurable AssetKeys.

    Bugfixes

    • Fixed regression starting in 1.0.16 for some compute log managers where an exception in the compute log manager setup/teardown would cause runs to fail.
    • The S3 / GCS / Azure compute log managers now sanitize the optional prefix argument to prevent badly constructed paths.
    • [dagit] The run filter typeahead no longer surfaces key-value pairs when searching for tag:. This resolves an issue where retrieving the available tags could cause significant performance problems. Tags can still be searched with freeform text, and by adding them via click on individual run rows.
    • [dagit] Fixed an issue in the Runs tab for job snapshots, where the query would fail and no runs were shown.
    • [dagit] Schedules defined with cron unions displayed “Invalid cron string” in Dagit. This has been resolved, and human-readable versions of all members of the union will now be shown.

    Breaking Changes

    • You can no longer set an output’s asset key by overriding get_output_asset_key on the IOManager handling the output. Previously, this was experimental and undocumented.

    Experimental

    • Sensor and schedule evaluation contexts now have an experimental log property, which log events that can later be viewed in Dagit. To enable these log views in dagit, navigate to the user settings and enable the Experimental schedule/sensor logging view option. Log links will now be available for sensor/schedule ticks where logs were emitted. Note: this feature is not available for users using the NoOpComputeLogManager.
    Source code(tar.gz)
    Source code(zip)
  • 1.1.3(Nov 23, 2022)

    Bugfixes

    • Fixed a bug with the asset reconciliation sensor that caused duplicate runs to be submitted in situations where an asset has a different partitioning than its parents.
    • Fixed a bug with the asset reconciliation sensor that caused it to error on time-partitioned assets.
    • [dagster-snowflake] Fixed a bug when materializing partitions with the Snowflake I/O manager where sql BETWEEN was used to determine the section of the table to replace. BETWEEN included values from the next partition causing the I/O manager to erroneously delete those entries.
    • [dagster-duckdb] Fixed a bug when materializing partitions with the DuckDB I/O manager where sql BETWEEN was used to determine the section of the table to replace. BETWEEN included values from the next partition causing the I/O manager to erroneously delete those entries.
    Source code(tar.gz)
    Source code(zip)
  • 1.1.2(Nov 19, 2022)

    Bugfixes

    • In Dagit, assets that had been materialized prior to upgrading to 1.1.1 were showing as "Stale". This is now fixed.
    • Schedules that were constructed with a list of cron strings previously rendered with an error in Dagit. This is now fixed.
    • For users running dagit version >= 1.0.17 (or dagster-cloud) with dagster version < 1.0.17, errors could occur when hitting "Materialize All" and some other asset-related interactions. This has been fixed.
    Source code(tar.gz)
    Source code(zip)
  • 1.1.1(Nov 19, 2022)

    Major Changes since 1.0.0 (core) / 0.16.0 (libraries)

    Core

    • You can now create multi-dimensional partitions definitions for software-defined assets, through the MultiPartitionsDefinition API. In Dagit, you can filter and materialize certain partitions by providing ranges per-dimension, and view your materializations by dimension.
    • The new asset reconciliation sensor automatically materializes assets that have never been materialized or whose upstream assets have changed since the last time they were materialized. It works with partitioned assets too. You can construct it using build_asset_reconciliation_sensor.
    • You can now add a FreshnessPolicy to any of your software-defined assets, to specify how up-to-date you expect that asset to be. You can view the freshness status of each asset in Dagit, alert when assets are missing their targets using the @freshness_policy_sensor, and use the build_asset_reconciliation_sensor to make a sensor that automatically kick off runs to materialize assets based on their freshness policies.
    • You can now version your asset ops and source assets to help you track which of your assets are stale. You can do this by assigning op_version s to software-defined assets or observation_fn s to SourceAssets. When a set of assets is versioned in this way, their “Upstream Changed” status will be based on whether upstream versions have changed, rather than on whether upstream assets have been re-materialized. You can launch runs that materialize only stale assets.
    • The new @multi_asset_sensor decorator enables defining custom sensors that trigger based on the materializations of multiple assets. The context object supplied to the decorated function has methods to fetch latest materializations by asset key, as well as built-in cursor management to mark specific materializations as “consumed”, so that they won’t be returned in future ticks. It can also fetch materializations by partition and mark individual partitions as consumed.
    • RepositoryDefinition now exposes a load_asset_value method, which accepts an asset key and invokes the asset’s I/O manager’s load_input function to load the asset as a Python object. This can be used in notebooks to do exploratory data analysis on assets.
    • With the new asset_selection parameter on @sensor and SensorDefinition, you can now define a sensor that directly targets a selection of assets, instead of targeting a job.
    • When running dagit or dagster-daemon locally, environment variables included in a .env file in the form KEY=value in the same folder as the command will be automatically included in the environment of any Dagster code that runs, allowing you to easily use environment variables during local development.

    Dagit

    • The Asset Graph has been redesigned to make better use of color to communicate asset health. New status indicators make it easy to spot missing and stale assets (even on large graphs!) and the UI updates in real-time as displayed assets are materialized.
    • The Asset Details page has been redesigned and features a new side-by-side UI that makes it easier to inspect event metadata. A color-coded timeline on the partitions view allows you to drag-select a time range and inspect the metadata and status quickly. The new view also supports assets that have been partitioned across multiple dimensions.
    • The new Workspace page helps you quickly find and navigate between all your Dagster definitions. It’s also been re-architected to load significantly faster when you have thousands of definitions.
    • The Overview page is the new home for the live run timeline and helps you understand the status of all the jobs, schedules, sensors, and backfills across your entire deployment. The timeline is now grouped by repository and shows a run status rollup for each group.

    Integrations

    • dagster-dbt now supports generating software-defined assets from your dbt Cloud jobs.
    • dagster-airbyte and dagster-fivetran now support automatically generating assets from your ETL connections using load_assets_from_airbyte_instance and load_assets_from_fivetran_instance.
    • New dagster-duckdb integration: build_duckdb_io_manager allows you to build an I/O manager that stores and loads Pandas and PySpark DataFrames in DuckDB.

    Database migration

    • Optional database schema migration, which can be run via dagster instance migrate:
      • Improves Dagit performance by adding database indexes which should speed up the run view as well as a range of asset-based queries.
      • Enables multi-dimensional asset partitions and asset versioning.

    Breaking Changes and Deprecations

    • define_dagstermill_solid, a legacy API, has been removed from dagstermill. Use define_dagstermill_op or define_dagstermill_asset instead to create an op or asset from a Jupyter notebook, respectively.
    • The internal ComputeLogManager API is marked as deprecated in favor of an updated interface: CapturedLogManager. It will be removed in 1.2.0. This should only affect dagster instances that have implemented a custom compute log manager.

    Dependency Changes

    • dagster-graphql and dagit now use version 3 of graphene

    Since 1.0.17

    New

    • The new UPathIOManager base class is now a top-level Dagster export. This enables you to write a custom I/O manager that plugs stores data in any filesystem supported by universal-pathlib and uses different serialization format than pickle (Thanks Daniel Gafni!).
    • The default fs_io_manager now inherits from the UPathIOManager, which means that its base_dir can be a path on any filesystem supported by universal-pathlib (Thanks Daniel Gafni!).
    • build_asset_reconciliation_sensor now works with support partitioned assets.
    • build_asset_reconciliation_sensor now launches runs to keep assets in line with their defined FreshnessPolicies.
    • The FreshnessPolicy object is now exported from the top level dagster package.
    • For assets with a FreshnessPolicy defined, their current freshness status will be rendered in the asset graph and asset details pages.
    • The AWS, GCS, and Azure compute log managers now take an additional config argument upload_interval which specifies in seconds, the interval in which partial logs will be uploaded to the respective cloud storage. This can be used to display compute logs for long-running compute steps.
    • When running dagit or dagster-daemon locally, environment variables included in a .env file in the form KEY=value in the same folder as the command will be automatically included in the environment of any Dagster code that runs, allowing you to easily test environment variables during local development.
    • observable_source_asset decorator creates a SourceAsset with an associated observation_fn that should return a LogicalVersion, a new class that wraps a string expressing a version of an asset’s data value.
    • [dagit] The asset graph now shows branded compute_kind tags for dbt, Airbyte, Fivetran, Python and more.
    • [dagit] The asset details page now features a redesigned event viewer, and separate tabs for Partitions, Events, and Plots. This UI was previously behind a feature flag and is now generally available.
    • [dagit] The asset graph UI has been revamped and makes better use of color to communicate asset status, especially in the zoomed-out view.
    • [dagit] The asset catalog now shows freshness policies in the “Latest Run” column when they are defined on your assets.
    • [dagit] The UI for launching backfills in Dagit has been simplified. Rather than selecting detailed ranges, the new UI allows you to select a large “range of interest” and materialize only the partitions of certain statuses within that range.
    • [dagit] The partitions page of asset jobs has been updated to show per-asset status rather than per-op status, so that it shares the same terminology and color coding as other asset health views.
    • [dagster-k8s] Added an execute_k8s_job function that can be called within any op to run an image within a Kubernetes job. The implementation is similar to the build-in k8s_job_op , but allows additional customization - for example, you can incorporate the output of a previous op into the launched Kubernetes job by passing it into execute_k8s_job. See the dagster-k8s API docs for more information.
    • [dagster-databricks] Environment variables used by dagster cloud are now automatically set when submitting databricks jobs if they exist, thank you @zyd14!
    • [dagstermill] define_dagstermill_asset now supports RetryPolicy . Thanks @nickvazz!
    • [dagster-airbyte] When loading assets from an Airbyte instance using load_assets_from_airbyte_instance, users can now optionally customize asset names using connector_to_asset_key_fn.
    • [dagster-fivetran] When loading assets from a Fivetran instance using load_assets_from_fivetran_instance, users can now alter the IO manager using io_manager_key or connector_to_io_manager_key_fn, and customize asset names using connector_to_asset_key_fn.

    Bugfixes

    • Fixed a bug where terminating runs from a backfill would fail without notice.
    • Executing a subset of ops within a job that specifies its config value directly on the job, it no longer attempts to use that config value as the default. The default is still presented in the editable interface in dagit.
    • [dagit] The partition step run matrix now reflects historical step status instead of just the last run’s step status for a particular partition.

    Documentation

    Source code(tar.gz)
    Source code(zip)
  • 1.0.17(Nov 10, 2022)

    New

    • With the new asset_selection parameter on @sensor and SensorDefinition, you can now define a sensor that directly targets a selection of assets, instead of targeting a job.
    • materialize and materialize_to_memory now accept a raise_on_error argument, which allows you to determine whether to raise an Error if the run hits an error or just return as failed.
    • (experimental) Dagster now supports multi-dimensional asset partitions, through a new MultiPartitionsDefinition object. An optional schema migration enables support for this feature (run via dagster instance migrate). Users who are not using this feature do not need to run the migration.
    • You can now launch a run that targets a range of asset partitions, by supplying the "dagster/asset_partition_range_start" and "dagster/asset_partition_range_end" tags.
    • [dagit] Asset and op graphs in Dagit now show integration logos, making it easier to identify assets backed by notebooks, DBT, Airbyte, and more.
    • [dagit] a -db-pool-recycle cli flag (and dbPoolRecycle helm option) have been added to control how long the pooled connection dagit uses persists before recycle. The default of 1 hour is now respected by postgres (mysql previously already had a hard coded 1hr setting). Thanks @adam-bloom!
    • [dagster-airbyte] Introduced the ability to specify output IO managers when using load_assets_from_airbyte_instance and load_assets_from_airbyte_project.
    • [dagster-dbt] the dbt_cloud_resource resource configuration account_id can now be sourced from the environment. Thanks @sowusu-ba!
    • [dagster-duckdb] The DuckDB integration improvements: PySpark DataFrames are now fully supported, “schema” can be specified via IO Manager config, and API documentation has been improved to include more examples
    • [dagster-fivetran] Introduced experimental load_assets_from_fivetran_instance helper which automatically pulls assets from a Fivetran instance.
    • [dagster-k8s] Fixed an issue where setting the securityContext configuration of the Dagit pod in the Helm chart didn’t apply to one of its containers. Thanks @jblawatt!

    Bugfixes

    • Fixed a bug that caused the asset_selection parameter of RunRequest to not be respected when used inside a schedule.
    • Fixed a bug with health checks during delayed Op retries with the k8s_executor and docker_executor.
    • [dagit] The asset graph now live-updates when assets fail to materialize due to op failures.
    • [dagit] The "Materialize" button now respects the backfill permission for multi-run materializations.
    • [dagit] Materializations without metadata are padded correctly in the run logs.
    • [dagster-aws] Fixed an issue where setting the value of task_definition field in the EcsRunLauncher to an environment variable stopped working.
    • [dagster-dbt] Add exposures in load_assets_from_dbt_manifest. This fixed then error when load_assets_from_dbt_manifest failed to load from dbt manifest with exposures. Thanks @sowusu-ba!
    • [dagster-duckdb] In some examples, the duckdb config was incorrectly specified. This has been fixed.

    Breaking Changes

    • The behavior of the experimental asset reconciliation sensor, which is accessible via build_asset_reconciliation_sensor has changed to be more focused on reconciliation. It now materializes assets that have never been materialized before and avoids materializing assets that are “Upstream changed”. The build_asset_reconciliation_sensor API no longer accepts wait_for_in_progress_runs and wait_for_all_upstream arguments.

    Documentation

    All Changes

    https://github.com/dagster-io/dagster/compare/1.0.16...1.0.17

    See All Contributors
    • 7e9d8f6 - [jog] host_representation_tests, instance_tests, selector_tests (#10256) by @dpeng817
    • ec07488 - [jog] resource_tests (#10257) by @dpeng817
    • 8e44cf5 - [dagstermill] notebook backed assets (#10277) by @jamiedemaria
    • ce52573 - notebook and assets example project (#10315) by @jamiedemaria
    • 10187df - [dagster-airbyte][managed-elements] Explicit handling of reconciling secret values (#10195) by @benpankow
    • faece2a - [duckdb] integration improvements (#10114) by @jamiedemaria
    • b094d26 - Fix Helm package skipping (#10313) by @jmsanders
    • 0500534 - Remove special casing for graphql tests (#10327) by @jmsanders
    • 0c89026 - Add client ID to dagit (#10316) by @dpeng817
    • 2cc3545 - [dagster-airflow] add airflow 2 support tomake_dagster_job_from_airflow_dag+ xcom mock option (#10337) by @Ramshackle-Jamathon
    • a403cfc - 1.0.16 changelog (#10340) by @alangenfeld
    • 3bbca03 - disentangle asset reconciliation sensor from MultiAssetSensorDefinition (#10258) by @sryza
    • 1130ec0 - Mock tqdm in tests to avoid segfaults (#10343) by @jmsanders
    • fa4a445 - Automation: versioned docs for 1.0.16 by @elementl-devtools
    • 3b07b06 - update 1.0.16 changelog (#10346) by @yuhan
    • 8b44cda - Better fix for tooltips around disabled buttons + gitlab icon (#10338) by @salazarm
    • 08f4e2d - Back-compat fix for setting the task definition in the EcsRunLauncher to an env var (#10341) by @gibsondan
    • 8289da9 - Mock each callsite of tqdm (#10368) by @gibsondan
    • 15f1b09 - Build integration steps before test project steps (#10367) by @jmsanders
    • 7fa824b - Packages with test changes only shouldn't run deps (#10309) by @jmsanders
    • a8d457a - Remove test-project-core (#10310) by @jmsanders
    • 369fb96 - chore: add project urls to dagster pypi (#10328) by @rexledesma
    • 5690af5 - fix: add comma (#10373) by @rexledesma
    • 22cf4db - add securityContext to dagit init-user-deployment initContainers (#10369) by @gibsondan
    • adf924f - JB - Docs Typography Follow-ups (#10344) by @braunjj
    • 49081bd - [dagit] Add more logos to tags (#10355) by @jamiedemaria
    • 713d873 - [Multi-dimensional partitions 1/n] Add schema migration for asset event tags table (#10001) by @clairelin135
    • 0244116 - Revert "Mock each callsite of tqdm" (#10379) by @jmsanders
    • a34ffc5 - [Multi-dimensional partitions 2/n] Read/write from asset event tags table (#10056) by @clairelin135
    • a8f8be0 - Prevent segfaults (#10372) by @jmsanders
    • d709c11 - [apidocs] clarify usage of create_databricks_job_op (#10299) by @sryza
    • cefab82 - Mark celery-k8s-test-suite as flaky (#10204) by @jmsanders
    • 6d1429c - [Multi-dimensional partitions 3/n] Output multi-dimensional materializations (#10082) by @clairelin135
    • 454a570 - fix duckdb configs in examples (#10378) by @jamiedemaria
    • 79d1f12 - Revert "Mark celery-k8s-test-suite as flaky" (#10384) by @jmsanders
    • b4ccac3 - Unpin grpc (#10386) by @gibsondan
    • 22d94cd - some refactorings to asset reconciliation sensor (#10233) by @sryza
    • 3065bad - fix mysql heading level + typo (#10336) by @domsj
    • 620d6cf - [dagster-fivetran] Enable loading of Fivetran connection assets from instance (#10290) by @benpankow
    • edcdcba - Resolver -> AssetGraph (#10401) by @sryza
    • a35654b - Enable accessing base jobs off of repository definitions (#10405) by @sryza
    • ef4a4bb - add pyrightconfig.json to .gitignore (#10403) by @sryza
    • e9fcb96 - [dagster-dbt] add exposures in load_assets_from_dbt_manifest (#10395) by @sowusu-ba
    • 339d14c - operate on keys instead of strings in AssetSelection and AssetGraph (#10402) by @sryza
    • 9607192 - Updated "Loaded with Errors" link (#10421) by @gibsondan
    • 7c205d5 - [dagster-io/ui] Add stroke color to TextArea (#10422) by @hellendag
    • 4900d22 - Install project-fully-featured (#10419) by @jmsanders
    • 353ed2a - More process cleanup in grpc tests (#10423) by @gibsondan
    • b6aea90 - Add methods to AssetGraph for traversing asset partitions graph (#10413) by @sryza
    • 29b24f1 - [dagit] Update partner compute_kind tags to latest designs (#10394) by @bengotow
    • 4b207e5 - [dagit] Fix materialization log padding when no metadata is present (#10350) by @bengotow
    • cd5af37 - [dagit] Respect backfill permission in asset backfill modal (#10387) by @bengotow
    • 174a958 - [freshness-policy] add latency freshness policy (#10393) by @OwenKephart
    • 50026a1 - Revert "Unpin grpc (#10386)" (#10428) by @gibsondan
    • 44c9ddb - support RunRequest.asset_selection in schedules (#10400) by @sryza
    • 4df96a7 - support raise_on_error in materialize and materialize_to_memory (#10397) by @sryza
    • 42c47a8 - docs(cacheable-assets): mark unimplemented methods as abstract (#10424) by @rexledesma
    • c694526 - [easy] fix incorrect header in docker guide (#10427) by @gibsondan
    • 36a8383 - [dagit] Fix bug in asset live reloading after step failures, clean up AssetView (#10349) by @bengotow
    • 3f75ae6 - allow sensors to target asset selections instead of jobs (#10417) by @sryza
    • 1365944 - Implement abstract methods (#10434) by @jmsanders
    • 46e41ad - [graphql] handle null int metadata val (#10436) by @alangenfeld
    • dbbfffb - [jog] convert runtime_types_tests (#10317) by @dpeng817
    • bbe1378 - [jog] convert core_test/selector_tests (#10318) by @dpeng817
    • 649ea0e - [easy][dagster-airbyte] Treat SSH key as secret for managed stack (#10390) by @benpankow
    • dba7b8b - [jog] Convert snap tests (#10319) by @dpeng817
    • 9e28855 - Expose sqlalchemy pool recycle option for mysql/postgres (#10416) by @adam-bloom
    • 1e7e9a7 - cached_method decorator (#10398) by @sryza
    • 7f6a458 - Enable action logging in dagit (#10342) by @dpeng817
    • 1e9af55 - Temporarily skip celery-k8s tests (#10438) by @jmsanders
    • b6237b6 - Run builds on changes to .txt files (#10433) by @jmsanders
    • ff77d5e - Hook for loading secrets in grpc/run/step entry points (#10089) by @gibsondan
    • f03a491 - Readaccount_idfrom an environment variable indbt_cloud_resource(#10324) by @sowusu-ba
    • 7bfe9f1 - Add container_name to EcsContainerContext (#10446) by @gibsondan
    • 15f0f2d - Make asset reconciliation sensor more reconcile-y (#10435) by @sryza
    • 198d234 - [dagster-airbyte] Generate src/dst Python classes for managed stacks (#10272) by @benpankow
    • 28739e8 - [docs] - Guides for configuring environment variables (#10034) by @erinkcochran87
    • e579a21 - Revert "specify schema in pandas to_sql" (#10450) by @jamiedemaria
    • 79eaa72 - feat: bind command-K to search (#10449) by @rexledesma
    • 91487d7 - [dagster-airbyte] Add ability to specify output IO managers (#10217) by @benpankow
    • 7d0cc3e - [dagster-airbyte] When loading managed connections, ignore those which are not managed (#10388) by @benpankow
    • efab894 - enable launching a run that targets a range of asset partitions (#10441) by @sryza
    • 356f768 - chore: remove support for github dark mode (#10329) by @rexledesma
    • f7da04d - [dagit] Clarify notebook buttons (#10451) by @jamiedemaria
    • 2a3f8f2 - [dagstermill] allow custom io manager key specification (#10448) by @jamiedemaria
    • 87c8389 - [dagster-airbyte][docs] Reorganize Airbyte guide (#10215) by @benpankow
    • 00e4491 - Fix check_step_health with delayed op retries (#10458) by @johannkm
    • fd6411f - changelog 1.0.17 (#10464) by @yuhan
    • 4c51207 - 1.0.17 by @elementl-devtools
    Source code(tar.gz)
    Source code(zip)
  • 1.0.16(Nov 3, 2022)

    New

    • [dagit] The new Overview and Workspace pages have been enabled for all users, after being gated with a feature flag for the last several releases. These changes include design updates, virtualized tables, and more performant querying.
      • The top navigation has been updated to improve space allocation, with main nav links moved to the left.
      • “Overview” is the new Dagit home page and “factory floor” view, were you can find the run timeline, which now offers time-based pagination. The Overview section also contains pages with all of your jobs, schedules, sensors, and backfills. You can filter objects by name, and collapse or expand repository sections.
      • “Workspace” has been redesigned to offer a better summary of your repositories, and to use the same performant table views, querying, and filtering as in the Overview pages.
    • @asset and @multi_asset now accept a retry_policy argument. (Thanks Adam Bloom!)
    • When loading an input that depends on multiple partitions of an upstream asset, the fs_io_manager will now return a dictionary that maps partition keys to the stored values for those partitions. (Thanks andrewgryan!).
    • JobDefinition.execute_in_process now accepts a run_config argument even when the job is partitioned. If supplied, the run config will be used instead of any config provided by the job’s PartitionedConfig.
    • The run_request_for_partition method on jobs now accepts a run_config argument. If supplied, the run config will be used instead of any config provided by the job’s PartitionedConfig.
    • The new NotebookMetadataValue can be used to report the location of executed jupyter notebooks, and Dagit will be able to render the notebook.
    • Resolving asset dependencies within a group now works with multi-assets, as long as all the assets within the multi-asset are in the same group. (Thanks @peay!)
    • UPathIOManager, a filesystem-agnostic IOManager base class has been added - (Thanks @danielgafni!)
    • A threadpool option has been added for the scheduler daemon. This can be enabled via your dagster.yaml file; check out the docs.
    • The default LocalComputeLogManager will capture compute logs by process instead of by step. This means that for the in_process executor, where all steps are executed in the same process, the captured compute logs for all steps in a run will be captured in the same file.
    • [dagster-airflow] make_dagster_job_from_airflow_dag now supports airflow 2, there is also a new mock_xcom parameter that will mock all calls to made by operators to xcom.
    • [helm] volume and volumeMount sections have been added for the dagit and daemon sections of the helm chart.

    Bugfixes

    • For partitioned asset jobs whose config is a hardcoded dictionary (rather than a PartitionedConfig), previously run_request_for_partition would produce a run with no config. Now, the run has the hardcoded dictionary as its config.
    • Previously, asset inputs would be resolved to upstream assets in the same group that had the same name, even if the asset input already had a key prefix. Now, asset inputs are only resolved to upstream assets in the same group if the input path only has a single component.
    • Previously, asset inputs could get resolved to outputs of the same AssetsDefinition, through group-based asset dependency resolution, which would later error because of a circular dependency. This has been fixed.
    • Previously, the “Partition Status” and “Backfill Status” fields on the Backfill page in dagit were always incomplete and showed missing partitions. This has been fixed to accurately show the status of the backfill runs.
    • [dagit] When viewing the config dialog for a run with a very long config, scrolling was broken and the “copy” button was not visible. This has been fixed.
    • Executors now compress step worker arguments to avoid CLI length limits with large DAGs.
    • [dagster-msteams] Longer messages can now be used in Teams HeroCard - (Thanks @jayhale!)

    Documentation

    • API docs for InputContext have been improved - (Thanks @peay!)
    • [dagster-snowflake] Improved documentation for the Snowflake IO manager

    All Changes

    https://github.com/dagster-io/dagster/compare/1.0.15...1.0.16

    See All Contributors
    • ee39fcd - rm all_types from config_context (#10203) by @alangenfeld
    • bad2e44 - Also skip docs next (#10178) by @jmsanders
    • bd6e269 - Make handleLaunchResult agnostic to the query that returned the data (#10179) by @salazarm
    • b5212c7 - Make PythonPackages.get() more flexible to _/- (#10184) by @jmsanders
    • 16bc4cf - [dagit] Invert stored state for expand/collapse in Overview pages (#10212) by @hellendag
    • af58bf9 - [config] memoize ConfigType snap creation (#10210) by @alangenfeld
    • 2eebc3c - [config] avoid double init on field cache objects (#10214) by @alangenfeld
    • fcda5f3 - Override default io manager in more places (#10202) by @johannkm
    • 59ae17e - Fix black (#10216) by @johannkm
    • d4334f1 - type annotations on backfill-related code paths (#9402) by @sryza
    • b56013e - Fix lint (#10218) by @johannkm
    • 3176dcc - Add docs for customizing the serverless base image (#9571) by @petehunt
    • a24298a - Skip test-project image builds (#10099) by @jmsanders
    • f7c058a - [dagit] Updated asset details event view (#10143) by @bengotow
    • 4ef9d27 - [dagit] Fix AssetView.test key warnings (#10226) by @hellendag
    • 2bbf998 - Also trigger builds when tox.ini changes (#10209) by @jmsanders
    • c1ff910 - 1.0.15 changelog (#10227) by @jamiedemaria
    • 688a5d8 - Automation: versioned docs for 1.0.15 by @elementl-devtools
    • 0478723 - add materialization property on dagster event (#10230) by @prha
    • cf8ad4a - Parse HEAD commit in addition to BUILDKITE_MESSAGE (#10208) by @jmsanders
    • 58ee17c - Allow default io_manager load_input to support partitions of differing frequencies (#10172) by @andrewgryan
    • 67313e8 - Install the correct dagster-buildkite CLI (#10211) by @jmsanders
    • cf84896 - Type annotations in dagster-graphql (#10005) by @smackesey
    • 174dd99 - fix black, mypy (#10235) by @prha
    • 3c89b70 - [freshness-policy] [1/n] FreshnessPolicy object (#10024) by @OwenKephart
    • 281595b - Break up sensor_definition.py (#10181) by @sryza
    • ffa01e1 - Highlight config entry when being hovered in yaml editor (#10239) by @salazarm
    • a5f0fa9 - Resolve multi-asset deps when they have the same group (#10222) by @peay
    • 77faa3c - update apidoc for postgres (#10241) by @prha
    • 35b8f8b - fix schedules threading config (#10247) by @alangenfeld
    • 8296cf7 - [dagit] Ship Overview/Workspace (#10245) by @hellendag
    • 24af700 - Fix stray references to define_assets_job (#10199) by @bengotow
    • 631c3ef - captured log manager (#9429) by @prha
    • 65b47a0 - Allow for longer messages in Teams HeroCard (#10234) by @jayhale
    • ee43f93 - [dagit] Clean up code post-Overview changes (#10249) by @hellendag
    • a097c43 - [dagit] Move the refresh indicator on asset details pages to avoid flicker (#10253) by @bengotow
    • ce7afb7 - [dagit] Fix issues with embedding fonts in downloaded DAG SVGs (#10252) by @bengotow
    • 37f2c95 - [dagit] Fix daylight savings issue in humanCronString.test (#10263) by @hellendag
    • 4ca8da0 - [dagit] Make 50-level colors opaque (#10186) by @hellendag
    • 997fdbb - [dagit] Add query countdown/refresh to Timeline (#10250) by @hellendag
    • da7555c - Support overriding run config for partitions with execute_in_process (#10246) by @sryza
    • a2bb0cf - Add calendar icon (#10267) by @salazarm
    • 5de98f5 - UPathIOManager - filesystem-agnostic IOManager base (#10193) by @danielgafni
    • 40bf6c7 - rename logKey to fileKey / logFileKey for disambiguation with new API (#9956) by @prha
    • dc667ef - fix some cases in group-based asset dep resolution (#10266) by @sryza
    • ef678e6 - dagster-io/ui release notes (#10269) by @salazarm
    • 966bfb7 - fix UnresolvedAssetJobDefinition.run_request_for_partition when confi… (#10238) by @sryza
    • 36f7c38 - Improve doc forInputContext.{dagster_type,metadata}(#10242) by @peay
    • 69bfd0c - Fetch asset materialization planned event from index shard in Sqlite (#10248) by @clairelin135
    • cb62bc4 - fix #9193: add retry policy to @asset and @multi_asset (#10150) by @adam-bloom
    • ca18259 - adds dagster-aws install to dockerfile in docker deployment guide (#10225) by @jamiedemaria
    • e6444be - keep storage name for serialized event backcompat (#10280) by @prha
    • 16b3cc3 - graphql for captured log subscription (#9957) by @prha
    • 9ccfe7c - Add NotebookMetadataValue (#10278) by @jamiedemaria
    • b88f309 - Add volumes and volumeMounts to dagit and daemon in OSS helm chart (#10285) by @gibsondan
    • 1830880 - support new captured log API for process-based execution (#9958) by @prha
    • a0a1340 - add new capture APIs in frontend queries (#9959) by @prha
    • 621b57a - specify schema in pandas to_sql (#10289) by @jamiedemaria
    • 7dd241f - [dagit] Fix AssetView flakiness (#10293) by @hellendag
    • fc6ea04 - [dagit] Middle-truncate asset key path in virtualized list (#10275) by @hellendag
    • b4c0930 - Snowflake IO Manager API docs (#10175) by @jamiedemaria
    • 2b3280f - Compress execute_step args (#10244) by @johannkm
    • 1dd602a - fix black formatting (#10298) by @prha
    • 9a3b5bc - fix backfill table status (#10295) by @prha
    • f7f1e13 - run_config argument on run_request_for_partition (#10279) by @sryza
    • 0e5386b - [dagit] NotebookMetadataValue support (#10287) by @jamiedemaria
    • f966490 - [dagit] Fix very tall configs in run config dialog (#10301) by @hellendag
    • f466eb7 - Observable source asset decorator (#9899) by @smackesey
    • 70c5f8d - convert NoOpComputeLogManager to support captured log API (#9960) by @prha
    • 4df6e55 - convert S3 compute log manager to support new captured log API (#9961) by @prha
    • c8dee2d - Test createdBefore run filter (#10270) by @benpankow
    • 7c38a8c - fix build; isort; black (#10303) by @prha
    • 082b407 - Fix example scaffold (#10003) by @smackesey
    • 6459b0f - Add version to asset decorator (#10167) by @smackesey
    • 3f99335 - Fix logic to run lints on builds (#10304) by @jmsanders
    • 98fe688 - Don't repeat skip logs (#10307) by @jmsanders
    • 0fc9f84 - [jog] Add a utility method to execute op inside of graph (#10255) by @dpeng817
    • 6b333d5 - [jog] execution_tests,hook_tests (#10254) by @dpeng817
    • 1283e63 - Revert "convert S3 compute log manager to support new captured log API (#9961) (#10311) by @prha
    • 834abba - [docs] Automatically toggle tab components to display URL hash/anchor (#10231) by @benpankow
    • 5b0d959 - [dagstermill] notebook backed assets (#10277) by @jamiedemaria
    • 345f0b3 - notebook and assets example project (#10315) by @jamiedemaria
    • 9157ebb - [dagster-airbyte][managed-elements] Explicit handling of reconciling secret values (#10195) by @benpankow
    • 251b666 - [dagster-airflow] add airflow 2 support tomake_dagster_job_from_airflow_dag+ xcom mock option (#10337) by @Ramshackle-Jamathon
    • 89948a3 - 1.0.16 changelog (#10340) by @alangenfeld
    • afb20aa - 1.0.16 by @elementl-devtools
    Source code(tar.gz)
    Source code(zip)
  • 1.0.15(Oct 27, 2022)

    New

    • [dagit] The run timeline now shows all future schedule ticks for the visible time window, not just the next ten ticks.
    • [dagit] Asset graph views in Dagit refresh as materialization events arrive, making it easier to watch your assets update in real-time.
    • [dagster-airbyte] Added support for basic auth login to the Airbyte resource.
    • Configuring a Python Log Level will now also apply to system logs created by Dagster during a run.

    Bugfixes

    • Fixed a bug that broke asset partition mappings when using the key_prefix with methods like load_assets_from_modules.
    • [dagster-dbt] When running dbt Cloud jobs with the dbt_cloud_run_op, the op would emit a failure if the targeted job did not create a run_results.json artifact, even if this was the expected behavior. This has been fixed.
    • Improved performance by adding database indexes which should speed up the run view as well as a range of asset-based queries. These migrations can be applied by running dagster instance migrate.
    • An issue that would cause schedule/sensor latency in the daemon during workspace refreshes has been resolved.
    • [dagit] Shift-clicking Materialize for partitioned assets now shows the asset launchpad, allowing you to launch execution of a partition with config.

    Community Contributions

    • Fixed a bug where asset keys with - were not being properly sanitized in some situations. Thanks @peay!
    • [dagster-airbyte] A list of connection directories can now be specified in load_assets_from_airbyte_project. Thanks @adam-bloom!
    • [dagster-gcp] Dagster will now retry connecting to GCS if it gets a ServiceUnavailable error. Thanks @cavila-evoliq!
    • [dagster-postgres] Use of SQLAlchemy engine instead of psycopg2 when subscribing to PostgreSQL events. Thanks @peay!

    Experimental

    • [dagster-dbt] Added a display_raw_sql flag to the dbt asset loading functions. If set to False, this will remove the raw sql blobs from the asset descriptions. For large dbt projects, this can significantly reduce the size of the generated workspace snapshots.
    • [dagit] A “New asset detail pages” feature flag available in Dagit’s settings allows you to preview some upcoming changes to the way historical materializations and partitions are viewed.

    All Changes

    https://github.com/dagster-io/dagster/compare/1.0.14...1.0.15

    See All Contributors
    • 899e8d5 - [dagit] Add actions to run timeline empty state (#10092) by @hellendag
    • 2e8fae8 - fix future ticks cursor when no ticks are available (#10095) by @prha
    • 2d8ada6 - Revert "Make the default_job_io_manager overridable via env (#9950)" by @johannkm
    • 80f1022 - Add helm vals for scheduler threading (#10094) by @dpeng817
    • e0b61df - [dagit] Shift-clicking Materialize should show asset launchpad even if assets are partitioned (#10047) by @bengotow
    • bcf581a - Remove references to services in docker-compose down command test fixture (#10042) by @dpeng817
    • 426fc52 - 1.0.14 changelog (#10106) by @dpeng817
    • 4a22317 - 1.0.14 changelog addendum (#10107) by @gibsondan
    • 2af0b8a - [dagit] Show all future ticks in the timeline window (#10097) by @hellendag
    • e8d8525 - [dagster-postgres] Use SQLAlchemy engine in pynotify instead of psycopg2 directly (#10090) by @peay
    • c76d049 - Jb/Docs UI Cleanup (#10050) by @braunjj
    • a0489f9 - [mypy] fixes for uvicorn types (#10108) by @alangenfeld
    • ec3c509 - Revert "Revert "Make the default_job_io_manager overridable via env (#9950)"" by @johannkm
    • 07c02c5 - [dagit] Use gray tags in repo headers (#10078) by @hellendag
    • 927054f - Run make mdx-format (#10113) by @jmsanders
    • 565105a - Upgrade kind (#9992) by @jmsanders
    • a06a8db - rebuild event log indices to include id (#10105) by @prha
    • 05a8aa0 - [dagit] Fix repo bucket sorting in virtualized tables (#10101) by @hellendag
    • bd690a6 - Automation: versioned docs for 1.0.14 by @elementl-devtools
    • 288f692 - docs(slack-alert-policies): add slack bot name to setup instructions (#10116) by @rexledesma
    • ff07bf9 - [dagit] Some typographic and spacing tweaks on virtualized tables (#10118) by @hellendag
    • 328f4a1 - Don't gate on elementl emails (#10119) by @jmsanders
    • 71e0129 - Skip docs changes (#10030) by @jmsanders
    • bc284fc - [docs] - improve partitioned asset examples on partitions concepts page (#10084) by @sryza
    • ab42f31 - Walk the Python dependency tree (#10096) by @jmsanders
    • 54dbe2d - [dagit] Add some right padding to clearable text input (#10129) by @hellendag
    • 76c9674 - [dagster-airbyte] Docs for how to use basic auth in current release (#10131) by @benpankow
    • 876dfab - Cache docstring check (#10127) by @schrockn
    • a8e67fc - [dagster-airbyte] Docs fix using with_resources with cacheable asset (#10132) by @benpankow
    • 6614290 - Retry on Service Unavailable (#10110) by @cavila-evoliq
    • a90ea64 - Make AssetKey.__eq__ more efficient (#10139) by @schrockn
    • 6af2ba9 - [docs formatter] Drop mypy ignore lines in docs (#10138) by @benpankow
    • 0772a12 - [refactor] take logic for base asset jobs out of AssetGroup (#10071) by @sryza
    • 9df49da - [dagit] Subscribe to in progress asset runs, refresh on relevant events (#10028) by @bengotow
    • 38f5291 - docs: add h1 header to Dagster readme (#10156) by @rexledesma
    • 3a8f4e1 - docs: update @dagsterio -> @dagster (#10158) by @rexledesma
    • 85cc0ca - Use instance.python_log_level when determining default system logger level (#10073) by @gibsondan
    • a20c2b7 - in hacker news demo, fix column filtering with partitions and snowfla… (#10141) by @sryza
    • 30c0751 - Allow overriding svg with color (#10163) by @salazarm
    • fb2b580 - [dagster-airbyte] Add top-level support for basic auth username & password (#10130) by @benpankow
    • c9dca01 - improve api doc examples for cloud object store IO managers (#10153) by @sryza
    • eeb6c8f - switch order of assets and jobs in partitions doc (#10142) by @sryza
    • 99f3579 - Fix AssetsDefinition.with_prefix_or_group to update partition mappings (#10164) by @peay
    • 05a08b8 - Fix assets with dashes in their path andio_manager_def(#10087) by @peay
    • 9694525 - Refactor db io manager to core (#10128) by @jamiedemaria
    • 0f2e30c - docs design: fix icons. make icon changes backcompat (#10174) by @yuhan
    • bf3e941 - Support multi-repo builds (#10135) by @jmsanders
    • cb8c7d6 - [docs] Out -> AssetOut docs tweak (#10146) by @benpankow
    • ab0e946 - Conditionally skip dagit builds (#10165) by @jmsanders
    • 0556f09 - [dagster-managed-elements] Dagster-managed-elements CLI, APIs (#10011) by @benpankow
    • ab92934 - EventLogRecord.partition_key and EventLogRecord.asset_key (#10180) by @sryza
    • 01edf1d - [dagster-airbyte] Airbyte managed elements impl (#10013) by @benpankow
    • a4b045d - [dagster-airbyte] allow filtering connections by directory names (#10151) by @adam-bloom
    • 0781a28 - make_airflow_dag airflow 2 compatibility (#10115) by @Ramshackle-Jamathon
    • 60e8f6f - [dagit] Bump TypeScript (#10133) by @hellendag
    • 8bfb2e3 - [@dagster-io/ui] Release v1.0.6 (#10188) by @hellendag
    • c08c0d9 - [dagster-dbt] allow missing run_results.json (#10187) by @OwenKephart
    • 5e74834 - docs(dagster-cloud-agent): update IAM role link (#10189) by @rexledesma
    • 32c316d - [dagit] Expand run config dialog (#10173) by @hellendag
    • 767946a - Replace custom in-memory logic with in-memory sqlite connection (#10154) by @prha
    • e4a2357 - add cpu and memory to DagsterEcsTaskDefinitionConfig (#10198) by @gibsondan
    • 4938f9b - Allow silencing failures with default io manager override (#10067) by @johannkm
    • 5e32b41 - [dagster-airbyte] Managed elements typo (#10196) by @benpankow
    • 385c28a - [dagit] Consolidate expand/collapse state in Overview tables (#10194) by @hellendag
    • 4bd995f - reduce locking in workspace reload (#10192) by @alangenfeld
    • 6d0f696 - [dagster-dbt] add flag to make asset snapshots smaller (#10213) by @OwenKephart
    • 63d00fd - Extend default timeout (#10205) by @jmsanders
    • 886da61 - [dagster-airbyte] Fix recreating source/destination logic (#10197) by @benpankow
    • a5db43e - Revert 4938f9beb9 and ec3c5099c2 by @johannkm
    • 6bdd409 - Make handleLaunchResult agnostic to the query that returned the data (#10179) by @salazarm
    • 4caf010 - rm all_types from config_context (#10203) by @alangenfeld
    • 8e2aec7 - [config] memoize ConfigType snap creation (#10210) by @alangenfeld
    • 3f8002d - [config] avoid double init on field cache objects (#10214) by @alangenfeld
    • b8b973c - Fix black (#10216) by @johannkm
    • 5f6a5ea - [dagit] Invert stored state for expand/collapse in Overview pages (#10212) by @hellendag
    • cb3bb61 - [dagit] Updated asset details event view (#10143) by @bengotow
    • 124098c - 1.0.15 changelog (#10227) by @jamiedemaria
    • f816bb9 - 1.0.15 by @elementl-devtools
    Source code(tar.gz)
    Source code(zip)
  • 1.0.14(Oct 20, 2022)

    New

    • Tags can now be provided to an asset reconciliation sensor and will be applied to all RunRequests returned by the sensor.
    • If you don’t explicitly specify a DagsterType on a graph input, but all the inner inputs that the graph input maps to have the same DagsterType, the graph input’s DagsterType will be set to the the DagsterType of the inner inputs.
    • [dagster-airbyte] load_assets_from_airbyte_project now caches the project data generated at repo load time so it does not have to be regenerated in subprocesses.
    • [dagster-airbyte] Output table schema metadata is now generated at asset definition time when using load_assets_from_airbyte_instance or load_assets_from_airbyte_project.
    • [dagit] The run timeline now groups all jobs by repository. You can collapse or expand each repository in this view by clicking the repository name. This state will be preserved locally. You can also hold Shift while clicking the repository name, and all repository groups will be collapsed or expanded accordingly.
    • [dagit] In the launchpad view, a “Remove all” button is now available once you have accrued three or more tabs for that job, to make it easier to clear stale configuration tabs from view.
    • [dagit] When scrolling through the asset catalog, the toolbar is now sticky. This makes it simpler to select multiple assets and materialize them without requiring you to scroll back to the top of the page.
    • [dagit] A “Materialize” option has been added to the action menu on individual rows in the asset catalog view.
    • [dagster-aws] The EcsRunLauncher now allows you to pass in a dictionary in the task_definition config field that specifies configuration for the task definition of the launched run, including role ARNs and a list of sidecar containers to include. Previously, the task definition could only be configured by passing in a task definition ARN or by basing the the task definition off of the task definition of the ECS task launching the run. See the docs for the full set of available config.

    Bugfixes

    • Previously, yielding a SkipReason within a multi-asset sensor (experimental) would raise an error. This has been fixed.
    • [dagit] Previously, if you had a partitioned asset job and supplied a hardcoded dictionary of config to define_asset_job, you would run into a CheckError when launching the job from Dagit. This has been fixed.
    • [dagit] When viewing the Runs section of Dagit, the counts displayed in the tabs (e.g. “In progress”, “Queued”, etc.) were not updating on a poll interval. This has been fixed.

    All Changes

    https://github.com/dagster-io/dagster/compare/1.0.13...1.0.14

    See All Contributors
    • 350b61d - Fix materialization count by partition (#9979) by @clairelin135
    • e297cc0 - [dagit] Turn on timeline run bucketing for everyone (#9993) by @hellendag
    • c132917 - [dagit] "Remove all" tab button to clear Launchpad tabs (#9981) by @hellendag
    • 80e10e5 - [dagit] Add polling to Overview pages (#9996) by @hellendag
    • dc4fe9b - Return each Python PackageSpec's distribution (#9889) by @jmsanders
    • 57b8c64 - [dagster-airbyte] Support union types while generating normalization tables (#9937) by @benpankow
    • b745984 - [dagster-airbyte] Add option to specify custom API request params to Airbyte resource (#10000) by @benpankow
    • 38dfd67 - View Notebook button opens notebook as link if it's a url (#9894) by @jamiedemaria
    • e3a57ec - IO manager concept doc improvements (#9987) by @sryza
    • c35332b - [dagster-airbyte] Add optional connection name filter when generating assets (#9975) by @benpankow
    • 0f09fa1 - [docs] Update deployment settings reference with SSO default role (#9984) by @benpankow
    • 865026b - [docs] - dbt-focused intro tutorial (#9853) by @jamiedemaria
    • bbc2139 - Configure tokens -> Tokens (#9999) by @salazarm
    • 2a2de19 - add a link to discuss.dagster.io on README (#10002) by @yuhan
    • 1443f57 - 1.0.13 changelog (#10010) by @yuhan
    • b289ba6 - [fix] fix docs bk error (#10014) by @benpankow
    • d0402ba - Only add buildkite steps for affected changes (#9897) by @jmsanders
    • c2538db - [dagit] Add daemon alerts to Overview schedules/sensors pages (#9972) by @hellendag
    • 661f3e7 - [dagit] Make asset catalog toolbar sticky (#9974) by @hellendag
    • 4064b6d - [dagit] Update workspace flag label (#9994) by @hellendag
    • 7b8ada8 - fix code example in create_databricks_job_op (#10012) by @sryza
    • f30c4c3 - Fixup setup paths (#10019) by @jmsanders
    • 8bbbcca - Skip Python checks if no Python files change (#10016) by @jmsanders
    • 3c493a5 - Skip package steps instead of excluding them (#10018) by @jmsanders
    • ef0ffae - Don't raise an exception when changing the EcsRunLauncher's container name to a new name (#10026) by @gibsondan
    • f25658a - [event log tests] add origin to run (#10017) by @alangenfeld
    • ca177b6 - move the big honkin asset graph to latest asset APIs (#10025) by @sryza
    • 0a80c4b - Conditionally skip helm steps (#10031) by @jmsanders
    • 1373823 - [dagit] Fix run tab counts not updating with poll interval (#9929) by @hellendag
    • 8c63dc6 - Automation: versioned docs for 1.0.13 by @elementl-devtools
    • ca87aa3 - Also run helm on merges to main (#10038) by @jmsanders
    • 2347024 - [dagit] Split asset graphs onto a new “Plots” tab (#9735) by @bengotow
    • 96ce2b6 - [dagit] Fetch isAssetJob in WorkspaceContext, rm Launchpad tab flicker (#9770) by @bengotow
    • 1f68be0 - Workaround for upstream snowflake-sqlalchemy issue (#10049) by @gibsondan
    • 1d93edc - [dagster-airbyte] Generate schema metadata at load time when loading from project or instance (#9939) by @benpankow
    • e3e0e61 - [cacheable-assets] Enable use of with_resources, with_prefix_or_group on cacheable assets (#9978) by @benpankow
    • dd73d98 - [docs] Fix for asset preview image next to code block (#9881) by @benpankow
    • 9bdd9ab - fix credential helper links and volume path (#10046) by @gibsondan
    • 3343c9d - Refactor build python package skipping (#10037) by @jmsanders
    • 389134a - replace tag joins with subquery over tag intersection queries (#10036) by @prha
    • 23d5bb1 - Also correctly skip dagster_buildkite (#10053) by @jmsanders
    • 0b0ac95 - pass tags to asset reconciliation sensor (#10032) by @jamiedemaria
    • 04ba619 - infer graph input types from inner input types (#9658) by @sryza
    • c648f67 - Add hook for launch pad root execution button (#9976) by @salazarm
    • 064a376 - Make the default_job_io_manager overridable via env (#9950) by @johannkm
    • bc59e18 - [dagit] Jest cleanup for missing canvas context (#10035) by @hellendag
    • e4b3498 - [easy] fix incorrect cli link (#10058) by @gibsondan
    • f84dc66 - Skip integration tests (#10051) by @jmsanders
    • 55b6903 - consistent capitalization in LaunchAssetChoosePartitionsDialog (#10033) by @sryza
    • e4c8a86 - Fix package skipping for dagster-test (#10063) by @jmsanders
    • 0694f75 - fix graph type inference with fan-in (#10064) by @sryza
    • 7ed03bd - Fix runs yielded error for multi asset sensor (#10059) by @clairelin135
    • f1e1c94 - [dagit] Add "Materialize" action item to asset action menu (#10065) by @hellendag
    • e2235c3 - [dagit] Update Backfill table styles (#10070) by @hellendag
    • a94dc9a - Raise if we can't infer distribution (#10066) by @jmsanders
    • 40df711 - fix Dagit execution of partitioned asset jobs with hardcoded config (#10057) by @sryza
    • 5727e7d - Skip grapqhl and mysql checks (#10054) by @jmsanders
    • 8c468c2 - [windows tests] fix start time == end time issue (#10021) by @alangenfeld
    • 5f9879b - [dagster-io/ui] Calculate available width for MiddleTruncate (#10052) by @hellendag
    • 149fcd2 - [dagit] Add Shift+click to expand/collapse all repos in run timeline and tables (#10076) by @hellendag
    • a016e84 - [dagit] Fix repo row expanded icon (#10077) by @hellendag
    • f6f58a0 - Allow passing in task definition config to the EcsRunLauncher instead of just a task definition ARN (#10044) by @gibsondan
    • b695b9b - bump test_execute_schedule_on_celery_k8s to 3m (#10088) by @alangenfeld
    • e97c876 - Expose permissions loading state (#10093) by @salazarm
    • 65a3bbe - Revert "Make the default_job_io_manager overridable via env (#9950)" by @johannkm
    • c43b595 - Add helm vals for scheduler threading (#10094) by @dpeng817
    • 5ca8694 - [dagit] Shift-clicking Materialize should show asset launchpad even if assets are partitioned (#10047) by @bengotow
    • bdf7468 - 1.0.14 changelog (#10106) by @dpeng817
    • e7b7848 - 1.0.14 by @elementl-devtools
    Source code(tar.gz)
    Source code(zip)
  • 1.0.13(Oct 14, 2022)

    New

    • AssetMaterialization now has a metadata property, which allows accessing the materialization’s metadata as a dictionary.
    • DagsterInstance now has a get_latest_materialization_event method, which allows fetching the most recent materialization event for a particular asset key.
    • RepositoryDefinition.load_asset_value and AssetValueLoader.load_asset_value now work with IO managers whose load_input implementation accesses the op_def and name attributes on the InputContext.
    • RepositoryDefinition.load_asset_value and AssetValueLoader.load_asset_value now respect the DAGSTER_HOME environment variable.
    • InMemoryIOManager, the IOManager that backs mem_io_manager, has been added to the public API.
    • The multi_asset_sensor (experimental) now supports marking individual partitioned materializations as “consumed”. Unconsumed materializations will appear in future calls to partitioned context methods.
    • The build_multi_asset_sensor_context testing method (experimental) now contains a flag to set the cursor to the newest events in the Dagster instance.
    • TableSchema now has a static constructor that enables building it from a dictionary of column names to column types.
    • Added a new CLI command dagster run migrate-repository which lets you migrate the run history for a given job from one repository to another. This is useful to preserve run history for a job when you have renamed a repository, for example.
    • [dagit] The run timeline view now shows jobs grouped by repository, with each repository section collapsible. This feature was previously gated by a feature flag, and is now turned on for everyone.
    • [dagster-airbyte] Added option to specify custom request params to the Airbyte resource, which can be used for auth purposes.
    • [dagster-airbyte] When loading Airbyte assets from an instance or from YAML, a filter function can be specified to ignore certain connections.
    • [dagster-airflow] DagsterCloudOperator and DagsterOperator now support Airflow 2. Previously, installing the library on Airflow 2 would break due to an import error.
    • [dagster-duckdb] A new integration with DuckDB allows you to store op outputs and assets in an in-process database.

    Bugfixes

    • Previously, if retries were exceeded when running with execute_in_process, no error would be raised. Now, a DagsterMaxRetriesExceededError will be launched off.
    • [dagster-airbyte] Fixed generating assets for Airbyte normalization tables corresponding with nested union types.
    • [dagster-dbt] When running assets with load_assets_from_...(..., use_build=True), AssetObservation events would be emitted for each test. These events would have metadata fields which shared names with the fields added to the AssetMaterialization events, causing confusing historical graphs for fields such as Compilation Time. This has been fixed.
    • [dagster-dbt] The name for the underlying op for load_assets_from_... was generated in a way which was non-deterministic for dbt projects which pulled in external packages, leading to errors when executing across multiple processes. This has been fixed.

    Dependency changes

    • [dagster-dbt] The package no longer depends on pandas and dagster-pandas.

    Community Contributions

    • [dagster-airbyte] Added possibility to change request timeout value when calling Airbyte. Thanks @FransDel!
    • [dagster-airflow] Fixed an import error in dagster_airflow.hooks. Thanks @bollwyvl!
    • [dagster-gcp] Unpin Google dependencies. dagster-gcp now supports google-api-python-client 2.x. Thanks @amarrella!
    • [dagstermill] Fixed an issue where DagsterTranslator was missing an argument required by newer versions of papermill. Thanks @tizz98!

    Documentation

    • Added an example, underneath examples/assets_smoke_test, that shows how to write a smoke test that feeds empty data to all the transformations in a data pipeline.
    • Added documentation for build_asset_reconciliation_sensor.
    • Added documentation for monitoring partitioned materializations using the multi_asset_sensor and kicking off subsequent partitioned runs.
    • [dagster-cloud] Added documentation for running the Dagster Cloud Docker agent with Docker credential helpers.
    • [dagster-dbt] The class methods of the dbt_cli_resource are now visible in the API docs for the dagster-dbt library.
    • [dagster-dbt] Added a step-by-step tutorial for using dbt models with Dagster software-defined assets

    All Changes

    https://github.com/dagster-io/dagster/compare/1.0.12...1.0.13

    See All Contributors
    • ab4ae1d - [dagit] Run timeline: reduce scheduled tick width, reduce chunk min width (#9913) by @hellendag
    • 421997f - [dagster-airflow] Add DagsterOperator and associated airflow abstractions (#9780) by @Ramshackle-Jamathon
    • 43538db - partition_key and upstream_output.asset_key in load_asset_value (#9914) by @sryza
    • 76df23d - Pin dask-kubernetes until we switch KubeCluster (#9918) by @jmsanders
    • 5c1f889 - Fix executor test (#9919) by @gibsondan
    • f9a9bd7 - Automation: versioned docs for 1.0.12 by @elementl-devtools
    • 72b3b9c - change log for 1.0.12 (#9923) by @sryza
    • 1c26868 - respect DAGSTER_HOME in AssetValueLoader (#9922) by @sryza
    • 448a776 - [dagit] Create Overview root (#9907) by @hellendag
    • d619ab6 - fix 1.0.12 changelog (#9925) by @sryza
    • 3dd81f0 - [dagit] Bucketed virtualized tables for Jobs, Schedules, Sensors (#9909) by @hellendag
    • c88933e - [Feature] Add possibility to change request timeout value when calling airbyte (#9906) by @FransDel
    • ebd90c1 - asset selection diff apidoc (#9917) by @sryza
    • b5ba488 - add InMemoryIOManager to public API (#9882) by @sryza
    • 185769c - add SDAs to fs_io_manager docstring examples (#9872) by @sryza
    • 350e03f - Explicitly error when retries are exceeded if raise_on_error is set (#9934) by @dpeng817
    • bb3bf95 - add cli command to migrate job runs from one repo to another (#9376) by @prha
    • 82a7cb6 - add emptyinit.pyto makedagster_airflow.hooks,.linksimportable (#9932) by @bollwyvl
    • f71cfc1 - Add metadata hooks for making grpc client calls (#9825) by @gibsondan
    • c68397d - TableSchema.from_name_type_dict (#9926) by @sryza
    • fb847aa - linting fixups for dagster-airflow (#9948) by @Ramshackle-Jamathon
    • 0afa38c - add name and op_def to asset value load context (#9942) by @sryza
    • eaaf0e0 - [dagster-io/ui] Middle truncation (#9933) by @hellendag
    • d17e1b2 - Unpin google dependencies (#9319) by @amarrella
    • ccc82ff - [dagster-dbt] Fix dbt asset op name (#9963) by @OwenKephart
    • a05ac4c - Add asset_selection arg to execute_job (#9876) by @dpeng817
    • 3b200e5 - Convert definitions tests to use graph/job/op APIs (#9736) by @dpeng817
    • 08932da - make it easier to fetch asset materialization metadata (#9951) by @sryza
    • 0035ebc - Data pipeline smoke test example (#9945) by @sryza
    • 4cf2f60 - remove dagster-dbt deps on pandas and dagster-pandas (#9953) by @sryza
    • c65b6be - Add skipped events to multi asset sensor context (#9903) by @clairelin135
    • 384e2be - [dagstermill] update DagsterTranslator to support newer versions of papermill (#9901) by @tizz98
    • bf04ebd - [dagster-io/ui] Test and comments for middle truncation search (#9955) by @hellendag
    • fca13bc - [dagster-dbt] fix dbt test metadata (#9965) by @OwenKephart
    • f6c2341 - Show noteable logo on noteable backed assets/ops (#9916) by @jamiedemaria
    • 1d3e016 - Flag to set cursor to latest materializations on build_multi_asset_sensor_context (#9814) by @clairelin135
    • 3505b55 - [docs] asset reconciliation sensor concept page (#9912) by @jamiedemaria
    • f714be6 - support asset keys and asset selection for multi asset sensors (#9954) by @jamiedemaria
    • f91dc3b - [docs] - Partitioned multi asset sensor examples (#9722) by @clairelin135
    • 4a2fb05 - Update multi asset sensor docstring (#9971) by @jamiedemaria
    • 2a8855c - Clarifywith_resourceserror and update docs (#9784) by @clairelin135
    • 0edd7c3 - Add threading to scheduler daemon (#9885) by @dpeng817
    • cf8d611 - Ignore warnings sent from the dagster module (#9577) by @dpeng817
    • cab3946 - [dagster-airflow] remove default param for parent init (#9966) by @Ramshackle-Jamathon
    • 345ceb1 - Fix incorrect secrets_tags docs (#9980) by @gibsondan
    • e42c37d - Document cloud Docker credential helpers (#9982) by @jmsanders
    • ee1a1de - [dagster-dbt] boldly ignore type hints (#9989) by @OwenKephart
    • c449d95 - Increase timeout in asset sensor tests (#9990) by @jamiedemaria
    • 132252b - [dagster-dbt] make dbt cli resource methods public (#9973) by @OwenKephart
    • c480f17 - duckdb integration library (#9869) by @jamiedemaria
    • ddea744 - Fix materialization count by partition (#9979) by @clairelin135
    • ab26eaa - [dagit] Turn on timeline run bucketing for everyone (#9993) by @hellendag
    • ab8c21d - [dagster-airbyte] Support union types while generating normalization tables (#9937) by @benpankow
    • 25d0917 - [dagster-airbyte] Add option to specify custom API request params to Airbyte resource (#10000) by @benpankow
    • d8617f6 - [dagster-airbyte] Add optional connection name filter when generating assets (#9975) by @benpankow
    • ad4943a - [docs] Update deployment settings reference with SSO default role (#9984) by @benpankow
    • 0abdbb8 - [docs] - dbt-focused intro tutorial (#9853) by @jamiedemaria
    • 4244cea - 1.0.13 changelog (#10010) by @yuhan
    • 6aedfe1 - Don't raise an exception when changing the EcsRunLauncher's container name to a new name (#10026) by @gibsondan
    • 141abe9 - 1.0.13 by @elementl-devtools
    Source code(tar.gz)
    Source code(zip)
  • 1.0.12(Oct 7, 2022)

    New

    • The multi_asset_sensor (experimental) now accepts an AssetSelection of assets to monitor. There are also minor API updates for the multi-asset sensor context.
    • AssetValueLoader, the type returned by RepositoryDefinition.get_asset_value_loader is now part of Dagster’s public API.
    • RepositoryDefinition.load_asset_value and AssetValueLoader.load_asset_value now support a partition_key argument.
    • RepositoryDefinition.load_asset_value and AssetValueLoader.load_asset_value now work with I/O managers that invoke context.upstream_output.asset_key.
    • When running Dagster locally, the default amount of time that the system waits when importing user code has been increased from 60 seconds to 180 seconds, to avoid false positives when importing code with heavy dependencies or large numbers of assets. This timeout can be configured in dagster.yaml as follows:
    code_servers:
      local_startup_timeout: 120
    
    • [dagit] The “Status” section has been renamed to “Deployment”, to better reflect that this section of the app shows deployment-wide information.
    • [dagit] When viewing the compute logs for a run and choosing a step to filter on, there is now a search input to make it easier to find the step you’re looking for.
    • [dagster-aws] The EcsRunLauncher can now launch runs in ECS clusters using both Fargate and EC2 capacity providers. See the Deploying to ECS docs for more information.
    • [dagster-airbyte] Added the load_assets_from_airbyte_instance function which automatically generates asset definitions from an Airbyte instance. For more details, see the new Airbyte integration guide.
    • [dagster-airflow] Added the DagsterCloudOperator and DagsterOperator , which are airflow operators that enable orchestrating dagster jobs, running on either cloud or OSS dagit instances, from Apache Airflow.

    Bugfixes

    • Fixed a bug where if resource initialization failed for a dynamic op, causing other dynamic steps to be skipped, those skipped dynamic steps would be ignored when retrying from failure.
    • Previously, some invocations within the Dagster framework would result in warnings about deprecated metadata APIs. Now, users should only see warnings if their code uses deprecated metadata APIs.
    • How the daemon process manages its understanding of user code artifacts has been reworked to improve memory consumption.
    • [dagit] The partition selection UI in the Asset Materialization modal now allows for mouse selection and matches the UI used for partitioned op jobs.
    • [dagit] Sidebars in Dagit shrink more gracefully on small screens where headers and labels need to be truncated.
    • [dagit] Improved performance for loading runs with >10,000 logs
    • [dagster-airbyte] Previously, the port configuration in the airbyte_resource was marked as not required, but if it was not supplied, an error would occur. It is now marked as required.
    • [dagster-dbt] A change made to the manifest.json schema in dbt 1.3 would result in an error when using load_assets_from_dbt_project or load_assets_from_manifest_json. This has been fixed.
    • [dagster-postgres] connections that fail due to sqlalchemy.exc.TimeoutError now retry

    Breaking Changes

    • [dagster-aws] The redshift_resource no longer accepts a schema configuration parameter. Previously, this parameter would error whenever used, because Redshift connections do not support this parameter.

    Community Contributions

    • We now reference the correct method in the "loading asset values outside of Dagster runs" example (thank you Peter A. I. Forsyth!)
    • We now reference the correct test directory in the “Create a New Project” documentation (thank you Peter A. I. Forsyth!)
    • [dagster-pyspark] dagster-pyspark now contains a LazyPysparkResource that only initializes a spark session once it’s accessed (thank you @zyd14!)

    Experimental

    • The new build_asset_reconciliation_sensor function accepts a set of software-defined assets and returns a sensor that automatically materializes those assets after their parents are materialized.
    • [dagit] A new "groups-only" asset graph feature flag allows you to zoom way out on the global asset graph, collapsing asset groups into smaller nodes you can double-click to expand.
    Source code(tar.gz)
    Source code(zip)
  • 1.0.11(Sep 29, 2022)

    New

    • RepositoryDefinition now exposes a load_asset_value method, which accepts an asset key and invokes the asset’s I/O manager’s load_input function to load the asset as a Python object. This can be used in notebooks to do exploratory data analysis on assets.
    • Methods to fetch a list of partition keys from an input/output PartitionKeyRange now exist on the op execution context and input/output context.
    • [dagit] On the Instance Overview page, batched runs in the run timeline view will now proportionally reflect the status of the runs in the batch instead of reducing all run statuses to a single color.
    • [dagster-dbt] [dagster-snowflake] You can now use the Snowflake IO manager with dbt assets, which allows them to be loaded from Snowflake into Pandas DataFrames in downstream steps.
    • The dagster package’s pin of the alembic package is now much less restrictive.

    Bugfixes

    • The sensor daemon when using threads will no longer evaluate the next tick for a sensor if the previous one is still in flight. This resolves a memory leak in the daemon process.
    • The scheduler will no longer remove tracked state for automatically running schedules when they are absent due to a workspace load error.
    • The way user code severs manage repository definitions has been changed to more efficiently serve requests.
    • The @multi_asset decorator now respects its config_schema parameter.
    • [dagit] Config supplied to define_asset_job is now prefilled in the modal that pops up when you click the Materialize button on an asset job page, so you can quickly adjust the defaults.
    • [dagster-dbt] Previously, DagsterDbtCliErrors produced from the dagster-dbt library would contain large serialized objects representing the raw unparsed logs from the relevant cli command. Now, these messages will contain only the parsed version of these messages.
    • Fixed an issue where the deploy_ecs example didn’t work when built and deployed on an M1 Mac.

    Community Contributions

    • [dagster-fivetran] The resync_parameters configuration on the fivetran_resync_op is now optional, enabling triggering historical re*syncs for connectors. Thanks @dwallace0723!

    Documentation

    • Improved API documentation for the Snowflake resource.

    All Changes

    https://github.com/dagster-io/dagster/compare/1.0.10...1.0.11

    See All Contributors
    • d7d5879 - [dagster-dbt] fix star issue for asset input names (#9763) by @OwenKephart
    • d6f08a1 - api docs for snowflake resource (#9717) by @jamiedemaria
    • 233a298 - [dagit] Hide hidden asset group job from Instance Overview (#9742) by @hellendag
    • fdd2ff4 - [dagit] Workspace Overview page (#9744) by @hellendag
    • de43e65 - [dagit] Virtualized table for assets (#9759) by @hellendag
    • 4ef2c08 - 1.0.10 Changelog (#9771) by @OwenKephart
    • 3b4d8f4 - add api doc for run_request_for_partition (#9645) by @sryza
    • 380aadb - [fixit] Add library versions to docs (#9768) by @smackesey
    • 19f8d74 - Automation: versioned docs for 1.0.10 by @elementl-devtools
    • 3d88c86 - Remove unused workspace args from schedule wipe command (#9753) by @gibsondan
    • 882f506 - [dagster-fivetran] Optional Fivetran Historical Resync Parameters (#9774) by @dwallace0723
    • ffae039 - Helm template tests support raw dict (#9773) by @johannkm
    • c8ddaa8 - [dagit] Prefill default job config when materializing assets (#9769) by @bengotow
    • 3cf2492 - [dagit] Multi-colored run batch backgrounds (#9775) by @hellendag
    • 1d823b6 - [dagit] Virtualized graphs table, reuse ops view (#9778) by @hellendag
    • a6efcaa - [dagster-dbt] remove logs and raw output from dbt errors (#9779) by @OwenKephart
    • a833f28 - starlette TestClient deps (#9791) by @alangenfeld
    • 5b66025 - Extend ECS stub to include new cpu/mem (#9786) by @jmsanders
    • 1e8f629 - Get asset partition keys from IOManager and Op Contexts (#9776) by @clairelin135
    • 49a2c57 - [dagit] Fix lint (#9798) by @hellendag
    • a33935b - [dagit] Top tooltip for backfill segments (#9783) by @hellendag
    • 273e820 - [dagit] Add reload button to WorkspaceHeader (#9789) by @hellendag
    • cb01ddb - [dagit] Add top-level polling to virtualized table pages (#9797) by @hellendag
    • d8596e4 - fix 4xx in docs.dagster.io CLOUD-1843 (#9802) by @yuhan
    • 26257e7 - [scheduler] dont clean auto running states for error locations (#9805) by @alangenfeld
    • 586c037 - [perf] resolve lru_cache usage issues (#9782) by @alangenfeld
    • 5b3d15b - A few tweaks to concept docs (#9793) by @sryza
    • 462db9e - Specify platform to build in deploy_ecs example (#9815) by @gibsondan
    • a3c3743 - Add way to build and deploy the deploy_ecs project locally (#9760) by @gibsondan
    • 2e0be96 - rm dead code in GrpcServerRepositoryLocation (#9811) by @alangenfeld
    • c23c5d8 - Revert "remove dagster cli api subcommand from docs (#9165)" (#9740) by @gibsondan
    • 8752ee7 - load asset values outside of a run (#9792) by @sryza
    • 5a928ad - [dagit] Make asset group row clickable (#9810) by @hellendag
    • 2d9ad46 - [dagit] Update homepage and top nav for new workspace flag (#9819) by @hellendag
    • 0af6f5e - [dagit] Align virtualized table cells to top (#9809) by @hellendag
    • a2d3ad1 - [dagit] Hover/active state for top nav items (#9816) by @hellendag
    • 158269f - handle None outputs in snowflake IO manager (#9818) by @sryza
    • 4e9e8b4 - [docs] - simplify multi-asset sensor examples and fix formatting (#9754) by @sryza
    • 1f2633d - mark dbt outputs as Nothing (#9822) by @sryza
    • 888935b - make methods of PartitionMapping public (#9747) by @sryza
    • f43d582 - Make alembic pin much less restrictive (#9830) by @gibsondan
    • 246f8f2 - [dagit] Export the new Workspace grid for Cloud use (#9824) by @hellendag
    • 3d931ac - fix config schema on multi-asset (#9828) by @sryza
    • 30db4d3 - 1.0.11 by @elementl-devtools
    Source code(tar.gz)
    Source code(zip)
  • 1.0.10(Sep 22, 2022)

    New

    • Run status sensors can now monitor all runs in a Dagster Instance, rather than just runs from jobs within a single repository. You can enable this behavior by setting monitor_all_repositories=True in the run status sensor decorator.
    • The run_key argument on RunRequest and run_request_for_partition is now optional.
    • [dagster-databricks] A new “verbose_logs” config option on the databricks_pyspark_step_launcher makes it possible to silence non-critical logs from your external steps, which can be helpful for long-running, or highly parallel operations (thanks @zyd14!)
    • [dagit] It is now possible to delete a run in Dagit directly from the run page. The option is available in the dropdown menu on the top right of the page.
    • [dagit] The run timeline on the Workspace Overview page in Dagit now includes ad hoc asset materialization runs.

    Bugfixes

    • Fixed a set of bugs in multi_asset_sensor where the cursor would fail to update, and materializations would be returned out of order for latest_materialization_records_by_partition.
    • Fixed a bug that caused failures in runs with time-partitioned asset dependencies when the PartitionsDefinition had an offset that wasn’t included in the date format. E.g. a daily-partitioned asset with an hour offset, whose date format was %Y-%m-%d.
    • An issue causing code loaded by file path to import repeatedly has been resolved.
    • To align with best practices, singleton comparisons throughout the codebase have been converted from (e.g.) foo == None to foo is None (thanks @chrisRedwine!).
    • [dagit] In backfill jobs, the “Partition Set” column would sometimes show an internal __ASSET_JOB name, rather than a comprehensible set of asset keys. This has been fixed.
    • [dagit] It is now possible to collapse all Asset Observation rows on the AssetDetails page.
    • [dagster-dbt] Fixed issue that would cause an error when loading assets from dbt projects in which a source had a “*” character in its name (e.g. BigQuery sharded tables)
    • [dagster-k8s] Fixed an issue where the k8s_job_op would sometimes fail if the Kubernetes job that it creates takes a long time to create a pod.
    • Fixed an issue where links to the compute logs for a run would sometimes fail to load.
    • [dagster-k8s] The k8s_job_executor now uses environment variables in place of CLI arguments to avoid limits on argument size with large dynamic jobs.

    Documentation

    • Docs added to explain subsetting graph-backed assets. You can use this feature following the documentation here.
    • UI updated to reflect separate version schemes for mature core Dagster packages and less mature integration libraries
    Source code(tar.gz)
    Source code(zip)
  • 1.0.9(Sep 15, 2022)

    New

    • The multi_asset_sensor (experimental) now has improved capabilities to monitor asset partitions via a latest_materialization_records_by_partition method.
    • Performance improvements for the Partitions page in Dagit.

    Bugfixes

    • Fixed a bug that caused the op_config argument of dagstermill.get_context to be ignored
    • Fixed a bug that caused errors when loading the asset details page for assets with time window partitions definitions
    • Fixed a bug where assets sometimes didn’t appear in the Asset Catalog while in Folder view.
    • [dagit] Opening the asset lineage tab no longer scrolls the page header off screen in some scenarios
    • [dagit] The asset lineage tab no longer attempts to materialize source assets included in the upstream / downstream views.
    • [dagit] The Instance page Run Timeline no longer commingles runs with the same job name in different repositories
    • [dagit] Emitting materializations with JSON metadata that cannot be parsed as JSON no longer crashes the run details page
    • [dagit] Viewing the assets related to a run no longer shows the same assets multiple times in some scenarios
    • [dagster-k8s] Fixed a bug with timeouts causing errors in k8s_job_op
    • [dagster-docker] Fixed a bug with Op retries causing errors with the docker_executor

    Community Contributions

    • [dagster-aws] Thanks @Vivanov98 for adding the list_objects method to S3FakeSession!

    Experimental

    • [dagster-airbyte] Added an experimental function to automatically generate Airbyte assets from project YAML files. For more information, see the dagster-airbyte docs.
    • [dagster-airbyte] Added the forward_logs option to AirbyteResource, allowing users to disble forwarding of Airbyte logs to the compute log, which can be expensive for long-running syncs.
    • [dagster-airbyte] Added the ability to generate Airbyte assets for basic normalization tables generated as part of a sync.

    Documentation

    Source code(tar.gz)
    Source code(zip)
  • 1.0.8(Sep 8, 2022)

    New

    • With the new cron_schedule argument to TimeWindowPartitionsDefinition, you can now supply arbitrary cron expressions to define time window-based partition sets.
    • Graph-backed assets can now be subsetted for execution via AssetsDefinition.from_graph(my_graph, can_subset=True).
    • RunsFilter is now exported in the public API.
    • [dagster-k8s] The dagster-user-deployments.deployments[].schedulerName Helm value for specifying custom Kubernetes schedulers will now also apply to run and step workers launched for the given user deployment. Previously it would only apply to the grpc server.

    Bugfixes

    • In some situations, default asset config was ignored when a subset of assets were selected for execution. This has been fixed.
    • Added a pin to grpcio in dagster to address an issue with the recent 0.48.1 grpcio release that was sometimes causing Dagster code servers to hang.
    • Fixed an issue where the “Latest run” column on the Instance Status page sometimes displayed an older run instead of the most recent run.

    Community Contributions

    • In addition to a single cron string, cron_schedule now also accepts a sequence of cron strings. If a sequence is provided, the schedule will run for the union of all execution times for the provided cron strings, e.g., ['45 23 * * 6', '30 9 * * 0] for a schedule that runs at 11:45 PM every Saturday and 9:30 AM every Sunday. Thanks @erinov1!
    • Added an optional boolean config install_default_libraries to databricks_pyspark_step_launcher . It allows to run Databricks jobs without installing the default Dagster libraries .Thanks @nvinhphuc!

    Experimental

    • [dagster-k8s] Added additional configuration fields (container_config, pod_template_spec_metadata, pod_spec_config, job_metadata, and job_spec_config) to the experimental k8s_job_op that can be used to add additional configuration to the Kubernetes pod that is launched within the op.
    Source code(tar.gz)
    Source code(zip)
  • 1.0.7(Sep 1, 2022)

    New

    • Several updates to the Dagit run timeline view: your time window preference will now be preserved locally, there is a clearer “Now” label to delineate the current time, and upcoming scheduled ticks will no longer be batched with existing runs.
    • [dagster-k8s] ingress.labels is now available in the Helm chart. Any provided labels are appended to the default labels on each object (helm.sh/chart, app.kubernetes.io/version, and app.kubernetes.io/managed-by).
    • [dagster-dbt] Added support for two types of dbt nodes: metrics, and ephemeral models
    • When constructing a GraphDefinition manually, InputMapping and OutputMapping objects should be directly constructed.
    • [dagit] The launchpad tab is no longer shown for Asset jobs. Asset jobs can be launched via the “Materialize All” button shown on the Overview tab. To provide optional configuration, hold shift when clicking “Materialize”.

    Bugfixes

    • [dagster-snowflake] Pandas is no longer imported when dagster_snowflake is imported. Instead, it’s only imported when using functionality inside dagster-snowflake that depends on pandas.
    • Recent changes to run status sensors caused sensors that only monitored jobs in external repositories to also monitor all jobs in the current repository. This has been fixed.
    • Fixed an issue where "unhashable type" errors could be spawned from sensor executions.
    • [dagit] Clicking between assets in different repositories from asset groups and asset jobs now works as expected.
    • [dagit] The DAG rendering of composite ops with more than one input/output mapping has been fixed.
    • [dagit] Selecting a source asset in Dagit no longer produces a GraphQL error
    • [dagit] Viewing “Related Assets” for an asset run now shows the full set of assets included in the run, regardless of whether they were materialized successfully.
    • [dagit] The Asset Lineage view has been simplified and lets you know if the view is being clipped and more distant upstream/downstream assets exist.
    • Fixed erroneous experimental warnings being thrown when using with_resources alongside source assets.

    Breaking Changes

    • The arguments to the (internal) InputMapping and OutputMapping constructors have changed.

    Community Contributions

    • The ssh_resource can now accept configuration from environment variables. Thanks @cbini!
    • Spelling corrections in migrations.md. Thanks @gogi2811!
    Source code(tar.gz)
    Source code(zip)
  • 1.0.6(Aug 26, 2022)

    New

    • [dagit] nbconvert is now installed as an extra in Dagit.
    • Multiple assets can be monitored for materialization using the multi_asset_sensor (experimental).
    • Run status sensors can now monitor jobs in external repositories.
    • The config argument of define_asset_job now works if the job contains partitioned assets.
    • When configuring sqlite-based storages in dagster.yaml, you can now point to environment variables.
    • When emitting RunRequests from sensors, you can now optionally supply an asset_selection argument, which accepts a list of AssetKeys to materialize from the larger job.
    • [dagster-dbt] load_assets_from_dbt_project and load_assets_from_dbt_manifest now support the exlude parameter, allowing you to more precisely which resources to load from your dbt project (thanks @flvndh!)

    Bugfixes

    • Previously, types for multi-assets would display incorrectly in Dagit when specified. This has been fixed.
    • In some circumstances, viewing nested asset paths in Dagit could lead to unexpected empty states. This was due to incorrect slicing of the asset list, and has been fixed.
    • Fixed an issue in Dagit where the dialog used to wipe materializations displayed broken text for assets with long paths.
    • [dagit] Fixed the Job page to change the latest run tag and the related assets to bucket repository-specific jobs. Previously, runs from jobs with the same name in different repositories would be intermingled.
    • Previously, if you launched a backfill for a subset of a multi-asset (e.g. dbt assets), all assets would be executed on each run, instead of just the selected ones. This has been fixed.
    • [dagster-dbt] Previously, if you configured a select parameter on your dbt_cli_resource , this would not get passed into the corresponding invocations of certain context.resources.dbt.x() commands. This has been fixed.
    Source code(tar.gz)
    Source code(zip)
  • 1.0.4(Aug 19, 2022)

    New

    • Assets can now be materialized to storage conditionally by setting output_required=False. If this is set and no result is yielded from the asset, Dagster will not create an asset materialization event, the I/O manager will not be invoked, downstream assets will not be materialized, and asset sensors monitoring the asset will not trigger.
    • JobDefinition.run_request_for_partition can now be used inside sensors that target multiple jobs (Thanks Metin Senturk!)
    • The environment variable DAGSTER_GRPC_TIMEOUT_SECONDS now allows for overriding the default timeout for communications between host processes like dagit and the daemon and user code servers.
    • Import time for the dagster module has been reduced, by approximately 50% in initial measurements.
    • AssetIn now accepts a dagster_type argument, for specifying runtime checks on asset input values.
    • [dagit] The column names on the Activity tab of the asset details page no longer reference the legacy term “Pipeline”.
    • [dagster-snowflake] The execute_query method of the snowflake resource now accepts a use_pandas_result argument, which fetches the result of the query as a Pandas dataframe. (Thanks @swotai!)
    • [dagster-shell] Made the execute and execute_script_file utilities in dagster_shell part of the public API (Thanks Fahad Khan!)
    • [dagster-dbt] load_assets_from_dbt_project and load_assets_from_dbt_manifest now support the exclude parameter. (Thanks @flvndh!)

    Bugfixes

    • [dagit] Removed the x-frame-options response header from Dagit, allowing the Dagit UI to be rendered in an iframe.
    • [fully-featured project example] Fixed the duckdb IO manager so the comment_stories step can load data successfully.
    • [dagster-dbt] Previously, if a select parameter was configured on the dbt_cli_resource, it would not be passed into invocations of context.resources.dbt.run() (and other similar commands). This has been fixed.
    • [dagster-ge] An incompatibility between dagster_ge_validation_factory and dagster 1.0 has been fixed.
    • [dagstermill] Previously, updated arguments and properties to DagstermillExecutionContext were not exposed. This has since been fixed.

    Documentation

    • The integrations page on the docs site now has a section for links to community-hosted integrations. The first linked integration is @silentsokolov’s Vault integration.
    Source code(tar.gz)
    Source code(zip)
  • 1.0.3(Aug 11, 2022)

    New

    • Failure now has an allow_retries argument, allowing a means to manually bypass retry policies.
    • dagstermill.get_context and dagstermill.DagstermillExecutionContext have been updated to reflect stable dagster-1.0 APIs. pipeline/solid referencing arguments / properties will be removed in the next major version bump of dagstermill.
    • TimeWindowPartitionsDefinition now exposes a get_cron_schedule method.

    Bugfixes

    • In some situations where an asset was materialized and that asset that depended on a partitioned asset, and that upstream partitioned asset wasn’t part of the run, the partition-related methods of InputContext returned incorrect values or failed erroneously. This was fixed.
    • Schedules and sensors with the same names but in different repositories no longer affect each others idempotence checks.
    • In some circumstances, reloading a repository in Dagit could lead to an error that would crash the page. This has been fixed.

    Community Contributions

    • @will-holley added an optional key argument to GCSFileManager methods to set the GCS blob key, thank you!
    • Fix for sensors in fully featured example, thanks @pwachira!

    Documentation

    Source code(tar.gz)
    Source code(zip)
  • 1.0.2(Aug 8, 2022)

    New

    • When the workpace is updated, a notification will appear in Dagit, and the Workspace tab will automatically refresh.

    Bugfixes

    • Restored the correct version mismatch warnings between dagster core and dagster integration libraries
    • Field.__init__ has been typed, which resolves an error that pylance would raise about default_value
    • Previously, dagster_type_materializer and dagster_type_loader expected functions to take a context argument from an internal dagster import. We’ve added DagsterTypeMaterializerContext and DagsterTypeLoaderContext so that functions annotated with these decorators can annotate their arguments properly.
    • Previously, a single-output op with a return description would not pick up the description of the return. This has been rectified.

    Community Contributions

    • Fixed the dagster_slack documentation examples. Thanks @ssingh13-rms!

    Documentation

    Source code(tar.gz)
    Source code(zip)
  • 1.0.1(Aug 5, 2022)

    Bugfixes

    • Fixed an issue where Dagster libraries would sometimes log warnings about mismatched versions despite having the correct version loaded.

    Documentation

    • The Dagster Cloud docs now live alongside all the other Dagster docs! Check them out by nagivating to Deployment > Cloud.
    Source code(tar.gz)
    Source code(zip)
  • 1.0.0(Aug 5, 2022)

    Major Changes

    • A docs site overhaul! Along with tons of additional content, the existing pages have been significantly edited and reorganized to improve readability.
    • All Dagster examplesare revamped with a consistent project layout, descriptive names, and more helpful README files.
    • A new dagster projectCLI contains commands for bootstrapping new Dagster projects and repositories
      • dagster project scaffold creates a folder structure with a single Dagster repository and other files such as workspace.yaml. This CLI enables you to quickly start building a new Dagster project with everything set up.
      • dagster project from-example downloads one of the Dagster examples. This CLI helps you to quickly bootstrap your project with an officially maintained example. You can find the available examples via dagster project list-examples.
      • Check out Create a New Project for more details.
    • A default_executor_def argument has been added to the @repository decorator. If specified, this will be used for any jobs (asset or op) which do not explicitly set an executor_def.
    • A default_logger_defs argument has been added to the @repository decorator, which works in the same way as default_executor_def.
    • A new execute_job function presents a Python API for kicking off runs of your jobs.
    • Run status sensors may now yield RunRequests, allowing you to kick off a job in response to the status of another job.
    • When loading an upstream asset or op output as an input, you can now set custom loading behavior using the input_manager_key argument to AssetIn and In.
    • In the UI, the global lineage graph has been brought back and reworked! The graph keeps assets in the same group visually clustered together, and the query bar allows you to visualize a custom slice of your asset graph.

    Breaking Changes and Deprecations

    Legacy API Removals

    In 1.0.0, a large number of previously-deprecated APIs have been fully removed. A full list of breaking changes and deprecations, alongside instructions on how to migrate older code, can be found in MIGRATION.md. At a high level:

    • The solid and pipeline APIs have been removed, along with references to them in extension libraries, arguments, and the CLI (deprecated in 0.13.0).
    • The AssetGroup and build_asset_job APIs, and a host of deprecated arguments to asset-related functions, have been removed (deprecated in 0.15.0).
    • The EventMetadata and EventMetadataEntryData APIs have been removed (deprecated in 0.15.0).

    Deprecations

    • dagster_type_materializer and DagsterTypeMaterializer have been marked experimental and will likely be removed within a 1.x release. Instead, use an IOManager.
    • FileManager and FileHandle have been marked experimental and will likely be removed within a 1.x release.

    Other Changes

    • As of 1.0.0, Dagster no longer guarantees support for python 3.6. This is in line with PEP 494, which outlines that 3.6 has reached end of life.
    • [planned] In an upcoming 1.x release, we plan to make a change that renders values supplied to configured in Dagit. Up through this point, values provided to configured have not been sent anywhere outside the process where they were used. This change will mean that, like other places you can supply configuration, configured is not a good place to put secrets: You should not include any values in configuration that you don't want to be stored in the Dagster database and displayed inside Dagit.
    • fs_io_manager, s3_pickle_io_manager, and gcs_pickle_io_manager, and adls_pickle_io_manager no longer write out a file or object when handling an output with the None or Nothing type.
    • The custom_path_fs_io_manager has been removed, as its functionality is entirely subsumed by the fs_io_manager, where a custom path can be specified via config.
    • The default typing_type of a DagsterType is now typing.Any instead of None.

    New since 0.15.8

    • [dagster-databricks] When using the databricks_pyspark_step_launcher the events sent back to the host process are now compressed before sending, resulting in significantly better performance for steps which produce a large number of events.
    • [dagster-dbt] If an error occurs in load_assets_from_dbt_project while loading your repository, the error message in Dagit will now display additional context from the dbt logs, instead of just DagsterDbtCliFatalRuntimeError.

    Bugfixes

    • Fixed a bug that causes Dagster to ignore the group_name argument to AssetsDefinition.from_graph when a key_prefix argument is also present.
    • Fixed a bug which could cause GraphQL errors in Dagit when loading repositories that contained multiple assets created from the same graph.
    • Ops and software-defined assets with the None return type annotation are now given the Nothing type instead of the Any type.
    • Fixed a bug that caused AssetsDefinition.from_graph and from_op to fail when invoked on a configured op.
    • The materialize function, which is not experimental, no longer emits an experimental warning.
    • Fixed a bug where runs from different repositories would be intermingled when viewing the runs for a specific repository-scoped job/schedule/sensor.
    • [dagster-dbt] A regression was introduced in 0.15.8 that would cause dbt logs to show up in json format in the UI. This has been fixed.
    • [dagster-databricks] Previously, if you were using the databricks_pyspark_step_launcher, and the external step failed to start, a RESOURCE_DOES_NOT_EXIST error would be surfaced, without helpful context. Now, in most cases, the root error causing the step to fail will be surfaced instead.

    Documentation

    • New guide that walks through seamlessly transitioning code from development to production environments.
    • New guide that demonstrates using Branch Deployments to test Dagster code in your cloud environment without impacting your production data.
    Source code(tar.gz)
    Source code(zip)
  • 0.15.8(Jul 28, 2022)

    New

    • Software-defined asset config schemas are no longer restricted to dicts.
    • The OpDefinition constructor now accept ins and outs arguments, to make direct construction easier.
    • define_dagstermill_op accepts ins and outs in order to make direct construction easier.

    Bugfixes

    • Fixed a bug where default configuration was not applied when assets were selected for materialization in Dagit.
    • Fixed a bug where RunRequests returned from run_status_sensors caused the sensor to error.
    • When supplying config to define_asset_job, an error would occur when selecting most asset subsets. This has been fixed.
    • Fixed an error introduced in 0.15.7 that would prevent viewing the execution plan for a job re-execution from 0.15.0 → 0.15.6
    • [dagit] The Dagit server now returns 500 http status codes for GraphQL requests that encountered an unexpected server error.
    • [dagit] Fixed a bug that made it impossible to kick off materializations of partitioned asset if the day_offset, hour_offset, or minute_offset parameters were set on the asset’s partitions definition.
    • [dagster-k8s] Fixed a bug where overriding the Kubernetes command to use to run a Dagster job by setting the dagster-k8s/config didn’t actually override the command.
    • [dagster-datahub] Pinned version of acryl-datahub to avoid build error.

    Breaking Changes

    • The constructor of JobDefinition objects now accept a config argument, and the preset_defs argument has been removed.

    Deprecations

    • DagsterPipelineRunMetadataValue has been renamed to DagsterRunMetadataValue. DagsterPipelineRunMetadataValue will be removed in 1.0.

    Community Contributions

    • Thanks to @hassen-io for fixing a broken link in the docs!

    Documentation

    • MetadataEntry static methods are now marked as deprecated in the docs.
    • PartitionMappings are now included in the API reference.
    • A dbt example and memoization example using legacy APIs have been removed from the docs site.
    Source code(tar.gz)
    Source code(zip)
  • 0.15.7(Jul 27, 2022)

    New

    • DagsterRun now has a job_name property, which should be used instead of pipeline_name.
    • TimeWindowPartitionsDefinition now has a get_partition_keys_in_range method which returns a sequence of all the partition keys between two partition keys.
    • OpExecutionContext now has asset_partitions_def_for_output and asset_partitions_def_for_input methods.
    • Dagster now errors immediately with an informative message when two AssetsDefinition objects with the same key are provided to the same repository.
    • build_output_context now accepts a partition_key argument that can be used when testing the handle_output method of an IO manager.

    Bugfixes

    • Fixed a bug that made it impossible to load inputs using a DagsterTypeLoader if the InputDefinition had an asset_key set.
    • Ops created with the @asset and @multi_asset decorators no longer have a top-level “assets” entry in their config schema. This entry was unused.
    • In 0.15.6, a bug was introduced that made it impossible to load repositories if assets that had non-standard metadata attached to them were present. This has been fixed.
    • [dagster-dbt] In some cases, using load_assets_from_dbt_manifest with a select parameter that included sources would result in an error. This has been fixed.
    • [dagit] Fixed an error where a race condition of a sensor/schedule page load and the sensor/schedule removal caused a GraphQL exception to be raised.
    • [dagit] The “Materialize” button no longer changes to “Rematerialize” in some scenarios
    • [dagit] The live overlays on asset views, showing latest materialization and run info, now load faster
    • [dagit] Typing whitespace into the launchpad Yaml editor no longer causes execution to fail to start
    • [dagit] The explorer sidebar no longer displays “mode” label and description for jobs, since modes are deprecated.

    Community Contributions

    • An error will now be raised if a @repository decorated function expects parameters. Thanks @roeij!

    Documentation

    • The non-asset version of the Hacker News example, which lived inside examples/hacker_news/, has been removed, because it hadn’t received updates in a long time and had drifted from best practices. The asset version is still there and has an updated README. Check it out here

    All Changes

    https://github.com/dagster-io/dagster/compare/0.15.6...0.15.7

    See All Contributors
    • 57acdd5 - Correct check for pickle s3 io manager (#8834) by @ripplekhera
    • 4114910 - [dagit] Always show “Materialize” instead of “Rematerialize” based on status (#8711) by @bengotow
    • 972274c - [dagster-dbt] make group configurable for load_assets_from_dbt (#8863) by @OwenKephart
    • 05fc596 - asset_partitions_def on InputContext and OutputContext (#8858) by @sryza
    • 2d86c76 - [dagster-dbt] refactor the dbt asset integration (#8793) by @OwenKephart
    • 150bc3e - PartitionMappings when non-partitioned assets depend on partitioned assets (#8866) by @sryza
    • 70a7dbf - [dagster-dbt] seeds and snapshots are assets when using dbt build (#8794) by @OwenKephart
    • e46b9a0 - Document valid names for asset keys (#8765) by @jamiedemaria
    • 2357a02 - [docs] Dagster + dbt guide (#8714) by @OwenKephart
    • dbaed58 - 0.15.6 changelog (#8876) by @yuhan
    • 5bb50c0 - provide description for MAP ConfigType (#8824) by @Jiafi
    • b1aa83a - Retrieve minimal set of asset records for assetsLatestInfo (#8835) by @bengotow
    • b8493f3 - error when duplicate asset keys on a repository (#8874) by @sryza
    • 084c66c - [docs] - Add Airflow Operator to Op Docs (#8875) by @clairelin135
    • 2f15fbf - dagster-datahub Rest and Kafka Emitter integration (#8764) by @Jiafi
    • 0988274 - Automation: versioned docs for 0.15.6 by @elementl-devtools
    • 0e83834 - [1.0] move solid to dagster.legacy (#8843) by @dpeng817
    • bc5e502 - Extract ECS task overrides (#8842) by @jmsanders
    • e3ea175 - [graphql] tolerate empty runConfigData (#8886) by @alangenfeld
    • 56c7023 - [dagit] Fix edge case where “ “ launchpad config is not coerced to an empty object (#8895) by @bengotow
    • ee2e977 - Fix ScheduleRootQuery typo (#8903) by @johannkm
    • 61c1c20 - unloadable shit (#8887) by @prha
    • 711b323 - Change base image for OSS release builds (#8902) by @gibsondan
    • c85e158 - change deprecation warnings to 1.0 (#8892) by @dpeng817
    • cd779b1 - update README for hacker news assets example (#8904) by @sryza
    • e657abd - [hacker news] add missing key prefix to activity analytics python assets (#8817) by @sryza
    • 4da2a9e - [buildkite] Specify internal branch used for compatibility pipeline (#8881) by @smackesey
    • 6c97c75 - [dagit] Remove “mode” label and description in explorer sidebar (#8907) by @bengotow
    • 4cefd84 - remove the non-asset version of the hacker news example (#8883) by @sryza
    • 23a9997 - Error when @repository-decorated function has arguments (#8913) by @roeij
    • f787d6d - [docs] - Correct snippets for dbt (#8923) by @erinkcochran87
    • 1961e51 - [bug] fix input loading regression (#8885) by @OwenKephart
    • ff87738 - [docs] - graph backed assets doc fix (#8927) by @jamiedemaria
    • 18f254d - silence system-originated experimental warning for PartitionMapping (#8931) by @sryza
    • a2df1de - Add partition key to build_output_context, add documentation for partition_key on build_op_context (#8774) by @dpeng817
    • 53287b9 - fix dimensions of screenshot on connecting ops tutorial page (#8908) by @sryza
    • c00de5b - ttv: remove undocumented/legacy example - user in loop (#8934) by @yuhan
    • 31f3283 - [docs] - Clean up graph-backed asset example, put under test (#8893) by @dpeng817
    • 7c60a46 - [docs] - Fix garbled sentence in ops.mdx (#8935) by @schrockn
    • c554461 - enable getting asset partitions def from op context (#8924) by @sryza
    • 7c13e28 - Increase test_docker_monitoring timeout (#8906) by @johannkm
    • 6365996 - PartitionsDefinition.get_partition_keys_in_range (#8933) by @sryza
    • b58d711 - Move pipeline to dagster.legacy (#8888) by @dpeng817
    • 7e11df2 - [dagit] Rename search open event (#8954) by @hellendag
    • f3caeae - [dagit] Adjust shift-selection behavior in asset graphs (#8950) by @bengotow
    • 65caf79 - [dagit] Clean up code around the graph sidebar (#8914) by @bengotow
    • 5bd5c8b - add a job_name property to PipelineRun (#8928) by @sryza
    • 9421f73 - remove partition entries from asset op config schema (#8951) by @sryza
    • 000d37a - avoid pipelines in run status sensor doc snippets (#8929) by @sryza
    • a9b25dd - [bug] Fix issue where 'invalid' asset metadata resulted in an error (#8947) by @OwenKephart
    • 4dadcd4 - [dagster-dbt] fix tagged source asset loading (#8943) by @OwenKephart
    • 065adbd - fix black in run status sensor docs example (#8974) by @sryza
    • 164c585 - [known state] fix build_for_reexecution bug (#8975) by @alangenfeld
    • cc70c88 - Document deprecation of MetadataEntry static constructors (#8984) by @smackesey
    • eed9277 - changelog (#8986) by @jamiedemaria
    • b283b8a - 0.15.7 by @elementl-devtools
    Source code(tar.gz)
    Source code(zip)
  • 0.15.6(Jul 15, 2022)

    New

    • When an exception is wrapped by another exception and raised within an op, Dagit will now display the full chain of exceptions, instead of stopping after a single exception level.
    • A default_logger_defs argument has been added to the @repository decorator. Check out the docs on specifying default loggers to learn more.
    • AssetsDefinition.from_graph and AssetsDefinition.from_op now both accept a partition_mappings argument.
    • AssetsDefinition.from_graph and AssetsDefinition.from_op now both accept a metadata_by_output_name argument.
    • define_asset_job now accepts an executor_def argument.
    • Removed package pin for gql in dagster-graphql.
    • You can now apply a group name to assets produced with the @multi_asset decorator, either by supplying a group_name argument (which will apply to all of the output assets), or by setting the group_name argument on individual AssetOuts.
    • InputContext and OutputContext now each have an asset_partitions_def property, which returns the PartitionsDefinition of the asset that’s being loaded or stored.
    • build_schedule_from_partitioned_job now raises a more informative error when provided a non-partitioned asset job
    • PartitionMapping, IdentityPartitionMapping, AllPartitionMapping, and LastPartitionMapping are exposed at the top-level dagster package. They're currently marked experimental.
    • When a non-partitioned asset depends on a partitioned asset, you can now control which partitions of the upstream asset are used by the downstream asset, by supplying a PartitionMapping.
    • You can now set PartitionMappings on AssetIn.
    • [dagit] Made performance improvements to the loading of the partitions and backfill pages.
    • [dagit] The Global Asset Graph is back by popular demand, and can be reached via a new “View global asset lineage ”link on asset group and asset catalog pages! The global graph keeps asset in the same group visually clustered together and the query bar allows you to visualize a custom slice of your asset graph.
    • [dagit] Simplified the Content Security Policy and removed frame-ancestors restriction.
    • [dagster-dbt] load_assets_from_dbt_project and load_assets_from_dbt_manifest now support a node_info_to_group_name_fn parameter, allowing you to customize which group Dagster will assign each dbt asset to.
    • [dagster-dbt] When you supply a runtime_metadata_fn when loading dbt assets, this metadata is added to the default metadata that dagster-dbt generates, rather than replacing it entirely.
    • [dagster-dbt] When you load dbt assets with use_build_command=True, seeds and snapshots will now be represented as Dagster assets. Previously, only models would be loaded as assets.

    Bugfixes

    • Fixed an issue where runs that were launched using the DockerRunLauncher would sometimes use Dagit’s Python environment as the entrypoint to launch the run, even if that environment did not exist in the container.
    • Dagster no longer raises a “Duplicate definition found” error when a schedule definition targets a partitioned asset job.
    • Silenced some erroneous warnings that arose when using software-defined assets.
    • When returning multiple outputs as a tuple, empty list values no longer cause unexpected exceptions.
    • [dagit] Fixed an issue with graph-backed assets causing a GraphQL error when graph inputs were type-annotated.
    • [dagit] Fixed an issue where attempting to materialize graph-backed assets caused a graphql error.
    • [dagit] Fixed an issue where partitions could not be selected when materializing partitioned assets with associated resources.
    • [dagit] Attempting to materialize assets with required resources now only presents the launchpad modal if at least one resource defines a config schema.

    Breaking Changes

    • An op with a non-optional DynamicOutput will now error if no outputs are returned or yielded for that dynamic output.
    • If an Output object is used to type annotate the return of an op, an Output object must be returned or an error will result.

    Community Contributions

    • Dagit now displays the path of the output handled by PickledObjectS3IOManager in run logs and Asset view. Thanks @danielgafni

    Documentation

    • The Hacker News example now uses stable 0.15+ asset APIs, instead of the deprecated 0.14.x asset APIs.
    • Fixed the build command in the instructions for contributing docs changes.
    • [dagster-dbt] The dagster-dbt integration guide now contains information on using dbt with Software-Defined Assets.

    All Changes

    https://github.com/dagster-io/dagster/compare/0.15.5...0.15.6

    See All Contributors
    • 583ce34 - Fold asset_defs submodule into main dagster structure (#8446) by @smackesey
    • b4fa57e - Op that runs a kubernetes job (#8161) by @gibsondan
    • a3a5ccb - add validation for graph backed asset graphs (#8754) by @OwenKephart
    • 5d5dd71 - chore: mark snapshots as generated (#8758) by @rexledesma
    • 4a9718d - [op] fix empty list output bug (#8763) by @alangenfeld
    • 397ad03 - [dagster-dbt] Allow SDAs generated with load_assets_from_dbt* to be partitioned (#8725) by @OwenKephart
    • b007a0c - docs: update config schema descriptions for default executors (#8757) by @rexledesma
    • 58518d9 - Restore sensor daemon yielding when evaluating sensors synchronously (#8756) by @prha
    • c449c0f - bypass bucketed queries for mysql versions that do not support it (#8753) by @prha
    • b1ac8d6 - [dagster-dbt] Fix error that occurs when generating events for tests that depend on sources (#8775) by @OwenKephart
    • 2a237a0 - Specifying executor docs examples (#8530) by @dpeng817
    • eccdac2 - prevent multiple sensor evaluations from multithreaded race conditions (#8720) by @prha
    • 519d77f - Fix config case for default executor (#8777) by @dpeng817
    • daadb5a - Ensure graph inputs/outputs are included in all_dagster_types (#8736) by @smackesey
    • a94210a - improve error for build_schedule_from_partitioned_job with non-partitioned asset job (#8776) by @sryza
    • 0bab6fd - [dagit] Bring back the global asset graph as an “all asset groups” view (#8709) by @bengotow
    • f6987da - fix source asset regression (#8784) by @smackesey
    • 530c321 - fix issue with repos and partitioned scheduled asset jobs (#8779) by @sryza
    • b3ba40c - 0.15.5 Changelog (#8781) by @prha
    • 0e32054 - changelog (#8788) by @prha
    • 1452280 - Option to hide daemon heartbeat timestamp in Dagit (#8785) by @johannkm
    • 1e7691d - Fix bug with how resources are applied in materialize (#8790) by @dpeng817
    • ccd1893 - Add default_logger_defs arg to repository (#8512) by @dpeng817
    • e10b6f3 - update hackernews tests to use asset invocation and materialize_to_memory (#8592) by @dpeng817
    • 5bdd6cf - Add MetaDataEntry.path to PickledObjectS3IOManager (#8732) by @danielgafni
    • 6d19b1d - Automation: versioned docs for 0.15.5 by @elementl-devtools
    • f459da0 - add define_asset_job to __all__ (#8791) by @sryza
    • f1a4612 - eliminate incorrect SDA warnings (#8769) by @sryza
    • 738e7eb - update hacker news assets example for post-0.15.0 APIs (#7904) by @sryza
    • 5938830 - partition mappings on graph-backed assets (#8768) by @sryza
    • b721c70 - Snowflake IO Manager handles pandas timestamps (#8760) by @jamiedemaria
    • cb178b1 - Add Python 3.10 testing to BK and other image adjustments (#7700) by @smackesey
    • f4415aa - Option to skip daemon heartbeats with no errors (#8670) by @johannkm
    • 2d35b84 - Assorted type annotations (#8356) by @smackesey
    • 3e1d539 - Bump urllib3 (#8808) by @dependabot[bot]
    • de366db - Bump rsa (#8807) by @dependabot[bot]
    • 6771e78 - Bump pyyaml (#8806) by @dependabot[bot]
    • 44492db - Change default local grpc behavior to send back "dagster" as the entry point to use, rather than dagit's python environment (#8571) by @gibsondan
    • 776d701 - move Metadata and Tags concept page under jobs section (#8813) by @sryza
    • c142bd9 - updates to multi-assets docs page (#8814) by @sryza
    • 3eb0afd - Update asset ID to contain repository location and name (#8762) by @clairelin135
    • a05f9b1 - unpin gql (#8822) by @prha
    • c89d01e - remove unused old partitions ui (#8796) by @prha
    • 1539cfb - fix broken asset deps in hacker news example (#8809) by @sryza
    • 7f5cd8e - [dagit] Bump TS version (#8704) by @hellendag
    • 8da4644 - fix contributing docs (#8789) by @prha
    • 1714bc9 - Better support for nested causes in dagit (#8823) by @gibsondan
    • 641c707 - [easy] Fix docs link for RetryPolicy (#8830) by @gibsondan
    • 04fb5c1 - Remove the "cronjobs" permission from the helm chart (#8827) by @gibsondan
    • e7111f9 - fix gql resolver for graph-backed assets resources (#8825) by @smackesey
    • 418952d - refactor run storage to enable backfill status queries (#8695) by @prha
    • 3eae463 - refactor backfill / partition pages to stop run fetching (#8696) by @prha
    • 47238c2 - add multi_or_in_process_executor to __all__ (#8831) by @smackesey
    • a3ec60b - avoid apollo cache collision for partition/backfill status (#8841) by @prha
    • f741443 - distinguish between [] and None for asset queries (#8838) by @prha
    • 9906c4b - override batch loader to use asset records instead of legacy event materialization method (#8839) by @prha
    • e523dad - [dagit] Add analytics.group (#8832) by @hellendag
    • 52d4fdf - [dagster-io/ui] Fix disabled Button (#8844) by @hellendag
    • 633d6b4 - Fix issue where partitioned assets with resources fail materialization in dagit (#8837) by @smackesey
    • 652d12e - [dagit] Tweak analytics function sigs (#8851) by @hellendag
    • 7cd7de8 - [asset-defs] allow multi assets to have group names (#8847) by @OwenKephart
    • acf8c4d - Refactor op return checking code (#8755) by @dpeng817
    • df5833e - [dagit] Remove frame-ancestors restriction (#8850) by @hellendag
    • e5aca1b - adjust error messages (#8853) by @dpeng817
    • fb89b1f - [dagit] Update CRA, simplify CSP (#8854) by @hellendag
    • 50dbd6a - key_prefix for AssetsDefinition from_graph and from_op (#8859) by @jamiedemaria
    • 105ad88 - easy: fix dagster pandas link (#8862) by @yuhan
    • 1d33b3f - executor_definition on define_asset_job (#8856) by @sryza
    • c9cc22b - include airflow_operator_to_op in apidoc (#8860) by @sryza
    • 0fc21f5 - add metdata_by_output_name (#8861) by @OwenKephart
    • 3916741 - [dagster-dbt] make group configurable for load_assets_from_dbt (#8863) by @OwenKephart
    • 7eded6b - asset_partitions_def on InputContext and OutputContext (#8858) by @sryza
    • 2dfcff7 - [dagster-dbt] refactor the dbt asset integration (#8793) by @OwenKephart
    • 915948e - [dagster-dbt] seeds and snapshots are assets when using dbt build (#8794) by @OwenKephart
    • e7a82d0 - PartitionMappings when non-partitioned assets depend on partitioned assets (#8866) by @sryza
    • a555b22 - [docs] Dagster + dbt guide (#8714) by @OwenKephart
    • 6e4dfcb - 0.15.6 by @elementl-devtools
    Source code(tar.gz)
    Source code(zip)
  • 0.15.5(Jul 7, 2022)

    New

    • Added documentation and helm chart configuration for threaded sensor evaluations.
    • Added documentation and helm chart configuration for tick retention policies.
    • Added descriptions for default config schema. Fields like execution, loggers, ops, and resources are now documented.
    • UnresolvedAssetJob objects can now be passed to run status sensors.
    • [dagit] A new global asset lineage view, linked from the Asset Catalog and Asset Group pages, allows you to view a graph of assets in all loaded asset groups and filter by query selector and repo.
    • [dagit] A new option on Asset Lineage pages allows you to choose how many layers of the upstream / downstream graph to display.
    • [dagit] Dagit's DAG view now collapses large sets of edges between the same ops for improved readability and rendering performance.

    Bugfixes

    • Fixed a bug with materialize that would cause required resources to not be applied correctly.
    • Fixed issue that caused repositories to fail to load when build_schedule_from_partitioned_job and define_asset_job were used together.
    • Fixed a bug that caused auto run retries to always use the FROM_FAILURE strategy
    • Previously, it was possible to construct Software-Defined Assets from graphs whose leaf ops were not mapped to assets. This is invalid, as these ops are not required for the production of any assets, and would cause confusing behavior or errors on execution. This will now result in an error at definition time, as intended.
    • Fixed issue where the run monitoring daemon could mark completed runs as failed if they transitioned quickly between STARTING and SUCCESS status.
    • Fixed stability issues with the sensor daemon introduced in 0.15.3 that caused the daemon to fail heartbeat checks if the sensor evaluation took too long.
    • Fixed issues with the thread pool implementation of the sensor daemon where race conditions caused the sensor to fire more frequently than the minimum interval.
    • Fixed an issue with storage implementations using MySQL server version 5.6 which caused SQL syntax exceptions to surface when rendering the Instance overview pages in Dagit.
    • Fixed a bug with the default_executor_def argument on repository where asset jobs that defined executor config would result in errors.
    • Fixed a bug where an erroneous exception would be raised if an empty list was returned for a list output of an op.
    • [dagit] Clicking the "Materialize" button for assets with configurable resources will now present the asset launchpad.
    • [dagit] If you have an asset group and no jobs, Dagit will display it by default rather than directing you to the asset catalog.
    • [dagit] DAG renderings of software-defined assets now display only the last component of the asset's key for improved readability.
    • [dagit] Fixes a regression where clicking on a source asset would trigger a GraphQL error.
    • [dagit] Fixed issue where the “Unloadable” section on the sensors / schedules pages in Dagit were populated erroneously with loadable sensors and schedules
    • [dagster-dbt] Fixed an issue where an exception would be raised when using the dbt build command with Software-Defined Assets if a test was defined on a source.

    Deprecations

    • Removed the deprecated dagster-daemon health-check CLI command

    Community Contributions

    • TimeWindow is now exported from the dagster package (Thanks @nvinhphuc!)
    • Added a fix to allow customization of slack messages (Thanks @solarisa21!)
    • [dagster-databricks] The databricks_pyspark_step_launcher now allows you to configure the following (Thanks @Phazure!):
      • the aws_attributes of the cluster that will be spun up for the step.
      • arbitrary environment variables to be copied over to databricks from the host machine, rather than requiring these variables to be stored as secrets.
      • job and cluster permissions, allowing users to view the completed runs through the databricks console, even if they’re kicked off by a service account.

    Experimental

    • [dagster-k8s] Added k8s_job_op to launch a Kubernetes Job with an arbitrary image and CLI command. This is in contrast with the k8s_job_executor, which runs each Dagster op in a Dagster job in its own k8s job. This op may be useful when you need to orchestrate a command that isn't a Dagster op (or isn't written in Python). Usage:

      from dagster_k8s import k8s_job_op
      
      my_k8s_op = k8s_job_op.configured({
       "image": "busybox",
       "command": ["/bin/sh", "-c"],
       "args": ["echo HELLO"],
       },
       name="my_k8s_op",
      )
      
    • [dagster-dbt] The dbt asset-loading functions now support partitions_def and partition_key_to_vars_fn parameters, adding preliminary support for partitioned dbt assets. To learn more, check out the Github issue!

    All Changes

    https://github.com/dagster-io/dagster/compare/0.15.4...0.15.5

    See All Contributors
    • 5192aa0 - Remove unused check_heartbeats arg (#8673) by @johannkm
    • 45e5019 - docs: use dagster brand colors for README (#8660) by @rexledesma
    • a09234a - fix: use absolute url for README images (#8676) by @rexledesma
    • 7a34e49 - Expose TimeWindow in dagster package (#8643) by @nvinhphuc
    • 9842ba8 - hold shift to force asset config modal (#8668) by @smackesey
    • d0ffb1e - add helm values for configuring instance sensor config (#8657) by @prha
    • b335d19 - [dagit] If you have no jobs, prefer routing to asset group over asset catalog (#8613) by @bengotow
    • 44bac29 - [dagit] Add control for graph depth on the Asset Lineage page, default to 5 (#8531) by @bengotow
    • 593cfc4 - [assets] Fix issue with graph backed assets + partitions (#8682) by @OwenKephart
    • eeb23e6 - [dagster-dbt] rework dagster dbt logging, cleanup (#8681) by @OwenKephart
    • b6396aa - Yield run requests from run status sensors (#8635) by @jamiedemaria
    • f53d5f4 - fix docs build (#8688) by @OwenKephart
    • 62801c8 - apidoc for AssetSelection and AssetsDefinition (#8618) by @sryza
    • 17b5233 - Materialize has resources arg, materialize_to_memory sets mem_io_manager for all io managers (#8659) by @dpeng817
    • 44d95b1 - Load asset launchpad for assets with configurable resources (#8677) by @smackesey
    • 01cfe05 - docs: update external links for github issues (#8661) by @rexledesma
    • c1cb62b - docs: convert github issue templates to forms (#8663) by @rexledesma
    • d338a0e - switch unloadable states to dedupe by selector id (#8656) by @prha
    • d3678bc - 0.15.3 changelog (#8690) by @dpeng817
    • e0d5dcb - Automation: versioned docs for 0.15.3 by @elementl-devtools
    • dc91a2d - docs: remove extra cruft from issue form (#8701) by @rexledesma
    • ca432f7 - docs: improve left nav for items with children DREL-359 (#8693) by @yuhan
    • b88b33e - docs: fix prev/next pagination (#8697) by @yuhan
    • 47118bb - docs: docs test should capture Next Image broken links and fix broken links (#8702) by @yuhan
    • 293db00 - [dagit] Display last asset key component on DAG rather than truncated full path (#8692) by @bengotow
    • 0d65304 - [dagit] Collapsed DAG rendering of multiple edges between the same ops (#8479) by @bengotow
    • 7d5bcc8 - [dagit] Add Analytics context (#8674) by @hellendag
    • 3e2fe7e - order the backfill partitions before creating (#8703) by @prha
    • 0533455 - skip threadpool sensor daemon tests (#8717) by @prha
    • 863ba80 - skip threaded sensor tests in py36 (#8726) by @prha
    • dc0fbe8 - Allow run status sensors to support unresolved asset jobs (#8689) by @smackesey
    • 979d644 - [dagster_databricks] - support configuration of job / cluster permissions (#8683) by @Phazure
    • a9ff7ca - [dagster_databricks] support aws_attributes (#8684) by @Phazure
    • 150a06b - [Job Log perf] Use rAF to call throttleSetNodes (#8735) by @salazarm
    • 33f059f - [dagster_databricks] support arbitrary env variables (#8685) by @Phazure
    • c72634c - minor changes to make dev_install (#8745) by @smackesey
    • bc227bb - Remove deprecated daemon health-check cli (#8751) by @johannkm
    • a57caf3 - Add reexecution strategy to auto run retries (#8718) by @johannkm
    • d90fc4a - Fix for blocks_fn option (#8448) by @clairelin135
    • 2d9a9e7 - Automation: versioned docs for 0.15.4 by @elementl-devtools
    • d4735dc - Improve race condition in run monitor (#8729) by @brad-alexander
    • 6f4b300 - Add get_daemon_statuses instance method (#8752) by @johannkm
    • b974184 - 0.15.4 changelog here (#8766) by @prha
    • bdca087 - add retention helm values (#8724) by @prha
    • 823a5d8 - Op that runs a kubernetes job (#8161) by @gibsondan
    • 62f41c4 - [op] fix empty list output bug (#8763) by @alangenfeld
    • 4d88332 - [dagster-dbt] Allow SDAs generated with load_assets_from_dbt* to be partitioned (#8725) by @OwenKephart
    • 874a368 - add validation for graph backed asset graphs (#8754) by @OwenKephart
    • 38ae84b - docs: update config schema descriptions for default executors (#8757) by @rexledesma
    • 0779ab7 - [dagster-dbt] Fix error that occurs when generating events for tests that depend on sources (#8775) by @OwenKephart
    • 5f92ca1 - Restore sensor daemon yielding when evaluating sensors synchronously (#8756) by @prha
    • 4176e42 - bypass bucketed queries for mysql versions that do not support it (#8753) by @prha
    • fd1a3d9 - prevent multiple sensor evaluations from multithreaded race conditions (#8720) by @prha
    • 4058bb5 - Fold asset_defs submodule into main dagster structure (#8446) by @smackesey
    • 87e291e - Fix config case for default executor (#8777) by @dpeng817
    • 1544643 - Ensure graph inputs/outputs are included in all_dagster_types (#8736) by @smackesey
    • eeeb82c - [dagit] Bring back the global asset graph as an “all asset groups” view (#8709) by @bengotow
    • e40122e - fix source asset regression (#8784) by @smackesey
    • 2fe13df - improve error for build_schedule_from_partitioned_job with non-partitioned asset job (#8776) by @sryza
    • a42146f - fix issue with repos and partitioned scheduled asset jobs (#8779) by @sryza
    • 798e9c1 - 0.15.5 Changelog (#8781) by @prha
    • 39fecd4 - Fix bug with how resources are applied in materialize (#8790) by @dpeng817
    • 8f9f0d0 - 0.15.5 by @elementl-devtools
    Source code(tar.gz)
    Source code(zip)
  • 0.15.4(Jul 6, 2022)

    Bugfixes

    • Reverted sensor threadpool changes from 0.15.3 to address daemon stability issues.

    All Changes

    https://github.com/dagster-io/dagster/compare/0.15.3...0.15.4

    See All Contributors
    • d0bc0b4 - Automation: versioned docs for 0.15.3 by @elementl-devtools
    • 73ca70a - Revert "bump timeout for large sensor test (#8671)" by @johannkm
    • c988915 - Revert "Add a threadpool to the sensor daemon (#8642)" by @johannkm
    • 9441ce3 - 0.15.4 by @elementl-devtools
    Source code(tar.gz)
    Source code(zip)
Owner
Dagster
An orchestration platform for the development, production, and observation of data assets.
Dagster
A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.

Realtime Financial Market Data Visualization and Analysis Introduction This repo shows my project about real-time stock data pipeline. All the code is

null 6 Sep 7, 2022
CaterApp is a cross platform, remotely data sharing tool created for sharing files in a quick and secured manner.

CaterApp is a cross platform, remotely data sharing tool created for sharing files in a quick and secured manner. It is aimed to integrate this tool with several more features including providing a User Interface.

Ravi Prakash 3 Jun 27, 2021
Open source platform for Data Science Management automation

Hydrosphere examples This repo contains demo scenarios and pre-trained models to show Hydrosphere capabilities. Data and artifacts management Some mod

hydrosphere.io 6 Aug 10, 2021
An Integrated Experimental Platform for time series data anomaly detection.

Curve Sorry to tell contributors and users. We decided to archive the project temporarily due to the employee work plan of collaborators. There are no

Baidu 486 Dec 21, 2022
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Amundsen 3.7k Jan 3, 2023
Elementary is an open-source data reliability framework for modern data teams. The first module of the framework is data lineage.

Data lineage made simple, reliable, and automated. Effortlessly track the flow of data, understand dependencies and analyze impact. Features Visualiza

null 898 Jan 9, 2023
PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io

PyStan PyStan is a Python interface to Stan, a package for Bayesian inference. Stan® is a state-of-the-art platform for statistical modeling and high-

Stan 229 Dec 29, 2022
🧪 Panel-Chemistry - exploratory data analysis and build powerful data and viz tools within the domain of Chemistry using Python and HoloViz Panel.

???? ??. The purpose of the panel-chemistry project is to make it really easy for you to do DATA ANALYSIS and build powerful DATA AND VIZ APPLICATIONS within the domain of Chemistry using using Python and HoloViz Panel.

Marc Skov Madsen 97 Dec 8, 2022
Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

null 2 Nov 20, 2021
PrimaryBid - Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift

Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift This project is composed of two parts: Part1 and Part2

Emmanuel Boateng Sifah 1 Jan 19, 2022
fds is a tool for Data Scientists made by DAGsHub to version control data and code at once.

Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc

DAGsHub 359 Dec 22, 2022
Python data processing, analysis, visualization, and data operations

Python This is a Python data processing, analysis, visualization and data operations of the source code warehouse, book ISBN: 9787115527592 Descriptio

FangWei 1 Jan 16, 2022
Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials

Data Scientist Learning Plan Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials

Trung-Duy Nguyen 27 Nov 1, 2022
Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather than invoking the Python interpreter, Tuplex generates optimized LLVM bytecode for the given pipeline and input data set.

Tuplex 791 Jan 4, 2023
A data parser for the internal syncing data format used by Fog of World.

A data parser for the internal syncing data format used by Fog of World. The parser is not designed to be a well-coded library with good performance, it is more like a demo for showing the data structure.

Zed(Zijun) Chen 40 Dec 12, 2022
Fancy data functions that will make your life as a data scientist easier.

WhiteBox Utilities Toolkit: Tools to make your life easier Fancy data functions that will make your life as a data scientist easier. Installing To ins

WhiteBox 3 Oct 3, 2022
A Big Data ETL project in PySpark on the historical NYC Taxi Rides data

Processing NYC Taxi Data using PySpark ETL pipeline Description This is an project to extract, transform, and load large amount of data from NYC Taxi

Unnikrishnan 2 Dec 12, 2021
Utilize data analytics skills to solve real-world business problems using Humana’s big data

Humana-Mays-2021-HealthCare-Analytics-Case-Competition- The goal of the project is to utilize data analytics skills to solve real-world business probl

Yongxian (Caroline) Lun 1 Dec 27, 2021