Open-source data observability for modern data teams

Overview

Logo

License

Use cases

Monitor your data warehouse in minutes:

  • Data anomalies monitoring as dbt tests
  • Data lineage made simple, reliable, and automated
  • dbt operational monitoring
  • Slack alerts

⭐ Support us with a ⭐

Quick start

Quick start: Data monitoring as dbt tests in minutes.

Quick start: Data lineage.

Our full documentation is available here.

Join our Slack to learn more on Elementary.

(Not a dbt user? you can still use Elementary data monitoring, reach out to us on Slack and we will help).

Data anomalies monitoring as dbt tests

Elementary delivers data monitoring and anomaly detection as dbt tests.

Elementary dbt tests are data monitors that collect metrics and metadata over time. On each execution, the tests analyze the new data, compare it to historical metrics, and alert on anomalies and outliers.

Elementary data monitors as tests are configured and executed like native tests in your project!

Demo & sandbox

Data anomalies monitoring as dbt tests demo video.
Try out our live lineage sandbox here.

Slack configuration

Community & Support

For additional information and help, you can use one of these channels:

  • Slack (Live chat with the team, support, discussions, etc.)
  • GitHub issues (Bug reports, feature requests)
  • Roadmap (Vote for features and add your inputs)
  • Twitter (Updates on new releases and stuff)

Integrations

  • Snowflake
  • BigQuery
  • Redshift - Data monitoring

Ask us for integrations on Slack or as a GitHub issue.

License

Elementary is licensed under Apache License 2.0. See the LICENSE file for licensing information.

Comments
  • How to use days back in report and anomaly detection tests?

    How to use days back in report and anomaly detection tests?

    We had two questions about this in #support on Slack recently. We need to add it to FAQ and see if there are other parts of the flow where we can better address this.

    documentation good first issue 
    opened by Maayan-s 9
  • Parse failed rows as part of the package

    Parse failed rows as part of the package

    Failed dbt test results return an error message that includes the number of failed rows. Today we load the message as is in the package, and parse the number in the CLI using Python. If we move the parsing to the package, users could run their analysis on these data.

    enhancement dbt package 
    opened by Maayan-s 7
  • Integrate linage visualization with the report UI

    Integrate linage visualization with the report UI

    Motivation

    The report UI is awesome. When we find failures on a table, we would like to investigate affected downstream tables. So, it would be great to support the lineage feature in the report UI.

    feature 
    opened by yu-iskw 7
  • [Question] How can we prepare `table_monitors_config`?

    [Question] How can we prepare `table_monitors_config`?

    Overview

    I tried to monitor dbt tests with jaffle_shopw by following the documentation. But, I was not able to upload artifacts due to the lack of the destination table. If I am correct, we have to create the table table_monitors_config a head. But, the documentation doesn't describes table_monitors_config. How can we prepare the table?

    Environments

    • Python 3.8
    • dbt 1.0.3
    • elementary 0.3.2.

    Error message

    09:22:03  Running 2 on-run-end hooks
    09:22:33  1 of 2 START hook: jaffle_shop.on-run-end.0..................................... [RUN]
    09:22:33  1 of 2 OK hook: jaffle_shop.on-run-end.0........................................ [OK in 0.00s]
    09:22:33  2 of 2 START hook: elementary.on-run-end.0...................................... [RUN]
    09:22:33  2 of 2 OK hook: elementary.on-run-end.0......................................... [OK in 0.00s]
    09:22:33
    09:22:33
    09:22:33  Finished running 8 view models, 9 incremental models, 2 table models, 3 seeds, 17 tests, 3 hooks in 44.13s.
    09:22:33
    09:22:33  Completed with 2 errors and 0 warnings:
    09:22:33
    09:22:33  Runtime Error in model filtered_information_schema_columns (models/edr/metadata_store/filtered_information_schema_columns.sql)
    09:22:33    404 Not found: Table sandbox-project:jaffle_shop_elementary.table_monitors_config was not found in location asia-northeast1
    09:22:33
    09:22:33    (job ID: d0f16aa6-b33d-493f-b0d1-8b934a090682)
    09:22:33
    09:22:33  Runtime Error in model filtered_information_schema_tables (models/edr/metadata_store/filtered_information_schema_tables.sql)
    09:22:33    404 Not found: Table sandbox-project:jaffle_shop_elementary.table_monitors_config was not found in location asia-northeast1
    09:22:33
    09:22:33    (job ID: 03fc0b75-262f-49f8-8957-96b55268e9ac)
    
    opened by yu-iskw 7
  • support browser authentication method

    support browser authentication method

    Our dbt_runner expect the log messages of dbt run-operation to be in json format. However, browser authentication scenario returns log messages as string. So instead of failing on json parsing, I log the failure and the unsupported log message to our edr file. This is OK because run-operation log message should be in json format as expected, so if anything is returning as string it is currently OK to ignore it. And if something will break in the future, we will have the edr log to understand what happened and what we should change.

    opened by IDoneShaveIt 6
  • Set

    Set "elementary_tests" schema to upper case

    Hi,

    I'm using elementary with Snowflake, where we have set all of our schemas as upper-case. However, in the elementary dbt package the "__tests" is hard-coded as lower-case resulting in a schema named "ELEMENTARY__tests".

    Can you give the option of creating a var for this so we have the option of lower-case or upper-case suffix for "__tests".

    (Raised this via slack here: https://elementary-community.slack.com/archives/C02CTC89LAX/p1660646926498189)

    Thanks!

    feature dbt package 
    opened by ltw94 6
  • New monitor: column values distribution

    New monitor: column values distribution

    The feature is a new monitor for column anomalies detection.

    Monitor goal: Detecting a change in the distribution of the different values in a column.

    Example: An example would be an orders table which represents orders placed across multiple stores. There's a column in the table which represents the store the order was placed in. We'd then want to have a test fail if the count of orders by day for a store dropped significantly below the typical value for that store. Something along the lines of:

    select column_name, count(*)
    from table
    group by column_name
    

    Possible implementations: Package:

    1. Add new CTE in the current column_monitoring_query
    2. Create a new query for this test, and add it to the flow of test_column_anomalies Anomaly detection does not need to change to support this.

    CLI: We need to think about how to present it in the UI. Currently, we have a metric graph for each monitor+column. This test will output a metric per value+monitor+column.

    enhancement 
    opened by Maayan-s 5
  • [Feature] Store elementary results in a single schema

    [Feature] Store elementary results in a single schema

    At the time of writing this with elementary 0.4.1, elementary persists results per dbt's schema. But, personally, it can get messy, as I have a lot of dbt's schema, that is, BigQuery datasets. Let's consider if we have 60 BigQuery datasets accross 5 Google Cloud projects in a dbt project. The same number of new BigQuery datasets for elementary are created. It looks messy for me. I would like to bundle them in a single BigQuery dataset to keep it clean. Moreover, it would be nice to specify a single destination schema in a database, that is, a single destination BigQuery dataset in a GCP project.

    opened by yu-iskw 5
  • SQL compilation error when using column-level anomalies

    SQL compilation error when using column-level anomalies

    Hey all,

    I added the elementary package to the dbt repository and used dbt run to create all required tables. But when I tried to add column-level anomalies, the dbt run gave me an error as

    19:10:32    001003 (42000): SQL compilation error:
    19:10:32    syntax error line 7 at position 19 unexpected '""'.
    19:10:32    syntax error line 10 at position 15 unexpected ''day''.
    19:10:32    syntax error line 10 at position 26 unexpected '('.
    19:10:32    syntax error line 10 at position 40 unexpected 'as'.
    19:10:32    syntax error line 12 at position 1 unexpected ')'.
    

    The configuration I added to the yml file is as:

      - name: table_name
        config:
          elementary:
            timestamp_column: "_inserted_at"
        tests:
          - elementary.table_anomalies:
              table_anomalies:
                - row_count
                - freshness
        columns:
          - name: "id"
            description: " "
            quote: true
            tests:
              - not_null
              - unique
              - elementary.column_anomalies:
                  column_anomalies:
                    - missing_count
                    - min_length
    

    However, table-level anomalies worked as expected. I tried to look up compiled SQL files from target/compiled and target/run, but couldn't find any models relevant to this problem. Any ideas?

    opened by nbdaw-st 5
  • Running process got stuck

    Running process got stuck

    Hi team, I tried to run elementary, but it just got stuck after logging into Snowflake. Nothing has changed on my screen at least for an hour. image

    Do you have any ideas about what could go wrong?

    opened by TrololoLi 4
  • Issue #301 : add support for `timezone` param in slack alerts

    Issue #301 : add support for `timezone` param in slack alerts

    Providing support to set a timezone parameter in the config.yml for Slack alerts.

    This why users are able to configure the timezone they want the timestamp will be converted to. by default and for backwards competability in case the parameter was not set, we will still convert to the user local time.

    opened by Nic3Guy 3
  • Deleted scripts that are prone to errors due to their dependence on the developer's environment.

    Deleted scripts that are prone to errors due to their dependence on the developer's environment.

    The only way to deploy a package should be using the pypi-release workflow. Manually deploying the package is more susceptible to errors and mistakes as we've seen in the past. Also some of those scripts don't work.

    opened by elongl 0
  • Allow Boto3 to determine credentials, or have the local IAM role as an option for S3 report uploads.

    Allow Boto3 to determine credentials, or have the local IAM role as an option for S3 report uploads.

    I want to run Elementary as part of an Airflow job, running in an AWS EKS Pod that has a role that allows access to the bucket. The current way that AWS credentials are handled in the S3 Client requires that the user provide either an AWS credential profile, which also ultimately has to have credentials, or AWS Secret/Access Keys which ends up stopping boto3 from finding the IAM role and using that for authentication. I suppose it could be a flag in the CLI, and the validation in config.py could check for the flag as well to skip the other credentials. If there's a preference either way, I could submit a PR for this, it's a small change I think.

    enhancement CLI 
    opened by luthes 1
Releases(v0.4.11)
  • v0.4.11(Sep 7, 2022)

    New Features

    • Support uploading the report to flexible path in S3 & GCS buckets 😎
    • Support configuring slack channel also at the test level 💯

    Bug Fixes

    • Linage screen fixes and improvements ✌🏻
    • Fix Slack rate limit error

    Contributions & Acknowledgements

    Thanks @YashPimple for making his first contribution.

    Source code(tar.gz)
    Source code(zip)
  • v0.4.10(Aug 30, 2022)

  • v0.4.9(Aug 29, 2022)

    New Features

    • New Lineage screen 🥳 🎉 🎈 dbt lineage enriched with test results.
    • Browser authentication support via SSO in profiles.yml.
    • Custom report name in send-report.
    • edr returns exit codes according to whether it succeeded or failed.
    • A new Github Action for running edr in an automated manner.

    Infrastructure

    • Report side bar issue when files string was part of the models path.
    • Added CI to automatically run E2E tests using Github Actions.
    • Added more logs when CLI fails to expedite incident resolution.

    Guides

    Source code(tar.gz)
    Source code(zip)
  • v0.4.8(Aug 15, 2022)

  • v0.4.7(Aug 14, 2022)

    New Changes

    • New! Databricks support (beta)!! ✌🏻💯
    • New! Dimension values monitoring!! 💪🏻
    • New! S3 / GCS integration (upload report & static website support)!! 😎
    • New! Docs are first citizen and part of the repository!! 🤯

    Acknowledgements & Contributions

    • @hahnbeelee for making first contribution 👏🏻
    • @hanywang2 for making first contribution 👏🏻
    • @Aylr for making first contribution 👏🏻
    Source code(tar.gz)
    Source code(zip)
  • v0.4.6(Jul 27, 2022)

    Same as v0.4.5 but with the following fixes-

    • Fixed dependencies issue between platforms (BigQuery, Redshift, Snowflake)
    • Fixed edr monitor missing alert modules
    • Fixed duplicated values in UI filters
    Source code(tar.gz)
    Source code(zip)
  • v0.4.5(Jul 25, 2022)

    New Changes

    • New! Inspect upstream and downstream test results in UI
    • New! Alerts on models and snapshots failures and errors
    • Added the option to subscribe for alerts
    • Added custom Slack channel for alerts
    • Long tests queries support in alerts
    • Configurable name for report file
    • Flag for sampling passed Elementary anomaly tests

    Bug Fixes

    • Fixed error status tests failing the report
    • Fixed multiple owners in alerts
    • Fixed race condition in alerts when multiple dbt test jobs are running
    • Fixed Slack token integration bug due to Slack API pagination
    Source code(tar.gz)
    Source code(zip)
  • v0.4.4(Jul 14, 2022)

    New Changes

    • Added an error page on a failed report.

    Bug Fixes

    • Handling a race condition with multiple dbt test concurrently to edr monitor.
    Source code(tar.gz)
    Source code(zip)
  • v0.4.2(Jul 6, 2022)

    New Changes

    • New test runs screen to monitor test executions!!!
    • Added support for sending Elementary's report via Slack!!!
    • Added filters and sorting to the UI table.
    • Made Elementary 'pass' tests expandable as well for visibility.
    • Added 'error' status support for tests that didn't compete successfully.
    • Improved Slack alerts reliability.
    • Added support for Slack tokens.

    Bug Fixes

    • Slack alerts with long text failed due to Slack limitation.
    • Failed to parse a list of model owners in the Slack alerts.

    Acknowledgements & Contributions

    • @nimrodne
    • @shahafa
    Source code(tar.gz)
    Source code(zip)
  • v0.4.1(Jun 20, 2022)

    New Changes

    • Elementary now supports showing results and sending alerts also for dbt's Singular tests!
    • Added status code to the CLI to get better indication of a CLI successful run
    • Added support for showing dbt sub types in the UI
    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(Jun 17, 2022)

    New Changes

    • Elementary now supports showing its dbt package results in a UI! 🥇
    • New CLI command to open Elementary UI - edr monitor report
    • Elementary UI shows test results and metrics for both Elementary and regular dbt tests

    Acknowledgements & Contributions

    • @IDoneShaveIt for making his first contribution
    • @nimrodne for FE development
    • @shahafa for FE development
    • @elongl

    (Bumped version from 0.2.9 to 0.4.0 for compatibility with the dbt package version)

    Source code(tar.gz)
    Source code(zip)
  • v0.2.9(May 18, 2022)

  • v0.2.7(May 12, 2022)

    • Added detailed alerts on regular dbt test failures
    • Added rich metadata to alerts including owners, tags, params, query, sample rows and more
    • Added slack webhook CLI param
    • Added new alerts foramatting
    Source code(tar.gz)
    Source code(zip)
  • v0.2.6(Apr 25, 2022)

  • v0.2.5(Mar 28, 2022)

  • v0.2.4(Mar 21, 2022)

  • v0.2.3(Mar 20, 2022)

  • v0.2.2(Mar 16, 2022)

    This version presents the following enhancements -

    • New alerts aggregation
    • Supports our new dbt package -
      • Monitors are natively defined as dbt tests
      • Monitors run as part of dbt test
    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Mar 10, 2022)

  • v0.2.0(Mar 4, 2022)

    This version presents the following enhancements -

    • Configuration directly from dbt yml files 👍
    • Auto-upload of dbt artifacts to the DWH 💯
    • New anomaly detection module 👍
    • New dbt artifacts module 🥇
    • New alerts for table and column level anomalies 💯
    Source code(tar.gz)
    Source code(zip)
  • v0.1.5(Feb 16, 2022)

  • v0.1.4(Feb 10, 2022)

  • v0.1.3(Feb 10, 2022)

    • Faster and much more accurate lineage based on the new access_history feature in Snowflake!
    • Fast graph explorations as now graph generation is a separate command from visualization
    • New graph filters on db, schema and table based on '+' operator
    Source code(tar.gz)
    Source code(zip)
  • v0.1.2(Feb 1, 2022)

  • v0.1.1(Feb 1, 2022)

  • v0.1.0(Jan 30, 2022)

    • New monitoring module! (see the docs to learn more)
    • New internal dbt package to automatically detect schema changes
    • Slack alerts support
    • New command line interface, that supports both data lineage and monitoring
    • Faster lineage!
    • New interactive progress bars and spinners
    Source code(tar.gz)
    Source code(zip)
  • v0.0.23(Nov 29, 2021)

    • Added support for MERGE queries in Snowflake
    • Changed ignore_schema default so now lineage by default will present every table in the database
    Source code(tar.gz)
    Source code(zip)
  • v0.0.22(Nov 10, 2021)

    • Added query id to the monitoring details
    • Added progress bars to all major phases
    • Fixed a bug with wrong volume reporting in Snowflake (specifically when pulling from account_usage)
    Source code(tar.gz)
    Source code(zip)
  • v0.0.21(Nov 9, 2021)

    • Fixed missing monitoring details when using table filters
    • Fixed usage issue when using both ignore_schema and table flags
    • Added safeguard to avoid failing the entire run when parsing of a specific query fails
    • Added query duration to the monitoring details
    • Fixed some issues with anonymous metrics
    • Saved all failed queries to a failed_queries.json
    • Changed the --export-query-history flag default to be True (now it exports the history by default)
    Source code(tar.gz)
    Source code(zip)
  • v0.0.20(Nov 3, 2021)

Cloud Native sample microservices showcasing Full Stack Observability using AppDynamics and ThousandEyes

Cloud Native Sample Bookinfo App Observability Bookinfo is a sample application composed of four Microservices written in different languages.

Cisco DevNet 13 Jul 21, 2022
Tracing and Observability with OpenFaaS

Tracing and Observability with OpenFaaS Today we will walk through how to add OpenTracing or OpenTelemetry with Grafana's Tempo. For this walk-through

Lucas Roesler 7 Sep 7, 2022
BloodCheck enables Red and Blue Teams to manage multiple Neo4j databases and run Cypher queries against a BloodHound dataset.

BloodCheck BloodCheck enables Red and Blue Teams to manage multiple Neo4j databases and run Cypher queries against a BloodHound dataset. Installation

Mr B0b 16 Nov 5, 2021
Virtual webcam that takes real webcam footage and replaces the background in order to have Virtual Backgrounds in MS Teams for Linux where the feature is unimplemented.

Background Remover The Need It's been good long while since Microsoft first released a Teams version for Linux and yet, one of Teams' coolest features

Dylan Turner 76 Sep 12, 2022
Track testrail productivity in automated reporting to multiple teams

django_web_app_for_testrail testrail is a test case management tool which helps any organization to track all consumption and testing of manual and au

Vignesh 2 Nov 21, 2021
Wunderland desktop wallpaper and Microsoft Teams background.

Wunderland Professional Impress your colleagues, friends and family with this edition of the "Wunderland" wallpaper. With the nostalgic feel of the or

null 3 Aug 5, 2022
A collection of modern themes for Tkinter TTK

ttkbootstrap A collection of modern flat themes inspired by Bootstrap. Also includes TTK Creator which allows you to easily create and use your own th

Israel Dryer 664 Sep 21, 2022
A modern Python build backend

trampolim A modern Python build backend. Features Task system, allowing to run arbitrary Python code during the build process (Planned) Easy to use CL

Filipe Laíns 36 Aug 16, 2022
A python program with an Objective-C GUI for building and booting OpenCore on both legacy and modern Macs

A python program with an Objective-C GUI for building and booting OpenCore on both legacy and modern Macs, see our in-depth Guide for more information.

dortania 3.3k Sep 29, 2022
A Modern Fetch Tool for Linux!

Ufetch A Modern Fetch Tool for Linux! Programming Language: Python IDE: Visual Studio Code Developed by Avishek Dutta If you get any kind of problem,

Avishek Dutta 7 Dec 12, 2021
Lightweight and Modern kernel for VK Bots

This is the kernel for creating VK Bots written in Python 3.9

Yrvijo 4 Nov 21, 2021
Convert Roman numerals to modern numerals and vice-versa

Roman Numeral Conversion Utilities This is a utility module for converting from and to Roman numerals. It supports numbers upto 3,999,999, using the v

Fictive Kin 1 Dec 17, 2021
A modern python module including many useful features that make discord bot programming extremely easy.

discord-super-utils Documentation Secondary Documentation A modern python module including many useful features that make discord bot programming extr

null 105 Sep 27, 2022
This Program Automates The Procces Of Adding Camos On Guns And Saving Them On Modern Warfare Guns

This Program Automates The Procces Of Adding Camos On Guns And Saving Them On Modern Warfare Guns

Flex Tools 6 May 26, 2022
chiarose(XCR) based on chia(XCH) source code fork, open source public chain

chia-rosechain 一个无耻的小活动 | A shameless little event 如果您喜欢这个项目,请点击star 将赠送您520朵玫瑰,可以去 facebook 留下您的(xcr)地址,和github用户名。 If you like this project, please

ddou123 378 Sep 15, 2022
Source-o-grapher is a tool built with the aim to investigate software resilience aspects of Open Source Software (OSS) projects.

Source-o-grapher is a tool built with the aim to investigate software resilience aspects of Open Source Software (OSS) projects.

Aristotle University 5 Jun 28, 2022
null 1 May 12, 2022
A free and open-source chess improvement app that combines the power of Lichess and Anki.

A free and open-source chess improvement app that combines the power of Lichess and Anki. Chessli Project Activity & Issue Tracking PyPI Build & Healt

null 89 Sep 27, 2022