dbt-firebolt
dbt-firebolt supports dbt 0.21 and newer
Installation
First, download the JDBC driver and place it wherever you'd prefer. If you've never installed a Java Runtime Environment you will need to download and install one from either OpenJDK or Oracle.
Now install dbt-firebolt. For the current version:
pip install dbt-firebolt
Setup
Engines
For dbt to function with Firebolt you must have an engine connected to your database and available. In addition, these needs must be met:
- The engine must be a general-purpose read-write engine.
- You must have permissions to access the engine.
- The engine must be running.
- If you're not using the default engine for the database, the name of the engine must be specified.
YAML configuration file
You'll need to add your project to the profiles.yml
file. These fields are necessary:
type
user
password
database
schema
jar_path
These fields are optional:
engine_name
(See note above.)host
(The host defaults toapi.app.firebolt.io
. If you want to use a dev account you must include thehost
field and set it toapi.dev.firebolt.io
.)
Note that, although the value of type
is always firebolt
, it must be included either in profiles.yml
or in the dbt_project.yml file for your application.
Example file:
my_project:
target: fb_app
fb_app:
type: firebolt
user:
password:
database:
schema:
jar_path:
threads: 1 # The following two fields are optional. Please see the notes above. engine_name:
host: api.app.firebolt.io
dbt feature support
feature | supported |
---|---|
tables/views | |
ephemeral | |
tests | |
docs | |
incremental | |
snapshot | |
source_freshness | |
Model Configuration in Firebolt
Dimension and Fact Tables
Both fact and dimension tables are supported. When materialized='table'
, table type
and primary index
configurations can be set. table type
can be either fact
or dimension
. primary_index
is required for fact tables, and can be a string or a list of strings.
These configs can be set by either:
- a config block at the top of that model's SQL file (like below), or
{{
config(
materialized = 'table',
table_type = 'dimension',
primary_index = ['customer_id', 'first_name']
)
}}
- in the
dbt_project.yml
or model schema YAML file.
Read more in dbt docs on configuring models
Join and Aggregating Indexes
In addition to primary indexes, Firebolt also supports:
dbt-firebolt follows the same convention for indexes as was introduced to dbt-postgres (more info on indexes usage in dbt-postgres).
Naming
In dbt-firebolt, indexes are named as follows, with the number being a unix timestamp at the time of execution
- template:
table-name__key-column__index-type_unix-timestamp
- join index:
my_orders__order_id__join_1633504263
- aggregating index:
my_orders__order_id__aggregating_1633504263
Usage
The index
argument takes a list of dictionaries, where each dictionary is an index you'd like to define. there are two types
of indexes that can be defined here: aggregating
and join
. The required fields for each index are as follows:
aggregating
:key_column
(string) andaggregation
(string of list of strings)join
:join_column
(string) anddimension_column
(string of list of strings)
Fact table with aggregating index
-- orders.sql
{{
config(
materialized = 'table',
table_type = 'fact',
primary_index = 'id',
indexes = [
{
'type': 'aggregating',
'key_column': 'order_id',
'aggregation': ['COUNT(DISTINCT status)', 'AVG(customer_id)']
}
]
)
}}
Dimension table with join index
-- orders.sql
{{
config(
materialized = 'table',
table_type = 'dimension',
indexes = [
{
'type': 'join',
'join_column': 'order_id',
'dimension_column': ['customer_id', 'status']
}
]
)
}}
Fact table with two aggregating indexes and one join index
-- orders.sql
{{
config(
materialized = 'table',
table_type = 'dimension',
indexes = [
{
'type': 'aggregating',
'key_column': 'order_id',
'aggregation': ['COUNT(DISTINCT status)', 'AVG(customer_id)']
},
{
'type': 'aggregating',
'key_column': 'customer_id',
'aggregation': 'COUNT(DISTINCT status)'
},
{
'type': 'join',
'join_column': 'order_id',
'dimension_column': ['customer_id', 'status']
}
]
)
}}
Recommended dbt project configurations
quote_columns
To prevent a warning, you should add a configuration as below to your dbt_project.yml
. For more info, see the relevant dbt docs page
seeds:
+quote_columns: false # or `true` if you have csv column headers with spaces
dbt projects with concurrent users
Currently, with dbt-firebolt
, all models will be run in the same schema, so the schema
provided isn't used, but is still required. If you are a team of analytics engineers using a the same database and engine, a recommended practice is to add the following macro to your project. It will prefix the model name/alias with the schema value to provide namespacing so that multiple developers are not interacting with the same set of models.
For example, consider two analytics engineers on the same project: Shahar and Eric.
If in their .dbt/profiles.yml
, Sharar provides schema=sh
, and Eric, schema=er
, when they each run the customers
model, the models will land in the database as sh_customers
and er_customers
, respectively.
-- macros/generate_alias_name.sql
{% macro generate_alias_name(custom_alias_name=none, node=none) -%}
{%- if custom_alias_name is none -%}
{{ node.schema }}_{{ node.name }}
{%- else -%}
{{ node.schema }}_{{ custom_alias_name | trim }}
{%- endif -%}
{%- endmacro %}
External Tables
Documentation on dbt's use of external tables can be found in the dbt documentation.
Documentation on using external tables including properly configuring IAM can be found in the Firebolt documentation.
Installation
To install and use dbt-external-tables with firebolt, you must:
- Add the package to your
packages.yml
,packages: - package: dbt-labs/dbt_external_tables version:
- add this to your
dbt_project.yml
, anddispatch: - macro_namespace: dbt_external_tables search_order: ['dbt', 'dbt_external_tables']
- call
dbt deps
Usage
To use external tables, you must define a table as EXTERNAL
in your project.yml file. Every external table must contain fields for url, type, and object pattern. Note that the Firebolt external table specification differs slightly from the dbt specification in the dbt documentation in that it does not require all the fields shown in dbt's documentation.
In addition to specifying the columns, an external table may specify partitions. Partitions are not columns and a partition name cannot have the same name as a column. An example yaml file follows. In order to avoid yml parsing errors it is likely necessary to quote at least the url
, object_pattern
, and regex
values.
sources: - name: firebolt_external schema: "{{ target.schema }}" loader: S3 tables: - name:external: url: 's3:// / ' object_pattern: '' type: '' credentials: internal_role_arn: arn:aws:iam::id:/ external_role_id:object_pattern: ' ' compression: '' partitions: - name:data_type: regex: ' ' columns: - name:data_type:
Changelog
See our changelog or our release history for more info