Context
Palm currently requires manual-ish command creation with strings which get passed to the container/host runner. We need a smart way for command chaining to occur so users aren't limited to a single command-line execution for their custom commands.
Is your feature request related to a problem? Please describe.
We currently have composite commands chain together with "&&", which requires every step in the command to complete successfully in order for the rest of the steps to run at all. This is a bug for palm-dbt
(detailed in #17) because idempotency of dev/CI environments requires cleanup to execute regardless of model execution failure or success.
Describe the solution you'd like
Writing palm commands would be done through a palm library function for command building that provides chaining and the ability to handle errors. This would streamline development of new commands with repeated steps because you could simply chain together a set of existing commands in the desired execution order. Interface for a new command wouldn't appear as eg. dbt clean && dbt deps && dbt run && dbt test && dbt run-operation drop_branch_schemas
. Instead, each command would be stored in a queue of some kind and executed one by one, with customization for whether a failure is raised to the process or handled, and customizable behavior depending on whether the preceding step succeeded or failed.
Why? Because a one-size-fits-all approach bakes in a lot of our opinionated processes. Since I work most in dbt, I'll focus on that use-case. Palm-dbt
has an extremely prescriptive approach to dbt runs, which is fine for now, but most certainly doesn't feel scalable, and does feel brittle. Every command in the plugin has repeated code inside it (dbt clean && dbt deps && dbt seed --full-refresh && ... run-operation drop_branch_schemas
) and there are bugs in the way commands execute (like the example mentioned above).
Chaining commands would allow us to specify how each piece of our completed command works independently. Then, you can chain them together safely with the confidence that it will just work. Users of plugins can use the building blocks of all the pieces that exist, with the ability to chain them together in different orders, with different default behavior, without reinventing the whole wheel.
Eg. new functionality for palm cycle
inside a dbt project, utilizing palm-dbt
:
def dbt_run(..., raise_error=False):
try:
result = run_in_docker('dbt run')
except model_build_failures:
if raise_error == True:
raise model_build_failures
return result
...
# palm cycle
def cli(count=2):
cmd = palm.command_builder()
cmd.add(dbt_clean)
cmd.add(dbt_deps)
while count > 0:
cmd.add(dbt_run)
cmd.add(dbt_test, depends_on_previous=True)
count -= 1
cmd.add(dbt_cleanup, raise_error=True)
cmd.execute() # execution of all commands in the builder in above order
And the interface for a user running the command:
$ palm cycle 1
Executing `dbt clean`...
Clean as a whistle!
Executing `dbt deps`...
All dependencies installed!
Executing `dbt run`...
Oops, some models failed to build. Skipping test...
`dbt test` skipped due to model build failures
Executing `dbt run-operation drop_branch_schemas`...
Schema `test.mrogers_test_new_feature_01993049` cleaned up.
Then, a user of palm-dbt
decides they want dbt test
to run even if dbt run
has model failures. They override palm cycle
with their own command, changing only one line from our version:
def cli(count=2):
cmd = palm.command_builder()
cmd.add(dbt_clean)
cmd.add(dbt_deps)
while count > 0:
cmd.add(dbt_run)
cmd.add(dbt_test)
count -= 1
cmd.add(dbt_cleanup, raise_error=True)
cmd.execute()
$ palm cycle
Executing `dbt clean`...
Clean as a whistle!
Executing `dbt deps`...
All dependencies installed!
Executing `dbt run`...
Oops, some models failed to build.
Executing `dbt test`...
Oops, some tests failed!
Executing `dbt run-operation drop_branch_schemas`...
Schema `test.mmoulds_test_another_feature_01856927` cleaned up.
Describe alternatives you've considered
Option: Keep things as they are.
Pros: This is simple for users writing new commands because they only have to think about how to execute code exactly as they would do it in the terminal. Single step commands are straightforward because they only require one execution. Advanced users can make use of our initial progress in this direction with our dbt_palm_utils
functions, or the way palm cycle
is implemented with a make_cmd()
function.
Cons: Usefulness of these existing functions is limited because they are currently only available in palm-dbt
; commands aren't reusable with library functions, so there's a lot of repeated code. Commands are built as strings concatenated together before being passed to the host/container runners, which is very brittle and prone to introducing bugs, since the interpreter can't identify bugs in the commands.
Option: Break all existing commands apart at the "&&" and run them in sequence with separate run_in_docker
calls.
Pros: Better for error handling.
Cons: No clean way for modular, reusable commands to exist because each call to the runner will still execute independently and commands still won't have a library function interface.
Additional context
This feels awkwardly like trying to implement workflow orchestration. If using an existing orchestrator inside palm were an option, I would totally go with doing that as long as it is as platform agnostic as palm currently is. Otherwise, I think we could still do this simply enough that would provide a ton of value.
Is there an existing feature request for this?
- [X] I have searched the existing issues
enhancement