Skip to main content

Programmatic invocations

In v1.5, dbt-core added support for programmatic invocations. The intent is to expose the existing dbt Core CLI via a Python entry point, such that top-level commands are callable from within a Python script or application.

The entry point is a dbtRunner class, which allows you to invoke the same commands as on the CLI.

from dbt.cli.main import dbtRunner, dbtRunnerResult

# initialize
dbt = dbtRunner()

# create CLI args as a list of strings
cli_args = ["run", "--select", "tag:my_tag"]

# run the command
res: dbtRunnerResult = dbt.invoke(cli_args)

# inspect the results
for r in res.result:
print(f"{r.node.name}: {r.status}")

Parallel execution not supported

dbt-core doesn't support safe parallel execution for multiple invocations in the same process. This means it's not safe to run multiple dbt commands at the same time. It's officially discouraged and requires a wrapping process to handle sub-processes. This is because:

  • Running simultaneous commands can unexpectedly interact with the data platform. For example, running dbt run and dbt build for the same models simultaneously could lead to unpredictable results.
  • Each dbt-core command interacts with global Python variables. To ensure safe operation, commands need to be executed in separate processes, which can be achieved using methods like spawning processes or using tools like Celery.

To run safe parallel execution, you can use the dbt Cloud CLI or dbt Cloud IDE, both of which does that additional work to manage concurrency (multiple processes) on your behalf.

dbtRunnerResult

Each command returns a dbtRunnerResult object, which has three attributes:

  • success (bool): Whether the command succeeded.
  • result: If the command completed (successfully or with handled errors), its result(s). Return type varies by command.
  • exception: If the dbt invocation encountered an unhandled error and did not complete, the exception it encountered.

There is a 1:1 correspondence between CLI exit codes and the dbtRunnerResult returned by a programmatic invocation:

ScenarioCLI Exit Codesuccessresultexception
Invocation completed without error0Truevaries by commandNone
Invocation completed with at least one handled error (e.g. test failure, model build error)1Falsevaries by commandNone
Unhandled error. Invocation did not complete, and returns no results.2FalseNoneException

Commitments & Caveats

From dbt Core v1.5 onward, we making an ongoing commitment to providing a Python entry point at functional parity with dbt-core's CLI. We reserve the right to change the underlying implementation used to achieve that goal. We expect that the current implementation will unlock real use cases, in the short & medium term, while we work on a set of stable, long-term interfaces that will ultimately replace it.

In particular, the objects returned by each command in dbtRunnerResult.result are not fully contracted, and therefore liable to change. Some of the returned objects are partially documented, because they overlap in part with the contents of dbt artifacts. As Python objects, they contain many more fields and methods than what's available in the serialized JSON artifacts. These additional fields and methods should be considered internal and liable to change in future versions of dbt-core.

Advanced usage patterns

caution

The syntax and support for these patterns are liable to change in future versions of dbt-core.

The goal of dbtRunner is to offer parity with CLI workflows, within a programmatic environment. There are a few advanced usage patterns that extend what's possible with the CLI.

Reusing objects

Pass pre-constructed objects into dbtRunner, to avoid recreating those objects by reading files from disk. Currently, the only object supported is the Manifest (project contents).

from dbt.cli.main import dbtRunner, dbtRunnerResult
from dbt.contracts.graph.manifest import Manifest

# use 'parse' command to load a Manifest
res: dbtRunnerResult = dbtRunner().invoke(["parse"])
manifest: Manifest = res.result

# introspect manifest
# e.g. assert every public model has a description
for node in manifest.nodes.values():
if node.resource_type == "model" and node.access == "public":
assert node.description != "", f"{node.name} is missing a description"

# reuse this manifest in subsequent commands to skip parsing
dbt = dbtRunner(manifest=manifest)
cli_args = ["run", "--select", "tag:my_tag"]
res = dbt.invoke(cli_args)

Registering callbacks

Register callbacks on dbt's EventManager, to access structured events and enable custom logging. The current behavior of callbacks is to block subsequent steps from proceeding; this functionality is not guaranteed in future versions.

Overriding parameters

Pass in parameters as keyword arguments, instead of a list of CLI-style strings. At present, dbt will not do any validation or type coercion on your inputs. The subcommand must be specified, in a list, as the first positional argument.

from dbt.cli.main import dbtRunner
dbt = dbtRunner()

# these are equivalent
dbt.invoke(["--fail-fast", "run", "--select", "tag:my_tag"])
dbt.invoke(["run"], select=["tag:my_tag"], fail_fast=True)
0