Hi @ChristianRRL,
To address both your original question and your follow-up about the open-source angle:
CURRENT STATE ON DATABRICKS
Lakeflow Spark Declarative Pipelines (SDP), the current name for what was previously known as DLT, runs on its own managed pipeline compute on Databricks. The two supported options are:
1. Serverless compute (recommended, and the default for new pipelines)
2. Classic pipeline compute (you configure worker/driver instance types, autoscaling, etc.)
SDP pipelines do not run on all-purpose (interactive) compute today. When you create or update a pipeline, the system provisions dedicated compute for that pipeline run. This is by design, as the SDP runtime includes specialized configurations and optimizations on top of the Databricks Runtime that are not present on standard all-purpose clusters.
For the development workflow, you use the Lakeflow Pipelines Editor in the workspace to iteratively develop and validate your pipeline code. Running in "development mode" provides faster iteration with relaxed retry policies and no waiting for cluster reuse between updates.
Documentation for pipeline compute configuration:
https://docs.databricks.com/aws/en/ldp/configure-pipeline
https://docs.databricks.com/aws/en/ldp/develop
REGARDING THE OPEN-SOURCE VERSION
You are correct that Databricks contributed Spark Declarative Pipelines to the Apache Spark open-source project. The sql/pipelines module is available in the Apache Spark repository:
https://github.com/apache/spark/tree/master/sql/pipelines
There are some important distinctions to understand:
1. The open-source Apache Spark Declarative Pipelines provides the core declarative programming model (defining tables and views with @Dlt.table, @Dlt.view, etc.) that can run on any Spark cluster.
2. The Databricks version (Lakeflow SDP) adds significant platform integrations on top of that core, including Unity Catalog integration, managed pipeline compute orchestration, the Lakeflow Pipelines Editor, monitoring and observability, expectations/data quality enforcement at scale, enhanced autoscaling, and Photon acceleration.
3. When a future Databricks Runtime LTS ships with Spark 4.x that includes the open-source declarative pipelines module, it would theoretically be possible to use the core open-source APIs on all-purpose compute. However, you would not get the managed pipeline orchestration, automatic compute lifecycle, or the deeper platform integrations that the Databricks-managed SDP experience provides.
In short, the open-source contribution means the declarative programming model becomes portable across Spark environments, but the full managed experience on Databricks will continue to run on dedicated pipeline compute. For production workloads on Databricks, the recommended path remains using SDP pipelines with serverless or classic pipeline compute.
Keep an eye on the Databricks release notes for any updates as the Spark 4.x line matures:
https://docs.databricks.com/aws/en/release-notes/
* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.
If this answer resolves your question, could you mark it as "Accept as Solution"? That helps other users quickly find the correct fix.