06-04-2024 01:28 PM
Hi there, I would like to understand for DLT jobs if there is any way to get DLT jobs running in an *existing* (and currently running) All-purpose compute rather than spinning up an "ephemeral" (not yet initialized) Job Compute?
06-04-2024 01:57 PM
Hi @ChristianRRL , you cannot run a DLT pipeline in an All-purpose compute cluster.
DLT Clusters are managed resources. Here are some details on how to configure a DLT cluster: https://docs.databricks.com/en/delta-live-tables/settings.html#configure-your-compute-settings.
06-04-2024 02:00 PM
I suspected this might be the case. Thank you for your confirmation!
06-04-2024 02:11 PM
@raphaelblg Actually, one follow-up question. Very brief out-loud thought, why doesn't Databricks at least offer the option to use our existing All-purpose compute clusters? If my understanding is correct, DLT Clusters are meant to run jobs more efficiently (and therefore cheaper) than the traditional all-purpose clusters, but if we already have dedicated all-purpose clusters running without the possibility of letting our DLT jobs run on those all-purpose clusters, then no matter how efficient/cheap DLT runs it will always be an additional cost to our existing setup.
In short, we already have a "baked in" cost of running our all-purpose clusters, but adding DLT will only ever *drive up* costs rather than cut them since there is no way to have them run in our all-purpose clusters.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now