Legacy Autoscaling(workflow) VS Enhanced Autoscaling(DLT)

chsoni12
New Contributor II

I conducted a proof of concept (POC) to compare the performance of the DLT pipeline and Databricks Workflow using the same workload, task, code, and cluster configuration. Both configurations were set with autoscaling enabled, with a minimum of 1 worker node and a maximum of 5 worker nodes.

The differences were as follows:

  1. DLT used enhanced autoscaling, while Databricks Workflow utilized standard autoscaling.

  2. I created a Delta table using Databricks Workflow and a materialized view using DLT.

Results:

I ran both the pipeline and the workflow and observed that the DLT pipeline completed in 9.5 minutes, whereas the Databricks Workflow took 14.32 minutes. Additionally, the cost of running the DLT pipeline was lower.

Upon reviewing the logs, I found that in the Databricks Workflow, the cluster first upscaled from 1 worker to 19 workers, and then from 19 workers to 50 workers. In contrast, the DLT pipeline completed the entire process using only 5 worker nodes, starting from 1 and scaling up to 5.

Questions:

  1. Why did the DLT pipeline complete with only 5 worker nodes, while the Databricks Workflow required up to 50 worker nodes?

  2. How do autoscaling and enhanced autoscaling function in the background, and what accounts for the observed differences in scaling behavior?