cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Limited concurrent running DLT's within a pipeline

JulianKrรผger
New Contributor

Hi Champions!

We are currently experiencing a few unexplainable limitations when executing pipelines with > 50 DLT tables. It looks like, that there is some calculation in the background in place, to determine the maximum number of concurrent running DLT's - in our case 16

Pipeline Config:
Cloud: Azure Databricks
Product Edition: Pro
Channel: Current
Pipeline mode: Triggered
Storage option: Unity Catalog

Initial Cluster Config:
Cluster Policy: None
Cluster mode: Enhanced autoscaling
Min / Max Worker: 1/6
Photon enabled
Worker / Driver type: Standard_F8

Analyze the Issue:
I checked the detail column for the event_type "cluster_resources" in the event_log tvf for the pipeline. The value for "num_task_slots" is limited to 16. This was a first indicator for me, to think, that this number influence the maximum concurrent running DLT's.

JulianKrger_0-1737625978665.png

Tries to increase the Concurrency:

  • Changed the Worker and Driver Types
  • Increased the Driver and Worker Size
  • Changed the product edition
  • Enabled and disabled Photon
  • All Cluster modes (Fixed size, legacy autoscaling, enhanced autoscaling)

One configuration worked to increase the "num_task_slots". Set the minimum number of workers to 5:

JulianKrger_1-1737626627178.png

After that change, the "num_task_slots" increased to 40.
(I cannot realy derive, why the num_task_slots for 4 worker is 16 and for 5 worker it is 40)

Concurrency still limited:
When I extend my query against the event_log and calculate the number of concurrent running pipelines, the pipeline still execute a maximum of 16 pipelines at the same time:

JulianKrger_2-1737626698802.png

So maybe the "num_task_slots" to not influence the number of concurrent running DLT's?

Open questions:

  • Does the โ€œnum_task_slotsโ€ have anything to do with the maximum parity?
    • If yes:
      • Why the pipeline still caps at 16 concurrent running DLT's while the number has increased to 40?
      • How does Databricks calculate the "num_task_slots"?
    • If no:
      • What does the number "num_task_slots" tell me?
      • What else determine the maximum of concurrent running DLT's?
  • How can I increase the number of concurrent running pipelines? Is there any cluster conf or DLT conf I can provide?

I do not found any limitation or calculation out there to answer that questions.

I hope that one of you champions out there can help me with this.


Best regards

Julian

1 REPLY 1

Sidhant07
Databricks Employee
Databricks Employee

Hi @JulianKrรผger ,

โ€ข The "num_task_slots" parameter in Databricks Delta Live Tables (DLT) pipelines is related to the concurrency of tasks within a pipeline. It determines the number of concurrent tasks that can be executed. However, this parameter does not directly determine the maximum number of concurrent running DLT pipelines within a workspace.
โ€ข A pipeline might still be capped at 16 concurrent running DLTs even if the "num_task_slots" has been increased to 40 due to other limitations or configurations in the system, such as cluster configurations or workspace-level limits that are not directly influenced by the "num_task_slots" parameter.
โ€ข The "num_task_slots" is calculated based on the available resources and the specific configurations of the cluster, such as the number of workers and the instance types used. Enhanced autoscaling and other cluster settings can also impact how resources are allocated for task slots.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group