cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

A Job "pool"? (or task pool)

spott_submittab
New Contributor II

I'm trying to run a single job multiple times with different parameters where the number of concurrent jobs is less than the number of parameters.

I have a job (or task...) J that takes parameter set p, I have 100 p values I want to run, however I only want 10 to run at a time (say I have 10 clusters, or I want to run them all on the same cluster and that cluster needs to have enough compute to run them concurrently, etc.), however I want all 100 to eventually run.

Is this possible? The maximum concurrent runs parameter just skips extra jobs and tasks without dependencies all run at the same time.

I'm aware that there is frequently a way to do something like this is spark, but that feels heavy handed (and requires rearchitecting my code).

1 REPLY 1

Aviral-Bhardwaj
Esteemed Contributor III

this is something new,interesting question, try to reach out databricks support team, maybe they have some good idea here

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.