cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Dynamic variable and multi-instance tasks.

Maverick1
Valued Contributor II

1. How to pass dynamic variable values like "sysdate" to a job parameters, so that it will automatically take the updated values on the fly.

2. How to run multiple instance of set of tasksin a job (for different parameters). For e.g the same pipeline or set of tasks in the same job need to run for various markets/domains that can be referred from the list or parameter set.

2 REPLIES 2

BilalAslamDbrx
Databricks Employee
Databricks Employee

Hey @Saurabh Vermaโ€‹ :

  1. We support a limited number of dynamic parameters. We are working on expanding these.
  2. You can run multiple instances of a job with different task parameters. That should just work as it is already supported.

gyapar
New Contributor II

Hey Maverick1,

Did you find a solution for your second question?

I have also same approach. In databricks, it has workflows, job clusters, tasks etc.

I'm trying to do creating one job cluster with one configuration or specification which has a workflow and this workflow needs to have 3 dependent tasks as a straight line. For example, t1->t2->t3. 

In databricks there are some constraints also like a job can contain maximum 100 task as a specification and  maximum 1000 concurrent task runs as an instance. Besides, they are saying databricks has own orchestration inside workflow. 

 These questions below are for the community.

How can I utilize my job cluster?

Does the orchestrator run 1000 concurrent instance even the workflow has 3 task in the job?

Does databricks support a queue as an input for workflow or a task inside workflow? (Instead of giving parameters)

 

Actually, I don't know how databricks running internally. Is it scaling up and down tasks or if it is yes, are they scaling up and down by looking which metrics? I would like to scale up my specific task 't2' like in the example, respect to events or inputs inside queue or anywhere which are dynamically created.

Is it possible with managed workflow powered by the orchestrator you have?

Or Do we need to use some custom tool like Apache Airflow?   

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group