Job Clusters With Multiple Tasks

Get Started Discussions

Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.

Hi all,

I'm trying to do creating one job cluster with one configuration or specification which has a workflow and this workflow needs to have 3 dependent tasks as a straight line. For example, t1->t2->t3.

In databricks there are some constraints also like a job can contain maximum 100 task as a specification and maximum 1000 concurrent task runs as an instance. Besides, they are saying databricks has own orchestration inside workflow.

These questions below are for the community.

How can I utilize my job cluster?

Does the orchestrator run 1000 concurrent instance even the workflow has 3 task in the job?

Does databricks support a queue as an input for workflow or a task inside workflow? (Instead of giving parameters)

Actually, I don't know how databricks running internally. Is it scaling up and down tasks or if it is yes, are they scaling up and down by looking which metrics? I would like to scale up my specific task 't2' like in the example, respect to events or inputs inside queue or anywhere which are dynamically created.

Is it possible with managed workflow powered by the orchestrator you have?

Or Do we need to use some custom tool like Apache Airflow?

Thanks.