Topics with Label: Multiple Tasks

Forum Posts

Sorted by:

by dave_hiltbrand • New Contributor II

06-22-2023 7:47:26 PM

1167 Views
3 replies
0 kudos

I have a job with multiple tasks running asynchronously and I don't think its leveraging all the nodes on the cluster based on runtime.

I have a job with multiple tasks running asynchronously and I don't think its leveraging all the nodes on the cluster based on runtime. I open the Spark UI for the cluster and checkout the executors and don't see any tasks for my worker nodes. How ca...

Data Engineering

1167 Views
3 replies
0 kudos

06-22-2023 7:47:26 PM

View Replies

Latest Reply

Anonymous
Not applicable

06-23-2023 12:18:56 AM

0 kudos

Hi @Dave Hiltbrand Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

0 kudos

06-23-2023 12:18:56 AM

2 More Replies

by thib • New Contributor III

06-14-2022 10:52:04 AM

3088 Views
3 replies
2 kudos

Can we use multiple git repos for a job running multiple tasks?

I have a job running multiple tasks :Task 1 runs a machine learning pipeline from git repo 1Task 2 runs an ETL pipeline from git repo 1Task 2 is actually a generic pipeline and should not be checked in repo 1, and will be made available in another re...

Data Engineering

3088 Views
3 replies
2 kudos

06-14-2022 10:52:04 AM

View Replies

Latest Reply

trijit
New Contributor II

05-11-2023 1:15:35 AM

2 kudos

The way to go about this would be to create Databricks repos in the workspace and then use that in the task formation. This way we can refer multiple repos in different tasks.

2 kudos

05-11-2023 1:15:35 AM

2 More Replies

by swzzzsw • New Contributor III

01-24-2022 11:17:24 AM

5394 Views
5 replies
10 kudos

"Run now with different parameters" - different parameters not recognized by jobs involving multiple tasks

I'm running a databricks job involving multiple tasks and would like to run the job with different set of task parameters. I can achieve that by edit each task and and change the parameter values. However, it gets very manual when I have a lot of tas...

Data Engineering

5394 Views
5 replies
10 kudos

01-24-2022 11:17:24 AM

View Replies

Latest Reply

erens
New Contributor II

12-14-2022 4:32:48 AM

10 kudos

Hello,I am also facing with the same issue. The problem is described below:I have a multi-task job. This job consists of multiple "spark_python_task" kind tasks that execute a python script in a spark cluster. This pipeline is created within a CI/CD ...

10 kudos

12-14-2022 4:32:48 AM

4 More Replies

by assapin • New Contributor

11-12-2022 2:54:32 AM

799 Views
0 replies
0 kudos

{{start_time}} isn't accurate and doesn't behave logically for multi-task jobs

I am trying to run an incremental data processing job using python wheel.The job is scheduled to run e.g. every hour.For my code to know what data increment to process, I inject it with the {{start_time}} as part of the command line, like so["end_dat...

Data Engineering

799 Views
0 replies
0 kudos

11-12-2022 2:54:32 AM

by Arun_tsr • New Contributor III

11-08-2022 10:30:02 PM

988 Views
2 replies
0 kudos

Spark SQL output multiple small files

We are having multiple joins involving a large table (about 500gb in size). The output of the joins is stored into multiple small files each of size 800kb-1.5mb. Because of this the job is split into multiple tasks and taking a long time to complete....

Data Engineering

988 Views
2 replies
0 kudos

11-08-2022 10:30:02 PM

View Replies

Latest Reply

Debayan
Esteemed Contributor III

11-08-2022 11:32:03 PM

0 kudos

Hi @Arun Balaji , Could you please provide the error message you are receiving?

0 kudos

11-08-2022 11:32:03 PM

1 More Replies

by RJB • New Contributor II

03-03-2022 1:16:27 PM

7432 Views
6 replies
0 kudos

Resolved! How to pass outputs from a python task to a notebook task

I am trying to create a job which has 2 tasks as follows:A python task which accepts a date and an integer from the user and outputs a list of dates (say, a list of 5 dates in string format).A notebook which runs once for each of the dates from the d...

Data Engineering

7432 Views
6 replies
0 kudos

03-03-2022 1:16:27 PM

View Replies

Latest Reply

BilalAslamDbrx
Honored Contributor II

10-22-2022 1:14:35 AM

0 kudos

Just a note that this feature, Task Values, has been generally available for a while.

0 kudos

10-22-2022 1:14:35 AM

5 More Replies

by RKNutalapati • Valued Contributor

07-11-2022 6:41:22 AM

1041 Views
2 replies
0 kudos

Jobs API "run now" - How to set task wise parameters

I have a job with multiple tasks like Task1 -> Task2 -> Task3. I am trying to call the job using api "run now". Task details are belowTask1 - It executes a Note Book with some input parametersTask2 - It runs using "ABC.jar", so its a jar based task ...

Data Engineering

1041 Views
2 replies
0 kudos

07-11-2022 6:41:22 AM

View Replies

Latest Reply

Prabakar
Esteemed Contributor III

07-11-2022 8:29:43 AM

0 kudos

@Rama Krishna N you can refer here https://docs.databricks.com/dev-tools/api/latest/jobs.html#operation/JobsRunNow"jar_params": [ "john", "doe", "35" ], "notebook_params": { "name": "john doe", "age": "35" },

0 kudos

07-11-2022 8:29:43 AM

1 More Replies

by swzzzsw • New Contributor III

01-24-2022 11:34:29 AM

2962 Views
5 replies
2 kudos

Resolved! Pass variable values from one task to another

I created a Databricks job with multiple tasks. Is there a way to pass variable values from one task to another. For example, if I have tasks A and B as Databricks notebooks. Can I create a variable (e.g. x) in notebook A and later use that value in ...

Data Engineering

2962 Views
5 replies
2 kudos

01-24-2022 11:34:29 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

01-25-2022 7:26:43 AM

2 kudos

you could also consider using an orchestration tool like Data Factory (Azure) or Glue (AWS). there you can inject and use parameters from notebooks.The job scheduling of databricks also has the possibility to add parameters, but I do not know if yo...

2 kudos

01-25-2022 7:26:43 AM

4 More Replies