cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

swzzzsw
by New Contributor III
  • 9623 Views
  • 4 replies
  • 9 kudos

"Run now with different parameters" - different parameters not recognized by jobs involving multiple tasks

I'm running a databricks job involving multiple tasks and would like to run the job with different set of task parameters. I can achieve that by edit each task and and change the parameter values. However, it gets very manual when I have a lot of tas...

  • 9623 Views
  • 4 replies
  • 9 kudos
Latest Reply
VijayNakkonda
New Contributor II
  • 9 kudos

Dear Team, For now, I found a solution. Disconnect the bundle source on Databricks, edit the parameters that you want to run. After execution, redeploy your code again from repository.

  • 9 kudos
3 More Replies
thib
by New Contributor III
  • 6346 Views
  • 4 replies
  • 3 kudos

Can we use multiple git repos for a job running multiple tasks?

I have a job running multiple tasks :Task 1 runs a machine learning pipeline from git repo 1Task 2 runs an ETL pipeline from git repo 1Task 2 is actually a generic pipeline and should not be checked in repo 1, and will be made available in another re...

image
  • 6346 Views
  • 4 replies
  • 3 kudos
Latest Reply
tors_r_us
New Contributor II
  • 3 kudos

Had this same problem. Fix was to have two workflows with no triggers, each pointing to the respective git repo. Then setup a 3rd workflow with appropriate triggers/schedule which calls the first 2 workflows. A workflow can run other workflows. 

  • 3 kudos
3 More Replies
RKNutalapati
by Valued Contributor
  • 2064 Views
  • 3 replies
  • 0 kudos

Jobs API "run now" - How to set task wise parameters

I have a job with multiple tasks like Task1 -> Task2 -> Task3. I am trying to call the job using api "run now". Task details are belowTask1 - It executes a Note Book with some input parametersTask2 - It runs using "ABC.jar", so its a jar based task ...

  • 2064 Views
  • 3 replies
  • 0 kudos
Latest Reply
Harsha777
New Contributor III
  • 0 kudos

Hi,It would be a good feature to pass parameters at task level. We have scenarios where we would like to create a job with multiple tasks (notebook/dbt) and pass parameters at task level.

  • 0 kudos
2 More Replies
dave_hiltbrand
by New Contributor II
  • 6069 Views
  • 3 replies
  • 0 kudos

I have a job with multiple tasks running asynchronously and I don't think its leveraging all the nodes on the cluster based on runtime.

I have a job with multiple tasks running asynchronously and I don't think its leveraging all the nodes on the cluster based on runtime. I open the Spark UI for the cluster and checkout the executors and don't see any tasks for my worker nodes. How ca...

  • 6069 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Dave Hiltbrand​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 0 kudos
2 More Replies
Arun_tsr
by New Contributor III
  • 2324 Views
  • 2 replies
  • 0 kudos

Spark SQL output multiple small files

We are having multiple joins involving a large table (about 500gb in size). The output of the joins is stored into multiple small files each of size 800kb-1.5mb. Because of this the job is split into multiple tasks and taking a long time to complete....

Spark UI metrics
  • 2324 Views
  • 2 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi @Arun Balaji​ , Could you please provide the error message you are receiving?

  • 0 kudos
1 More Replies
RJB
by New Contributor II
  • 12213 Views
  • 6 replies
  • 0 kudos

Resolved! How to pass outputs from a python task to a notebook task

I am trying to create a job which has 2 tasks as follows:A python task which accepts a date and an integer from the user and outputs a list of dates (say, a list of 5 dates in string format).A notebook which runs once for each of the dates from the d...

  • 12213 Views
  • 6 replies
  • 0 kudos
Latest Reply
BilalAslamDbrx
Databricks Employee
  • 0 kudos

Just a note that this feature, Task Values, has been generally available for a while.

  • 0 kudos
5 More Replies
swzzzsw
by New Contributor III
  • 6214 Views
  • 5 replies
  • 2 kudos

Resolved! Pass variable values from one task to another

I created a Databricks job with multiple tasks. Is there a way to pass variable values from one task to another. For example, if I have tasks A and B as Databricks notebooks. Can I create a variable (e.g. x) in notebook A and later use that value in ...

  • 6214 Views
  • 5 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

you could also consider using an orchestration tool like Data Factory (Azure) or Glue (AWS). there you can inject and use parameters from notebooks.The job scheduling of databricks also has the possibility to add parameters, but I do not know if yo...

  • 2 kudos
4 More Replies
Labels