cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

cmilligan
by Contributor II
  • 1533 Views
  • 3 replies
  • 2 kudos

Dropdown for parameters in a job

I want to be able to denote the type of run from a predetermined list of values that a user can choose from when kicking off a run using different parameters. Our team does standardized job runs on a weekly cadence but can have timeframes that change...

  • 1533 Views
  • 3 replies
  • 2 kudos
Latest Reply
dev56
New Contributor II
  • 2 kudos

Hi @cmilligan , I have a similar requirement and would really be grateful if you could provide me with any information on how to fix this issue. Thanks a lot!

  • 2 kudos
2 More Replies
pgruetter
by Contributor
  • 3102 Views
  • 7 replies
  • 2 kudos

Run Task as Service Principal with Code in Azure DevOps Repo

Hi allI have a task of type Notebook, source is Git (Azure DevOps). This task runs fine with my user, but if I change the Owner to a service principal, I get the following error:Run result unavailable: run failed with error message Failed to checkout...

  • 3102 Views
  • 7 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@pgruetter​ :To enable a service principal to access a specific Azure DevOps repository, you need to grant it the necessary permissions at both the organization and repository levels.Here are the steps to grant the service principal the necessary per...

  • 2 kudos
6 More Replies
Diego_MSFT
by New Contributor II
  • 2444 Views
  • 1 replies
  • 4 kudos

Automating the re run of job (with several Tasks) // automate the notification of a failed specific tasks after re trying // Error handling on azure data factory pipeline with DataBricks notebook

Hi DataBricks Experts:I'm using Databricks on Azure.... I'd like to understand the following:1) if there is way of automating the re run some specific failed tasks from a job (with several Tasks), for example if I have 4 tasks, and the task 1 and 2 h...

  • 2444 Views
  • 1 replies
  • 4 kudos
Latest Reply
Lindberg
New Contributor II
  • 4 kudos

You can use "retries".In Workflow, select your job, the task, and in the options below, configure retries.If so, you can also see more options at:https://learn.microsoft.com/pt-br/azure/databricks/dev-tools/api/2.0/jobs?source=recommendations

  • 4 kudos
Michael_Papadop
by New Contributor II
  • 4177 Views
  • 3 replies
  • 0 kudos

How can I set the status of a databricks job as skipped via python?

I have a basic 2 task job. The 1st notebook (task) checks whether the source file has changes and if so then refreshes a corresponding materialized view. In case we have no changes then I use dbutils.jobs.taskValues.set(key = "skip_job", value = 1) &...

  • 4177 Views
  • 3 replies
  • 0 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 0 kudos

@Michael Papadopoulos​ usually that should not be the case i think, as for task level we have 3 level notifications ( success, failure,start), where as whole job level skip option is available to discard notification . will see if some one from commu...

  • 0 kudos
2 More Replies
mmenjivar
by New Contributor II
  • 1121 Views
  • 2 replies
  • 0 kudos

How to get the run_id from a previous task in a Databricks jobs

Hi, is there any way to share the run_id from a task_A to a task_B within the same job when task_A is a dbt task?

  • 1121 Views
  • 2 replies
  • 0 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 0 kudos

Hi, You can pass {job_id}} and {{run_id}} in Job arguments and print that information and save into wherever it is neededplease find below the documentation for the same:https://docs.databricks.com/data-engineering/jobs/jobs.html#task-parameter-varia...

  • 0 kudos
1 More Replies
Chanu
by New Contributor II
  • 989 Views
  • 2 replies
  • 2 kudos

Databricks JAR task type functionality

Hi, I would like to understand Databricks JAR based workflow tasks. Can I interpret JAR based runs to be something like a spark-submit on a cluster? In the logs, I was expecting to see the spark-submit --class com.xyz --num-executors 4 etc., And, the...

  • 989 Views
  • 2 replies
  • 2 kudos
Latest Reply
Chanu
New Contributor II
  • 2 kudos

Hi, I did try using the Workflows>Jobs>CreateTask>JarTaskType>UploadedMyJAR and Class and created JobCluster and tested this task. This JAR reads some tables as input, does some transformations and output as writing some other tables. I would like t...

  • 2 kudos
1 More Replies
Choolanadu
by New Contributor
  • 1978 Views
  • 1 replies
  • 0 kudos

Airflow - How to pull XComs value in the notebook task?

Using AIrflow, I have created a DAG with a sequence of notebook tasks. The first notebook returns a batch id; the subsequent notebook tasks need this batch_id.I am using the DatabricksSubmitRunOperator to run the notebook task. This operator pushes ...

  • 1978 Views
  • 1 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

From what I understand - you want to pass a run_id parameter to the second notebook task?You can: Create a widget param inside your databricks notebook (https://docs.databricks.com/notebooks/widgets.html) that will consume your run_idPass the paramet...

  • 0 kudos
rammy
by Contributor III
  • 4506 Views
  • 6 replies
  • 5 kudos

How I could read the Job id, run id and parameters in python cell?

I have tried following ways to get job parameters but none of the things are working.runId='{{run_id}}' jobId='{{job_id}}' filepath='{{filepath}}' print(runId," ",jobId," ",filepath) r1=dbutils.widgets.get('{{run_id}}') f1=dbutils.widgets.get('{{file...

  • 4506 Views
  • 6 replies
  • 5 kudos
Latest Reply
rammy
Contributor III
  • 5 kudos

Thanks for your response. I found the solution. The below code gives me all the job parametersall_args = dbutils.notebook.entry_point.getCurrentBindings()print(all_args)Thanks for your support

  • 5 kudos
5 More Replies
arthur_wang
by New Contributor
  • 2503 Views
  • 3 replies
  • 1 kudos

How does Task Orchestration compare to Airflow (for Databricks-only jobs)?

One of my clients has been orchestration Databricks notebooks using Airflow + REST API. They're curious about the pros/cons of switching these jobs to Databricks jobs with Task Orchestration.I know there are all sorts of considerations - for example,...

  • 2503 Views
  • 3 replies
  • 1 kudos
Latest Reply
Shourya
New Contributor III
  • 1 kudos

@Kaniz Fatma​ Hello Kaniz, I'm currently working with a major Enterprise Client looking to make the choice between the Airflow vs Databricks for Jobs scheduling. Our Entire code base is in Databricks and we are trying to figure out the complexities t...

  • 1 kudos
2 More Replies
Robbie
by New Contributor III
  • 1627 Views
  • 4 replies
  • 5 kudos

Resolved! Why can't I create new jobs? ("You are not entitled to run this type of task...")

This morning I encountered an issue when trying to create a new job using the Workflows UI (in browser). Never had this issue before.The error message that appears is:"You are not entitled to run this type of task, please contact your Databricks admi...

Screenshot including the error message
  • 1627 Views
  • 4 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @Robbie Capps​, I'm glad we could help you. Thank you for marking the best answer for us.

  • 5 kudos
3 More Replies
User16826994223
by Honored Contributor III
  • 3595 Views
  • 2 replies
  • 2 kudos

Mult task - restart of the failed jobs

Hi Team I am using Multitask and I am trying to restart only the failed task but seems like I have to restart complete workflow again and again , is there any way or workaround

  • 3595 Views
  • 2 replies
  • 2 kudos
Latest Reply
TheOptimizer
Contributor
  • 2 kudos

One way that works is to go to your task definition, click advanced options, and set retry policy. The task will restart per those instructions. Does that work for you?

  • 2 kudos
1 More Replies
saipujari_spark
by Valued Contributor
  • 4888 Views
  • 2 replies
  • 3 kudos

Resolved! How to restrict the number of tasks per executor?

In general, one task per core is how spark executes the tasks.If we want to restrict the number of tasks submitted to the executor to get more task to memory ratio, How can we achieve that?

  • 4888 Views
  • 2 replies
  • 3 kudos
Latest Reply
saipujari_spark
Valued Contributor
  • 3 kudos

We can use a config called "spark.task.cpus"This specifies the number of cores to allocate for each task.The default value is 1If we specify say 2, it means fewer tasks will be assigned to the executor.

  • 3 kudos
1 More Replies
Anonymous
by Not applicable
  • 1143 Views
  • 1 replies
  • 0 kudos
  • 1143 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Honored Contributor III
  • 0 kudos

There are two types of auto scaling in Databricks: Standard and Optimized. In both scenarios when tasks are submitted the cluster will begin scaling to execute as many of them in parallel immediately.Scaling down is different. In optimized autoscalin...

  • 0 kudos
Labels