I want to be able to denote the type of run from a predetermined list of values that a user can choose from when kicking off a run using different parameters. Our team does standardized job runs on a weekly cadence but can have timeframes that change...
Hi @cmilligan , I have a similar requirement and would really be grateful if you could provide me with any information on how to fix this issue. Thanks a lot!
Hi allI have a task of type Notebook, source is Git (Azure DevOps). This task runs fine with my user, but if I change the Owner to a service principal, I get the following error:Run result unavailable: run failed with error message Failed to checkout...
@pgruetter​ :To enable a service principal to access a specific Azure DevOps repository, you need to grant it the necessary permissions at both the organization and repository levels.Here are the steps to grant the service principal the necessary per...
Hi DataBricks Experts:I'm using Databricks on Azure.... I'd like to understand the following:1) if there is way of automating the re run some specific failed tasks from a job (with several Tasks), for example if I have 4 tasks, and the task 1 and 2 h...
You can use "retries".In Workflow, select your job, the task, and in the options below, configure retries.If so, you can also see more options at:https://learn.microsoft.com/pt-br/azure/databricks/dev-tools/api/2.0/jobs?source=recommendations
I have a basic 2 task job. The 1st notebook (task) checks whether the source file has changes and if so then refreshes a corresponding materialized view. In case we have no changes then I use dbutils.jobs.taskValues.set(key = "skip_job", value = 1) &...
@Michael Papadopoulos​ usually that should not be the case i think, as for task level we have 3 level notifications ( success, failure,start), where as whole job level skip option is available to discard notification . will see if some one from commu...
Hi, You can pass {job_id}} and {{run_id}} in Job arguments and print that information and save into wherever it is neededplease find below the documentation for the same:https://docs.databricks.com/data-engineering/jobs/jobs.html#task-parameter-varia...
Hi, I would like to understand Databricks JAR based workflow tasks. Can I interpret JAR based runs to be something like a spark-submit on a cluster? In the logs, I was expecting to see the spark-submit --class com.xyz --num-executors 4 etc., And, the...
Hi, I did try using the Workflows>Jobs>CreateTask>JarTaskType>UploadedMyJAR and Class and created JobCluster and tested this task. This JAR reads some tables as input, does some transformations and output as writing some other tables. I would like t...
Using AIrflow, I have created a DAG with a sequence of notebook tasks. The first notebook returns a batch id; the subsequent notebook tasks need this batch_id.I am using the DatabricksSubmitRunOperator to run the notebook task. This operator pushes ...
From what I understand - you want to pass a run_id parameter to the second notebook task?You can: Create a widget param inside your databricks notebook (https://docs.databricks.com/notebooks/widgets.html) that will consume your run_idPass the paramet...
I have tried following ways to get job parameters but none of the things are working.runId='{{run_id}}'
jobId='{{job_id}}'
filepath='{{filepath}}'
print(runId," ",jobId," ",filepath)
r1=dbutils.widgets.get('{{run_id}}')
f1=dbutils.widgets.get('{{file...
Thanks for your response. I found the solution. The below code gives me all the job parametersall_args = dbutils.notebook.entry_point.getCurrentBindings()print(all_args)Thanks for your support
One of my clients has been orchestration Databricks notebooks using Airflow + REST API. They're curious about the pros/cons of switching these jobs to Databricks jobs with Task Orchestration.I know there are all sorts of considerations - for example,...
@Kaniz Fatma​ Hello Kaniz, I'm currently working with a major Enterprise Client looking to make the choice between the Airflow vs Databricks for Jobs scheduling. Our Entire code base is in Databricks and we are trying to figure out the complexities t...
This morning I encountered an issue when trying to create a new job using the Workflows UI (in browser). Never had this issue before.The error message that appears is:"You are not entitled to run this type of task, please contact your Databricks admi...
Hi Team I am using Multitask and I am trying to restart only the failed task but seems like I have to restart complete workflow again and again , is there any way or workaround
One way that works is to go to your task definition, click advanced options, and set retry policy. The task will restart per those instructions. Does that work for you?
In general, one task per core is how spark executes the tasks.If we want to restrict the number of tasks submitted to the executor to get more task to memory ratio, How can we achieve that?
We can use a config called "spark.task.cpus"This specifies the number of cores to allocate for each task.The default value is 1If we specify say 2, it means fewer tasks will be assigned to the executor.
There are two types of auto scaling in Databricks: Standard and Optimized. In both scenarios when tasks are submitted the cluster will begin scaling to execute as many of them in parallel immediately.Scaling down is different. In optimized autoscalin...