cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Automating the re run of job (with several Tasks) // automate the notification of a failed specific tasks after re trying // Error handling on azure data factory pipeline with DataBricks notebook

Diego_MSFT
New Contributor II

Hi DataBricks Experts:

I'm using Databricks on Azure.... I'd like to understand the following:

1) if there is way of automating the re run some specific failed tasks from a job (with several Tasks), for example if I have 4 tasks, and the task 1 and 2 have succeed and task 3 and 4 have failed, then to be able to re run task 3 and 4 one more time... I know there is a functionality per se inside Jobs, that allows to re run failed tasks but this needs to be done manually; and I want to automate this... here more info: https://docs.databricks.com/data-engineering/jobs/jobs.html (Repair an unsuccessful job run)

2) and if after retrying and still failing some tasks... then do some kind of notification invoking a https url for deciding what to do next after retrying 2 times. I've seen this documentation (about retrying several times on a notebook): https://docs.databricks.com/notebooks/notebook-workflows.html?_ga=2.96427868.191663080.1659650759-13....

I've seen this reference and it seems only email is allowed as notification method for failed job: https://stackoverflow.com/questions/61586505/azure-databricks-job-notification-email ... has this changed?

Additionally:

3) not sure if there is a best practice for orchestrating notebooks from Azure Data Factory and manage this type of problems from Azure Data Factory? I've seen this documentation: https://azure.microsoft.com/es-mx/blog/operationalize-azure-databricks-notebooks-using-data-factory/

Where it seems that if a notebook failed then this can be catch up on DataFactory pipeline, and manage the error there (for example sending an email).

Any help will be appreciated.

Thanks

Diego

1 REPLY 1

Lindberg
New Contributor II

You can use "retries".

In Workflow, select your job, the task, and in the options below, configure retries.

If so, you can also see more options at:

https://learn.microsoft.com/pt-br/azure/databricks/dev-tools/api/2.0/jobs?source=recommendations

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!