cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Automating the re run of job (with several Tasks) // automate the notification of a failed specific tasks after re trying // Error handling on azure data factory pipeline with DataBricks notebook

Diego_MSFT
New Contributor II

Hi DataBricks Experts:

I'm using Databricks on Azure.... I'd like to understand the following:

1) if there is way of automating the re run some specific failed tasks from a job (with several Tasks), for example if I have 4 tasks, and the task 1 and 2 have succeed and task 3 and 4 have failed, then to be able to re run task 3 and 4 one more time... I know there is a functionality per se inside Jobs, that allows to re run failed tasks but this needs to be done manually; and I want to automate this... here more info: https://docs.databricks.com/data-engineering/jobs/jobs.html (Repair an unsuccessful job run)

2) and if after retrying and still failing some tasks... then do some kind of notification invoking a https url for deciding what to do next after retrying 2 times. I've seen this documentation (about retrying several times on a notebook): https://docs.databricks.com/notebooks/notebook-workflows.html?_ga=2.96427868.191663080.1659650759-13....

I've seen this reference and it seems only email is allowed as notification method for failed job: https://stackoverflow.com/questions/61586505/azure-databricks-job-notification-email ... has this changed?

Additionally:

3) not sure if there is a best practice for orchestrating notebooks from Azure Data Factory and manage this type of problems from Azure Data Factory? I've seen this documentation: https://azure.microsoft.com/es-mx/blog/operationalize-azure-databricks-notebooks-using-data-factory/

Where it seems that if a notebook failed then this can be catch up on DataFactory pipeline, and manage the error there (for example sending an email).

Any help will be appreciated.

Thanks

Diego

1 REPLY 1

Lindberg
New Contributor II

You can use "retries".

In Workflow, select your job, the task, and in the options below, configure retries.

If so, you can also see more options at:

https://learn.microsoft.com/pt-br/azure/databricks/dev-tools/api/2.0/jobs?source=recommendations

Mestrando em BIG Date e Marketing Digital, MBA em Data Science, Analytics e BI, Especialista em Perícias e Crimes Digitais, Engenheiro de dados, Cientista de dados , Administrador de banco de dados e professor de nível superior. Cloud Azure (DP 203 - DP 100 - DP 300 - DP 420 - AZ 900 - DP 900 - AI 900 - PL 900 - SC 900) e Cloud AWS e GCP. Estudos em Quantum Technologies. Atividades com Python, Spark, ETL, SSIS, SSRS, Power Bi, ambiente Hadoop entre outros. Membro da ANPPD (Associação Nacional dos Profissionais de Privacidade de Dados).

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group