cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

TaskSensor - check if task is succeded

kmodelew
New Contributor II

Hi,

I would like to check if the task within job is succeded (even the job is marked as failed because on of the tasks).

I need to create dependency for tasks within other jobs. The case is that I have one job for loading all tables for one country. Report uses some tables from job. 

I'm looking something similar to ExternalTasSensor in Airflow.

What is the best practice in this area? I see the option to use API but maybe you see other, better possibilites.

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @kmodelew

  • Databricks Jobs now supports task orchestration, allowing you to run multiple tasks as a directed acyclic graph (DAG). This simplifies the creation, management, and monitoring of your data and machine learning workflows.
  • You can easily orchestrate tasks using the Databricks UI and API, making it accessible to data scientists and analysts. This feature is available at no additional cost.
  • Benefits include:
  • By default, Databricks Jobs runs when their dependencies have succeeded. However, you can configure tasks to run conditionally based on specific criteria.
  • For example, you can set up an “If/else condition” task that runs only when certain conditions are met. This allows for more fine-grained control over task execution 23.
  • While Databricks doesn’t have an exact equivalent to Airflow’s ExternalTaskSensor, you can achieve similar functionality using the ExternalTaskSensor in Databricks.
  • The ExternalTaskSensor waits for a task in a different DAG (or even a different Databricks workspace) to complete before proceeding with the current task.
  • You can use this sensor to create dependencies across different jobs or even different Databricks workspaces.
  • It’s particularly useful when you have interdependent tasks across different workflows 45
  • If you prefer programmatic control, you can use Databricks APIs to query the status of tasks or jobs.
  • For example, you can use the Jobs API to check the status of specific tasks within a job.

View solution in original post

2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @kmodelew

  • Databricks Jobs now supports task orchestration, allowing you to run multiple tasks as a directed acyclic graph (DAG). This simplifies the creation, management, and monitoring of your data and machine learning workflows.
  • You can easily orchestrate tasks using the Databricks UI and API, making it accessible to data scientists and analysts. This feature is available at no additional cost.
  • Benefits include:
  • By default, Databricks Jobs runs when their dependencies have succeeded. However, you can configure tasks to run conditionally based on specific criteria.
  • For example, you can set up an “If/else condition” task that runs only when certain conditions are met. This allows for more fine-grained control over task execution 23.
  • While Databricks doesn’t have an exact equivalent to Airflow’s ExternalTaskSensor, you can achieve similar functionality using the ExternalTaskSensor in Databricks.
  • The ExternalTaskSensor waits for a task in a different DAG (or even a different Databricks workspace) to complete before proceeding with the current task.
  • You can use this sensor to create dependencies across different jobs or even different Databricks workspaces.
  • It’s particularly useful when you have interdependent tasks across different workflows 45
  • If you prefer programmatic control, you can use Databricks APIs to query the status of tasks or jobs.
  • For example, you can use the Jobs API to check the status of specific tasks within a job.

kmodelew
New Contributor II

@Kaniz 

Thank you. In my case I plan to use API:

https://docs.databricks.com/api/workspace/jobs/listruns - for choosing the latest run_id

https://docs.databricks.com/api/workspace/jobs/getrun - for getting task status for run_id

Python script will run every 10 minutes in certain predefined period of time and check the status of the tasks.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.